Call for Master Thesis on Biomedical Concept Detection

Posted on .

Update: The master thesis has been assigned to Florian Winkler as of June 26, 2014.

This is a call for a master thesis on biomedical concept detection.

Detecting biomedical concepts in text documents or queries is a useful tool for improving biomedical information retrieval or navigating biomedical document collections. Known successful techniques for biomedical concept detection rely on natural language processing and/or machine learning and incur a substantial processing overhead at the document indexing stage compared to keyword-only indexing.
The goal of this master thesis is to evaluate a novel efficient concept detection algorithm based on position-dependent keyword matching on a recent public biomedical data set. Concepts are taken from a biomedical thesaurus called Medical Subject Headings (MeSH). Results should be compared to the accuracy achieved by the well-known MetaMap concept detector.

Textual MCR Methods

Posted on .

I recently completed a technical report on textual methods for medical case retrieval. A corresponding poster prepared for an internal ITEC meeting is available here.

M. Taschwer. Textual methods for medical case retrieval. Technical Report TR/ITEC/14/2.01, Institute of Information Technology (ITEC), Alpen-Adria-Universität Klagenfurt, Austria, May 2014.

Medical case retrieval (MCR) is information retrieval in a collection of medical case descriptions, where descriptions of patients’ symptoms are used as queries. We apply known text retrieval techniques based on query and document expansion to this problem, and combine them with new algorithms to match queries and documents with Medical Subject Headings (MeSH). We ran comprehensive experiments to evaluate 546 method combinations on the ImageCLEF 2013 MCR dataset. Methods combining MeSH query expansion with pseudo-relevance feedback performed best, delivering retrieval performance comparable to or slightly better than the best MCR run submitted to ImageCLEF 2013.

ImageCLEF 2013 participation

Posted on .

I participated in one of the ImageCLEF 2013 medical tasks by experimenting with text-based retrieval and query expansion using MeSH ontology. The resulting CLEF 2013 working note paper is available here. The achieved performance for medical case retrieval was moderate, but there are additional options for text retrieval to explore before turning to visual retrieval.

I will have a poster presentation (not reviewed) at CLEF 2013 conference in two weeks. Here is the abstract:

Our approach to the ImageCLEF medical case retrieval task consists of text-only retrieval combined with utilizing the Medical Subject Headings (MeSH) ontology. MeSH terms extracted from the query are used for query expansion or query term weighting. MeSH annotations of documents available from PubMed Central are added to the corpus. Retrieval results improve slightly upon full-text retrieval.

The poster is available here.

PhD proposal

Posted on .

I just finished my PhD proposal regarding medical case retrieval (available as PDF). Here is the abstract:

The proposed PhD project addresses the problem of medical case retrieval (MCR), where a medical case is represented by a multimedia document describing a certain disease or a patient’s history. The ImageCLEF evaluation campaign poses a yearly MCR task using a heterogeneous dataset of more than 75,000 medical publications consisting of text and images. The best results achieved by participants of the ImageCLEF MCR task in 2012 are moderate and call for improvement. Interestingly, approaches based on visual retrieval perform significantly worse than text-only retrieval, even if combined with text retrieval. This project therefore aims at designing an MCR model that is able to deliver a substantially better retrieval performance on the ImageCLEF dataset. Moreover, the potential of further improvement by leveraging the feedback of medical expert users for long-term learning will be investigated.