Courses in winter term 2016

Posted on .

Here are the courses I give this winter term, starting on October 3:

621.702 Computer organization (lab)
621.704 Computer organization (lab)

Students access the course material through non-public Moodle. If you are not enrolled to these courses but are interested in the course material (available in German only), please drop me an e-mail.

Compound Figure Separation Journal Paper

Posted on .

We submitted extended work on compound figure separation to the MTAP Journal.

Update: The revised version of our paper has been accepted for publication on Dec 1, 2016 and published online on Dec 29, 2016. The printed version appeared in January, 2018.

Title: Automatic Separation of Compound Figures in Scientific Articles

Content-based analysis and retrieval of digital images found in scientific articles is often hindered by images consisting of multiple subfigures (compound figures). We address this problem by proposing a method (ComFig) to automatically classify and separate compound figures, which consists of two main steps: (i) a supervised compound figure classifier (ComFig classifier) discriminates between compound and non-compound figures using task-specific image features; and (ii) an image processing algorithm is applied to predicted compound images to perform compound figure separation (ComFig separation). The proposed ComFig classifier is shown to achieve state-of-the-art classification performance on a published dataset. Our ComFig separation algorithm shows superior separation accuracy on two different datasets compared to other known automatic approaches. Finally, we propose a method to evaluate the effectiveness of the ComFig chain combining classifier and separation algorithm, and use it to optimize the misclassification loss of the ComFig classifier for maximal effectiveness in the chain.


Bibtex citation:

  Title                    = {Automatic separation of compound figures in scientific articles},
  Author                   = {Taschwer, Mario and Marques, Oge},
  Journal                  = {Multimedia Tools and Applications},
  Year                     = {2018},
  Month                    = {Jan},
  Number                   = {1},
  Pages                    = {519--548},
  Volume                   = {77},
  Doi                      = {10.1007/s11042-016-4237-x},
  ISSN                     = {1573-7721}

Courses in summer term 2016

Posted on .

In the upcoming summer term (starting on March 1, 2016), I will give the following two courses:

  • 620.002 Introduction to computer science (exercises, German)
  • 621.401 Compiler construction (lab, English)

Course material will be provided for students in non-public Moodle. If you are not enrolled to these courses and interested in the course material, please drop me an e-mail.

Courses in winter term 2015

Posted on .

Here are the courses I give this winter term, starting this week:

620.005 Introduction to computer science (exercises)
621.704 Computer organization (lab)

Students access the course material through non-public Moodle. If you are not enrolled to these courses but are interested in the course material (available in German only), please drop me an e-mail.

Compound Figure Separation at MMM 2016

Posted on .

Our extended work on compound figure separation has been accepted as a regular paper at MMM 2016 conference. A preprint is available here.

Update: Slides of my presentation on Jan 5, 2016 at MMM conference. Official link to published paper. BibTeX citation:

Title                    = {Compound Figure Separation Combining Edge and Band Separator Detection},
Author                   = {Taschwer, Mario and Marques, Oge},
Booktitle                = {MultiMedia Modeling},
Publisher                = {Springer International Publishing},
Year                     = {2016},
Editor                   = {Tian, Qi and Sebe, Nicu and Qi, Guo-Jun and Huet, Benoit and Hong, Richang and Liu, Xueliang},
Pages                    = {162--173},
Series                   = {Lecture Notes in Computer Science},
Volume                   = {9516},

Doi                      = {10.1007/978-3-319-27671-7_14},
ISBN                     = {978-3-319-27670-0}

We propose an image processing algorithm to automatically separate compound figures appearing in scientific articles. We classify compound images into two classes and apply different algorithms for detecting vertical and horizontal separators to each class: the edge-based algorithm aims at detecting visible edges between subfigures, whereas the band-based algorithm tries to detect whitespace separating subfigures (separator bands). The proposed algorithm has been evaluated on two recent datasets for compound figure separation (CFS) in the biomedical domain and achieves a slightly better detection accuracy than state-of-the-art approaches. Conducted experiments investigate CFS effectiveness and classification accuracy of various classifier implementations.

Compound Figure Separation at ImageCLEF 2015

Posted on .

We participated in the compound figure separation subtask of the ImageCLEF 2015 medical classification task (group AAUITEC). The paper describing our approach has been accepted for the CLEF 2015 Working Notes. A preprint version is available here.

Update: I presented a poster at CLEF 2015 in Toulouse on September 9, 2015. The paper appeared online in CEUR Workshop Proceedings. Here is the BibTeX citation:

Title                    = {{AAUITEC} at {ImageCLEF} 2015: Compound Figure Separation},
Author                   = {Taschwer, Mario and Marques, Oge},
Booktitle                = {{CLEF} 2015 Working Notes},
Year                     = {2015},
Month                    = {September},
Series                   = {CEUR Workshop Proceedings, ISSN 1613-0073},
Volume                   = {1391},
Location                 = {Toulouse, France},
Url                      = {}

Our approach to automatically separating compound figures appearing in biomedical articles is split into two image processing algorithms: one is based on detecting separator edges, and the other tries to identify background bands separating subfigures. Only one algorithm is applied to a given image, according to the prediction of a binary classifier trained to distinguish graphical illustrations from other images in biomedical articles. Our submission to the ImageCLEF 2015 compound figure separation task achieved an accuracy of 49% on the provided test set of about 3400 compound images. This stays clearly behind the best submission of other participants (85% accuracy), but is by an order of magnitude faster than other approaches reported in the literature.

ACM Multimedia 2014 Doctoral Symposium

Posted on .

Update: I had my presentation at ACM Multimedia 2014 on November 5, and received the Best Doctoral Symposium Paper Award. The slides are availabe here.

My PhD proposal has been accepted by the ACM Multimedia 2014 Doctoral Symposium. The paper is available here.

The proposed PhD project addresses the problem of finding descriptions of diseases or patients’ health records that are relevant for a given description of patient’s symptoms, also known as medical case retrieval (MCR). Designing an automatic multimodal MCR system applicable to general medical data sets still presents an open research problem, as indicated by the ImageCLEF 2013 MCR challenge, where the best submitted runs achieved only moderate retrieval performance and used purely textual techniques. This project therefore aims at designing a multimodal MCR model that is capable of achieving a substantially better retrieval performance on the ImageCLEF data set than state-of-the-art techniques. Moreover, the potential of further improvement by leveraging relevance feedback of medical expert users for long-term learning will be investigated.

Call for Master Thesis on Biomedical Concept Detection

Posted on .

Update: The master thesis has been assigned to Florian Winkler as of June 26, 2014.

This is a call for a master thesis on biomedical concept detection.

Detecting biomedical concepts in text documents or queries is a useful tool for improving biomedical information retrieval or navigating biomedical document collections. Known successful techniques for biomedical concept detection rely on natural language processing and/or machine learning and incur a substantial processing overhead at the document indexing stage compared to keyword-only indexing.
The goal of this master thesis is to evaluate a novel efficient concept detection algorithm based on position-dependent keyword matching on a recent public biomedical data set. Concepts are taken from a biomedical thesaurus called Medical Subject Headings (MeSH). Results should be compared to the accuracy achieved by the well-known MetaMap concept detector.