BioTextQuest (BTQ) implements an enhanced version of the TextQuest algorithm (proposed by Iliopoulos et al., 2001), providing a user friendly web-based interface. BTQ collects abstracts from Medline literature and OMIM databases matching a user query. Identification of relevant terms enables the representation of text records in a Vector Space Model and the calculation of pairwise document similarities. Employing suitable clustering algorithms, results are transformed into clusters of records along with their corresponding terms. BTQ, besides the document processing and clustering algorithms, relies on public web services such as NCBI eSearch, Reflect, and WhatIzIt to query biomedical databases and to annotate and enrich the biomedically significant terms. Data Integration and further bioinformatics analysis related to the tagged bioentities is available through BioCompendium service. Additional added-value features include a variety of clustering, stemming, co-occurence analysis and visualization algorithms/techniques allowing interactive result navigation.

BTQ input is a Medline/OMIM-like query. Tunable parameters include the clustering and stemming algorithms to be used, along with their own parameters. Results are presented in four different views, namely: document clusters, tag clouds, interactive document graphs and the list of the extracted terms of biomedical significance. To enhance these views even further, the extracted terms are annotated according to the biological entity they describe (gene, protein, biological process, molecular function, and/or cellular components). Importantly, users can sub-cluster and/or recluster, results from previous analyses.

References

1. Papanikolaou N*, Pavlopoulos GA*, Pafilis E, Theodosiou T, Schneider R, Satagopam V, Ouzounis C, Eliopoulos A, Promponas VJ, Iliopoulos I. (2014) BioTextQuest+: A knowledge integration platform for literature mining and concept discovery. Bioinformatics, August 6, doi: 10.1093/bioinformatics/btu524. [PubMed]

2. Papanikolaou N., Pafilis, E.J., Nikolaou S., Ouzounis C.A., Iliopoulos I. and Promponas V. (2011) BioTextQuest: A Web-based Biomedical Text Mining Suite for Concept Discovery Bioinformatics, Dec 1;27(23):3327-3328. [PubMed]

3. Iliopoulos, I., Enright, A.J. and Ouzounis C.A. (2001) TextQuest: Document Clustering of Medline Abstracts for Concept Discovery in Molecular Biology. Pac. Symp. Biocomput., 384-395. [PubMed]

Contributors


Evangelos Pafilis

Evangelos Pafilis

Nikolaos Papanikolaou

Nikolaos Papanikolaou

Georgios Pavlopoulos

Georgios Pavlopoulos

Andreas Antonakis

Andreas Antonakis

Theodosis Theodosiou

Theodosis Theodosiou

Venkata Satagopam

Venkata Satagopam

Reinhard Schneider

Reinhard Schneider

Stavros Nicolaou

Stavros Nikolaou

Christos Ouzounis

Christos Ouzounis

Aristeidis Eliopoulos

Aristeidis Eliopoulos

Vasilis Promponas

Vasilis Promponas

Ioannis Iliopoulos

Ioannis Iliopoulos