Information Technology Reference
In-Depth Information
1.7
Conclusion
Mining the complex structure of scientific publications is a daunting task, yet
achievable. Not only can the layout information be used to improve on classical
tasks [9], we have also shown that it opens up new possibilities of retrieving
information from the publications. Being able to specifically find images and
table data, not only helps in the biological scenario of our case studies. It is
also applicable to other areas, such as chemistry to find structure formulae, in
engineering to find results of standardized tests in tables or in pharmacology to
find which drugs did not work on a given problem.
1.8
Future Trends
The next topic of interest would be to show, whether the mistakes made by the
automatic layout detection outweigh the gains when applying it to classical tasks
like classification and clustering. In table recognition, it would be interesting to
see, if the data from the tables can automatically extracted into databases for
easier access. And these are just a few of the many possibilities that the layout
analysis of scientific documents opens up.
Layout information can also be used to improve the analysis of web sites
or any other textual and graphical medium, such as newspapers or magazines.
As the amount of data presented in that way rises very fast and will probably
continue to do so in the future, it becomes more and more important to find
effective ways to handle this data and get the most out it.
References
1. Liu, Y., Navathe, S.B., Civera, J., Dasigi, V., Ram, A., Ciliax, B.J., Dingledine, R.:
Text mining biomedical literature for discovering gene-to-gene relationships: a com-
parative study of algorithms. IEEE/ACM Trans. Comput. Biol. Bioinform. 2(1),
62-76 (2005)
2. Hu, X., Wu, D.D.: Data mining and predictive modeling of biomolecular network
from biomedical literature databases. IEEE/ACM Trans. Comput. Biol. Bioin-
form. 4(2), 251-263 (2007)
3. Tanabe, L., Scherf, U., Smith, L.H., Lee, J.K., Hunter, L., Weinstein, J.N.: Med-
Miner: an Internet text-mining tool for biomedical information, with application
to gene expression profiling. Biotechniques 27(6), 1210-4, 1216-7 (1999)
4. Chaussabel, D.: Biomedical literature mining: challenges and solutions in the
'omics' era. Am J. Pharmacogenomics 4(6), 383-393 (2004)
5. Natarajan, J., Berrar, D., Dubitzky, W., Hack, C., Zhang, Y., DeSesa, C., Van
Brocklyn, J.R., Bremer, E.G.: Text mining of full-text journal articles com-
bined with gene expression analysis reveals a relationship between sphingosine-
1-phosphate and invasiveness of a glioblastoma cell line. BMC Bioinformatics 7,
373 (2006)
6. Faulstich, L.C., Stadler, P.F., Thurner, C., Witwer, C.: litsift: Automated text
categorization in bibliographic search. In: Data Mining and Text Mining for Bioin-
formatics, Workshop at the ECML / PKDD 2003 (2003)
 
Search WWH ::




Custom Search