Optimized seamless integration of biomolecular data
- 1 January 2001
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Today, scientific data is inevitably digitized, stored in a variety of heterogeneous formats, and is accessible over the Internet. Scientists need to access an integrated view of multiple remote or local heterogeneous data sources. They then integrate the results of complex queries and apply further analysis and visualization to support the task of scientific discovery. Building a digital library for scientific discovery requires accessing and manipulating data extracted from flat files or databases, documents retrieved from the Web, as well as data that is locally materialized in warehouses or is generated by software. We consider several tasks to provide optimized and seamless integration of biomolecular data. Challenges to be addressed include capturing and representing source capabilities; developing a methodology to acquire and represent metadata about source contents and access costs; and decision support to select sources and capabilities using cost based and semantic knowledge, and generating low cost query evaluation plans.Keywords
This publication has 38 references indexed in Scilit:
- A schema-based approach to building a bioinformatics database federationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Integrating life sciences data-with a little GarlicPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Scientific data integration: wrapping textual documents with a database view mechanism and an XML enginePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Alternative splicing: increasing diversity in the proteomic worldTrends in Genetics, 2001
- DiscoveryLink: A system for integrated access to life sciences data sourcesIBM Systems Journal, 2001
- A meta-wrapper for scaling up to multiple autonomous distributed information sourcesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1998
- NOMENCLATUREGenomics, 1997
- LASSAP, a LArge Scale Sequence compArison Package.Bioinformatics, 1997
- Mediators in the architecture of future information systemsComputer, 1992
- Randomized algorithms for optimizing large join queriesPublished by Association for Computing Machinery (ACM) ,1990