The Design of Discovery Net: Towards Open Grid Services for Knowledge Discovery
- 1 August 2003
- journal article
- Published by SAGE Publications in The International Journal of High Performance Computing Applications
- Vol. 17 (3) , 297-315
- https://doi.org/10.1177/1094342003173003
Abstract
With the emergence of distributed resources and grid technologies there is a need to provide higher level informatics infrastructures allowing scientists to easily create and execute meaningful data integration and analysis processes that take advantage of the distributed nature of the available resources. These resources typically include heterogeneous data sources, computational resources for task execution and various application-specific services. The effort of the high performance community has so far mainly focused on the delivery of low-level informatics infrastructures enabling the basic needs of grid applications. Such infrastructures are essential but do not directly help end-users in creating generic and re-usable applications. In this paper, we present the Discovery Net architecture for building grid-based knowledge discovery applications. Our architecture enables the creation of high-level, re-usable and distributed application workflows that use a variety of common types of distributed resources. It is built on top of standard protocols and standard infrastructures such as Globus but also defines its own protocols such as the Discovery Process Mark-up Language for data flow management. We discuss an implementation of our architecture and evaluate it by building a real-time genome annotation environment on top.Keywords
This publication has 27 references indexed in Scilit:
- SETI@homeCommunications of the ACM, 2002
- Prediction of Human Protein Function from Post-translational Modifications and Localization FeaturesJournal of Molecular Biology, 2002
- Genome annotation: from sequence to biologyNature Reviews Genetics, 2001
- Predicting Subcellular Localization of Proteins Based on their N-terminal Amino Acid SequenceJournal of Molecular Biology, 2000
- The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000Nucleic Acids Research, 2000
- Towards seamless computing and metacomputing in JavaConcurrency: Practice and Experience, 1998
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Prediction of complete gene structures in human genomic DNAJournal of Molecular Biology, 1997
- tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic SequenceNucleic Acids Research, 1997
- Basic local alignment search toolJournal of Molecular Biology, 1990