A notation and system for expressing and executing cleanly typed workflows on messy scientific data
- 1 September 2005
- journal article
- Published by Association for Computing Machinery (ACM) in ACM SIGMOD Record
- Vol. 34 (3) , 37-43
- https://doi.org/10.1145/1084805.1084813
Abstract
The description, composition, and execution of even logically simple scientific workflows are often complicated by the need to deal with "messy" issues like heterogeneous storage formats and ad-hoc file system structures. We show how these difficulties can be overcome via a typed, compositional workflow notation within which issues of physical representation are cleanly separated from logical typing, and by the implementation of this notation within the context of a powerful runtime system that supports distributed execution. The resulting notation and system are capable both of expressing complex workflows in a simple, compact form, and of enacting those workflows in distributed environments. We apply our technique to cognitive neuroscience workflows that analyze functional MRI image data, and demonstrate significant reductions in code size relative to other approaches.Keywords
This publication has 4 references indexed in Scilit:
- XDTM: The XML Data Type and Mapping for Specifying DatasetsPublished by Springer Nature ,2005
- Taverna: a tool for the composition and enactment of bioinformatics workflowsBioinformatics, 2004
- Chimera: a virtual data system for representing, querying, and automating data derivationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Taming heterogeneity - the Ptolemy approachProceedings of the IEEE, 2003