| Managing information quality in e-science: the qurator workbench |
| Full text |
Pdf
(139 KB)
|
Source
|
International Conference on Management of Data
archive
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
table of contents
Beijing, China
SESSION: Group 4
table of contents
Pages: 1150 - 1152
Year of Publication: 2007
ISBN:978-1-59593-686-8
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 15, Downloads (12 Months): 129, Citation Count: 0
|
|
|
ABSTRACT
Data-intensive e-science applications often rely on third-party data found in public repositories, whose quality is largely unknown. Although scientists are aware that this uncertainty may lead to incorrect scientific conclusions, in the absence of a quantitative characterization of data quality properties they find it difficult to formulate precise data acceptability criteria. We present an Information Quality management workbench, called Qurator, that supports data experts in the specification of personal quality models, and lets them derive effective criteria for data acceptability. The demo of our working prototype will illustrate our approach on a real e-science workflow for a bioinformatics application.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
R. Aebersold and M. Mann. Mass spectrometry-based proteomics. Nature, 422:198--207, March 2003.
|
| |
2
|
|
| |
3
|
C. Hedeler and P. Missier. Database Modeling in Biology: Practices and Challenges, chapter Quality management challenges in the post-genomic era. Artech House, 2007. In print.
|
| |
4
|
D. Hull, K. Wolstencroft, R. Stevens, C. Goble, M. R. Pocock, P. Li, and T. Oinn. Taverna: a tool for building and running workflows of services. Nucleic Acids Research, 34:W729--W732, 2006.
|
| |
5
|
Paolo Missier , Suzanne Embury , Mark Greenwood , Alun Preece , Binling Jin, Quality views: capturing and exploiting the user perspective on data quality, Proceedings of the 32nd international conference on Very large data bases, September 12-15, 2006, Seoul, Korea
|
| |
6
|
D. A. Stead, A. Preece, and A. J. Brown. Universal metrics for quality assessment of protein identifications by mass spectrometry. Molecular & Cellular Proteomics, 5(7):1205--1211, 2006. Also available at http://www.mcponline.org/papbyrecent.shtml.
|
| |
7
|
I. H. Witten and E. Frank. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco, 2nd edition, 2005.
|
|