|
ABSTRACT
High throughput glycoproteomics, similar to genomics and proteomics, involves extremely large volumes of distributed, heterogeneous data as a basis for identification and quantification of a structurally diverse collection of biomolecules. The ability to share, compare, query for and most critically correlate datasets using the native biological relationships are some of the challenges being faced by glycobiology researchers. As a solution for these challenges, we are building a semantic structure, using a suite of ontologies, which supports management of data and information at each step of the experimental lifecycle. This framework will enable researchers to leverage the large scale of glycoproteomics data to their benefit.In this paper, we focus on the design of these biological ontology schemas with an emphasis on relationships between biological concepts, on the use of novel approaches to populate these complex ontologies including integrating extremely large datasets ( 500MB) as part of the instance base and on the evaluation of ontologies using OntoQA [38] metrics. The application of these ontologies in providing informatics solutions, for high throughput glycoproteomics experimental domain, is also discussed. We present our experience as a use case of developing two ontologies in one domain, to be part of a set of use cases, which are used in the development of an emergent framework for building and deploying biological ontologies.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
B. Aleman-Meza, C. Halaschek, A. Sheth, I. B. Arpinar, G. Sannapareddy, "SWETO: Large-Scale Semantic Web Test-bed,ö Proc. of the16th Intl. Conf. on Software Engineering & Knowledge Engineering (SEKE2004): Intl. Workshop on Ontology in Action, Banff, Canada, June 21-24, 2004, pp. 490--493. http://lsdis.cs.uga.edu/library/resources/
|
| |
2
|
M. Ashburner, CA Ball, J. A. Blake, D Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig, M. S. Harris, D.P. Hill, L. Issel-Tarver, A. Kasarskis, S. Lewis, J.C. Matese, J. E. Richardson, M. Ringwald, G. M. Rubin, G. Sherlock, Gene Ontology: Tool for the Unification of Biology. Nature Genetics, 25:25--29, 2000.
|
| |
3
|
A. Bohne-Lang, T. Lang, E. Forster, C. W. von der Lieth, 2001. LINUCS: linear notation for unique description of carbohydrate sequences. Carbohydr Res. 336:1--11
|
| |
4
|
M. Cristani and R. Cuel: A Survey on Ontology Creation Methodologies. Int. J. Semantic Web Inf. Syst. 1(2): 49--69 (2005)
|
| |
5
|
S. Doubet and P. Albersheim, CarbBank. Glycobiology, 2, 1992, 505
|
 |
6
|
|
| |
7
|
R. Guha and R. McCool, The tap knowledge base. http://tap.stanford.edu/
|
| |
8
|
I. Horrocks, P. F. Patel-Schneider and F. van Harmelen, From SHIQ and RDF to OWL: the making of a Web Ontology Language, Journal of Web Semantics 1(1): 7--26 (2003)
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
IUPAC Commission on the Nomenclature of Organic Chemistry (CNOC) and IUPAC-IUB Commission on Biochemical Nomenclature (CBN). Nomenclature of Cyclitols. Recommendations, 1973. Biochem J. 1976 Jan 1;153(1):23--31
|
| |
18
|
D. M. Jones, T. J. M. Bench-Capon, and P.R.S. Visser, Methodologies for Ontology Development. In J. Cuena, editor, Proc. ITi and KNOWS Conference of the 15th IFIP World Computer Congress, pages 62-75, London, UK, 1998. Chapman and Hall Ltd.
|
| |
19
|
M. Kanehisa and S. Goto, KEGG: Kyoto Encyclopedia of Genes and Genomes, Nucleic Acids Research, 2000, Vol. 28, No. 1 27--30
|
| |
20
|
A. Lo_, P. Bunsmann, A. Bohne, A. Lo_, E. Schwarzer, E. Lang, and C. W. Von der Lieth, SWEET-DB: an attempt to create annotated data collections for carbohydrates, Nucleic Acids Research, 2002 January 1; Vol 30, No. 1, 405--408.
|
| |
21
|
NCRR Integrated Technology Resource for Biomedical Glycomics:http://lsdis.cs.uga.edu/projects/glycomics/, http://cell.ccrc.uga.edu/world/glycomics/researchprogram.php)
|
 |
22
|
|
| |
23
|
Natalya F. Noy , Michael Sintek , Stefan Decker , Monica Crubézy , Ray W. Fergerson , Mark A. Musen, Creating Semantic Web Contents with Protégé-2000, IEEE Intelligent Systems, v.16 n.2, p.60-71, March 2001
[doi> 10.1109/5254.920601
]
|
| |
24
|
N. F. Noy & D. L. McGuinness, "Ontology Development 101: A Guide to Creating Your First Ontology". Knowledge Systems Laboratory, March (2001).
|
| |
25
|
D. Ramachandran, P. Reagan, K. Goolsbey, First-Orderized ResearchCyc: Expressivity and Efficiency in a Common-Sense Ontology. In Papers from the AAAI Workshop on Contexts and Ontologies: Theory, Practice and Applications. Pittsburgh, Pennsylvania, July 2005.
|
 |
26
|
|
| |
27
|
S. S. Sahoo, A. P. Sheth, W. S. York, J. A. Miller, "Semantic Web Services for N-Glycosylation Process", International Symposium on Web Services for Computational Biology and Bioinformatics, VBI, Blacksburg, VA, May 26-27, 2005.
|
| |
28
|
S. S. Sahoo, C. Thomas, A. Sheth, C. Henson, W. S. York, GLYDE-an expressive XML standard for the representation of glycan structure., Carbohydr Res. 2005 Dec 30;340(18):2802--7. Epub 2005 Oct 20. PMID: 16242678
|
| |
29
|
S. Schulze-Kremer, Ontologies for molecular biology and bioinformatics, In Silico Biology 2, 0017 (2002)
|
| |
30
|
Amit Sheth , Clemens Bertram , David Avant , Brian Hammond , Krysztof Kochut , Yashodhan Warke, Managing Semantic Content for the Web, IEEE Internet Computing, v.6 n.4, p.80-87, July 2002
[doi> 10.1109/MIC.2002.1020330
]
|
| |
31
|
A. Sheth, I. B. Arpinar, and V. Kashyap, Relationships at the Heart of Semantic Web: Modeling, Discovering, and Exploiting Complex Semantic Relationships, in Enhancing the Power of the Internet: Studies in Fuzziness and Soft Computing, M. Nikravesh, L. A. Zadeh, B. Azvine, R. R. Yager (Eds), Springer-Verlag, 63--94
|
| |
32
|
L. N Soldatova & R. D King, Are the current ontologies in biology good ontologies? Nature Biotechnology, 23, 1095 -- 1098 (2005)
|
| |
33
|
R. Stevens, C. A. Goble and S. Bechhofer, Ontology-based Knowledge Representation for Bioinformatics, Briefings in Bioinformatics. 2000 Nov;1(4):398--414.
|
| |
34
|
R. Stevens, P. Baker, S. Bechhofer,G. Ng, A. Jacoby, N. W. Paton, C. A. Goble, A. Brass, TAMBIS: transparent access to multiple bioinformatics information sources, Bioinformatics. 2000 Feb;16(2):184--5.
|
| |
35
|
C. J. Stoeckert Jr, H. C. Causton & C. A. Ball, Microarray databases: standards and ontologies., Nature Genetics, 32 Supplement - Chipping Forecast II, (December 2002), pp. 469--473
|
| |
36
|
SUMO: http://ontology.teknowledge.com/
|
| |
37
|
N. Takahashi and K. Kato, GlycoTree, Trends in Glycoscience and Glycotechnology, 15, 2003: 235--251.
|
| |
38
|
S. Tartir, I. B. Arpinar, M. Moore, A. Sheth, B. Aleman-Meza, OntoQA: Metric-Based Ontology Quality Analysis, IEEE ICDM 2005 Workshop on Knowledge Acquisition from Distributed, Autonomous, Semantically Heterogeneous Data and Knowledge Sources. Houston, Texas, November 27, 2005
|
| |
39
|
C. F. Taylor et al. "A systematic approach to modeling, capturing, and disseminating proteomics experimental dataö, Nat. Biotechnol. 2003 Mar; 21(3):247--54
|
 |
40
|
|
| |
41
|
J. Zhao, C. Wroe, C. Goble, R. Stevens, D. Quan, M. Greenwood, Using Semantic Web Technologies for Representing e-Science Provenance in Proc. 3rd International Semantic Web Conference ISWC2004, Hiroshima, Japan, 9-11 Nov 2004, Springer LNCS 3298
|
CITED BY 4
|
|
Boanerges Aleman-Meza , Farshad Hakimpour , I. Budak Arpinar , Amit P. Sheth, SwetoDblp ontology of Computer Science publications, Web Semantics: Science, Services and Agents on the World Wide Web, v.5 n.3, p.151-155, September, 2007
|
|
Bingjun Sun , Qingzhao Tan , Prasenjit Mitra , C. Lee Giles, Extraction and search of chemical formulae in text documents on the web, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
Darren A. Natale , Cecilia N. Arighi , Winona Barker , Judith Blake , Ti-Cheng Chang , Zhangzhi Hu , Hongfang Liu , Barry Smith , Cathy H. Wu, Framework for a protein ontology, Proceedings of the 1st international workshop on Text mining in bioinformatics, November 10-10, 2006, Arlington, Virginia, USA
|
|
|
INDEX TERMS
Primary Classification:
H.
Information Systems
H.3
INFORMATION STORAGE AND RETRIEVAL
H.3.3
Information Search and Retrieval
Additional Classification:
H.
Information Systems
H.1
MODELS AND PRINCIPLES
H.1.m
Miscellaneous
General Terms:
Design,
Languages,
Management,
Standardization
Keywords:
ProPreO,
bioinformatics ontology,
biological ontology development,
glycO,
glycoproteomics,
ontology population,
ontology structural metrics,
semantic bioinformatics
|