ACM Home Page
Please provide us with feedback. Feedback
Matching and integration across heterogeneous data sources
Full text PdfPdf (217 KB)
Source ACM International Conference Proceeding Series; Vol. 151 archive
Proceedings of the 2006 international conference on Digital government research table of contents
San Diego, California
POSTER SESSION: Posters table of contents
Pages: 438 - 439  
Year of Publication: 2006
Authors
Patrick Pantel  University of Southern California, Marina del Rey, CA
Andrew Philpot  University of Southern California, Marina del Rey, CA
Eduard Hovy  University of Southern California, Marina del Rey, CA
Sponsor
NSF : National Science Foundation
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 42,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1146598.1146738
What is a DOI?

ABSTRACT

A sea of undifferentiated information is forming from the body of data that is collected by people and organizations, across government, for different purposes, at different times, and using different methodologies. The resulting massive data heterogeneity requires automatic methods for data alignment, matching and/or merging. In this poster, we describe two systems, Guspin™ and Sift™, for automatically identifying equivalence classes and for aligning data across databases. Our technology, based on principles of information theory, measures the relative importance of data, leveraging them to quantify the similarity between entities. These systems have been applied to solve real problems faced by the Environmental Protection Agency and its counterparts at the state and local government level.



Collaborative Colleagues:
Patrick Pantel: colleagues
Andrew Philpot: colleagues
Eduard Hovy: colleagues