ACM Home Page
Please provide us with feedback. Feedback
Data quality inference
Full text PdfPdf (1.85 MB)
Source Information Quality in Informational Systems archive
Proceedings of the 2nd international workshop on Information quality in information systems table of contents
Baltimore, Maryland
SESSION: Paper session III: statistics, clustering table of contents
Pages: 105 - 111  
Year of Publication: 2005
ISBN:1-59593-160-0
Authors
Raymond K. Pon  UCLA Computer Science, Los Angeles, CA
Alfonso F. Cárdenas  UCLA Computer Science, Los Angeles, CA
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 16,   Downloads (12 Months): 167,   Citation Count: 1
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1077501.1077519
What is a DOI?

ABSTRACT

In the field of sensor networks, data integration and collaboration, and intelligence gathering efforts, information on the quality of data sources are important but are often not available. We describe a technique to rank data sources by observing and comparing their behavior (i.e., the data produced by data sources) to rank. Intuitively, our measure characterizes data sources that agree with accurate or high-quality data sources as likely accurate. Furthermore, our measure includes a temporal component that takes into account a data source's past accuracy in evaluating its current accuracy. Initial experimental results based on simulation data to support our hypothesis demonstrate high precision and recall on identifying the most accurate data sources.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
 
4
 
5
F. Naumann, "From databases to information systems - information quality makes the difference," presented at the International Conference on Information Quality (IQ 2001), Cambridge, MA, 2001.
6
 
7
T. Critchlow, L. Liu, D. Buttler, D. Rocco, and C. Pu, "Towards Automatic Discovery and Identification of Bioinformatics Web Interfaces," {Online} Available: http://sirius.cs.ucdavis.edu/Dagstuhl03/presentations/03362.CritchlowTerence.Slides.ppt, 2003.
8
 
9
F. Donovan, "Army to deploy hand-held devices to make every soldier into a sensor," {Online} Available: http://www.aviationnow.com/avnow/news/channel_netdefen se_story.jsp?id=news/arm04294.xml, 2004.
 
10
F. S. Collins, E. D. Green, A. E. Guttmacher, and M. S. Guyer, "A vision for the future of genomics research," Nature, vol. 422, pp. 835--847, 2003.
 
11
 
12
 
13
L. D. Santis, M. Scannapieco, and T. Catarci, "Trusting data quality in cooperative information systems," presented at CoopIS 2003, 2003.
 
14
J. Widom, "Trio: a system for integrated management of data, accuracy, and lineage," presented at CIDR 2005, Pacific Grove, California, 2005.
 
15
G. A. Mihaila, L. Raschid, and M.-E. Vidal, "Using quality of data metadata for source selection and ranking," presented at Third International Workshop on the Web and Databases, WebDB'2000, Dallax, TX, 2000.
 
16
G. A. Mihaila, L. Raschid, and M.-E. Vidal, "Source selection and ranking in the websemantics architecture using quality of data metadata," Advances in Computers, vol. 55, pp. 87--118, 2002.
 
17
 
18
 
19
 
20
A. Motro and I. Rakov, "Estimating the quality of databases," presented at 1996 Conference on Information Quality, Cambridge, MA, 1996.
 
21
M. Bobrowski, M. Marre, and D. Yankelevich, "A homogeneous framework to measure data quality," presented at IQ 1999, Cambridge, MA, 1999.
 
22
 
23
 
24
 
25
J. Hicklin, C. Moler, P. Webb, R. F. Boisvert, B. Miller, R. Pozo, and K. Remington, "JAMA: Java Matrix Package," {Online} Available: http://math.nist.gov/javanumerics/jama/, 2005.
 
26
J. Cho and A. Ntoulas, "Effective Change Detection using Sampling," presented at VLDB Conference, Hong Kong, China, 2002.
 
27
L. Cholvy and C. Garion, "Querying several conflicting databases," presented at ECSQARU-03 Workshop Uncertainity, Incompleteness, Imprecision, and Conflict in Multiple Data Sources, Aalborg, 2003.
 
28
Jung Framework Development Team, "JUNG: Java Universal Network/Graph Framework," {Online} Available: http://jung.sourceforge.net/index.html, 2005.
 
29

Collaborative Colleagues:
Raymond K. Pon: colleagues
Alfonso F. Cárdenas: colleagues