ABSTRACT
We live in the Information Era: the Web has enabled the availability of a huge amount of useful information and eased sharing of data among sources. Despite the richness of information surrounding us, an information user is often overwhelmed by the huge volume of raw, heterogeneous, and even conflicting data. Data sources can be of different qualities, providing information of different levels of accuracy, freshness, and completeness, and data can flow between data sources, being copied, reformatted, verified, and modified. There is an increasing need to help users find the information and the sources that are of highest quality and authority, to help data producers understand how their data are being used (and possibly protect their rights), and to help analysts and auditors understand how information has been disseminated and how rumors have been propagated [1].
- L. Berti-Equille, A. D. Sarma, X. L. Dong, A. Marian, and D. Srivastava. Sailing the information ocean with awareness of currents: Discovery and application of source dependence. In CIDR, 2009.Google Scholar
- X. L. Dong, L. Berti-Equille, Y. Hu, and D. Srivastava. Global detection of complex copying relationships between sources. PVLDB, 2010. Google ScholarDigital Library
- X. L. Dong, L. Berti-Equille, and D. Srivastava. Integrating con icting data: the role of source dependence. PVLDB, 2(1), 2009. Google ScholarDigital Library
- X. L. Dong, L. Berti-Equille, and D. Srivastava. Truth discovery and copying detection in a dynamic world. PVLDB, 2(1), 2009. Google ScholarDigital Library
Index Terms
- Solomon: seeking the truth via copying detection
Recommendations
SOLOMON: seeking the truth via copying detection
We live in the Information Era, with access to a huge amount of information from a variety of data sources. However, data sources are of different qualities, often providing conflicting, out-of-date and incomplete data. Data sources can also easily copy,...
Comments