|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ABSTRACT
This paper focuses on detecting how concepts are linked across multiple textdocuments by generating an evidence trail explaining the connection. A traditional search involving, for example, two or more person names willattempt to find documents mentioning both of these individuals. This researchfocuses on a different interpretation of such a query: what is the best evidencetrail across documents that explains a connection between these individuals? For example, allmay be good golfers. A generalization ofthis task involves query terms representing general concepts (e.g. indictment,foreign policy). Such queries reflect a special case oftext mining. Previous attempts to solve this problem have focused on graphapproaches involving hyperlinked documents, and link analysis tools exploiting named entities. A new robust framework is presented, based on (i) generating concept chain graphs, a hybrid content representation, (ii) performing graph matching to select candidate subgraphs, and (iii) subsequently using graphical models to validate hypotheses using ranked evidence trails. We adapt the DUC data set for cross-document summarization to evaluate evidence trails generated by this approach. REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
INDEX TERMS
Primary Classification:
Additional Classification:
General Terms:
Keywords:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||