skip to main content
10.1145/3151759.3151784acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiiwasConference Proceedingsconference-collections
research-article

Medical documents processing for summary generation and keywords highlighting based on natural language processing and ontology graph descriptor approach

Published:04 December 2017Publication History

ABSTRACT

In this paper a new method of data retrieval from free text documents in medical domain is proposed. Presented approach gives the document summary and highlights important keywords in the text to support further analysis of multiple medical documents. The document is processed with natural language processing techniques to find medical keywords and assign them to concepts in the medical ontology. These concepts contribute to higher levels in the hierarchy and build the document descriptor as a graph with concepts in the nodes and corresponding relevance points. The descriptor is used to generate the summary in a form of tree. Finally, we highlight the most important keywords in the original text. Presented experiments demonstrate the proposed approach, which successfully summarizes and highlights meaningful medical information.

References

  1. Xiaojun Wan, Jianmin Zhang. CTSUM: extracting more certain summaries for news articles. In Proc. of the 37th International ACM SIGIR conference on Research & Development in Information Retrieval, Queensland, Australia, July 6--11, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Pengjie Ren, Zhumin Chen, Zhaochun Ren, Furu Wei, Jun Ma, Maarten de Rijke. Leveraging Contextual Sentence Relations for Extractive Summarization Using a Neural Attention Model. In Proc. of the 40th International ACM SIGIR, Tokyo, Japan, August 7--11, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Andreas Doms and Michael Schroeder. GoPubMed: exploring PubMed with the Gene Ontology. Nucleic Acids Research, Vol.33, pp. 783--786, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  4. Jenssen, Tor-Kristian; Leegreid, Astrid; Komorowski, Jan; Hovig, Eivind. A literature network of human genes for high-throughput analysis of gene expression. Nature Genetics. Vol.28 (1), pp. 21--8, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  5. Rada Mihalcea and Paul Tarau. TextRank: Bringing Order into Text. In Proc. of International Conference on Empirical Methods on Natural Language Processing (EMNLP), Barcelona, Spain, 2004.Google ScholarGoogle Scholar
  6. Yatsko, V. et al. Automatic genre recognition and adaptive text summarization. Automatic Documentation and Mathematical Linguistics, Vol.44 (3), pp.111--120, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Sparsh Mittal and Ankush Mittal. Versatile question answering systems: seeing in synthesis. Intelligent Information and Database Systems. Vol.5 (2), pp. 119--142, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jaime Carbonell and Jade Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proc. of ACM SIGIR, Melbourne, Australia, August 24--28, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Hui Lin and Jeff Bilmes. Learning mixtures of submodular shells with application to document summarization. In Proc. of the conference on Uncertainty in Artificial Intelligence, Catalina Island, US, Aug 14--18, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. "Simplish Simplification and Summarization Tool". The Goodwill Consortium. Retrieved February 8, 2017.Google ScholarGoogle Scholar
  11. Apache cTAKES: clinical Text Analysis and Knowledge Extraction System, http://ctakes.apache.org/Google ScholarGoogle Scholar
  12. SNOMED CT, http://www.snomed.org/Google ScholarGoogle Scholar
  13. Medical Subject Headings, https://www.nlm.nih.gov/mesh/Google ScholarGoogle Scholar
  14. History and Physical Examination Examples, http://www.clinicaladvisor.comGoogle ScholarGoogle Scholar

Index Terms

  1. Medical documents processing for summary generation and keywords highlighting based on natural language processing and ontology graph descriptor approach

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      iiWAS '17: Proceedings of the 19th International Conference on Information Integration and Web-based Applications & Services
      December 2017
      609 pages
      ISBN:9781450352994
      DOI:10.1145/3151759

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 December 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader