skip to main content
10.1145/2600428.2609601acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article
Open Access

Evaluation of machine-learning protocols for technology-assisted review in electronic discovery

Published:03 July 2014Publication History

ABSTRACT

Abstract Using a novel evaluation toolkit that simulates a human reviewer in the loop, we compare the effectiveness of three machine-learning protocols for technology-assisted review as used in document review for discovery in legal proceedings. Our comparison addresses a central question in the deployment of technology-assisted review: Should training documents be selected at random, or should they be selected using one or more non-random methods, such as keyword search or active learning? On eight review tasks -- four derived from the TREC 2009 Legal Track and four derived from actual legal matters -- recall was measured as a function of human review effort. The results show that entirely non-random training methods, in which the initial training documents are selected using a simple keyword search, and subsequent training documents are selected by active learning, require substantially and significantly less human review effort (P<0.01) to achieve any given level of recall, than passive learning, in which the machine-learning algorithm plays no role in the selection of training documents. Among passive-learning methods, significantly less human review effort (P<0.01) is required when keywords are used instead of random sampling to select the initial training documents. Among active-learning methods, continuous active learning with relevance feedback yields generally superior results to simple active learning with uncertainty sampling, while avoiding the vexing issue of "stabilization" -- determining when training is adequate, and therefore may stop.

References

  1. Da Silva Moore v. Publicis Groupe, 287 F.R.D. 182, S.D.N.Y., 2012.Google ScholarGoogle Scholar
  2. Case Management Order: Protocol Relating to the Production of Electronically Stored Information ("ESI"), In Re: Actos (Pioglitazone) Products Liability Litigation, MDL No. 6:11-md-2299, W.D. La., July 27, 2012.Google ScholarGoogle Scholar
  3. M. Bagdouri, W. Webber, D. D. Lewis, and D. W. Oard. Towards minimizing the annotation cost of certified text classification. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, pages 989--998, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. Bailey, N. Craswell, I. Soboroff, P. Thomas, A. de Vries, and E. Yilmaz. Relevance assessment: are judges exchangeable and does it matter? In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 667--674, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Büttcher, C. L. A. Clarke, and G. V. Cormack. Information Retrieval: Implementing and Evaluating Search Engines. MIT Press, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Cheng, A. Jones, C. Privault, and J.-M. Renders. Soft labeling for multi-pass document review. ICAIL 2013 DESI V Workshop, 2013.Google ScholarGoogle Scholar
  7. G. V. Cormack and M. Mojdeh. Machine learning for information retrieval: TREC 2009 Web, Relevance Feedback and Legal Tracks. The Eighteenth Text REtrieval Conference (TREC 2009), 2009.Google ScholarGoogle Scholar
  8. M. R. Grossman and G. V. Cormack. Technology-assisted review in e-discovery can be more effective and more efficient than exhaustive manual review. Richmond Journal of Law and Technology, 17(3):1--48, 2011.Google ScholarGoogle Scholar
  9. M. R. Grossman and G. V. Cormack. Inconsistent responsiveness determination in document review: Difference of opinion or human error? Pace Law Review, 32(2):267--288, 2012.Google ScholarGoogle Scholar
  10. M. R. Grossman and G. V. Cormack. The Grossman-Cormack glossary of technology-assisted review with foreword by John M. Facciola, U.S. Magistrate Judge. Federal Courts Law Review, 7(1):1--34, 2013.Google ScholarGoogle Scholar
  11. M. R. Grossman and G. V. Cormack. Comments on "The Implications of Rule 26(g) on the Use of Technology-Assisted Review." Federal Courts Law Review, 1, to appear 2014.Google ScholarGoogle Scholar
  12. B. Hedin, S. Tomlinson, J. R. Baron, and D. W. Oard. Overview of the TREC 2009 Legal Track. The Eighteenth Text REtrieval Conference (TREC 2009), 2009.Google ScholarGoogle Scholar
  13. C. Hogan, J. Reinhart, D. Brassil, M. Gerber, S. Rugani, and T. Jade. H5 at TREC 2008 Legal Interactive: User modeling, assessment & measurement. The Seventeenth Text REtrieval Conference (TREC 2008), 2008.Google ScholarGoogle Scholar
  14. D. G. Horvitz and D. J. Thompson. A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association, 47(260):663--685, 1952.Google ScholarGoogle ScholarCross RefCross Ref
  15. D. D. Lewis and W. A. Gale. A sequential algorithm for training text classifiers. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 3--12, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. W. Oard and W. Webber. Information retrieval for e-discovery. Information Retrieval, 6(1):1--140, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Y. Ravid. System for Enhancing Expert-Based Computerized Analysis of a Set of Digital Documents and Methods Useful in Conjunction Therewith. United States Patent 8527523, 2013.Google ScholarGoogle Scholar
  18. H. L. Roitblat, A. Kershaw, and P. Oot. Document categorization in legal electronic discovery: Computer classification vs. manual review. Journal of the American Society for Information Science and Technology, 61(1):70--80, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. K. Schieneman and T. Gricks. The implications of Rule 26(g) on the use of technology-assisted review. Federal Courts Law Review, 7(1):239--274, 2013.Google ScholarGoogle Scholar
  20. J. C. Scholtes, T. van Cann, and M. Mack. The impact of incorrect training sets and rolling collections on technology-assisted review. ICAIL 2013 DESI V Workshop, 2013.Google ScholarGoogle Scholar
  21. F. Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 34(1):1--47, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. B. Settles. Active learning literature survey. University of Wisconsin, Madison, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. D. Smucker and C. P. Jethani. Human performance and retrieval precision revisited. In Proceedings of the 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 595--602, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Tomlinson. Learning Task experiments in the TREC 2010 Legal Track. The Nineteenth Text REtrieval Conference (TREC 2010), 2010.Google ScholarGoogle Scholar
  25. E. M. Voorhees. Variations in relevance judgments and the measurement of retrieval effectiveness. Information Processing & Management, 36(5):697--716, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. W. Webber, D. W. Oard, F. Scholer, and B. Hedin. Assessor error in stratified evaluation. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pages 623--632, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. W. Webber and J. Pickens. Assessor disagreement and text classifier accuracy. In Proceedings of the 36th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 929--932, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Yablon and N. Landsman-Roos. Predictive coding: Emerging questions and concerns. South Carolina Law Review, 64(3):633--765, 2013.Google ScholarGoogle Scholar

Index Terms

  1. Evaluation of machine-learning protocols for technology-assisted review in electronic discovery

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval
        July 2014
        1330 pages
        ISBN:9781450322577
        DOI:10.1145/2600428

        Copyright © 2014 Owner/Author

        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 3 July 2014

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SIGIR '14 Paper Acceptance Rate82of387submissions,21%Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader