ABSTRACT
The main contribution of this paper is a method for creating a Graph-Embedded-Tree-based ontology, which utilizes domain knowledge from a patent classification scheme, for a patent classification process. Our contribution is twofold. First, we propose a novel definition of GeTCo ontology, which consists of four types of concept: Class, Document, Phrase, and Term. Depending on relationships of each pair of concepts, we further define their semantic information to give our classifier better reasoning capability whenever the semantic ambiguation occurs. Second, we propose a novel method to construct our ontology based on the United State Patent Classification Scheme (USPC) without relying on a rule-based method for concept extraction and thus, it can negate intensive-manual efforts in traditional ontology construction. We developed a prototype application on top of Rocchio classifier, called the GeTCo-enabled Rocchio classifier, to evaluate our proposed ontology. Our experiments with filtered 9703 single-class patents showed that the GeTCo-enabled Rocchio classifier, backed by our proposed directed-graph ontology, yields higher F1-score (i.e., +7%) than original Rocchio classifier without GeTCo supports.
- Patent Scope - International Patent Cooperation Treaty Database.Google Scholar
- L. S. Larkey. A patent search and classification system. In Proceedings of the Fourth ACM Conference on Digital Libraries, DL '99, pages 179--187, New York, NY, USA, 1999. ACM. Google ScholarDigital Library
- Z. Li and D. Tate. Automatic ontology generation from patents using a pre-built library, wordnet and a class-based n-gram model. International Journal of Product Development (IJPD), 20:142--172, Nov. 2015.Google ScholarCross Ref
- C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA, 2008. Google ScholarCross Ref
- J. W. Reed, Y. Jiao, T. E. Potok, B. A. Klump, M. T. Elmore, and A. R. Hurson. Tf-icf: A new term weighting scheme for clustering dynamic data streams. In Proceedings of the 5th International Conference on Machine Learning and Applications, ICMLA '06, pages 258--263, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarDigital Library
- S. Taduri, G. T. Lau, K. H. Law, H. Yu, and J. P. Kesan. Developing an ontology for the u.s. patent system. In Proceedings of the 12th Annual International Digital Government Research Conference: Digital Government Innovation in Challenging Times, dg.o '11, pages 157--166, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- European Patent Office. Espacenet - Online Patent Search with CPC scheme support. http://worldwide.espacenet.com/. Accessed: 2016-06-28.Google Scholar
- Japan Patent Office. J-platpat - Japan Patent Search. https://www.j-platpat.inpit.go.jp/. Accessed: 2016-06-28.Google Scholar
- Reed Tech - A Lexis Nexis Company. USPTO Data Sets - Patent Grant Red Book (Full Text). http://patents.reedtech.com/pgrbft.php. Accessed: 2016-06-28.Google Scholar
- United States Patent and Trademark Office. PatFT - Patent Full Text Search. http://patft.uspto.gov/. Accessed: 2016-06-28.Google Scholar
- United States Patent and Trademark Office. XML Resources - Patent Grants and Red Book. http://www.uspto.gov/learning-and-resources/xml-resources. Accessed: 2016-06-28.Google Scholar
- V. X. Vinh, H.-Q. Nguyen, and K.-N. Tran. Get-based ontology construction for semantic disambiguation. In Proceedings of the 16th International Conference on Information Integration and Web-based Applications and Services, iiWAS '14, pages 445--453, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
Index Terms
- GeTCo: an ontology-based approach for patent classification search
Recommendations
Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values
Each type of classifier has its own advantages as well as certain shortcomings. In this paper, we take the advantages of the associative classifier and the Naive Bayes Classifier to make up the shortcomings of each other, thus improving the accuracy of ...
Constructing a multi-class classifier using one-against-one approach with different binary classifiers
For the one-against-one approach, all the binary classifiers that form a one-against-one classifier should be sufficiently competent. If some of the classifiers are not competent, the consequences might be invalid classification results. To address the ...
AdaBoost classifiers for pecan defect classification
Highlights The performance of AdaBoost algorithms were compared with support vector machine and Bayesian classifiers for pecan defect classification. AdaBoost classifiers took least time and gave best classification accuracy. AdaBoost classifiers ...
Comments