skip to main content
10.1145/2790798.2790799acmotherconferencesArticle/Chapter ViewAbstractPublication PagesuccsConference Proceedingsconference-collections
research-article

Frequent Subgraph Mining from Streams of Uncertain Data

Published:13 July 2015Publication History

ABSTRACT

In the current era of Big data, high volumes of high-value data---such as social network data---can be generated at a high velocity. The quality and accuracy of these data depend on their veracity: uncertainty of the data. A collection of these uncertain data can be viewed as a big, interlinked, dynamic graph structure. Embedded in these big data are implicit, previously unknown, and potentially useful knowledge. Hence, efficient and effective knowledge discovery algorithms for mining frequent subgraphs from these dynamic streaming graph structured data are in demand. Most of the existing algorithms mine frequent subgraph from streams of precise data. However, there are many real-life scientific and engineering applications, in which data are uncertain. Hence, in this paper, we propose algorithms that use limited memory space for mining frequent subgraphs from streams of uncertain data. Evaluation results show the effectiveness of our algorithms in mining frequent subgraphs from streams of uncertain data.

References

  1. C.C. Aggarwal, Y. Li, J. Wang, and J. Wang. Frequent pattern mining with uncertain data. In Proc. ACM KDD 2009, pp. 29--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C.C. Aggarwal, Y. Li, P.S. Yu, and R. Jin. On dense pattern mining in graph streams. PVLDB, 3(1), pp. 975--984, Sept. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. VLDB 1994, pp 487--499. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Barreto and C. Antunes. Finding periodic regularities on sequential data: converging, diverging and cyclic patterns. In Proc. C3S2E 2014, art. 19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Bifet, G. Holmes, B. Pfahringer, and R. Gavaldà. Mining frequent closed graphs on evolving data streams. In Proc. ACM KDD 2011, pp. 591--599. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Braun, J.J. Cameron, A. Cuzzocrea, F. Jiang, and C.K. Leung. Effectively and efficiently mining frequent patterns from dense graph streams on disk. In Procedia Computer Science, 35, pp. 338--347, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  7. B.P. Budhia, A. Cuzzocrea, and C.K. Leung. Vertical frequent pattern mining from uncertain data. In Proc. KES 2012, pp. 1273--1282.Google ScholarGoogle Scholar
  8. A. Cuzzocrea, F. Jiang, and C.K. Leung. Frequent subgraph mining from streams of linked graph structured data. In Proc. EDBT/ICDT Workshops 2015, pp. 237--244.Google ScholarGoogle Scholar
  9. J.J. Cameron, A. Cuzzocrea, and C.K. Leung. Stream mining of frequent sets with limited memory. In Proc. ACM SAC 2013, pp. 173--175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. L. Chi, B. Li, & X. Zhu. Fast graph stream classification using discriminative clique hashing. In Proc. PAKDD 2013, Part I, pp. 225--236.Google ScholarGoogle ScholarCross RefCross Ref
  11. A. Cuzzocrea. CAMS: OLAPing multidimensional data streams efficiently. In Proc. DaWaK 2009, pp. 48--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Cuzzocrea and S. Chakravarthy. Event-based lossy compression for effective and efficient OLAP over data streams. DKE, 69(7), pp. 678--708, July 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Giannella, J. Han, J. Pei, X. Yan, and P.S. Yu. Mining frequent patterns in data streams at multiple time granularities. In Data Mining: Next Generation Challenges and Future Directions, ch. 6 (2004)Google ScholarGoogle Scholar
  14. J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proc. ACM SIGMOD 2000, pp. 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. F. Jiang and C.K. Leung. A business intelligence solution for frequent pattern mining on social networks. In Proc. IIEEE ICDM Workshops 2014, pp. 789--796.Google ScholarGoogle ScholarCross RefCross Ref
  16. F. Jiang, C.K. Leung, D. Liu, & A.M. Peddle. Discovery of really popular friends from social networks. In Proc. IEEE BDCloud 2014, pp. 342--349. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C.K. Leung, A. Cuzzocrea, and F. Jiang. Discovering frequent patterns from uncertain data streams with time-fading and landmark models. LNCS TLDKS, 8, pp. 174--196, 2013.Google ScholarGoogle Scholar
  18. C.K. Leung and B. Hao. Mining of frequent itemsets from streams of uncertain data. In Proc. IEEE ICDE 2009, pp. 1663--1670. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C.K. Leung and F. Jiang. A data science solution for mining interesting patterns from uncertain big data. In Proc. IEEE BDCloud 2014, pp. 235--242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C.K. Leung, F. Jiang, and Y. Hayduk. A landmark-model based system for mining frequent patterns from uncertain data streams. In Proc. IDEAS 2011, pp. 249--250. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C.K. Leung and K.W. Joseph. Sports data mining: predicting results for the college football games. Procedia Computer Science, 35, pp. 710--719, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  22. C.K. Leung and Q.I. Khan. DSTree: a tree structure for the mining of frequent sets from data streams. In Proc. IEEE ICDM 2006, pp. 928--932. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C.K. Leung, R.K. MacKinnon, and S.K. Tanbeer. Fast algorithms for frequent itemset mining from uncertain data. In Proc. IEEE ICDM 2014, pp. 893--898. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. C.K. Leung, R.K. MacKinnon, and Y. Wang. A machine learning approach for stock price prediction. In Proc. IDEAS 2014, pp. 274--277. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. C.K. Leung, M.A.F. Mateo, and D.A. Brajczuk. A tree-based approach for frequent pattern mining from uncertain data. In Proc. PAKDD 2008, pp. 653--661. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. C.K. Leung, S.K. Tanbeer, B.P. Budhia, & L.C. Zacharias. Mining probabilistic datasets vertically. In Proc. IDEAS 2012, pp. 199--204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C. Li, T. Amagasa, H. Kitagawa, and G. Srivastava. Label-bag based graph anonymization via edge addition. In Proc. C3S2E 2014, art. 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R.K. MacKinnon, T.D. Strauss, and C.K. Leung. DISC: efficient uncertain frequent pattern mining with tightened upper bounds. In Proc. IIEEE ICDM Workshops 2014, pp. 1038--1045.Google ScholarGoogle ScholarCross RefCross Ref
  29. O. Papapetrou, M. Garofalakis, and A. Deligiannakis. Sketch-based querying of distributed sliding-window data streams. PVLDB, 5(10), pp. 992--1003, June 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M.C. Pabón, C. Roncancio, and M. Millán. Graph data transformations and querying. In Proc. C3S2E 2014, art. 20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. S.K. Tanbeer, C.K. Leung, and J.J. Cameron. Interactive mining of strong friends from social networks and its applications in e-commerce. Journal of Organizational Computing and Electronic Commerce, 24(2--3), 157--173, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  32. S. Tirthapura and D.P. Woodruff. A general method for estimating correlated aggregates over a data stream. In Proc. IEEE ICDE 2012, pp. 162--173. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. E. Valari, M. Kontaki, and A.N. Papadopoulos. Discovery of top-k dense subgraphs in dynamic graph collections. In Proc. SSDBM 2012, pp. 213--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. K. Wang, L. Tang, J. Han, and J. Liu. Top down FP-growth for association rule mining. In Proc. PKDD 2002, pp. 334--340. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Frequent Subgraph Mining from Streams of Uncertain Data

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          C3S2E '15: Proceedings of the Eighth International C* Conference on Computer Science & Software Engineering
          July 2015
          166 pages
          ISBN:9781450334198
          DOI:10.1145/2790798

          Copyright © 2015 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 13 July 2015

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate12of42submissions,29%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader