ABSTRACT
In the current era of Big data, high volumes of high-value data---such as social network data---can be generated at a high velocity. The quality and accuracy of these data depend on their veracity: uncertainty of the data. A collection of these uncertain data can be viewed as a big, interlinked, dynamic graph structure. Embedded in these big data are implicit, previously unknown, and potentially useful knowledge. Hence, efficient and effective knowledge discovery algorithms for mining frequent subgraphs from these dynamic streaming graph structured data are in demand. Most of the existing algorithms mine frequent subgraph from streams of precise data. However, there are many real-life scientific and engineering applications, in which data are uncertain. Hence, in this paper, we propose algorithms that use limited memory space for mining frequent subgraphs from streams of uncertain data. Evaluation results show the effectiveness of our algorithms in mining frequent subgraphs from streams of uncertain data.
- C.C. Aggarwal, Y. Li, J. Wang, and J. Wang. Frequent pattern mining with uncertain data. In Proc. ACM KDD 2009, pp. 29--38. Google ScholarDigital Library
- C.C. Aggarwal, Y. Li, P.S. Yu, and R. Jin. On dense pattern mining in graph streams. PVLDB, 3(1), pp. 975--984, Sept. 2010. Google ScholarDigital Library
- R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. VLDB 1994, pp 487--499. Google ScholarDigital Library
- A. Barreto and C. Antunes. Finding periodic regularities on sequential data: converging, diverging and cyclic patterns. In Proc. C3S2E 2014, art. 19. Google ScholarDigital Library
- A. Bifet, G. Holmes, B. Pfahringer, and R. Gavaldà. Mining frequent closed graphs on evolving data streams. In Proc. ACM KDD 2011, pp. 591--599. Google ScholarDigital Library
- P. Braun, J.J. Cameron, A. Cuzzocrea, F. Jiang, and C.K. Leung. Effectively and efficiently mining frequent patterns from dense graph streams on disk. In Procedia Computer Science, 35, pp. 338--347, 2014.Google ScholarCross Ref
- B.P. Budhia, A. Cuzzocrea, and C.K. Leung. Vertical frequent pattern mining from uncertain data. In Proc. KES 2012, pp. 1273--1282.Google Scholar
- A. Cuzzocrea, F. Jiang, and C.K. Leung. Frequent subgraph mining from streams of linked graph structured data. In Proc. EDBT/ICDT Workshops 2015, pp. 237--244.Google Scholar
- J.J. Cameron, A. Cuzzocrea, and C.K. Leung. Stream mining of frequent sets with limited memory. In Proc. ACM SAC 2013, pp. 173--175. Google ScholarDigital Library
- L. Chi, B. Li, & X. Zhu. Fast graph stream classification using discriminative clique hashing. In Proc. PAKDD 2013, Part I, pp. 225--236.Google ScholarCross Ref
- A. Cuzzocrea. CAMS: OLAPing multidimensional data streams efficiently. In Proc. DaWaK 2009, pp. 48--62. Google ScholarDigital Library
- A. Cuzzocrea and S. Chakravarthy. Event-based lossy compression for effective and efficient OLAP over data streams. DKE, 69(7), pp. 678--708, July 2010. Google ScholarDigital Library
- C. Giannella, J. Han, J. Pei, X. Yan, and P.S. Yu. Mining frequent patterns in data streams at multiple time granularities. In Data Mining: Next Generation Challenges and Future Directions, ch. 6 (2004)Google Scholar
- J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proc. ACM SIGMOD 2000, pp. 1--12. Google ScholarDigital Library
- F. Jiang and C.K. Leung. A business intelligence solution for frequent pattern mining on social networks. In Proc. IIEEE ICDM Workshops 2014, pp. 789--796.Google ScholarCross Ref
- F. Jiang, C.K. Leung, D. Liu, & A.M. Peddle. Discovery of really popular friends from social networks. In Proc. IEEE BDCloud 2014, pp. 342--349. Google ScholarDigital Library
- C.K. Leung, A. Cuzzocrea, and F. Jiang. Discovering frequent patterns from uncertain data streams with time-fading and landmark models. LNCS TLDKS, 8, pp. 174--196, 2013.Google Scholar
- C.K. Leung and B. Hao. Mining of frequent itemsets from streams of uncertain data. In Proc. IEEE ICDE 2009, pp. 1663--1670. Google ScholarDigital Library
- C.K. Leung and F. Jiang. A data science solution for mining interesting patterns from uncertain big data. In Proc. IEEE BDCloud 2014, pp. 235--242. Google ScholarDigital Library
- C.K. Leung, F. Jiang, and Y. Hayduk. A landmark-model based system for mining frequent patterns from uncertain data streams. In Proc. IDEAS 2011, pp. 249--250. Google ScholarDigital Library
- C.K. Leung and K.W. Joseph. Sports data mining: predicting results for the college football games. Procedia Computer Science, 35, pp. 710--719, 2014.Google ScholarCross Ref
- C.K. Leung and Q.I. Khan. DSTree: a tree structure for the mining of frequent sets from data streams. In Proc. IEEE ICDM 2006, pp. 928--932. Google ScholarDigital Library
- C.K. Leung, R.K. MacKinnon, and S.K. Tanbeer. Fast algorithms for frequent itemset mining from uncertain data. In Proc. IEEE ICDM 2014, pp. 893--898. Google ScholarDigital Library
- C.K. Leung, R.K. MacKinnon, and Y. Wang. A machine learning approach for stock price prediction. In Proc. IDEAS 2014, pp. 274--277. Google ScholarDigital Library
- C.K. Leung, M.A.F. Mateo, and D.A. Brajczuk. A tree-based approach for frequent pattern mining from uncertain data. In Proc. PAKDD 2008, pp. 653--661. Google ScholarDigital Library
- C.K. Leung, S.K. Tanbeer, B.P. Budhia, & L.C. Zacharias. Mining probabilistic datasets vertically. In Proc. IDEAS 2012, pp. 199--204. Google ScholarDigital Library
- C. Li, T. Amagasa, H. Kitagawa, and G. Srivastava. Label-bag based graph anonymization via edge addition. In Proc. C3S2E 2014, art. 1. Google ScholarDigital Library
- R.K. MacKinnon, T.D. Strauss, and C.K. Leung. DISC: efficient uncertain frequent pattern mining with tightened upper bounds. In Proc. IIEEE ICDM Workshops 2014, pp. 1038--1045.Google ScholarCross Ref
- O. Papapetrou, M. Garofalakis, and A. Deligiannakis. Sketch-based querying of distributed sliding-window data streams. PVLDB, 5(10), pp. 992--1003, June 2012. Google ScholarDigital Library
- M.C. Pabón, C. Roncancio, and M. Millán. Graph data transformations and querying. In Proc. C3S2E 2014, art. 20. Google ScholarDigital Library
- S.K. Tanbeer, C.K. Leung, and J.J. Cameron. Interactive mining of strong friends from social networks and its applications in e-commerce. Journal of Organizational Computing and Electronic Commerce, 24(2--3), 157--173, 2014.Google ScholarCross Ref
- S. Tirthapura and D.P. Woodruff. A general method for estimating correlated aggregates over a data stream. In Proc. IEEE ICDE 2012, pp. 162--173. Google ScholarDigital Library
- E. Valari, M. Kontaki, and A.N. Papadopoulos. Discovery of top-k dense subgraphs in dynamic graph collections. In Proc. SSDBM 2012, pp. 213--230. Google ScholarDigital Library
- K. Wang, L. Tang, J. Han, and J. Liu. Top down FP-growth for association rule mining. In Proc. PKDD 2002, pp. 334--340. Google ScholarDigital Library
Index Terms
- Frequent Subgraph Mining from Streams of Uncertain Data
Recommendations
Mining uncertain data for constrained frequent sets
IDEAS '09: Proceedings of the 2009 International Database Engineering & Applications SymposiumData mining aims to search for implicit, previously unknown, and potentially useful pieces of information---such as sets of items that are frequently co-occurring together---that are embedded in data. The mined frequent sets can be used in the discovery ...
Item-centric mining of frequent patterns from big uncertain data
AbstractHigh volumes of wide varieties of valuable data of different veracity (e.g., imprecise and uncertain data) can be easily generated or collected at a high velocity for various knowledge-based and intelligent information & engineering systems in ...
A landmark-model based system for mining frequent patterns from uncertain data streams
IDEAS '11: Proceedings of the 15th Symposium on International Database Engineering & ApplicationsHuge volumes of streaming data have been generated by sensors for applications such as environment surveillance. Partially due to the inherited limitation of sensors, these continuous streaming data can be uncertain. Over the past few years, algorithms ...
Comments