ABSTRACT
In an interconnected and dynamic world, the evolution of one entity may cause a series of significant value changes for some others. For example, the currency inflation of Thailand caused the currency slump of other Asian countries, which eventually led to the financial crisis of 1997. We call such high impact entities shakers. To discover shakers, we first introduce the concept of a cascading graph to capture the causality relationships among evolving entities over some period of time, and then infer shakers from the graph. In a cascading graph, nodes represent entities and weighted links represent the causality effects. In order to find hidden shakers in such a graph, two scoring functions are proposed, each of which estimates how much the target entity can affect the values of some others. The idea is to artificially inject a significant change on the target entity, and estimate its direct and indirect influence on the others, by following an inference rule under the Markovian assumption. Both scoring functions are proven to be only dependent on the structure of a cascading graph and can be calculated in polynomial time. Experiments included three datasets in social sciences. Without directly applicable previous methods, we modified three graphical models as baselines. The two proposed scoring functions can effectively capture those high impact entities. For example, in the experiment to discover stock market shakers, the proposed models outperform the three baselines by as much as 50% in accuracy with the ground truth obtained from Yahoo!~Finance.
- N. Agarwal, H. Liu, L. Tang, and P. S. Yu. Identifying the influential bloggers in a community. In WSDM, pages 207--218, 2008. Google ScholarDigital Library
- World bank dataset. http://data.worldbank.org/.Google Scholar
- BRIC. http://en.wikipedia.org/wiki/bric.Google Scholar
- S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. In WWW, pages 107--117, 1998. Google ScholarDigital Library
- D.J. Fenn, M.A. Porter, P.J. Mucha, M. McDonald, S.Williams, N.F. Johnson, and N.S. Jones. Dynamical clustering of exchange rates. preprint (2010) arXiv:0905.4912v2, 2010.Google Scholar
- Yahoo! finance. http://finance.yahoo.com/.Google Scholar
- Yahoo! finance industry ranking. http://finance.yahoo.com/q/in?s=aaplGoogle Scholar
- industry.Google Scholar
- M. Gomez-Rodriguez, J. Leskovec, and A. Krause. Inferring networks of diffusion and influence. In KDD, pages 1019--1028, 2010. Google ScholarDigital Library
- C. W. J. Granger. Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37(3):424--438, 1969.Google ScholarCross Ref
- D. Heckerman, D. Geiger, and D. M. Chickering. Learning bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20(3):197--243, 1995. Google ScholarDigital Library
- D. Kempe, J. Kleinberg, and É. Tardos. Influential nodes in a diffusion model for social networks. In IN ICALP, pages 1127--1138, 2005. Google ScholarDigital Library
- D. Kempe, J. M. Kleinberg, and É. Tardos. Maximizing the spread of influence through a social network. In KDD, pages 137--146, 2003. Google ScholarDigital Library
- M. Kimura, K. Saito, and R. Nakano. Extracting influential nodes for information diffusion on a social network. In AAAI, pages 1371--1376, 2007. Google ScholarDigital Library
- J. M. Kleinberg, R. Kumar, P. Raghavan, S. Rajagopalan, and A. S. Tomkins. The web as a graph: measurements, models, and methods. In COCOON, pages 1--17, 1999. Google ScholarDigital Library
- T. Lappas, E. Terzi, D. Gunopulos, and H. Mannila. Finding effectors in social networks. In KDD, pages 1059--1068, 2010. Google ScholarDigital Library
- J. Leskovec, L. A. Adamic, and B. A. Huberman. The dynamics of viral marketing. In EC, pages 228--237, 2006. Google ScholarDigital Library
- J. Leskovec, M. McGlohon, C. Faloutsos, N. S. Glance, and M. Hurst. Patterns of cascading behavior in large blog graphs. In SDM, pages 551--556, 2007.Google ScholarCross Ref
- D. Li, J. Shen, and J. Xie. Study on representation of time series based on subsection polynomial fitting. In FSKD, pages 1371--1376, 2007. Google ScholarDigital Library
- D. Miles and A. Scott. Macroeconomics: understanding the wealth of nations. John Wiley and Sons., 2nd edition, 2005.Google Scholar
- A. Mueen and E. J. Keogh. Online discovery and maintenance of time series motifs. In KDD, pages 1089--1098, 2010. Google ScholarDigital Library
- A. Mueen, E. J. Keogh, Q. Zhu, S. Cash, and M. B. Westover. Exact discovery of time series motifs. In SDM, pages 473--484, 2009.Google ScholarCross Ref
- S. Papadimitriou, J. Sun, and C. Faloutsos. Streaming pattern discovery in multiple time-series. In VLDB, pages 697--708, 2005. Google ScholarDigital Library
- J. Pearl. Causal inference in statistics: an overview. Statistics Surveys, 3:96--146, 2009.Google ScholarCross Ref
- Country ranking in GDP. http://en.wikipedia.org/wiki/list-of-countries-by-gdp-(nominal).Google Scholar
- C. A. Ratanamahatana, J. Lin, D. Gunopulos, E. J. Keogh, M. Vlachos, and G. Das. Mining time series data. In Data Mining and Knowledge Discovery Handbook, pages 1049--1077. 2010.Google Scholar
- Market reaction on AAPL. http://www.finanzen.net/nachricht/aktien/market-snapshot-stocks-mostly-down-after-apple-goldman-results-1012909.Google Scholar
- G. Shafer. Causal logic. In ECAI, pages 711--720, 1998.Google Scholar
- J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 22(8):888--905, 2000. Google ScholarDigital Library
- Q. Smith. Causation and the logical impossibility of a divine cause. Philosophical Topics, 21(1):169--191, 1996.Google ScholarCross Ref
- Y. Sun, Q. Meng, Z. Chen, and Y. Zhang. Normalizing the polynomial match for the nonlinear signal in transducers. Sensors and Actuators, 125:76--83, 2005.Google ScholarCross Ref
- T. D. Tregarthen and L. Rittenberg. Macroeconomics. Worth Publishers, 2nd edition, 2000.Google Scholar
- Ioannis Tsamardinos. Causal data mining in bioinformatics. ERCIM News, 2007(69), 2007.Google Scholar
- Y. Wang, G. Cong, G. Song, and K. Xie. Community-based greedy algorithm for mining top-k influential nodes in mobile social networks. In KDD, pages 1039--1048, 2010. Google ScholarDigital Library
- D. Wu, Y. Ke, J. X. Yu, P. S. Yu, and L. Chen. Detecting leaders from correlated time series. In DASFAA, pages 352--367, 2010. Google ScholarDigital Library
- B. Yi, N. Sidiropoulos, T. Johnson, H. V. Jagadish, C. Faloutsos, and A. Biliris. Online data mining for co-evolving time sequences. In ICDE, pages 13--22, 2000. Google ScholarDigital Library
Index Terms
- Discovering shakers from evolving entities via cascading graph inference
Recommendations
A Probabilistic Inference Based Approach for Querying Associative Entities in Knowledge Graph
Web and Big DataAbstractQuerying associative entities is to provide top ranked entities in knowledge graph (KG). Many entities are not linked explicitly in KG but actually associated when incorporating outside user-generated data, which could enrich entity associations ...
Unsupervised Graph-Based Entity Resolution for Complex Entities
Entity resolution (ER) is the process of linking records that refer to the same entity. Traditionally, this process compares attribute values of records to calculate similarities and then classifies pairs of records as referring to the same entity or not ...
Comments