skip to main content
10.1145/2810103.2813689acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

AUTOREB: Automatically Understanding the Review-to-Behavior Fidelity in Android Applications

Authors Info & Claims
Published:12 October 2015Publication History

ABSTRACT

Along with the increasing popularity of mobile devices, there exist severe security and privacy concerns for mobile apps. On Google Play, user reviews provide a unique understanding of security/privacy issues of mobile apps from users' perspective, and in fact they are valuable feedbacks from users by considering users' expectations. To best assist the end users, in this paper, we automatically learn the security/privacy related behaviors inferred from analysis on user reviews, which we call review-to-behavior fidelity. We design the system AUTOREB that automatically assesses the review-to-behavior fidelity of mobile apps. AUTOREB employs the state-of-the-art machine learning techniques to infer the relations between users' reviews and four categories of security-related behaviors. Moreover, it uses a crowdsourcing approach to automatically aggregate the security issues from review-level to app-level. To our knowledge, AUTOREB is the first work that explores the user review information and utilizes the review semantics to predict the risky behaviors at both review-level and app-level.

We crawled a real-world dataset of 2,614,186 users, 12,783 apps and 13,129,783 reviews from Google play, and use it to comprehensively evaluate AUTOREB. The experiment result shows that our method can predict the mobile app behaviors at user-review level with accuracy as high as 94.05%, and also it can predict the security issues at app-level by aggregating the predictions at review-level. Our research offers an insight into understanding the mobile app security concerns from users' perspective, and helps bridge the gap between the security issues and users' perception.

References

  1. S. P. Boyd and L. Vandenberghe. Convex optimization. Cambridge university press, 2004. Google ScholarGoogle ScholarCross RefCross Ref
  2. S. Chakradeo, B. Reaves, P. Traynor, and W. Enck. Mast: Triage for market-scale mobile malware analysis. In WiSec, pages 13--24, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Cochran, L. D'Antoni, B. Livshits, D. Molnar, and M. Veanes. Program boosting: Program synthesis via crowd-sourcing. In POPL, Jan. 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Cotter, S. Shalev-shwartz, and N. Srebro. Learning optimally sparse support vector machines. In ICML, volume 28, pages 266--274, 2013.Google ScholarGoogle Scholar
  5. A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society, Series B, 39(1):1--38, 1977.Google ScholarGoogle Scholar
  6. R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. John Willey & Sons, New Yotk, 1973.Google ScholarGoogle Scholar
  7. W. Enck, P. Gilbert, B. Chun, L. P. Cox, J. Jung, P. McDaniel, and A. N. Sheth. Taintdroid: An information-flow tracking system for realtime privacy monitoring on smartphones. In OSDI, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. W. Enck, D. Octeau, P. Mcdaniel, and S. Chaudhuri. A study of android application security. In USENIX Security Symposium, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. P. Felt, E. Chin, S. Hanna, D. Song, and D. Wagner. Android permissions demystified. In CCS, pages 627--638, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. P. Felt, S. Egelman, and D. Wagner. I've got 99 problems, but vibration ain't one: A survey of smartphone users' concerns. In SPSM, pages 33--44, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. P. Felt, E. Ha, S. Egelman, A. Haney, E. Chin, and D. Wagner. Android permissions: User attention, comprehension, and behavior. In SOUPS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. Fu, J. Lin, L. Li, C. Faloutsos, J. Hong, and N. Sadeh. Why people hate your app: Making sense of user feedback in a mobile app store. In KDD, pages 1276--1284, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Gegick, P. Rotella, and T. Xie. Identifying security bug reports via text mining: An industrial case study. In MSR, pages 11--20, May 2010.Google ScholarGoogle ScholarCross RefCross Ref
  14. C. Gibler, J. Crussell, J. Erickson, and H. Chen. Androidleaks: Automatically detecting potential privacy leaks in android applications on a large scale. In TRUST, pages 291--307, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Grace, Y. Zhou, Q. Zhang, S. Zou, and X. Jiang. Riskranker: Scalable and accurate zero-day android malware detection. In MobiSys, pages 281--294, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. C. Grace, W. Zhou, X. Jiang, and A.-R. Sadeghi. Unsafe exposure analysis of mobile in-app advertisements. In WISEC, pages 101--112, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. Hornyack, S. Han, J. Jung, S. Schechter, and D. Wetherall. These aren't the droids you're looking for: Retrofitting android to protect data from imperious applications. In CCS, pages 639--652, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. H. Huang, K. Chen, C. Ren, P. Liu, S. Zhu, and D. Wu. Towards discovering and understanding unexpected hazards in tailoring antivirus software for android. In AsiaCCS, pages 7--18. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Kong and H. Jin. Towards permission request prediction on mobile apps via structure feature learning. In SDM, pages 604--612, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  20. D. Kong and G. Yan. Discriminant malware distance learning on structural information for automated malware classification. In KDD, pages 1357--1365, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. W. Lee and S. J. Stolfo. Data mining approaches for intrusion detection. In USENIX Security Symposium, pages 79--94, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Lin, S. Amini, J. I. Hong, N. Sadeh, J. Lindqvist, and J. Zhang. Expectation and purpose: Understanding users' mental models of mobile app privacy through crowdsourcing. In UbiComp, pages 501--510, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. B. Liu. Sentiment analysis and subjectivity. In Handbook of Natural Language Processing, Second Edition. Taylor and Francis Group, Boca, 2010.Google ScholarGoogle Scholar
  24. C. Liu, C. Chen, J. Han, and P. S. Yu. Gplag: Detection of software plagiarism by program dependence graph analysis. In KDD, pages 872--881, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. B. Livshits and J. Jung. Automatic mediation of privacy-sensitive resource access in smartphone applications. In USENIX Security, pages 113--130. USENIX, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. B. Livshits and T. Zimmermann. Dynamine: Finding common error patterns by mining software revision histories. In FSE, ESEC/FSE-13, pages 296--305, New York, NY, USA, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Ma, L. K. Saul, S. Savage, and G. M. Voelker. Beyond blacklists: Learning to detect malicious web sites from suspicious urls. In KDD, pages 1245--1254, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Muralidharan, Z. Gyongyi, and E. H. Chi. Social annotations in web search. In CHI, pages 1085--1094, New York, NY, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Y. Nan, M. Yang, Z. Yang, S. Zhou, G. Gu, and X. Wang. Uipicker: User-input privacy identification in mobile applications. In USENIX Security, pages 993--1008, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. L. Nelson, C. Held, P. Pirolli, L. Hong, D. Schiano, and E. H. Chi. With a little help from my friends: Examining the impact of social annotations in sensemaking tasks. In CHI, pages 1795--1798, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. Neugschwandtner, P. M. Comparetti, G. Jacob, and C. Kruegel. Forecast: Skimming off the malware cream. In ACSAC, pages 11--20, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Newsome, B. Karp, and D. Song. Polygraph: Automatically generating signatures for polymorphic worms. In 2005 IEEE Security and Privacy (S&P), pages 226--241, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. R. Pandita, X. Xiao, W. Yang, W. Enck, and T. Xie. Whyper: Towards automating risk assessment of mobile applications. In USENIX Security, pages 527--542, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. R. Pandita, X. Xiao, H. Zhong, T. Xie, S. Oney, and A. Paradkar. Inferring method specifications from natural language api descriptions. In ICSE, pages 815--825, Piscataway, NJ, USA, 2012. IEEE Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Z. Qi, F. Long, S. Achour, and M. Rinard. An analysis of patch plausibility and correctness for generate-and-validate patch generation systems. In ISSTA, pages 24--36, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Z. Qu, V. Rastogi, X. Zhang, Y. Chen, T. Zhu, and Z. Chen. Autocog: Measuring the description-to-permission fidelity in android applications. In CCS, pages 1354--1365, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. V. C. Raykar, S. Yu, L. H. Zhao, G. H. Valadez, C. Florin, L. Bogoni, and L. Moy. Learning from crowds. The Journal of Machine Learning Research, 11:1297--1322, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. K. Rieck, T. Krueger, and A. Dewald. Cujo: Efficient detection and prevention of drive-by-download attacks. In ACSAC, pages 31--39, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. J. Slankas, X. Xiao, L. Williams, and T. Xie. Relation extraction for inferring access control rules from natural language artifacts. In ACSAC, pages 366--375, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. L. Tan, D. Yuan, G. Krishna, and Y. Zhou. /*icomment: Bugs or bad comments?*/. In SOSP, pages 145--158, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58:267--288, 1994.Google ScholarGoogle Scholar
  42. N. Wang, B. Zhang, B. Liu, and H. Jin. Investigating effects of control and ads awareness on android users' privacy behaviors and perceptions. In MobileHCI. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. R. Wang, W. Enck, D. Reeves, X. Zhang, P. Ning, D. Xu, W. Zhou, and A. M. Azab. Easeandroid: Automatic policy analysis and refinement for security enhanced android via large-scale semi-supervised learning. In USENIX Security, Washington, D.C., 2015. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. J. Xu and W. B. Croft. Query expansion using local and global document analysis. In SIGIR, pages 4--11, New York, NY, USA, 1996. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. G. Yan, N. Brown, and D. Kong. Exploring discriminatory features for automated malware classification. In DIMVA, pages 41--61, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. L. K. Yan and H. Yin. Droidscope: Seamlessly reconstructing the os and dalvik semantic views for dynamic android malware analysis. In USENIX Security Symposium, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. F. Zhang, H. Huang, S. Zhu, D. Wu, and P. Liu. Viewdroid: Towards obfuscation-resilient mobile application repackaging detection. In WISEC, pages 25--36. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. M. Zhang, Y. Duan, H. Yin, and Z. Zhao. Semantics-Aware Android Malware Classification Using Weighted Contextual API Dependency Graphs. In CCS, Scottsdale, AZ, November 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Y. Zhou, Z. Wang, W. Zhou, and X. Jiang. Hey, you, get off of my market: Detecting malicious apps in official and alternative android markets. In NDSS, 2012.Google ScholarGoogle Scholar

Index Terms

  1. AUTOREB: Automatically Understanding the Review-to-Behavior Fidelity in Android Applications

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CCS '15: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security
        October 2015
        1750 pages
        ISBN:9781450338325
        DOI:10.1145/2810103

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 October 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        CCS '15 Paper Acceptance Rate128of660submissions,19%Overall Acceptance Rate1,261of6,999submissions,18%

        Upcoming Conference

        CCS '24
        ACM SIGSAC Conference on Computer and Communications Security
        October 14 - 18, 2024
        Salt Lake City , UT , USA

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader