skip to main content
10.1145/2393347.2396398acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
poster

Conversationally-inspired stylometric features for authorship attribution in instant messaging

Published: 29 October 2012 Publication History

Abstract

Authorship attribution (AA) aims at recognizing automatically the author of a given text sample. Traditionally applied to literary texts, AA faces now the new challenge of recognizing the identity of people involved in chat conversations. These share many aspects with spoken conversations, but AA approaches did not take it into account so far. Hence, this paper tries to fill the gap and proposes two novelties that improve the effectiveness of traditional AA approaches for this type of data: the first is to adopt features inspired by Conversation Analysis (in particular for turn-taking), the second is to extract the features from individual turns rather than from entire conversations. The experiments have been performed over a corpus of dyadic chat conversations (77 individuals in total). The performance in identifying the persons involved in each exchange, measured in terms of area under the Cumulative Match Characteristic curve, is 89.5%.

References

[1]
A. Abbasi and H. Chen. Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace. ACM Transactions on Information Systems, 26(2):1--29, 2008.
[2]
A. Abbasi, H. Chen, and J. Nunamaker. Stylometric identification in electronic markets: Scalability and robustness. Journal of Management Information Systems, 25(1):49--78, 2008.
[3]
S. Argamon, M. Koppel, J. Pennebaker, and J. Schler. Automatically profiling the author of an anonymous text. Communications of ACM, 52(2):119--123, 2009.
[4]
R. Bolle, J. Connell, S. Pankanti, N. Ratha, and A. Senior. Guide to Biometrics. Springer Verlag, 2003.
[5]
O. De Vel, A. Anderson, M. Corney, and G. Mohay. Mining e-mail content for auhtor identification forensics. ACM SIGMOD Record, 30(4), 2001.
[6]
R. Duda, P. Hart, and D. Stork. Pattern Classification. Wiley and Sons, 2001.
[7]
D. I. Holmes. The evolution of stylometry in humanities scholarship. Literary and Linguistic Computing, 13(3):111--117, 1998.
[8]
F. Iqbal, H. Binsalleeh, B. C. M. Fung, and M. Debbabi. A unified data mining solution for authorship analysis in anonymous textual communications. Information Sciences, 2011.
[9]
L. Kuncheva. A stability index for feature selection. In IASTED International Multi-Conference Artificial Intelligence and Applications, pages 390--395, 2007.
[10]
H. Liu and H. Motoda. Computational Methods of Feature Selection. Chapman and Hall, 2008.
[11]
A. Orebaugh and J. Allnutt. Classification of instant messaging communications for forensics analysis. Social Networks, pages 22--28, 2009.
[12]
E. Stamatatos. A survey of modern authorship attribution methods. Journal of the American Society for Information Science and Technology, 60(3):538--556, 2009.
[13]
A. Vinciarelli, M. Pantic, and H. Bourlard. Social Signal Processing: Survey of an emerging domain. Image and Vision Computing Journal, 27(12):1743--1759, 2009.
[14]
R. Zheng, J. Li, H. Chen, and Z. Huang. A framework for authorship identification of online messages: Writing-style features and classification techniques. Journal of the American Society for Information Science and Technology, 57(3):378--393, 2006.
[15]
D. Zhou, L. Zhang. Can online behavior unveil deceivers? -- an exploratory investigation of deception in instant messaging. In Proceedings of the Annual Hawaii International Conference on System Sciences, 2004.

Cited By

View all
  • (2024)Recognition algorithm for cross-texting in text chat conversationsData & Knowledge Engineering10.1016/j.datak.2023.102261150(102261)Online publication date: Mar-2024
  • (2023)Machine Learning-Based Respiration Rate and Blood Oxygen Saturation Estimation Using Photoplethysmogram SignalsBioengineering10.3390/bioengineering1002016710:2(167)Online publication date: 28-Jan-2023
  • (2023)TDRLM: Stylometric learning for authorship verification by Topic-DebiasingExpert Systems with Applications10.1016/j.eswa.2023.120745233(120745)Online publication date: Dec-2023
  • Show More Cited By

Index Terms

  1. Conversationally-inspired stylometric features for authorship attribution in instant messaging

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '12: Proceedings of the 20th ACM international conference on Multimedia
    October 2012
    1584 pages
    ISBN:9781450310895
    DOI:10.1145/2393347
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 October 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. authorship attribution
    2. chat analysis
    3. re-identification

    Qualifiers

    • Poster

    Conference

    MM '12
    Sponsor:
    MM '12: ACM Multimedia Conference
    October 29 - November 2, 2012
    Nara, Japan

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Recognition algorithm for cross-texting in text chat conversationsData & Knowledge Engineering10.1016/j.datak.2023.102261150(102261)Online publication date: Mar-2024
    • (2023)Machine Learning-Based Respiration Rate and Blood Oxygen Saturation Estimation Using Photoplethysmogram SignalsBioengineering10.3390/bioengineering1002016710:2(167)Online publication date: 28-Jan-2023
    • (2023)TDRLM: Stylometric learning for authorship verification by Topic-DebiasingExpert Systems with Applications10.1016/j.eswa.2023.120745233(120745)Online publication date: Dec-2023
    • (2022)There is a fine Line between Personalization and Surveillance: Semantic User Interest Tracing via Entity-level AnalyticsProceedings of the 14th ACM Web Science Conference 202210.1145/3501247.3531592(22-33)Online publication date: 26-Jun-2022
    • (2022)Automatically Estimating the Severity of Multiple Symptoms Associated with DepressionEarly Detection of Mental Health Disorders by Social Media Monitoring10.1007/978-3-031-04431-1_11(247-261)Online publication date: 15-Sep-2022
    • (2022)Studying Dishonest Intentions in Brazilian Portuguese TextsDeceptive AI10.1007/978-3-030-91779-1_12(166-178)Online publication date: 1-Jan-2022
    • (2021)Attribution of authorship in instant messaging software applications, based on similarity measures of the stylometric features’ vectorComputer Science and Mathematical Modelling10.5604/01.3001.0015.2735(33-41)Online publication date: 30-Jun-2021
    • (2021)Writer Identification Using Microblogging Texts for Social Media ForensicsIEEE Transactions on Biometrics, Behavior, and Identity Science10.1109/TBIOM.2021.30780733:3(405-426)Online publication date: Jul-2021
    • (2021)I Feel it in Your Fingers: Inference of Self-Assessed Personality Traits from Keystroke Dynamics in Dyadic Interactive Chats2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII)10.1109/ACII52823.2021.9597389(1-8)Online publication date: 28-Sep-2021
    • (2021)A Novel Non-Invasive Estimation of Respiration Rate From Motion Corrupted Photoplethysmograph Signal Using Machine Learning ModelIEEE Access10.1109/ACCESS.2021.30953809(96775-96790)Online publication date: 2021
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media