Mohammad Hammas Saeed

Postdoctoral Associate

George Washington University

Area of Expertise: Applied AI for Cybersafety

Mohammad Hammas Saeed is a postdoctoral associate at George Washington University. His research focuses on the intersection of applied AI/machine learning, cybersafety, and computational social science, emphasizing the understanding, detection and mitigation of online harms through a data-driven approach. Saeed identifies pressing issues in cyberspace and develops tools to counter malicious behavior.

Featured Publications

Mohammad Hammas Saeed, Shiza Ali, Pujan Paudel, Jeremy Blackburn, and Gianluca Stringhini. 2024. Unraveling the Web of Disinformation: Exploring the Larger Context of State-Sponsored Influence Campaigns on Twitter. In Proceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses (RAID '24). Association for Computing Machinery, New York, NY, USA, 353–367.

Social media platforms offer unprecedented opportunities for connectivity and exchange of ideas; however, they also serve as fertile grounds for the dissemination of disinformation. Over the years, there has been a rise in state-sponsored campaigns aiming to spread disinformation and sway public opinion on sensitive topics through designated accounts, known as troll accounts. Past works on detecting accounts belonging to state-backed operations focus on a single campaign. While campaign-specific detection techniques are easier to build, there is no work done on developing systems that are campaign-agnostic and offer generalized detection of troll accounts unaffected by the biases of the specific campaign they belong to.
In this paper, we identify several strategies adopted across different state actors and present a system that leverages them to detect accounts from previously unseen campaigns. We study 19 state-sponsored disinformation campaigns that took place on Twitter, originating from various countries. The strategies include sending automated messages through popular scheduling services, retweeting and sharing selective content and using fake versions of verified applications for pushing content. By translating these traits into a feature set, we build a machine-learning-based classifier that can correctly identify up to 94% of accounts from unseen campaigns. Additionally, we run our system in the wild and find more accounts that could potentially belong to state-backed operations. We also present case studies to highlight the similarity between the accounts found by our system and those identified by Twitter.

Full Paper
Saeed, M.H., Papadamou, K., Blackburn, J., De Cristofaro, E. and Stringhini, G. 2024. TUBERAIDER: Attributing Coordinated Hate Attacks on YouTube Videos to Their Source Communities. Proceedings of the International AAAI Conference on Web and Social Media. 18, 1 (May 2024), 1354-1366.

Alas, coordinated hate attacks, or raids, are becoming increasingly common online. In a nutshell, these are perpetrated by a group of aggressors who organize and coordinate operations on a platform (e.g., 4chan) to target victims on another community (e.g., YouTube). In this paper, we focus on attributing raids to their source community, paving the way for moderation approaches that take the context (and potentially the motivation) of an attack into consideration. We present TUBERAIDER, an attribution system achieving over 75% accuracy in detecting and attributing coordinated hate attacks on YouTube videos. We instantiate it using links to YouTube videos shared on 4chan's /pol/ board, r/The_Donald, and 16 Incels-related subreddits. We use a peak detector to identify a rise in the comment activity of a YouTube video, which signals that an attack may be occurring. We then train a machine learning classifier based on the community language (i.e., TF-IDF scores of relevant keywords) to perform the attribution. We test TUBERAIDER in the wild and present a few case studies of actual aggression attacks identified by it to showcase its effectiveness.

Full paper
Paudel, P., Saeed, M. H., Auger, R., Wells, C., & Stringhini, G. (2024). Enabling contextual soft moderation on social media through contrastive textual deviation. In 33rd USENIX Security Symposium (USENIX Security 24) (pp. 4409-4426).

Automated soft moderation systems are unable to ascertain if a post supports or refutes a false claim, resulting in a large number of contextual false positives. This limits their effectiveness, for example undermining trust in health experts by adding warnings to their posts or resorting to vague warnings instead of granular fact-checks, which result in desensitizing users. In this paper, we propose to incorporate stance detection into existing automated soft-moderation pipelines, with the goal of ruling out contextual false positives and providing more precise recommendations for social media content that should receive warnings. We develop a textual deviation task called Contrastive Textual Deviation (CTD), and show that it outperforms existing stance detection approaches when applied to soft moderation. We then integrate CTD into the state-of-the-art system for automated soft moderation Lambretta, showing that our approach can reduce contextual false positives from 20% to 2.1%, providing another important building block towards deploying reliable automated soft moderation tools on social media.

Full Paper

Mohammad Hammas Saeed

Postdoctoral Associate

George Washington University

Area of Expertise: Applied AI for Cybersafety

Featured Publications

Unraveling the Web of Disinformation: Exploring the Larger Context of State-Sponsored Influence Campaigns on Twitter

TUBERAIDER: Attributing Coordinated Hate Attacks on YouTube Videos to Their Source Communities

Enabling Contextual Soft Moderation on Social Media through Contrastive Textual Deviation