Degree
Doctor of Philosophy (PhD)
Department
Computer Science
Document Type
Dissertation
Abstract
Bot-driven abuse and human-assisted automation have become persistent threats to modern web services, enabling fraud, impersonation, scraping, and large-scale misuse of online resources. While defenses such as CAPTCHAs and browser fingerprinting are widely deployed to detect automated activity, attackers increasingly rely on CAPTCHA farms, residential proxies, and other evasion techniques to bypass these protections. This dissertation investigates the problem of in-the-wild bot attacks from both a measurement and mitigation perspective, with the goal of enabling real-time visibility into abuse ecosystems and designing practical defenses that remain effective against modern attacker capabilities. The first part of this dissertation introduces two large-scale measurement systems designed to capture real-world abuse activity. The first system leverages the operational protocols of modern behavioral CAPTCHAs and human-driven CAPTCHA farms to collect the first global, real-time dataset of CAPTCHA-targeted attacks across thousands of websites. Using this vantage point, the study characterizes over four hundred thousand attacks spanning diverse abuse categories, including ticket scalping, account fraud, content scraping, and automated abuse of public services. The second measurement study focuses on human-driven impersonation attacks on social media platforms, revealing attacker persistence, reuse of payment infrastructure, and adaptive evasion strategies that enable real-time victim targeting. Building on insights from these measurements, the second part of this dissertation focuses on strengthening practical defenses against bot-driven abuse. Browser fingerprinting is widely used in bot detection, but automated clients increasingly attempt to evade these signals by manipulating fingerprinting surfaces such as the HTML5 canvas element. We systematically analyze canvas evasions implemented by browser extensions and privacy-focused browsers, evaluating nine extensions and the built-in protections of five major browsers. Based on this analysis, we introduce two defensive techniques—Pixel-Recovery and Statistical Recovery—that enable websites to recover stable canvas-derived signals despite rendering perturbations. Finally, this dissertation introduces HWAM, a hardware-backed web authentication mechanism that strengthens CAPTCHA-based defenses against human-assisted bot attacks. HWAM leverages the Web Authentication API (WebAuthn) to bind CAPTCHA challenges and tokens to physical devices using trusted execution environments. By requiring fresh device-bound proofs for protected actions, HWAM prevents attackers from relaying solved challenges across distributed CAPTCHA farm networks and substantially increases the operational cost of large-scale abuse while preserving usability for legitimate users. Together, these contributions provide a holistic view of the modern web abuse ecosystem, combining large-scale empirical measurement with practical mitigation strategies. This dissertation demonstrates that effective defenses against bot-driven abuse require not only stronger mechanisms but also accurate, in-the-wild understanding of attacker behavior, enabling security systems to adapt alongside evolving abuse techniques.
Date
3-6-2026
Recommended Citation
Nguyen, Hoang Dai, "Real-Time Tracking and Mitigation Against In-the-Wild Bot Attacks and Human-Driven Abuse on the Web" (2026). LSU Doctoral Dissertations. 7003.
https://repository.lsu.edu/gradschool_dissertations/7003
Committee Chair
Vadrevu, Phani
LSU Acknowledgement
1
LSU Accessibility Acknowledgment
1