Email Babel: Does Language Affect Criminal Activity in Compromised Webmail Accounts?


Abstract in English

We set out to understand the effects of differing language on the ability of cybercriminals to navigate webmail accounts and locate sensitive information in them. To this end, we configured thirty Gmail honeypot accounts with English, Romanian, and Greek language settings. We populated the accounts with email messages in those languages by subscribing them to selected online newsletters. We hid email messages about fake bank accounts in fifteen of the accounts to mimic real-world webmail users that sometimes store sensitive information in their accounts. We then leaked credentials to the honey accounts via paste sites on the Surface Web and the Dark Web, and collected data for fifteen days. Our statistical analyses on the data show that cybercriminals are more likely to discover sensitive information (bank account information) in the Greek accounts than the remaining accounts, contrary to the expectation that Greek ought to constitute a barrier to the understanding of non-Greek visitors to the Greek accounts. We also extracted the important words among the emails that cybercriminals accessed (as an approximation of the keywords that they searched for within the honey accounts), and found that financial terms featured among the top words. In summary, we show that language plays a significant role in the ability of cybercriminals to access sensitive information hidden in compromised webmail accounts.

Download