Ana Mićković | 2024
Accounting fraud destroys billions in shareholder value and undermines trust in markets. Traditional detection methods rely on financial ratios or simple text measures, but these often miss subtle cues. We wanted to know: can advanced language models that capture context in financial reports help regulators and investors detect fraud earlier and more reliably?
We trained BERT, a state-of-the-art deep learning model for language, on the MD&A sections of annual 10-K filings. We found that contextual learning substantially improves fraud detection. Our model identifies five times more fraudulent filings than standard text methods and three times more than traditional financial ratios, when comparing the same investigation sample.
Fraud detection is costly, and regulators cannot investigate every firm. A model that ranks risky firms more accurately can direct attention to where it matters most. By showing that context in financial reports carries powerful signals of misreporting, our findings demonstrate the potential of AI tools to support regulators, investors, and auditors in safeguarding markets.
Bhattacharya, I., & Mickovic, A. (2024). Accounting fraud detection using contextual language learning. International Journal of Accounting Information Systems 53: 100682.