How can independent researchers reliably detect bias, discrimination, and other systematic errors in software-based decision-making systems?

In a project with Austin Hounsel (Princeton) and Nicholas Feamster (Chicago), we prototyped software for volunteers to participate in a crowd-sourced audit— using tech companies’ political advertising policies as our case study.

Facebook, Google, and Twitter review millions of ads to protect American elections from unlawful influence and to disclose legitimate campaign activity. But how often do they make mistakes? We submitted hundreds of non-election ads to Facebook and Google, recorded their responses, and analyzed the results. (Findings from this study apply only to the 2018 US midterm election)

We found evidence of systematic errors by Facebook, which blocked civically-meaningful ads to national parks, churches, and veterans day celebrations that weren’t election-related. We did not find statistically-significant differences between left-leaning and right-leaning content that Google or Facebook blocked.

We also published a computer science article about the design challenge of creating software to support crowd-sourced audits of algorithmic decision-makers:

More About the Software for Volunteer, Crowd-Sourced Audits

More About our Audit of Facebook and Google