Who Broke Mechanical Turk?
Catherine Marshall
Crowdsourcing platforms provide a valuable way to perform a wide range of human intelligence tasks — e.g., data labeling, content moderation, text translation, citizen science — as well as a convenient venue for collecting participant data. I’ve been using Amazon Mechanical Turk in various capacities since 2010, and have followed worker forums, labor organizing efforts, and the development of worker-centered tools (on one side) and increasingly sophisticated uses of the crowd (on the other). Early on, my colleagues and I were (perhaps naively) delighted by the quality of the data we gathered and by generally positive interactions we had with workers. Using practical advice from the literature, we were able to vet work and encourage good-faith participation in our studies.
More recently, a handful of researchers from diverse disciplines who use crowdsourcing platforms have described an uptick in unusable data from US-based workers. Frank Shipman and I saw this ourselves in 2018 and 2019 when we re-ran a survey we’d used successfully five years earlier: by 2019, we had to exclude more than 12% of the completed HITs according to our established cleaning heuristics. Even knowing this, what we saw on Mechanical Turk this spring and summer startled us. Almost 90% of the data was unusable. In this talk, I’ll use a preliminary analysis of our own and other researchers’ data in an effort to explain what seems to be happening on Mechanical Turk, present evidence of why it’s not necessarily a symptom of bots, autocompletion tools, or bad faith work, and speculate why Amazon has little incentive to do anything about it.
This seminar will be held both online & in person. You are welcome to join us either in South Hall or via Zoom.
For online participants
Online participants must have a Zoom account and be logged in. Sign up for your free account here. If this is your first time using Zoom, please allow a few extra minutes to download and install the browser plugin or mobile app.
Cathy Marshall is an adjunct professor in the Department of Computer Science and Engineering at Texas A&M University. She was previously a principal researcher at Microsoft Research, Silicon Valley.