Social Science / Data Science

Doing Academic Research with Amazon Mechanical Turk

Amazon Mechanical Turk (MTurk) has become increasingly popular as an online tool for conducting social science research. What are the specific advantages and downsides of using online crowdsourcing tools like MTurk for conducting research? What practical and/or moral dilemmas might emerge in the course of the research process, and what concrete strategies have scientists developed to address them?

Presented as part of the Social Sciences and Data Science event series, co-sponsored by the UC Berkeley D-Lab, a panel discussion recorded on October 1, 2021 brought together researchers from diverse disciplines, who shared their experience with the MTurk platform and discussed social and ethical aspects of MTurk more generally.

Moderated by Serena Chen, Professor and Chair of Psychology and the Marian E. and Daniel E. Koshland, Jr. Distinguished Chair for Innovative Teaching and Research at UC Berkeley, the panel featured Ali Alkhatib, Interim director of the Center for Applied Data Ethics at the University of San Francisco; Stefano DellaVigna, Daniel Koshland, Sr. Distinguished Professor of Economics and Professor of Business Administration at UC Berkeley; and Gabriel Lenz, Professor of Political Science at UC Berkeley.

“MTurk has been a huge boon to the social sciences in general, partly because along with a lot of other online platforms, it has reduced the cost, especially the administrative costs, of running experiments,” Lenz said. “MTurk has lots of issues you all should be aware of. But it’s still been a net positive and helped helped us understand real-world problems and real-world behaviors.”

Lenz said that researchers should be wary of assuming MTurk provides a representative sample of large populations, though he noted that there may be some predictability in when and how MTurk is not representative, based on what is known about the platform’s “worker” population.

“Demographically, this is not a representative sample of the US population, and you should never treat it that way,” Lenz said. “If you’re hoping to generalize your findings to the US population, don’t. But the argument for it is that it’s a more diverse sample than your typical lab sample.”

Lenz also warned that researchers should be attuned to bias based on “social desirability,” as MTurk survey respondents may not input their honest opinions. And there may also be bias due to workers’ high level of exposure to information about certain topics, such as politics. He recommended using real-world examples, rather than hypotheticals, to encourage more candid responses. “Try to use Mechanical Turk in ways that you’ll know will reflect more on the real world,” Lenz advised. “For example, we always try to ask people about their actual members of Congress when we’re doing studies on voting.”

One of the trade-offs with using a paid survey service such as Mechanical Turk, Lenz noted, is that the more you pay, the more people appear to attempt to cheat or use bots to shortcut the survey process. “You want to pay people more, but you don’t want people trying to do the study many times,” Lenz said. “Everybody struggles with this.”

In his talk, Stefano DellaVigna talked about how MTurk has made it more efficient to replicate studies without high investment. “It is wonderful to be able to have this quick access to obtain data and evaluate replicability,” DellaVigna said.

He also praised the platform for enabling research during the pandemic, and for allowing graduate students to conduct small-scale studies to gather initial results; he shared an anecdote about a PhD student who came up with a question and ran a study on MTurk in a matter of hours. “It is so empowering and lowers inequality in access to study samples,” he said.

In his talk, Ali Alkhatib from the Center for Applied Data Ethics explained that he is less of a user of MTurk than a researcher focused on understanding the workers behind the platform. “I have been studying the crowd workers themselves, and what they are experiencing as they as they engage with these platforms,” Alkhatib explained.

He noted that researchers should keep in mind the circumstances of the workers on MTurk and similar platforms, who often are struggling to make a living. He noted that, if the workers are in communication with each other, it may be because “they’re not trying to game the system; they’re just trying to not get stiffed. These workers are highly networked and and talking with each other and trying to exchange notes.”

He also explained that researchers should work to build trust in the MTurk community, and gain an understanding of how the platform works before diving in. “Mechanical Turk is very much a community, very much a culture,” he said. “Think of this as a relationship that you try to foster and build and nurture, because these are people, and as much as we would like to think that they pass through and are stateless, the reality is that they are human beings who are just as affected by the research and the treatments that we that we bring to them as as anybody else.”

Alkhatib said that researchers should be “as clear as possible” and “as communicative as possible,” while also trying to be “as humane as possible to the people that we’re working with. It also leads you to a much richer sort of understanding of why you get certain findings or why things don’t necessarily add up.”

“Mechanical Turk is not a panacea,” Alkhatib said. “It doesn’t solve all the problems, but it solves some of them, or it may ameliorate some of them. But we do need to be conscious of how it shifts other problems around as well.”

You May Like

Podcast

Interview

Published October 12, 2021

Politics of Indigeneity in El Salvador

In this episode of the Matrix podcast, Julia Sizek, PhD candidate in anthropology, interviews Hector Callejas, a PhD candidate in Ethnic Studies at UC Berkeley and a 2021-2022 ACLS/Mellon Dissertation Completion fellow. Sizek and Callejas discuss how Indigeneity is understood in El Salvador, as well as contemporary Indigenous movements in El Salvador.

Learn More >

Berkeley Conversation

Recap

Published September 23, 2021

Berkeley Conversation: Defending Against Disinformation

On September 21, UC Berkeley Public Affairs presented a panel discussion focused on the proliferation of disinformation and what can be done about it. The panel included: Geeta Anand, dean of the School of Journalism; Erwin Chemerinsky, dean of Berkeley Law; Hany Farid, associate dean and head of the School of Information; Susan D. Hyde, chair of the Department of Political Science; john powell, director of the Othering & Belonging Institute; and moderator Henry Brady, former dean of the Goldman School of Public Policy.

Learn More >

Podcast

Interview

Published September 16, 2021

A New Voice for Black History: Xavier Buck, PhD

In this episode of the Matrix Podcast, Julia Sizek interviews Xavier Buck, Deputy Director of the Dr. Huey P. Newton Foundation. Buck graduated with a PhD in History from UC Berkeley in 2021. The discussion focuses on Buck’s work in public history, including his @historyin3 channel (which can be found on TikTok and Instagram), his current work at the Huey P. Newton Foundation, and his dissertation research, which shows connections between Black experiences in Louisiana and California in the 20th century.

Learn More >