Today’s social scientists have access to a diverse toolbox of software for analyzing data, but when it comes to gathering the data in the first place, surprisingly few standardized solutions are available.
To tackle this challenge, Social Science Matrix is sponsoring a year-long research seminar called “T4M,” or “Technology for Measurement,” which seeks to examine the landscape of existing data-gathering tools, with an eye toward developing new approaches and training for measuring social data in all its forms.
“There has been an enormous surge in interest in data science, but that is generally focused on data that is already collected and in a nice format,” says Dav Clark, a data scientist with UC Berkeley’s D-Lab, who is leading the Matrix seminar. “Particularly in the social sciences, everything up to that point is where the hard work is. You can find small communities that have written software that does what you want, but it’s hard to adapt and integrate. This [seminar] is meant to be a survey of different approaches that people are using for data collection and management.”
The seminar is continuing the work of the Technology for Measurement (T4M) workgroup, which brings together five entities on campus—D‐Lab, Berkeley Institute for Data Science (BIDS), X‐Lab, the Big Data Psychology community, and Center for Effective Global Action (CEGA)— to advance the application of new technology solutions in field research settings. The seminar was inspired in part by a previous Matrix seminar on Behavior Measurement and Change, which explored broadly how digital technologies are changing opportunities for social scientists to observe (and change) behavior in a variety of settings.
The T4 group is focusing on the practical, hands-on challenges of gathering data in the first place. “The goal is to bring people up to sped on what to us appear to be the most useful pieces of technology, and in some cases, working toward interoperation,” Clark explains. “There has been an enormous proliferation of technology that fits a variety of needs, but there has been zero attention to inter-operation. There is still a significant amount of work for the researcher to integrate to get the picture they want to see.”
During the first part of the seminar (in Spring 2015), the seminar explored the landscape of existing solutions, focusing on four areas: environmental sensing, health-care related data collection, educational measurement, and development engineering. While each of these domains has its own unique challenges, all are plagued by shared issues, such as a lack of standardization. “They all have this measurement problem,” Clark explains. “For data analysis, people have figured out core algorithms and data structures that are very broadly useful, and they have standardized them. That’s the kind of thing we’re hoping will happen with this sort of data collection and management.”
As part of their learning, the group brought in topical experts, such as Tony Fountain, from UC San Diego, who conducts research into sensor network applications in such areas as ecology, oceanography, and civil engineering. “It was a great meeting where we managed to get people who are interested in environmental sensing, social interaction, developmental engineering, and behavior change intervention all describing their viewpoints,” Clark says. “It really was clear that there are some shared technical problems, and it is particularly challenging when you want to start crossing boundaries.”
Following the first semester’s exploratory process, Clark and his fellow seminar participants have begun developing their own basic tool for real-time data collection. “We wanted to implement something ourselves so we understand the nuts and bolts,” he says.
In addition, the group is developing a credit course on “Hacking Measurement” that will be delivered through the UC Berkeley School of Information in Fall 2015. The goal is to help social-scientists learn to “hack” (i.e. adapt) existing solutions for their respective social-data needs. “It’s a first whack at something that the university should be offering as part of training the next generation of scientists,” Clark says. “Researchers have a combination of traditional academic domain knowledge and statistical skills, but no one really knows how to teach ‘hacking skills’ in an academic context.”
One the key goals of the Matrix seminar is to help improve measurement tools for use in resource‐constrained settings, including developing nations, in part by eliminating the need for surveys and other self-reporting methods. Potential projects include using sensors to monitor the use of public goods, such as latrines, water pumps, or hand washing stations; using satellite imagery and geospatial mapping to improve farming; enhancing humanitarian responses to natural disasters; and tackling environmental challenges such as deforestation or desertification.
“All of a sudden we have the ability to really cheaply and efficiently measure what’s going on a variety of contexts, but it’s really hard to take advantage of that, because we lack the skill and we lack the basic orientation,” Clark explains. “People are struggling with these one-off tools that are specialized in a way that doesn’t cover what they need to study. That’s happening in all of these domains. This seminar is very cross-disciplinary, and it is letting them join forces.”