Big data, big implications: 2020 Clifford Symposium tackles data across disciplines

Courtesy Photo


This year’s annual Clifford Symposium — titled “The Rise of Big Data” — focused on the growth of data across disciplines and its importance for the future of learning at Middlebury. The conference was held over Zoom on Thursday, Sept. 24. 


‘Truth in Numbers? Data in Environmental Science’

Tony Sjodin

The event kicked off with a panel discussion featuring the four professors that teach the core classes of the Environmental Studies major. Director of the Environmental Studies program Dan Brayton moderated the panel, and the Howard E. Woodin Colloquium series co-sponsored the event.  

Joseph Holler, assistant professor of geography, spoke about replicability in scientific studies and the issues surrounding transparency in research methods. Holler highlighted a proposed Environmental Protection Agency (EPA) rule that would require the data and methods of research used for EPA regulations to be made public, such that others can reproduce the study. He also brought up the challenges of privacy in health data and intellectual property in making research more transparent.

Holler presented his own research that used data from 2017, when emergency call centers were overwhelmed during Hurricane Harvey and people took to Twitter to ask for help. 

“I’m really concerned about breaking the science-policy relationship because of the way that we treat data computationally in our research and trying to achieve this standard of reproducibility while also preserving privacy,” Holler said.

Marc Lapin, associate laboratory professor of environmental studies, discussed the importance of pairing quantitative understanding with qualitative understanding. Lapin cited an analysis of carbon sequestration — the storage of carbon dioxide — in trees and spoke about how such data can be used to inform conservation efforts when paired with human knowledge.

“These are things that some may claim can be coded and quantified and put into these more analytical algorithmic tools, but … our human minds are the best integrators of all this information,” Lapin said.

Kathryn Morse, John C. Elder professor of environmental studies, spoke about Morse’s research on the first New Deal and case files from families who applied for help to save their farms.

“The files are full of data … as a result I’ve gotten interested in how New Deal bureaucrats made sense of all of this, what they used it for, and how they grew meaning from it,” Morse said.

Morse shared maps created from this data displaying various metrics of New Deal programs and discussed how it was used to communicate with the public and Congress. 

Christopher McGrory Klyza, Stafford professor of public policy, political science, and environmental studies, spoke about the Clean Air Act and the use of data for setting ambient air quality standards.

“The EPA goes through a multi-phase process to examine the latest public health science. In this process, the data is central and the EPA is typically reliant on scientific studies, including many large epidemiological studies,” Klyza said.


‘Clifford Symposium Welcome and Launch of MiddData’

By Genny Gottdiener 

Jason Grant, assistant professor of computer science, and Alex Lyford, assistant professor of mathematics, introduced the symposium’s second event. Panelists from a variety of departments, including environmental studies, economics and sociology, introduced the MiddData initiative, a project aimed to make the idea of data usage less intimidating.

President Laurie Patton gave a speech about the power that data science has to unite people. The exploration of data science will serve as a way to honor the symposium’s namesake, Nick Clifford, a professor emeritus of history who passed away last year, according to Patton.

“MiddData is a major new initiative to provide equitable and inclusive access to powerful tools for empirical research, data analysis and critical digital scholarship from the moment students arrive on campus,” said Morse, a co-director of the MiddData initiative.

Although the initiative is still in the works, the group aims to introduce new campus-wide programming, such as introductory courses in statistics applicable across disciplines and credit-bearing data science boot camps. They also hope to expand data courses across the curriculum, starting from first-year seminars and expanding out across all majors, and introduce a data science minor along with other interdisciplinary minors in fields like public policy and medicine. 

“We all need to be data and digital people,” said Caitlin Myers, co-director of the MiddData Initiative and professor of economics. “[This is the way we can] maintain timeless values of liberal arts in a world which is rapidly changing around us.” 


‘Bigotry Data: How Big Data and Algorithms Perpetuate Racism and Inequality’

Nicole Pollack

Linus Owens, associate professor of sociology, gave this year’s Clifford Symposium keynote address. The talk explored big data through the lens of social analysis, highlighting the ways in which big data not only reflects reality but reflects and reinforces users’ expectations of reality, ultimately encoding and replicating inequity through data.

“Big data is not a neutral process,” Owens said. “And it’s a relationship that gets rationalized.”

Owens spoke about the mechanisms through which data explicitly amplifies racial hierarchies and structural oppression. The use of data in policing can reinforce existing biases, with individuals and groups who have been disproportionately targeted in the past being more likely to then have additional data collected on them, which will increase the chances of them being targeted again, he said.

Owens said that the role of institutions like Middlebury is “to rationalize and streamline and normalize the data through our usage of it and treatment of it as real and true and objective.”

“When we think about structural racism, we have to think about the way in which it’s embedded in just about every institution in our society,” he said. “And so responding to that will require fundamental changes in the institutions that we are accustomed to.”


‘Meet Data, Your New Assistant Coach’

By Maggie Reynolds

Data in sports may be changing the game. As data usage in sports has skyrocketed, coaches at Middlebury now consider a variety of statistics to help them make strategic game-time decisions.

When Middlebury football coach Bob Ritter plans his team’s practices for the week, he uses recent game footage as well as data to track injuries and determine optimal pre-game rest for his players.

Data has also allowed coaches to be more informed while preparing for games. Women’s basketball coach KJ Krasco uses data on shot percentages for the Panthers and each opposing team to optimize the Middlebury team’s strategies. Baseball coach Mike Leonard uses data to provide players with specific cues to improve their technique and understand the tendencies of pitchers and players on opposing teams. 

Though teams at all levels have adopted data to better inform coaches and players, challenges accompany the effective use of big data at the Division III level. When professional NFL, NBA and MLB teams use data, they have a much larger sample size. For Leonard, it is not always easy to determine which data points used at the Major League level are still relevant to the substantially shorter college season. Ritter said that with a season of fewer than 10 football games, it can be risky to follow strategies indicated by the data due to such small sample sizes. Still, Krasco believes it is possible to select helpful data points and use those as tangible goals for her team. 

This readily available data has shifted the role of coaches in athletic competitions. The three coaches agreed that they try to make well-informed decisions based on the data but still take responsibility if an error happens in the game. Despite its applications, big data cannot replace coaches’ personal connections with players or their goals of helping players grow as athletes and as people. But when data is balanced with human interaction, coaches can use it to give players motivation and instill confidence in their abilities.