At the Women in Data Science conference held at UC Berkeley this past week, four educators affiliated with the School of Information presented to a crowd of over 60 attendees. The event was hosted by Master of Information and Data Science (MIDS) continuing lecturer Joyce Shen, who invited various industry and faculty leaders to speak on topics related to data science and women in STEM.
A Rise in Female Data Scientists
First on the docket was Catherine Cronquist-Browning, former Assistant Dean of Academic Programs and of Equity and Inclusion at the I School and current Assistant Vice Provost and Chief of Staff at the Division of Undergraduate Education. Upon starting, Cronquist-Browning addressed the room, asking attendees to raise their hands if they were often the “only woman in the room.” A majority did.
In fact, it seemed that this was commonplace for the industry as a whole. Cronquist-Browning revealed that only 26% of all data analytics professionals in the U.S. were women, and the number was even lower worldwide. However, she also noted that this was beginning to change. In the MIDS program alone, she pointed out a 9% increase of women enrolled in the program since its inception and suggested that a preparatory class such as a Python boot camp could have played a part in the increase.
In hopes of seeing this trend continue, she then pointed out various ways to make a difference in the space. First, she implored attendees to create a supportive environment and check in with their peers to encourage them. She also recommended that the audience offer mentorship, plan community-building opportunities, and avoid gatekeeping when hiring by being clear about expectations and requirements.
On the other hand, Conquist-Browning warned the audience about falling on stereotypes and tokenism. She cautioned against making assumptions about the ways women think and advocated respect for the diversity of their experiences.
Show and Tell: Researching Information Visualization
I School Interim Dean and Professor Marti Hearst then took the stage to discuss her information visualization research in the relationship between visuals and text in charts. In her presentation, she highlighted the various approaches for data visualization, from all text to all visual to mixed options, and described the conflicting opinions about preferences. She explained that, while many people preferred a mixture of text and visuals in a chart, quite a number of people preferred text alone.
Delving deeper, Hearst discovered that text and visuals were good for different things. Visuals were often helpful in showing overall trends within a data set, whereas text often made the numbers in the dataset stand out. When paired together, however, Hearst noted that people were not good at taking advantage of each element’s benefits.
In her talk, she also explored other intersecting ideas in the field of information visualization, such as fluent reading, cognitive theories, and the effects of generative AI. One key finding she discussed was how hyperlinks and visualizations embedded in text could distract readers and interrupt their reading process, in turn affecting their ability to understand the text. As a result, Hearst identified a need for further research in this field, such as looking into how people interpret visualizations, the role of bias and misleading representations, and visualization for people who are sight impaired.
Data Science and Drug Discoveries: A New Model
Alum Brittney Vierra (MIDS ’21) currently serves as the associate director of data science at the biotech company Recursion. In her talk, she discussed how she has used artificial intelligence and machine learning in the field of computational biology, particularly in pharmaceuticals. “The drug discovery process is failing,” she declared, pointing to the industry’s inability to recapitulate academic literature, siloing of data, and reliance on archaic methods of data keeping such as PDFs and scanned printouts to track results.
In her role, however, she has been making strides to modernize the process. For example, she aims to redefine the use of technology to use it not to scale data, but to understand it. By creating a database of all compounds and genetic knockout states, she could teach a machine to identify key cell phenotypes to predict and test the effectiveness of a new drug.
As a result of her efforts, her team at Recursion recently demonstrated their new product: the LLM-Orchestrated Workflow Engine (LOWE). The engine helps simplify the drug discovery process by orchestrating complex workflows and generating novel compounds and scheduling them for synthesis and experimentation. Vierra hopes that with the help of LOWE, drug discovery scientists will be able to utilize data to make the drug discovery process more efficient and less complicated.
A.I. Meets Human Values: The Present and the Future
Assistant Professor Morgan Ames was the conference’s final speaker and gave a talk on the relationship between artificial intelligence and human values. She addressed the prevailing fear of AI taking over jobs and introduced the concept of the socio-technical gap, which states that there is a divide between what people want and what technology can do. By doing so, she reassured audiences that technology still requires human intervention.
Whereas there was a push to automate the technology or make it more human-like in the past, Ames pointed out that there was an increased interest in augmentation or playing to the strengths of both humans and machines. Despite this interest, however, the augmentation process is not perfect. Companies have taken advantage of this to employ people to do ghost work, which can be dehumanizing, isolating, and economically unviable.
Questions have also arisen about who takes responsibility when technology fails, especially as self-driving car companies such as Cruise fired an entire team of engineers for an accident that occurred in San Francisco.
Despite ongoing media discourse about such technology and its setbacks, Professor Ames reminded the audience about their ability to choose. She explained that while AI will continue to be built and anxieties about technology will continue to grow, humans are ultimately the ones who give machines meaning and choose whether to engage with the technology.
WiDS Berkeley is independently organized by the UC Berkeley School of Information and co-sponsored by the College of Computing, Data Science, and Society (CDSS), the Center for Information Technology Research in the Interest of Society and the Banatao Institute (CITRIS), and other UC Berkeley partners to be part of the mission to increase participation of women in data science and to feature outstanding women doing outstanding work.