Information Course Schedule Spring 2021
Upper-Division
Surveying history through the lens of information and information through the lens of history, this course looks across time to consider what might distinguish ours as “the information age” and what that description implies about the role of “information technology” across time. We will select moments in societies’ development of information production, circulation, consumption, and storage from the earliest writing and numbering systems to the world of Social Media. In every instance, we’ll be concerned with what and when, but also with how and why. Throughout we will keep returning to questions about how information-technological developments affect society and vice versa?
Three hours of lecture per week. Methods and concepts of creating design requirements and evaluating prototypes and existing systems. Emphasis on computer-based systems, including mobile system and ubiquitous computing, but may be suitable for students interested in other domains of design for end-users. Includes quantitative and qualitative methods as applied to design, usually for short-term term studies intended to provide guidance for designers. Students will receive no credit for 114 after taking 214.
This course applies economic tools and principles, including game theory, industrial organization, information economics, and behavioral economics, to analyze business strategies and public policy issues surrounding information technologies and IT industries. Topics include: economics of information goods, services, and platforms; economics of information and asymmetric information; economics of artificial intelligence, cybersecurity, data privacy, and peer production; strategic pricing; strategic complements and substitutes; competition and antitrust; Internet industry structure and regulation; network cascades, network formation, and network structure.
This course introduces students to natural language processing and exposes them to the variety of methods available for reasoning about text in computational systems. NLP is deeply interdisciplinary, drawing on both linguistics and computer science, and helps drive much contemporary work in text analysis (as used in computational social science, the digital humanities, and computational journalism). We will focus on major algorithms used in NLP for various applications (part-of-speech tagging, parsing, coreference resolution, machine translation) and on the linguistic phenomena those algorithms attempt to model. Students will implement algorithms and create linguistically annotated data on which those algorithms depend.
Graduate
Introduces the data sciences landscape, with a particular focus on learning data science techniques to uncover and answer the questions students will encounter in industry. Lectures, readings, discussions, and assignments will teach how to apply disciplined, creative methods to ask better questions, gather data, interpret results, and convey findings to various audiences. The emphasis throughout is on making practical contributions to real decisions that organizations will and should make.
This course is designed to be an introduction to the topics and issues associated with information and information technology and its role in society. Throughout the semester we will consider both the consequence and impact of technologies on social groups and on social interaction and how society defines and shapes the technologies that are produced. Students will be exposed to a broad range of applied and practical problems, theoretical issues, as well as methods used in social scientific analysis. The four sections of the course are: 1) theories of technology in society, 2) information technology in workplaces 3) automation vs. humans, and 4) networked sociability.
This course uses examples from various commercial domains — retail, health, credit, entertainment, social media, and biosensing/quantified self — to explore legal and ethical issues including freedom of expression, privacy, research ethics, consumer protection, information and cybersecurity, and copyright. The class emphasizes how existing legal and policy frameworks constrain, inform, and enable the architecture, interfaces, data practices, and consumer facing policies and documentation of such offerings; and, fosters reflection on the ethical impact of information and communication technologies and the role of information professionals in legal and ethical work.
This course addresses concepts and methods of user experience research, from understanding and identifying needs, to evaluating concepts and designs, to assessing the usability of products and solutions. We emphasize methods of collecting and interpreting qualitative data about user activities, working both individually and in teams, and translating them into design decisions. Students gain hands-on practice with observation, interview, survey, focus groups, and expert review. Team activities and group work are required during class and for most assignments. Additional topics include research in enterprise, consulting, and startup organizations, lean/agile techniques, mobile research approaches, and strategies for communicating findings.
"Behavioral Economics" is one important perspective on how information impacts human behavior. The goal of this class is to deploy a few important theories about the relationship between information and behavior, into practical settings — emphasizing the design of experiments that can now be incorporated into many 'applications' in day-to-day life. Truly 'smart systems' will have built into them precise, testable propositions about how human behavior can be modified by what the systems tell us and do for us. So let's design these experiments into our systems from the ground up! This class develops a theoretically informed, practical point of view on how to do that more effectively and with greater impact.
Discusses application of social psychological theory and research to information technologies and systems; we focus on sociological social psychology, which largely focuses on group processes, networks, and interpersonal relationships. Information technologies considered include software systems used on the internet such as social networks, email, and social games, as well as specific hardware technologies such as mobile devices, computers, wearables, and virtual/augmented reality devices. We examine human communication practices, through the lens of different social psychology theories, including: symbolic interaction, identity theories, social exchange theory, status construction theory, and social networks and social structure theory.
Three hours of lecture per week. This course applies economic tools and principles, including game theory, industrial organization, information economics, and behavioral economics, to analyze business strategies and public policy issues surrounding information technologies and IT industries. Topics include: economics of information goods, services, and platforms; economics of information and asymmetric information; economics of artificial intelligence, cybersecurity, data privacy, and peer production; strategic pricing; strategic complements and substitutes; competition and antitrust; Internet industry structure and regulation; network cascades, network formation, and network structure.
The introduction of technology increasingly delegates responsibility to technical actors, often reducing traditional forms of transparency and challenging traditional methods for accountability. This course explores the interaction between technical design and values including: privacy, accessibility, fairness, and freedom of expression. We will draw on literature from design, science and technology studies, computer science, law, and ethics, as well as primary sources in policy, standards and source code. We will investigate approaches to identifying the value implications of technical designs and use methods and tools for intentionally building in values at the outset.
The design and presentation of digital information. Use of graphics, animation, sound, visualization software, and hypermedia in presenting information to the user. Methods of presenting complex information to enhance comprehension and analysis. Incorporation of visualization techniques into human-computer interfaces. Three hours of lecture and one hour of laboratory per week.
Provides a theoretical and practical introduction to modern techniques in applied machine learning. Covers key concepts in supervised and unsupervised machine learning, including the design of machine learning experiments, algorithms for prediction and inference, optimization, and evaluation. Students will learn functional, procedural, and statistical programming techniques for working with real-world data.
This course is a survey of web technologies that are used to build back-end systems that enable rich web applications. Utilizing technologies such as Python, Flask, Docker, RDBMS/NoSQL databases, and Spark, this class aims to cover the foundational concepts that drive the web today. This class focuses on building APIs using micro-services that power everything from content management systems to data engineering pipelines that provide insights by processing large amounts of data. The goal of this course is to provide an overview of the technical issues surrounding back-end systems today, and to provide a solid and comprehensive perspective of the web’s constantly evolving landscape.
This course introduces students to natural language processing and exposes them to the variety of methods available for reasoning about text in computational systems. NLP is deeply interdisciplinary, drawing on both linguistics and computer science, and helps drive much contemporary work in text analysis (as used in computational social science, the digital humanities, and computational journalism). We will focus on major algorithms used in NLP for various applications (part-of-speech tagging, parsing, coreference resolution, machine translation) and on the linguistic phenomena those algorithms attempt to model. Students will implement algorithms and create linguistically annotated data on which those algorithms depend.
This course will cover new interface metaphors beyond desktops (e.g., for mobile devices, computationally enhanced environments, tangible user interfaces) but will also cover visual design basics (e.g., color, layout, typography, iconography) so that we have systematic and critical understanding of aesthetically engaging interfaces. Students will get a hands-on learning experience on these topics through course projects, design critiques, and discussion, in addition to lectures and readings. Two hours of lecture per week.
Three hours of seminar per week. This seminar reviews current literature and debates regarding Information and Communication Technologies and Development (ICTD). This is an interdisciplinary and practice-oriented field that draws on insights from economics, sociology, engineering, computer science, management, public health, etc.
New Venture Discovery introduces students to the process of launching an information-intensive venture — a social enterprise, business startup, or venture inside an established organization. It is motivated by the recognition that new enterprises fail more often from lack of customers than flaws in technology or product development. The course takes an iterative, design-oriented, and feedback-driven approach to the search process: identifying a problem or need to address, developing a prototype, discovering customers, refining the concept, testing and validating demand, and developing a sustainable business model.
As new sources of digital data proliferate in developing economies, there is the exciting possibility that such data could be used to benefit the world’s poor. Through a careful reading of recent research and through hands-on analysis of large-scale datasets, this course introduces students to the opportunities and challenges for data-intensive approaches to international development. Students should be prepared to dissect, discuss, and replicate academic publications from several fields including development economics, machine learning, information science, and computational social science. Students will also conduct original statistical and computational analysis of real-world data.
The Future of Cybersecurity Reading Group (FCRG) is a two-credit discussion seminar focused on cybersecurity. In the seminar, graduate, professional, and undergraduate students discuss current cybersecurity scholarship, notable cybersecurity books, developments in the science of security, and evolving thinking in how cybersecurity relates to political science, law, economics, military, and intelligence gathering. Students are required to participate in weekly sessions, present short papers on the readings, and write response pieces. The goals of the FCRG are to provide a forum for students from different disciplinary perspectives to deepen their understanding of cybersecurity and to foster and workshop scholarship on cybersecurity.
Data and the algorithmic systems are ubiquitous in everyday life. These data encode our daily choices, actions, and behaviors, as well as our more persistent social identities. They also enrich the lives of some while limiting the life chances of others. In this way, data generated and collected about us form a type of information infrastructure: pervasive, hidden, and at times insidious. As technology and data-driven systems increasingly enter into our public, professional, and personal spheres, more of these worlds become encoded in data and result in shifts in the power relations within those worlds. In a word, data is a medium which reconfigures power.
In this seminar, we will engage readings around data, power, and infrastructure, drawing from a number of interdisciplinary academic, artistic, and activist traditions. We’ll discuss topics related to state projects of legibility and quantification; the genealogy of the modern data subject; the politics of classification systems; the surveillance of Blackness and the carceral logics of technology; administrative violence and trans and gender non-conforming identities; the invisible labor powering data-driven systems; and the resistances, obfuscations, and refusals to datafication and surveillance.
In this group study class, we will cover the material in Data 8 using the online Data 8X a three-part professional certificate program in data science from UC Berkeley. This first course, “Computational Thinking with Python,” focuses on programming and data visualization. The second course, “Inferential Thinking by Resampling,” will focus on statistical inference. The third course is “Prediction and Machine Learning.”
This group study is intended for graduate students in professional schools who seek an introduction to data science in order to integrate techniques into their domain or to pursue further educational opportunities such as the graduate certificate in applied data science. The class format is essentially self-guided: students will watch the video lecture and complete the assignments before class, and then meet to discuss the lesson. Undergraduate assistants from Data 8 will coach class participants as necessary. There are small class projects that allow students to work with their own datasets.
Data 8X is based on a rigorous first-year undergraduate course at UC Berkeley called Foundations of Data Science. Over 1,000 students take this course each semester. The course is designed as an introduction to programming and statistics for students from many different majors. It teaches practical techniques that apply across many disciplines, and also serves as the technical foundation for more advanced courses in data science, statistics, and computer science.
No prior programming experience is necessary, but many of the programming techniques covered in this course do not appear in a typical introduction to programming. The programming content of this course focuses on manipulating data tables, rather than building software applications. Therefore, students who take the course after taking other programming courses often learn a new approach to programming that they haven't encountered before.
For firms and organizations that handle personal data, the desire to extract valuable information and insight must be balanced against the privacy interests of individuals. This task has grown considerably harder in the last few decades, with the development of advanced learning algorithms that can leverage statistical patterns to infer personal information. As a result, databases that were recently considered anonymized have been proven vulnerable to attack. Starting with the seminal definition of differential privacy, researchers are now responding with a new generation of algorithmic techniques, based on strong adversary models and offering mathematical bounds on worst-case privacy loss. This course is an introduction to the field known as formal privacy or differential privacy. It includes both foundational theory and algorithmic techniques for building private algorithms. A particular focus is placed on algorithms for statistical learning, and to research that incorporates a statistical perspective.
The first third of the course is structured like a bootcamp, with problem sets to build fluency in the most common mathematical structures used in the field. The latter two-thirds of the course is structured like a research seminar, with student-led discussion of published articles each week. The course completes with a final research project, giving students a chance to develop new algorithms, extend theoretical results, or build systems that incorporate formal privacy guarantees.
In this class students will continue research projects from INFO 217A. HCI research. The class includes weekly one-on-one meetings with each project team. Students will read literature related to their project assigned by the instructor and continue their projects. The final deliverable for the class will be a full conference or journal paper.
How do you create a concise and compelling User Experience portfolio? Applying the principles of effective storytelling to make a complex project quickly comprehensible is key. Your portfolio case studies should articulate the initial problem, synopsize the design process, explain the key decisions that moved the project forward, and highlight why the solution was appropriate. This course will include talks by several UX hiring managers who will discuss what they look for in portfolios and common mistakes to avoid.
Students should come to the course with a completed project to use as the basis for their case study; they will finish with a completed case study and repeatable process. Although this class focuses on UX, students from related fields who are expected to share examples and outcomes of past projects during the interview process (data science, product management, etc.) are welcome to join.
This class will cover the principles and practices of managing data at scale, with a focus on use cases in data analysis and data preparation for machine learning. We will cover the entire life cycle of data management and science, ranging from data preparation to exploration, visualization and analysis, to machine learning and collaboration.
The class will balance foundational concerns with exposure to practical languages, tools, and real-world concerns. We will study the foundations of prevalent data models in use today, including relations, tensors, and dataframes, and mappings between them. We will study SQL as a means to query and manipulate data at scale, including performance concerns like views and indexes, query processing and optimization, and transactions, all from a user perspective. We will study the foundations and realities of data preparation, including hands-on work with real-world data using standard Python and SQL frameworks. We will explore data exploration modalities for non-programmers, including the fundamentals behind spreadsheet systems and interactive visual analytics packages. We will look at approaches for managing the machine learning lifecycle of data preparation, model selection and training, model serving and monitoring. Time permitting we will look at technologies for moving, sharing, and caching data including event streaming systems, key-value/document stores, log analytics, and search engines.
In this course you’ll learn industry-standard agile and lean software development techniques such as test-driven development, refactoring, pair programming, and specification through example. You’ll also learn good object-oriented programming style. We’ll cover the theory and principles behind agile engineering practices, such as continuous integration and continuous delivery.
This class will be taught in a flip-the-classroom format, with students programming in class. We'll use the Java programming language. Students need not be expert programmers, but should be enthusiastic about learning to program. Please come to class with laptops, and install IntelliJ IDEA community edition. Students signing up should be comfortable writing simple programs in Java (or a Java-like language such as C#).
An intensive weekly discussion of current and ongoing research by Ph.D. students with a research interest in issues of information (social, legal, technical, theoretical, etc.). Our goal is to focus on critiquing research problems, theories, and methodologies from multiple perspectives so that we can produce high-quality, publishable work in the interdisciplinary area of information research. Circulated material may include dissertation chapters, qualifying papers, article drafts, and/or new project ideas. We want to have critical and productive discussion, but above all else we want to make our work better: more interesting, more accessible, more rigorous, more theoretically grounded, and more like the stuff we enjoy reading.
One hour colloquium per week. Must be taken on a satisfactory/unsatisfactory basis. Prerequisites: Ph.D. standing in the School of Information. Colloquia, discussion, and readings designed to introduce students to the range of interests of the school.
Topics in information management and systems and related fields. Specific topics vary from year to year. May be repeated for credit, with change of content. May be offered as a two semester sequence.