Collecting and Analyzing Big Data

This course was offered Summer 2121

This course is an introduction to collecting and analyzing “big data” for social scientists. Over the last decade, the variety and types of data available to researchers have exploded. This includes not only contemporary data, such as from websites and social media platforms, but also historical data, from digitized interviews to 19th century newspapers. At the same time, analytic techniques from computer science are increasingly being used to solve social science problems. One week is not enough time to master the techniques for collecting and analyzing big data. You will, however, be able to establish the foundation for developing these skills. The course is designed as a practical overview. The emphasis each class will be on applying the specific techniques rather than on their mathematical basis. The course will provide an overview in that each lesson will introduce a new method in order to demonstrate the range methods. Combined, students will have the skills and resources to apply these methods to theoretically- relevant problems in the social sciences. The course Github repository includes the Python materials for the course as Jupyter notebooks.

This course has been taught as part of the Oslo Summer School in Comparative Social Science Studies in 2017, 2018, 2019, and as a Capita Selecta course at KU Leuven in 2019 and 2021; and at the University of North Carolina in the Fall of 2020.

Neal Caren
Associate Professor of Sociology

My research interests include social movements, protest events, web scraping, and text analysis.