A short course on collecting data from the internet, including accessing APIs and web scraping, using Python. While the course was taught in person, the materials are designed for self-paced, independent learning for social scientists with no background in Python. Each lesson includes exercises that can be completed within the notebook, along with answers.
Directions (starting from scratch):
- Read the setup notebook online. Follow the directions for installing Python.
- Download this repository by clicking on the green “Clone or Download⌄” button above. You may need to unzip the folder, depending on your operating system.
- Using the instructions in the setup file, start the Anaconda Navigator program, launch a Jupyter notebook, and navigate to the “Notebooks” folder that you downloaded in Step 2.
- The first two notebooks (
2_Python.ipynb
and3_Data.ipynb
) provide an introduction to working with Python. - The other numbered notebooks are the materials that were covered in class.
- The
Bonus
notebooks detailed some additional techniques for data collection.
The lessons can also be completed entirely online without installing anything on your computer.