College Admissions vs Graduation Rate

College Admissions vs Graduation Rate

Data Gathering


The data used in this analysis comes from the U.S. Department of Education's “College Scorecard” Database, where there is an API as well as an interactive UI. This database dates back to 1996 and provides institution-specific data for all secondary education institutions in the United States. Given this sheer amount of data, the complete raw dataset can be considered population data of all secondary educational institutions in the United States.


College Scorecard Database

Structurally, the raw data is a series of separate CSV files, one for each school year from 1996 to 2020. The original dataset includes a variety of metrics for all institutions whose primarily-awarded achievement is certificates, bachelor's degrees, or master's degrees. The included metrics range from broad topics to incredibly specific statements such as the “percentage of graduate federal student loan borrowers with all loans discharged after 2 years.” Considering the raw data's wide span of time and extreme high dimensionality, a subset of the data is created for the purposes of this research.

Interested in exploring the dataset more? Visit this link to see all of the data.