Predicting Levels of Earthquake Damage: A Comparison of Classification Models

Predicting Levels of Earthquake Damage: A Comparison of Classification Models

Introduction


The Nepal earthquake of 2015 (also known as the Gorkha Earthquake) was a catastrophic earthquake that struck near the city of Kathmandu and resulted in around 9,000 casualties and more than 600,000 buildings being structurally damaged. Following the earthquake, Nepal carried out a survey to assess building damage in the earthquake-affected districts. The survey data was open to the public through the 2015 Nepal Earthquake Open Data Portal and is one of the largest post-disaster datasets collected with information on earthquake impact, household conditions, and socio-economic demographics.

Our team competed in the “Richter's Predictor: Modeling Earthquake Damage” challenge hosted by DrivenData, which prompts competitors to use the data from this survey to develop the best statistical model for predicting the level of damage sustained by different buildings. There are three levels of structural damage to predict: (1) low damage, (2) medium damage, and (3) near-complete destruction. In order to produce the best results, a series of classification models were built and tuned to optimize the accuracy of damage prediction.