Beware the Overfit Trap in Data Analysis

It can be exciting when your data analysis suggests a surprising or counterintuitive prediction. But the result might be due to overfitting, which occurs when a statistical model describes random noise rather than the underlying relationship you need to capture. You can guard against this trap by keeping your analysis simple. Be on guard against spurious correlations, and look for relationships that measure important effects related to clear, logical hypotheses. Test for overfitting by randomly dividing the data into a training set, with which you’ll estimate the model, and a validation set, with which you’ll test the accuracy of the model’s predictions. An overfit model might be great at making predictions within the training set but raise warning flags by performing poorly in the validation set. You might also consider alternative narratives: Is there another story you could tell with the same data? If so, you cannot be confident that the relationship you have uncovered is the right — or only — one.

Source: HBR

Beware the Overfit Trap in Data Analysis

15 Quotes to Inspire Great Teamwork

Luxury Goods in Africa: A Maturing Sector

Related Posts

Five Rules for a Successful Corporate Career

Giving back to society: Op Studios reaches out to Rising Star Orphanage, Dodowa

How to get the most value from Networking Events

8 Rules for Good Customer Service

When Companies Should Invest in Training Their Employees — and When They Shouldn’t

How To Lead a Team That Doesn’t Want To Be Led

Luxury Goods in Africa: A Maturing Sector

Obama’s Exit, Trump’s Opportunity

Welcome Back!

Retrieve your password