True Business Data

Data is Open for Business

Why open Business data?

As of 2012, America’s small businesses—any business employing fewer than 500 employees—were responsible for a staggering 64% of net new private-sector jobs, and employ nearly half of America’s workforce (1).


Yet, despite this, the health of small business in the U.S. is hampered by a lack of understanding about this rich and varied ecosystem. The root cause: a severe lack of reliable, rich and regular data.

A True Data Opportunity

The data already exists.

Data science makes it accessible.

The Common Crawl is an open repository of web crawl data that can be accessed and analyzed by anyone. By processing the entire Common Crawl, and applying a variety of intelligent steps to the data, we are able to turn the Web into a dataset of businesses. Discover more about the Common Crawl or our process below.

The Common Crawl The TBD Handbook

True Business Data: Making the case

Click to advance slides

The True Buiness Team

This project was conducted as part of Capstone Course w210, MIDS program at UCBerkeley . The team: