Understanding what industry a business is in proves to be relevant for nearly every aspect of customer onboarding, underwriting, and monitoring. Without knowing how a business makes its money, it’s difficult to answer any of the following questions:
In short, industry helps put everything else you know about a business in context. It tells you whether a business having a website is common or uncommon, or whether a business is relatively small or large.
The fundamental value of identifying a business’s industry is exactly why Enigma is excited to introduce our new, high-accuracy industry classification data.
Right now, nearly every financial institution that serves small businesses cites industry classification as one of its main challenges. When interviewing institutions, we found that they struggled with poor accuracy rates, antiquated taxonomies, and that their industry data lacked granularity and coverage of small businesses.
Rampant inaccuracy
Some financial institutions cited accuracy rates ranging from 25-40%, while others provided examples such as only knowing a business engaged in retail, but not knowing what that business sold. High inaccuracy rates result in financial institutions relying on manual research, creating significant inefficiency.
Systems that don’t reflect today’s businesses
We also learned about key failings in today’s industry taxonomies. NAICS, the standard taxonomy used by most institutions, is expansive and detailed. But it groups businesses in old-fashioned and unintuitive ways, leading to nonsensical groupings of businesses. My personal favorite 2-digit NAICS is “Administrative and Support and Waste Management and Remediation Services”, which covers everything from septic tank cleaning to temp agencies. Other institutions use the GICS industry taxonomy, which provides more common-sense groupings but can lack granularity.
Insufficient granularity
Granularity provides details that are often crucial to understanding if and how you want to work with a business. For example, a construction company can be a one-person plumbing contractor, a home remodeling agency, or a commercial construction company building skyscrapers. Each of these hypothetical businesses present distinct risks and opportunities, and thus getting details that go deeper than “construction company” is essential. 6-digit NAICS codes can provide much-needed detail, but they’re inconsistent in terms of specificity and range from essential to trivial. In other words, across the 1000 6-digit NAICS codes can be the difference between whether a business is a parking lot (812930) or a pet care service (812910); but these codes can also distinguish between minutiae such as whether a company engages in Dimension Stone Mining and Quarrying (212311) or Industrial Sand Mining (212322).
Low small business coverage
Financial institutions have also told us that they have very low fill rates when it comes to identifying industries for small businesses. This low coverage can again result in more manual research, but also potentially jeopardizes the extent to which you can onboard and service small business customers.
Overall, financial institutions are in a bind - forced to choose between accurate high-level classifications that lack important details, or less accurate and overly noisy granular classifications. We knew that this was a problem we had to fix, especially because accurate industry data makes our other data attributes even more powerful.
We set out to build our own industry classification system based on the following principles:
High accuracy through advanced data science
So far our industry classification is achieving accuracy rates of 2-3X higher than incumbent providers. This has major implications for financial institutions — readily-available and accurate data about every business’s industry will help them reduce risk exposure and related losses. Accurate industry data also allows institutions to minimize resources spent on manual research, a significant operational inefficiency.
We’ve attained this accuracy by building predictive models that reflect how a human would classify companies into an industry. We heard from many different companies that they spend countless time and resources plugging business names into search engines, combing through search results and business data aggregators to get detailed information. Based on this, we automated the manual investigation process. Through leveraging online and other public information about business and advanced linguistic models, we’re now able to replicate the human research process and classify industries far more accurately than current providers. Our internal accuracy bar is such that until an industry category is achieving 85% accuracy or higher, we don’t release it.
Classification that makes sense for modern businesses
Based on what we learned from our customers, we built an industry taxonomy that provides common-sense groupings of companies and reflects modern business models. We’re also integrating operations flags to capture how modern businesses operate (more details below).
Details, not noise
Our industry classification data provides as much granularity as a 4-6 digit NAICS code, which maintaining the high level of accuracy detailed above. We focus on providing detail for industries where it’s beneficial, not where it creates additional noise.
The coverage you’d expect from an SMB data provider
Lastly, our industry coverage extends to even the smallest businesses in ways that incumbent providers cannot. On average our fill rates are 10% higher, and we expect that number to get better as our small business data expands.
In the next 2 weeks, our coverage will expand from 20 industries to roughly 50 industries, including 60% of US small businesses. Further coverage expansions are planned in the coming months.
We will also be adding further nuance to industries in the form of operations flags that detail specific business activities about how companies offer goods and services. There’s more to come on this front, but we believe operational details will help our customers get an even deeper understanding of businesses and their associated risks and resiliencies.
Enigma’s beta industry classification data is available right now via our API. You can test the data for free, and we welcome your feedback. If you have any questions or would like to learn more about our industry classification data, please reach out.