An artificial intelligence-enabled industry classification and its interpretation

Daejin Kim, Hyoung Goo Kang, Kyounghun Bae, Seongmin Jeon

Research output: Contribution to journalArticlepeer-review


Purpose: To overcome the shortcomings of traditional industry classification systems such as the Standard Industrial Classification Standard Industrial Classification, North American Industry Classification System North American Industry Classification System, and Global Industry Classification Standard Global Industry Classification Standard, the authors explore industry classifications using machine learning methods as an application of interpretable artificial intelligence (AI). Design/methodology/approach: The authors propose a text-based industry classification combined with a machine learning technique by extracting distinguishable features from business descriptions in financial reports. The proposed method can reduce the dimensions of word vectors to avoid the curse of dimensionality when measuring the similarities of firms. Findings: Using the proposed method, the sample firms form clusters of distinctive industries, thus overcoming the limitations of existing classifications. The method also clarifies industry boundaries based on lower-dimensional information. The graphical closeness between industries can reflect the industry-level relationship as well as the closeness between individual firms. Originality/value: The authors’ work contributes to the industry classification literature by empirically investigating the effectiveness of machine learning methods. The text mining method resolves issues concerning the timeliness of traditional industry classifications by capturing new information in annual reports. In addition, the authors’ approach can solve the computing concerns of high dimensionality.

Original languageEnglish
JournalInternet Research
StateAccepted/In press - 2021
Externally publishedYes


  • Autoencoder
  • Dimensionality reduction
  • Firm similarity
  • Industry classification
  • Machine learning
  • Text mining


Dive into the research topics of 'An artificial intelligence-enabled industry classification and its interpretation'. Together they form a unique fingerprint.

Cite this