The impact of artificial intelligence and natural language processing on the efficiency of the business process of standardizing unstructured textual data

Authors

  • Antonija Buzov University of Split, Faculty of Economics, Business and Tourism, Split, Croatia
  • Mario Jadrić University of Split, Faculty of Economics, Business and Tourism, Split, Croatia

DOI:

https://doi.org/10.48188/so.7.8

Keywords:

natural language processing, artificial intelligence, Python, unstructured textual data

Abstract

Aim: To examine the role of natural language processing (NLP) in supporting business processes by reliably transforming user-submitted unstructured textual data, specifically requests for medicines, into standardized product entries.
Methods: We collected a dataset of 24 medicine requests which we then processed using a Python-based pipeline that combined preprocessing, BERT embeddings, and fuzzy string matching. In this context, association refers to correctly linking a free-text request to a database entry, where impact is measured through accuracy, precision, recall, and F1-score; natural language refers to the unstructured text provided by users; processing denotes the computational steps used to clean, tokenize, and match the data; and the business process involves transforming user-submitted unstructured requests into structured database records.
Results: At a similarity threshold of 95%, the model achieved 0.94 accuracy, 0.89 precision, 1.0 recall, and an F1-score of 0.941. When the threshold was reduced to 85%, performance dropped to 0.25 accuracy, mainly due to false duplicate matches. The model consistently standardized strength and form (e.g., “500 mg tab” → “500 mg Tablet”). Errors occurred when distinct medicines had highly similar names.
Conclusions: NLP methods can support the automation of unstructured textual data in business processes, provided high similarity thresholds and well-structured databases are maintained. Our findings highlight both the potential efficiency gains and the limitations of lightweight NLP models.

 

Published

2026-04-05

Most read articles by the same author(s)