The impact of artificial intelligence and natural language processing on the efficiency of the business process of standardizing unstructured textual data
DOI:
https://doi.org/10.48188/so.7.8Keywords:
natural language processing, artificial intelligence, Python, unstructured textual dataAbstract
Aim: To examine the role of natural language processing (NLP) in supporting business processes by reliably transforming user-submitted unstructured textual data, specifically requests for medicines, into standardized product entries.
Methods: We collected a dataset of 24 medicine requests which we then processed using a Python-based pipeline that combined preprocessing, BERT embeddings, and fuzzy string matching. In this context, association refers to correctly linking a free-text request to a database entry, where impact is measured through accuracy, precision, recall, and F1-score; natural language refers to the unstructured text provided by users; processing denotes the computational steps used to clean, tokenize, and match the data; and the business process involves transforming user-submitted unstructured requests into structured database records.
Results: At a similarity threshold of 95%, the model achieved 0.94 accuracy, 0.89 precision, 1.0 recall, and an F1-score of 0.941. When the threshold was reduced to 85%, performance dropped to 0.25 accuracy, mainly due to false duplicate matches. The model consistently standardized strength and form (e.g., “500 mg tab” → “500 mg Tablet”). Errors occurred when distinct medicines had highly similar names.
Conclusions: NLP methods can support the automation of unstructured textual data in business processes, provided high similarity thresholds and well-structured databases are maintained. Our findings highlight both the potential efficiency gains and the limitations of lightweight NLP models.
Downloads
Published
Issue
Section
Categories
License
Copyright (c) 2026 Antonija Buzov, Mario Jadrić

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution (CC-BY) 4.0 License that allows others to share the work with an acknowledgment of the work’s authorship and initial publication in this journal.


