Automatic Text Labeling Method Based on Large Language Models


  • CHENWU LI Graduate School, University of the East, Manila, Philippines and GuangDong Polytechnic, Guangdong,China Author
  • Henry Dyke A. Balmeo Graduate School, University of the East, Manila, Philippines Author



data automatic annotation, large model, prompt engineering, text similarity clustering algorithm


With the increasing demand for large amounts of training data for model development, this paper proposes LLM4Label, an automatic text labeling method based on large language models, to assist human labelers in annotating text data. LLM4Label first selects the most representative seed data using a clustering algorithm based on text similarity. It then constructs prompt dialogues with few-shot prompts to stimulate the language model’s performance on entity labeling tasks, enabling it to automatically and efficiently label more data. Finally, LLM4Label introduces human feedback to correct un- certain labeling results and retrains the model with the corrected annotations. Experiments show that LLM4Label achieves high- quality labeled data at low human labeling cost. The proposed method provides an effective way to obtain sizable and high- quality annotated datasets with minimal manual effort, which can strongly support downstream natural language processing tasks.


Download data is not yet available.


Gormley, M. R., & Mitchell, M. (2015). Temporal information extraction from narratives. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 1915-1920.

Banko, M., & Brill, E. (2001). Scaling to very very large corpora for natural language disambiguation. Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, 26(2), 26-33.

Zhang, X., Zhao, J. and LeCun, Y., 2015. Character-level convolutional networks for text classification. Advances in neural information processing systems, 28.

Ratinov, L., & Roth, D. (2009). Design challenges and misconceptions in named entity recognition. Proceedings of the Thirteenth Conference on Computational Natural Language Learning, 147-155.

Luo, L., Yang, J., and Zhang, J. (2015). Joint named entity recognition and disambiguation. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 879-888.

Michael, J., 2006. Where’s the evidence that active learning works?.Advances in physiology education.

Prince, M., 2004. Does active learning work? A review of the research. Journal of Engineering Education, 93(3), pp.223-231.

Cohn, D., Atlas, L. and Ladner, R., 1994. Improving generalization with active learning. Machine learning, 15, pp.201-221.

Wursch, Maxime, Andrei Kucharavy, Dimitri Percia-David, and Alain Mermoud. LLM-Based Entity Extraction Is Not for Cybersecurity.(2023).

Smith, Shaden; Patwary, Mostofa; Norick, Brandon; LeGresley, Patrick; Rajbhandari, Samyam; Casper, Jared; Liu, Zhun; Prabhumoye, Shrimai; Zerveas, George; Korthikanti, Vijay; Zhang, Elton; Child, Rewon; Aminabadi, Reza Yazdani; Bernauer, Julie; Song, Xia (2022-02-04). Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model. arXiv:2201.11990

ChatGPT: Optimizing Language Models for Dialogue . OpenAI. 2022-11-30. Retrieved 2023-01-13.

Diab, Mohamad; Herrera, Julian; Chernow, Bob (2022-10-28). Stable Diffusion Prompt Book(PDF). Retrieved 2023-08-07. Prompt engineering is the process of structuring words that can be interpreted and understood by a text-to-image model. Think of it as the language you need to speak in order to tell an AI model what to draw.

Jiang, Dongfu, Xiang Ren, and Bill Yuchen Lin. ”LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion.” ArXiv preprint arXiv:2306.02561 (2023).

Lee, Gibbeum, Volker Hartmann, Jongho Park, Dimitris Papailiopoulos,and Kangwook Lee. ”Prompted LLMs as Chatbot Modules for Long Open-domain Conversation.” arXiv preprint arXiv:2305.04533 (2023).

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal,P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A. and Agarwal, S., 2020. Language models are few-shot learners. Advances in neural information processing systems, 33, pp.1877-1901.

Wei J, Wang X, Schuurmans D, Bosma M, Xia F, Chi E, Le QV,Zhou D. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems. 2022 Dec 6;35:24824-37.

Diao S, Wang P, Lin Y, Zhang T. Active prompting with chain-of-thought for large language models. arXiv preprint arXiv:2302.12246. 2023 Feb 23.

Krishna, K., and M. Narasimha Murty. Genetic K-means algorithm.IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 29, no. 3 (1999): 433-439.

Bock, Hans-Hermann. ”Clustering methods: a history of k-means algo-rithms.” Selected contributions in data analysis and classification (2007): 161-172.

Yu, Hua, and Jie Yang. A direct LDA algorithm for high-dimensional data—with application to face recognition.Pattern recognition 34, no. 10 (2001): 2067-2070.

Schubert E, Sander J, Ester M, Kriegel HP, Xu X. DBSCAN revisited,revisited: why and how you should (still) use DBSCAN. ACM Trans-actions on Database Systems (TODS). 2017 Jul 31;42(3):1-21.

Jiang, Zi-Hang, Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng,and Shuicheng Yan. Convbert: Improving bert with span-based dynamic convolution. Advances in Neural Information Processing Systems 33 (2020): 12837-12848.

Bailey, Donald G., and Christopher T. Johnston. Single pass connected components analysis.In Proceedings of image and vision computing New Zealand, pp. 282-287. 2007.






Research Articles


How to Cite

C. LI and H. D. A. Balmeo, “Automatic Text Labeling Method Based on Large Language Models”, ijetaa, vol. 1, no. 1, Feb. 2024, doi: 10.62677/IJETAA.2401102.

Similar Articles

1-10 of 15

You may also start an advanced similarity search for this article.