Do GPT-3.5 and GPT-4 Have a Writing Style Different from Human Style? An Exploratory Study for Spanish
DOI:
https://doi.org/10.58859/rael.v23i1.666Keywords:
writing style, large language models, GPT-3.5, GPT-4, corpus linguisticsAbstract
The aim of this research is to verify, using statistical techniques, that the generative language models GPT-3.5 (free version) and GPT-4 (paid version) of ChatGPT have their own writing style distinct from that of humans and that they can be distinguished by at least three types of features: lexical features, punctuation marks and syntactic sentence structure. Determining whether large language models have their own style is relevant in order to detect automatic authorship of texts. In previous work, a comparable corpus of human and automatic texts in Spanish was constructed and, through a qualitative study, a set of linguistic and stylistic features specific to each author was identified. In this work, it has been quantitatively demonstrated that the 17 identified lexical and punctuation variables show statistically significant differences between human authors and the GPT-3.5 and GPT-4 models.
References
Alonso Simón, L., Gonzalo Gimeno, J. A., Fernández-Pampillón Cesteros, A. M.ª, Fernández Trinidad, M. y Escandell Vidal, M.ª V. (2023). Using Linguistic Knowledge for Automated Text Identification. En M. Montes y Gómez et al. (Eds.), Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2023), co-located with the Conference of the Spanish Society for Natural Language Processing (SEPLN 2023). Jaén, España, 26 de septiembre. https://ceur-ws.org/Vol-3496/autextification-paper17.pdf
Berber Sardinha, T. (2024). AI-generated vs human-authored texts: A multidimensional comparison. Applied Corpus Linguistics, 4(1). https://doi.org/10.1016/j.acorp.2023.100083
Cañete, J., Chaperon, G., Fuentes, R., Ho, J-H., Kang, H. y Pérez, J. (2020). Spanish pretrained BERT model and evaluation data. arXiv:2308.02976v1. https://doi.org/10.48550/arXiv.2308.02976
Cardenuto, J. P., Yang, J., Padilha, R., Wan, R., Moreira, D., Li, H., Wang, S., Andaló, F., Marcel, S. y Rocha, A. (2023). The Age of Synthetic Realities: Challenges and Opportunities. APSIPA Transactions on Signal and Information Processing, 12(1), 1–62. https://doi.org/10.1561/116.00000138
Casal, J. E. y Kessler, M. (2023). Can linguists distinguish between ChatGPT/AI and human writing?: A study of research ethics and academic publishing. Research Methods in Applied Linguistics, 2(3). https://doi.org/10.1016/j.rmal.2023.100068
Corizzo, R. y Leal-Arenas, S. (2023). A Deep Fusion Model for Human vs. Machine-Generated Essay Classification. En D. Wang y T. Toyoizumi (Eds.), Proceedings of the International Joint Conference on Neural Networks (IJCNN). Gold Coast, Australia, 18-23 de junio. https://doi.org/10.1109/IJCNN54540.2023.10191322
Crothers, E. N., Japkowicz, N. y Viktor, H. L. (2023). Machine-Generated Text: A Comprehensive Survey of Threat Models and Detection Methods. arXiv:2210.07321, Oct. 2023. https://doi.org/10.1109/ACCESS.2023.3294090
Desaire, H., Chua, A. E., Isom, M., Jarosova, R. y Hua, D. (2023). Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools. Cell Reports Physical Science, 4(6). https://doi.org/10.1016/j.xcrp.2023.101426
Fernández Vítores, D. (2023). El español: una lengua viva. Informe 2023. En C. Pastor Villalba (dir.), Instituto Cervantes (coord.), El español en el mundo. Anuario del Instituto Cervantes 2023 (pp. 19-142). Madrid: Instituto Cervantes.
Fröhling, L. y Zubiaga, A. (2021). Feature-based detection of automated language models: tackling GPT-2, GPT-3 and Grover. PeerJ Computer Science, 7, 1–23. https://doi.org/10.7717/PEERJ-CS.443
Guo, B., Zhang, X., Wang, Z., Jiang, M., Nie, J., Ding, Y., Yue, J. y Wu, Y. (2023). How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation y Detection. arXiv:2301.07597v1. https://doi.org/10.48550/arXiv.2301.07597
Hadi, M. U., Al-Tashi, O., Qureshi, R., Shah, A., Muneer, A., Irfan, M., Zafar, A., Shaikh, M., Akhtar, N., Wu, J. y Mirjalili, S. (2023). Large Language Models: A Comprehensive Survey of its Applications, Challenges, Limitations, and Future Prospects. TechRxiv. https://doi.org/10.36227/techrxiv.23589741.v4
He, Z., Mao, R. y Liu, Y. (2024). Predictive model on detecting ChatGPT responses against human responses. Applied and Computational Engineering, 44(1), 18–25. https://doi.org/10.54254/2755-2721/44/20230078
Jawahar, G., Abdul-Mageed, M. y Lakshmanan, L. V. S. (2020). Automatic Detection of Machine Generated Text: A Critical Survey. En D. Scott, N. Bel, y C. Zong (Eds.), Proceedings of the 28th International Conference on Computational Linguistics (pp. 2296–2309). Barcelona: International Committee on Computational Linguistics. arXiv:2011.01314. https://doi.org/10.48550/arXiv.2011.01314
Jurafsky, D. y Martin, J. H. (2024). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (3rd ed. draft). Stanford University. Recuperado de https://web.stanford.edu/~jurafsky/slp3/
[...]
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Lara Alonso Simón, Ana María Fernández-Pampillón Cesteros, Marianela Fernández Trinidad, Manuel Márquez Cruz

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Attribution - Non-commercial (CC BY-NC). Under this license the user can copy, distribute and publicly display the work and can create derivative works as long as these new creations acknowledge the authorship of the original work and are not used commercially.
Authors retain the copyright and full publishing rights without restrictions.