ChatGPT and Other Large Language Models: Their Language Processing Mechanisms and Their Theoretical Implications

YUAN Yulin

YUAN Yulin. ChatGPT and Other Large Language Models: Their Language Processing Mechanisms and Their Theoretical ImplicationsJ. Journal of Foreign Languages, 2024, 47(4): 2-14.

Citation:

YUAN Yulin. ChatGPT and Other Large Language Models: Their Language Processing Mechanisms and Their Theoretical ImplicationsJ. Journal of Foreign Languages, 2024, 47(4): 2-14.

Citation:

YUAN Yulin. ChatGPT and Other Large Language Models: Their Language Processing Mechanisms and Their Theoretical ImplicationsJ. Journal of Foreign Languages, 2024, 47(4): 2-14.

ChatGPT and Other Large Language Models: Their Language Processing Mechanisms and Their Theoretical Implications

YUAN Yulin

Graphical Abstract

Graphical Abstract

Abstract

Abstract

This paper briefly explains the language processing mechanisms, mathematical foundations and theoretical implications of ChatGPT and other modern large language models. Firstly, it demonstrates the performance of large language models in semantic understanding and common sense reasoning by testing ChatGPT’s understanding of ambiguous sentences. Secondly, it introduces the transformer, which is equipped with what is referred to as multi-headed attention (MHA) and functions as a novel module of these large language models. Additionally, it presents word embedding and real-valued vector representations based on distribution at semantics, as well as the role of word vectors in language processing and analogical reasoning. Thirdly, it details how transformers successfully predict the next word and generate appropriate texts by tracing and passing on syntactic and semantic relationship information between words through multi-headed attention (MHA) and feed forward network (FFN). Finally, it provides an overview of the training methods of large language models and shows how they use the method of “recreating a language” to help us re-assure relevant design features (including: distributivity and predictability) of human natural languages and to inspire us to re-examine various syntactic, semantic theories that have been developed and formulated so far.

FullText(HTML)

References (26)

Cited By

Turn off MathJax

Article Contents

ChatGPT and Other Large Language Models: Their Language Processing Mechanisms and Their Theoretical Implications

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content