Can Counting Words make Sense? A Simple Application of Visual Tools for Basic Machine Learning in Linguistics

Main Article Content

Marc-Daniel Rahn

Abstract

The field of Natural Language Processing (NLP) is rapidly advancing with the development of tools for automatic text and language processing. The introduction of Chat GPT and other similar tools has sparked discussions about the use of artificial intelligence (AI) in society and academia. These tools are typically based on machine learning (ML), which allows computer programs to learn from data and generate models to make decisions. While ML models can be trained on language data to predict answers to research questions, they are not always easily accessible or optimized for linguists. This paper proposes using simple graphical tools for machine learning to help linguists formulate research questions and hypotheses, thus allowing them to assess the potential for fruitful investigations with specific data sets. By providing a user-friendly interface, these tools aim to overcome the barriers that deter linguists from utilising machine learning techniques. To illustrate the application of such a tool for preliminary studies, three experiments are described in this paper, all of which represent real questions of interest from within the field of Caucasiology. In the appendix, a step-by-step guide to recreate the process is given.

Keywords:
Machine Learning, Caucasiology, Linguistics, Graphical Tools
Published: Sep 2, 2024

Article Details

Section
Language Technologies