Resilience and Reliability of the Internet
Computer scientist David Clark on the user's role in ensuring internet resilience, the policy of communication...
The standard application of computer algorithms or mathematical calculations can be described as follows: a programmer can input data, describe a sequence of actions with them, perform certain transformations, and obtain an answer using the resulting program code. Now, let’s imagine a situation where there are input conditions and examples of actions in similar cases, but there is no specific instruction on how to find the answer and the answer itself for a given task. Machine learning is a methodology through which one can build an “intelligent” model: it learns to solve a problem based on examples provided to it but without information on how to correctly solve the given task.
This significantly simplifies working with many practical issues. For example, you can classify images: gather hundreds of illustrations of mushrooms and clouds, label them, and thus create a training dataset for the correct solution of the task of identifying objects in pictures. Then, you can “connect” this data to a selected model, and it will learn to distinguish mushrooms from clouds on its own through examples. The example with mushrooms and clouds is relatively straightforward and can be implemented with a regular program. However, when the number of image types increases from 2 to 1000, writing a program becomes extremely complex, and this is where machine learning comes to the rescue.
In practice, a model is a specific mathematical function with various parameters. During the training process, the model’s parameters are adjusted in such a way that the algorithm’s error is minimized. The correct parameters are not known in advance; this is the essence of how the model works. After transitioning from the training dataset to real-world examples, the model should demonstrate the ability to make accurate predictions.
Some might say that artificial intelligence is a specific instance of machine learning. More likely, it’s the other way around: machine learning is a specific instance of artificial intelligence, as solving intelligent tasks can be achieved through various methods, including ML.
Artificial intelligence, however, is a term with different meanings. Firstly, it is a field of scientific research that deals with algorithms, some of which aim to replicate human intelligence or the process of solving intelligent tasks. Secondly, it represents a set of technologies for creating algorithms, with today’s focus primarily on symbolic or neural network-based intelligence.
Symbolic AI, or symbolic artificial intelligence, is an approach within the field of artificial intelligence where an attempt is made to model human reasoning. Imagine a task: there’s a mathematician proving a theorem. Let’s try to model this process. The mathematician performs certain actions, recording them in a formal language of mathematical notations. As a result of this process, mathematical proofs are generated. Certain rules determine the language, how to write on it, and what can and should be written. A system that allows reasoning in a formal language is called symbolic artificial intelligence because it is based on the manipulation of symbols.
Neural networks represent another approach to building artificial intelligence, one that is not based on modeling how humans reason but rather on how our brains work during learning. To achieve this, a network of “neurons” is created, which provides a mathematical description of the model of a “real” neuron (similar to the structural and functional unit in the human brain, the basic element of information processing). Neurons are interconnected to form a network, and the result of information processing is determined by the connections between elements in our network.
Hence, a neural network can serve as a model used in machine learning, allowing us to employ a network of neurons with numerous parameters that define the transmission of signals between individual neurons. These model parameters will be adjusted during the training process to minimize errors.
Being a language model, ChatGPT is implemented using a neural network. In a broader sense, implementing such a function can be achieved through various methods (and conversely, neural networks can be used not only for language modeling but also for tasks like speech recognition, image processing, and more).
For instance, one can gather a large corpus of texts and calculate the likelihood of a particular word occurring in all these materials. Then, one can create the simplest probabilistic language model where each subsequent word in a sentence is chosen proportionally to the calculated probabilities. While the quality of such a model may be low, it still qualifies as a language model. Thus, language modeling involves predicting what should come next in a text.
ChatGPT, however, is not just a language model; it’s a model tailored for simulating live human interactions on the internet, a “humanized” version. When we give ChatGPT a task, such as writing a story, the model generates a continuation that usually follows the words “write a story.” Imagine having a large dataset with examples where, in some cases, the text “write a story” was followed by an actual story. The neural network can generate something akin to a story based on the examples it has seen. But it can also respond, “I’m not in the mood to do that right now,” if it has seen a similar example in its training data. However, the likelihood of such responses is usually reduced to prevent user frustration.
Another interesting question is whether the process of a user fine-tuning ChatGPT for their specific purposes through conversation and clarifications can be considered a form of machine learning. From a formal perspective, it may not be considered training, as it doesn’t affect the model’s core parameters. However, by altering the initial query, you can influence the response it provides specifically for you.
An example of such synthesis is the program AlphaGo, developed by Google DeepMind in 2015. In 2016, it defeated world champion Go player Lee Sedol. AlphaGo consisted of a neural network that evaluated the state of the game board and a decision tree search model, which is essentially symbolic. Similar developments are now underway for other tasks, including language models.
Furthermore, it’s conceivable that in the future, models could communicate with each other and share experiences, pooling resources to solve various tasks and problems. This leads to much more mysterious scenarios that are challenging to predict in advance. Nonetheless, research in this direction is ongoing worldwide.
Computer scientist David Clark on the user's role in ensuring internet resilience, the policy of communication...
Professor Ilya Nemenman on machine learning, the laws of biology, and the quest for a 'robot-scientist'
Professor Mitchel Resnick on the benefits of learning to code, use of programming in everyday life, and the de...