Deep Learning
AI specialist Jürgen Schmidhuber on credit assignment, recurrent neural networks and how can you solve the par...
To understand the challenges facing artificial intelligence (AI), we must consider the historically established approaches that underlie the tasks we need to address shortly.
Artificial intelligence as a field of research emerged in the late 20th century. People began tackling this task from two directions. The first obvious solution is that if we know that humans possess intelligence and want to reproduce it, then from a biological perspective, it is due to the brain and the nervous system. Thus, we can address the problem of creating intelligent programs and algorithms by modelling the brain’s function. To model the brain, we need to take its components, individual nerve cells (neurons), and build a brain from them that can solve intellectual tasks. This approach in artificial intelligence is known as “artificial neural networks.”
On the other hand, we can consider human intelligence as a form of activity and study it from the perspective of psychology and practical problem-solving. In this case, we say that we do not want to reproduce intelligence as a biological entity; this is irrelevant since we are not trying to grow a copy of a human to make a robot. We aim to replicate the essence of intelligence using different principles on another material substrate, which we can identify in human intellectual activity.
If the first approach is an attempt to model the brain, the second direction involves working from the top down, not assembling intelligence from individual elements but deconstructing it. By understanding the activities that lead to intellectual outcomes, we will investigate and try to find algorithms that reproduce them. The research community saw that the second path—the path of logical or symbolic AI—seemed much more promising. Thus, this direction developed most rapidly after the emergence of AI. Research on artificial neural networks continued but shifted somewhat to the periphery.
Logical, inference, and systems capable of playing formal games thrived. This led to several exciting advancements, such as creating knowledge-based systems, where we describe knowledge formally in a specific subject area. We need not only the inference rules but also knowledge about the entities in the domain and their interrelationships (this is the knowledge in the domain) and the specific facts from which we want to conclude. We apply formal inference rules, feeding in our knowledge structure, domain, and facts, to derive new facts, verify statements, or make predictions. There is knowledge and the specific conditions of the problem we are solving. This description of subject areas was formalized into a tool called “ontology”—a formal description of a subject area listing concepts, objects involved in some activity or predictions, formalized rules of interaction between these objects, and the rules for reasoning within this domain.
By the late 1990s, this approach manifested when a chess program defeated world champion Garry Kasparov. IBM’s Deep Blue, which incorporated formal game rules and move prediction, could compute to a shallow depth and was much stronger than a human in formal games. However, the ontology-based approach could not encompass the world’s richness: it worked excellently in narrow areas but failed with broader tasks. It could not self-improve, expand its knowledge, include new information, or handle poorly formalized tasks—two significant shortcomings of this method.
Building a knowledge-based system requires experts in the field and a person who can translate the expert’s knowledge into a formal system. The system must be designed to cover as many real-world cases as possible. Practice showed that creating a system widely applicable in any field was impossible.
Neural network algorithms possess the qualities missing in knowledge-based systems. Creating them does not require formalizing expert knowledge—they learn automatically from examples of correct task solutions. There are input data (facts from the subject area, analogous to knowledge-based systems), and the output should be conclusions from these facts. The neural network must learn to provide the correct answers from examples of inputs and outputs, possibly building an ontology in a way that humans cannot predict.
Despite attempts to use neural networks for solving intellectual tasks, they did not achieve strong results, leading to a stagnation in AI research in the early 2000s. The term “artificial intelligence” became almost derogatory in scientific circles, considered the domain of charlatans making unrealistic predictions. Ten years later, with increased computational power, a window opened for creating much more extensive neural networks alongside a significant accumulation of data. Here, we see the difference between the top-down and bottom-up approaches, as neural networks, unlike formal approaches, require substantial knowledge to learn from. You can encode knowledge from experts for formal systems, needing only minor parameter adjustments, but learning from scratch requires many examples. No one had tried training neural networks on large datasets before, and they were pretty primitive. When people could use deep neural networks with many layers and train them on vast amounts of data, neural networks achieved impressive results, surpassing traditional methods in specific tasks.
Thus, neural networks solve tasks we cannot formalize but require large datasets. It started with image recognition: in a few years, neural networks learned to recognize and classify objects in images better than humans. Previously, they performed this task several times worse. A recent triumph was a neural network beating a human in Go. The Go algorithm evaluated the board’s situation rather than directly predicting all moves. This hybrid solution combined deep search, move computation, and the player’s intuition, approximated by the neural network and based on many played Go games, understanding better or worse positions. This allowed the search algorithm to calculate only the moves predicted as good by the neural network, significantly reducing computation.
Today, we face a duality in our approaches and methods for solving AI tasks. On the one hand, we have ontology-based approaches. Vast ontologies have been created for various knowledge areas, formalizing useful information. On the other hand, we have neural networks learning from examples. However, finding many examples is challenging for some knowledge areas, limiting neural network use. The future of AI lies in combining ontological and neural network approaches. How can we achieve this?
In knowledge-based systems, one problem is linking specific entities in our ontologies to specific situations, recognizing them as corresponding to situations that our system can infer. Here, neural networks could help correctly recognize situations and map them onto the ontology. Conversely, we could use ontologies to plan our intelligent agent’s actions or determine the correct response in a given situation. Modern neural network methods excel at classification tasks but do not match logic-based systems in planning and long-term computation. The ideal solution might be to let neural networks handle parts of the intelligent agent that we cannot formalize or know how to use ontologies and knowledge-based systems for reasoning and planning. The remaining task is to devise an excellent way to combine these approaches.
AI specialist Jürgen Schmidhuber on credit assignment, recurrent neural networks and how can you solve the par...
Harvard Prof. Mikhail Lukin on the quantum mechanical switch, improving the lifetime of our memory, and quantu...
Questions you’ve always wanted to ask