Deep Feedforward Neural Networks
AI specialist Jürgen Schmidhuber on the first deep networks, backpropagation and whether you can train a netwo...
How else could the science of cybernetics be called? Is it possible to consider the brain a universal computer? And what did a twelve-year-old boy from Detroit do to amaze the philosopher Bertrand Russell?
Norbert Wiener was one of the outstanding scientists of the 20th century. He had an exciting fate, which is typical of many scientists of the early 20th century. By the age of ten, he had graduated from high school, and at fourteen, he earned a bachelor’s degree in mathematics from Tufts University. By the age of seventeen, he had already defended his PhD dissertation. After defending his doctoral thesis, he went to Europe, to Cambridge, where he met Bertrand Russell, who had just published his famous book “Principia Mathematica” the year before, attempting to derive mathematics from logic.
By the start of the war, Wiener had already become an established mathematician with considerable recognition—Norbert Wiener’s views formed during the pre-war period, which is later reflected in his work. On one hand, he had a great interest in the foundations of science and philosophy. On the other hand, he realized that he could not apply his entire intellect, directed at formal descriptions and mathematics, to philosophy, and decided to engage in general methods of mathematics that could be applied to real-world problems.
At the beginning of the war, he understood that to solve many practical problems, it was necessary to obtain solutions to partial differential equations, which are very difficult to obtain analytically. However, if we get their values numerically, that is, by calculating these equations using different methods, they still require a significant amount of computation. Therefore, at the request of his supervisor, he prepared proposals for the development of computational architectures. At that time, the focus was either electromechanical relay computers or analogue computers, which allowed the integration of these equations and obtaining their solutions.
However, Wiener had experience working with analogue computers and knew that errors quickly accumulated, so he looked towards digital computers and understood that relay-based computers had low speed. In his work, he argued that it was necessary to transition to electronic computers, which could be ten thousand times faster than relay computers. However, the limiting factor was memory because in relay computers, in the main architecture of that time, memory was implemented using punch cards. Essentially, we set the initial conditions of our task by coding them onto punch cards and loading them into the computer; the computer would read them, perform the calculations, and then record the results of the computational steps back onto punch cards. Therefore, we would have to reload new punch cards if we iteratively solved a problem, as is often the case in solving differential equations using numerical methods.
Punching holes in punch cards and reading them was pretty lengthy. Wiener was one of those who proposed using fast memory also in the form of electronic circuits. At the same time, Wiener’s interest in modeling and understanding brain function became noticeable. He saw that if we consider, for example, an automatic guidance system as an adaptive system, it is similar to the system that controls movement in animals. In this sense, he saw the brain as a giant computer and tried to understand if we could use information about how the brain works to develop new computer architecture.
Here, he began collaborating with neurophysiologists at MIT and Harvard. During the war, he met and became acquainted with two scientists. One was Warren McCulloch, and the other was his young assistant Walter Pitts, who could also be considered a genius, albeit with a completely different background from Wiener.
It is worth saying a few words about him because, in some ways, his fate was similar to Wiener’s. Perhaps this is why he became one of his students. Pitts grew up in a troubled family and often sought refuge from all the troubles — he was bullied on the street, and his father sent him to work instead of school — in the public library in Detroit. There, he studied Latin, mathematics, and so on. This was in the 1930s. One evening, once again hiding from bullies who were chasing him, he was so afraid to go outside, expecting to be beaten up, that he stayed in the library overnight. That night, he stumbled upon the three-volume edition of Bertrand Russell’s “Principia Mathematica.” He sat and read this massive philosophical work, which many philosophers could not grasp at the time because it contained many formal descriptions and derivations, in three days without leaving the library. He found several errors and couldn’t contain himself, so he wrote a letter to Bertrand Russell and sent it to him. When Bertrand Russell read the letter, he invited Pitts to become his graduate student at Cambridge. However, at that time, Pitts was only twelve, so he did not go to Cambridge but ended up at the University of Chicago. He started attending philosophy lectures, where Russell’s work was discussed. Rudolf Carnap was one of the renowned philosophers at the time who also tried to link the philosophy of science with probability theory.
Many scientists at that time started with some philosophical foundations of science, with Bertrand Russell’s book, with attempts to describe the foundations of mathematics and science rigorously, and then switched to trying to understand cognition by understanding how the brains of animals and humans perceive the world. Likewise, after some time, Walter Pitts found himself in a group engaged in the biophysical modeling of neurons. He met Warren McCulloch, a philosopher and neurophysiologist lacking formal skills and mathematical description abilities.
At that time, Turing’s work was published, showing that any computation, any problem that can be computed, can be represented as a universal Turing machine. McCulloch’s idea was that if we can model each neuron in the brain, which consists of neurons, with a simple logical function, then the brain as a whole would also be equivalent to a Turing machine. But he couldn’t prove this because he needed to gain the technical skills. When he met Pitts, he realized that this was the person who could help him. He invited Pitts to live with him, and for several years, Pitts lived at his home, engaging in such research. In the mid-1940s, Pitts met Wiener. The story goes that when Pitts came to Wiener’s office, Wiener initially didn’t ask anything but continued deriving some formal problem on the board. Pitts commented and suggested what to do next. When Wiener reached the second board, continuing the solution, he said, “Young man, I want to invite you to become my graduate student as well; let’s work together.”
By this time, Pitts and McCulloch had already written a work that can be considered foundational for the field of artificial intelligence, titled “A Logical Calculus of the Ideas Immanent in Nervous Activity.” In this work, they showed that if we imagine a sequence of certain logical operations, modeling the work of a primitive neural network, we can express the primary operations equivalent to a Turing machine using these operations. Therefore, if neurons implement functions of such logical calculators, then the brain is comparable to a Turing machine. Consequently, it can perform any computation we can compute on a Turing machine. The fundamental nature of this work was that they argued that with this approach, the brain can indeed be considered a universal machine. Thus, building a computer on such logical functions can achieve the same computational potential as the brain.
This greatly interested Wiener because it intersected with his thoughts on what a computer should be and his idea that we need to try to model the brain and draw ideas from how it works to create intelligent systems. Wiener was so inspired by these ideas and results that he attempted to develop a cell at MIT that included neurophysiologists and mathematicians to begin working on a new field of research, which he had not yet named but would later become cybernetics.
This group included McCulloch, Pitts, Bigelow, and Rosenblatt. Wiener also tried to involve John von Neumann in this group. In 1944–1945, Wiener was very active in this effort, corresponding almost monthly with von Neumann and organizing meetings to discuss possible research directions. However, by the end of 1945, relations between von Neumann and Wiener cooled. This was due to their different views on how science could be used for military purposes. Wiener was deeply shocked by the bombing of Hiroshima and Nagasaki carried out by the US and was against the use of science for military purposes.
After the war, he began collaborating with his former colleague from Harvard, who was heading a center in Mexico to study heart rhythms controlled by nerve impulses and excitable tissue. There, in Mexico, he wrote the book “Cybernetics.” We see that the group he tried to assemble never came together as a group working in one direction. He tried to summarize all his thoughts on this subject in the book “Cybernetics.” He discussed creating a new scientific field for studying how information is transmitted in living organisms, machines, and society. These principles of information transmission allow us to build a new science of adaptive learning systems.
Interestingly, he long sought a word suitable for naming this science. Since the idea was that it should be related to information transmission, he began looking for various Greek words. One of the options was “angels” (meaning “messenger”). But he rejected it because, in Western culture, an angel is a messenger of God, and it was unclear what this science would be called — angelic analytics. Therefore, he found another word — “cybernetics” — one that steers or governs something, such as a boat, an organization, or a people. He decided to name the new science cybernetics.
AI specialist Jürgen Schmidhuber on the first deep networks, backpropagation and whether you can train a netwo...
Mathematician Gilbert Strang on the difference between cosine and wavelet functions, audio compression, and th...
Computer scientist Matthew Bass on dynamic environment, multitenancy, and coordination between releases