Poetry of reality Science |
We must know. We will know. |
A view from the shoulders of giants. |
Machine learning is a set of techniques and algorithms that allow computer programs to learn simple or complex tasks by analyzing some training data (or examples of how they should behave). Some believe machine learning is the first stage in the development of true AI, being the first time machines can do anything fully independent of human intervention.
A variety of techniques exist to assign a label, or a value, to an existing set of attributes. These techniques include decision trees, support vector machines, clustering and, perhaps most importantly, neural networks.
Artificial neural networks are composed of a set of interconnected units (neurons). Each of them is connected to other neurons with connections of different weight. The values passed from each incoming connection are summed and the result is processed by an activation function (such as a hyperbolic tangent), which is then passed to other neurons.
The task of the learning algorithm is to progressively choose better and better values for the weights so that the network behaves as desired, according to the sample input/output data.
Among these, recurrent neural networks contain neurons or layers of neurons that feed their outputs to a previous layer of neurons, making it possible to take decisions based on previous data (thus creating a memory).
Deep neural networks have been able to learn to describe the contents of a photo, both in terms of individual objects and in terms of short sentences.[1] Results are sometimes imprecise, but surprisingly good for a computer.
Neural networks have also been used to play arcade games by only looking at the pixels on the screen, without knowing the game rules in advance.[2] Networks have even rediscovered some common tricks used by human players to achieve better scores faster.
In 2016, a competition for making a neural network capable of playing Doom by only looking at the rendered contents of the screen has been announced.[3]
Recurrent neural networks are capable of learning the syntax rules of a language (either natural or artificial) by looking at same sample data, and then producing similarly-looking text. RNNs have been shown to be able to learn and produce quasi-realistic English text, XML, LaTeX, wiki markup and C code.[4]
Transformer[5]-based networks such as BERT, GPT-3 (and derivatives, including ChatGPT) took over within half a decade due to their strengths compared to RNNs, and were the main focus in the early 2020s.
When language models grow large (tens of millions of "parameters" – artificial neurons – or more), as such transformer networks have, they are called large language models.