Why the world must regulate AI development now !

7.2 Neural Network models

<< Previous

Next >>

AI has converged on a few Neural Network models at present

Most GPUs and TPUs running AI software applications are essentially designed to accelerate the training and inference of what are called Deep Neural Networks (DNNs).

Deep Neural Networks

DNNs were largely initially developed in the 1980s and 1990s by a number of AI and Cognitive Processing researchers, including Geoffrey Hinton, Yann LeCun, and Yoshua Bengio, among others.

The AI researcher Geoffrey Hinton is often referred to as the "Godfather of Deep Learning", because he made significant contributions to the development of backpropagation, a technique used to train DNNs. He also co-authored a seminal paper on deep belief networks in 2006.

Yann LeCun is known for developing the Convolutional Neural Network (CNN) architecture, which is widely used in image and video recognition applications. Yoshua Bengio has made contributions to the development of Deep Learning algorithms and is known for his work on word embeddings and neural machine translation.

DNNs are the result of contributing work from many AI and computing researchers over several decades.

Large Language Models (LLMs)

The most recent advances in AI use Large Language Models (LLMs) which are a class of Neural Network model that are Capable of generating Human-like natural language text.

These LLMs are trained on massive amounts of text information, typically billions of words, using deep learning techniques such as Neural Networks.

Large Language Models (LLMs) typically use Deep Neural Networks (DNNs) for their underlying architecture. In fact, LLMs are a type of Neural Network that is designed specifically for natural language processing tasks.

There are many different types of Neural Networks that can be used for LLMs, but some of the most common include the Transformer, the Gated Recurrent Unit (GRU), and the Long Short-Term Memory (LSTM) network. These networks are typically used as the basic building blocks of the LLM architecture, with multiple layers and sophisticated training techniques used to optimize their performance.

DNNs are an essential component of LLMs, as they allow the models to learn complex patterns and relationships within the input information.

The use of DNNs has been instrumental in the recent advances in natural language processing, and has enabled the development of powerful language models such as ChatGPT.

Contents

<< Previous

Next >>

Page updated

Report abuse