A Large Language Model (LLM) is a type of generative AI (GenAI) software. It contains traditional software code and libraries, as well as the model’s parameters. These parameters are the knowledge stored in the model. They are numerical values that are adjusted during training. These numerical values represent learned relationships and patterns in the input data in a statistical manner. Examples of learned patterns include contextual information in the input text (whether ‘bank’ refers to a place to store money or a type of electricity) and semantic information (such as cats and dogs being both animals).
Traditional software contains components that can be used to mimic human brains, known as neural networks. These neural networks are arranged according to the model’s architecture. The transformer architecture is currently the dominant architecture in LLMs. So, in the context of the human brain, traditional software represents the physical brain, and the parameters represent learned information stored in the brain’s synapses. The number of parameters in current LLMs typically ranges from billions to trillions, whereas the human brain has hundreds of trillions synapses.
When an LLM performs a task given by text, the execution flow inside the GenAI software is controlled by traditional software. It determines which parameters are used in the calculation and the order in which operations are performed. This flow results in a probability value, which is used to determine the answer to the given task.
Neptunux 29.2.2024