1.Intro to LLM

Intro to LLM

Personal study notes on what an LLM is and how it works.

What is a large language model?

A subset of Generative AI; a type of foundation model. An LLM consists of: 1. A set of parameters 2. A set of code that runs the parameters

How are parameters obtained? - Model training: a chunk of the internet is run through GPUs, compressing text into a file.

What is a parameter? - Variables a model learns to make predictions, by assigning every word a token and creating a neural network.

How are LLMs made? 1. Pretraining 1. Download text 2. Get a cluster 3. Compress the text into a neural network 4. Obtain base model 2. Finetuning 1. Write labeling instructions 2. Collect high-quality Q&A responses or comparisons 3. Finetune base model with new data 4. Obtain assistant model 5. Run evaluations 6. Deploy 7. Monitor, collect misbehaviors, go back to step 1

Neural Network

Predicts the next word in the sequence; made up of parameters.

How to use a neural network? The neural network "hallucinates" the internet documents in order to create an answer.
How does it work? Little is known. Billions of parameters are dispersed through this network and can be adjusted to make better predictions.
Types:
CNN (Convolutional Neural Network): spatial feature extraction (eyes); mimics eyes by scanning for small patterns then combining them into shapes then objects, as opposed to every single pixel. Example: FaceID.
RNN (Recurrent Neural Network): sequential processing — processes data in the order that it matters. Example: speech to text. Problem: forgetful.
- Solution: Transformers — looks at every word in a sentence simultaneously.

Assistant Model

Specialized applications built on top of LLMs, trained on a smaller data set for quality (not quantity); used to outline what information needs to be where for an answer to a prompt.

How to use AI: 1. Prompt: giving the AI an input. 2. Skills: give AI an input and a framework on how to do one thing perfectly. 3. Agent: take a prompt; using the agent to communicate between the AI and you to utilize your skills to get a job done. 4. Context: using the data, such as retrieving other data. 5. Vibe Coding: code in English, describing the vibe of the intended application. 6. OpenClaw: open-source framework that runs terminal commands on local files.

Tell me the most popular agents as a high school student (100 examples of AI agents) - A2A and MCP - Context: using the data, such as retrieving other data - Vibe Coding: code in English, describing the vibe of the intended application - OpenClaw: open-source framework, runs terminal commands on local files

Other: - LLM Scaling Laws: measures the number of parameters and amount of text. - Capabilities: use tools to perform tasks (calculator, browser, etc.). - Multimodality: - LLM can see images (not just make images). - Auditory processing and transmission.

Future of LLM

Thinking System 2
Book: Thinking, Fast and Slow (two types of thinking).
System 1: quick, automatic (currently LLMs).
System 2: rational, slower — convert time into accuracy.
Self-Improvement: (AlphaGo) learn by self-improvement, the way AlphaGo learned Go from humans then beat humans through reward training. Open question: can LLMs do this?
Customization

LLM Security

Jailbreak attacks:
Through role-play, get ChatGPT to do whatever.
Base64 coding.
Universal transferable suffix (can be appended to any prompt; can also be an image).
Prompt injection: taking over the prompt through some sort of injection.
Data poisoning: can create a trigger word that, when attached, can do anything.