Introduction¶

Generative AI¶

AI systems that can produce high quality unstructured content: text, images, audio

Impact on Jobs¶

More effect on

higher-paid jobs
knowledge workers

eloundou et al

mckinsey

Lifecycle of GenAI Project¶

Scoping
Build prototype
Internal evaluation
Improve system
Deploy
Monitor

LLMs¶

Large Language Models

Supervised learning to repeatedly predict the next word

Applications¶

Finding new information
Writing
Assistant
Translation
Reading
Proof reading
Summarization
Chatting

Advice for chatbots: Start with internal-facing that works with staff

What an LLM can do¶

Rule of thumb

Whatever a fresh undergraduate can do with the given prompt and

No internet/other resources
No training specific to your prompt
No memory of previous tasks

Prompting Tips¶

Be detailed: Give LLM sufficient context & information required to task at hand
Be specific
Guide the model to think through its answer: Suggest steps for performing task
Experiment and Iterate

Objective¶

Helpful, Honest, Harmless

How it works¶

Instruction tuning
RLHF: Reinforcement Learning from Human Feedback
Train another Supervised Learning model for answer quality rewards
Train LLM to generate responses with high response scores

Can be used to reduce bias

Tool-Use¶

Action¶

Reasoning¶

Agents¶

Use LLM to close and carry out complex sequence of actions
Not ready at the time of typing this

Image Generation¶

Diffusion Model

Noise + Prompt -> Generated Image

Limitations¶

Knowledge cut-off
Hallucinations: LLM can produce confident responses which are completely false
Prompt size is limited
Does not work with structured data
Does not do arithmetic well
Bias & Toxicity

Caveats¶

Be careful with confidential information
Double-check: LLMs do not necessarily give reliable information
For user service, better to have confirmation dialog before performing transaction

Cost of LLM¶

Relatively cheap to use

4 tokens ~ 3 words

RAG¶

Given question, search relevant documents for answer
Incorporate retrieved text into updated prompt
Generate answer with new prompt with additional context

Fine-Tuning¶

To carry out a task that isn’t easy to define in a prompt
To help LLM gain specific knowledge
To get a smaller model to perform a task

Pre-Training¶

Very costly
Requires large amount of data

For building a specific application, pre-training is the last resort

LLM Model Size¶

General guidelines

Parameters	Capability	Example
1B	Pattern-matching Basic knowledge of the word	Restaurant review sentiment
10B	Greater world knowledge Can follow basic instructions	Food order chatbot
> 100B	Rich world knowledge Complex reasoning	Brainstorming

Last Updated: 2024-05-14 ; Contributors: AhmedThahir