Module 2: LLM Fundamentals¶
Objective¶
Understand how ChatGPT, Claude, Gemini, and Bedrock models actually work before we start building AIOps solutions.
By the end of this module, you'll understand:
- Tokens
- Context Window
- Prompts
- Temperature
- System Prompts
- Inference
- Why LLMs hallucinate
- How DevOps engineers use LLMs in AIOps
1. What is an LLM?¶
LLM stands for Large Language Model.
Think of it as:
Massive Dataset
+
Deep Learning
+
Billions of Parameters
=
LLM
Examples:
- OpenAI GPT
- Anthropic Claude
- Google Gemini
- Meta Llama
An LLM's main job is:
Predict the next token.
Everything else is built on top of that.
2. What are Tokens?¶
Most beginners think:
Sentence -> Words
LLMs think:
Sentence -> Tokens
Example:
Kubernetes is awesome
May become:
Kubern
etes
is
awesome
4 tokens.
The model never sees words.
It sees tokens.
DevOps Example¶
Suppose you send: 5000 lines of Kubernetes logs to Bedrock.
Bedrock converts everything into tokens.
More tokens means:
- More cost
- More latency
- More processing time
This becomes critical in AIOps.
3. Context Window¶
This is the most important concept after tokens.
Context Window = Model Memory
Example:
Conversation
+
Prompt
+
Documents
+
Logs
must fit inside the context window.
Think of it like RAM.
DevOps Analogy¶
Server
├── 4 GB RAM
├── 8 GB RAM
└── 32 GB RAM
Similarly:
LLM
├── 8K tokens
├── 32K tokens
├── 128K tokens
└── 1M+ tokens
If you exceed the limit:
Older context gets dropped
or
Request fails
AIOps Example¶
Imagine:
1 Million Application Logs
You cannot send all logs directly.
Instead:
Logs
↓
Filter
↓
Relevant Logs
↓
LLM
Later we'll learn RAG which solves this problem.
4. Prompt¶
Prompt = Input given to the model.
Example:
Explain Terraform modules.
This is a prompt.
DevOps Example¶
Bad Prompt:
Pod issue.
Good Prompt:
My pod is in CrashLoopBackOff.
Below are the logs.
<logs>
Explain:
1. Root Cause
2. Troubleshooting Steps
3. Fix
The better the prompt, the better the answer.
5. System Prompt¶
Normal Prompt:
Explain Kubernetes.
System Prompt:
You are a Senior Kubernetes Engineer.
Always explain with real examples.
The system prompt controls behavior.
AIOps Example¶
System Prompt:
You are a Senior SRE Engineer.
Analyze incidents.
Return:
- Root Cause
- Severity
- Recommendation
Now every response follows this format.
6. Temperature¶
Temperature controls creativity.
Think of it as randomness.
Temperature = 0¶
Very deterministic.
2 + 2 = 4
Almost always the same answer everytime.
Temperature = 1¶
More creative.
Multiple valid responses possible.
DevOps Rule¶
For troubleshooting:
Temperature = Low
For RCA:
Temperature = Low
For automation:
Temperature = Very Low
Because consistency matters.
7. Inference¶
Training:
Model learns data
Inference:
You ask question
Model answers
As DevOps engineers, we rarely train foundation models.
We mostly use:
Prompt
↓
Inference
↓
Response
8. Hallucination¶
A hallucination occurs when the model confidently generates incorrect information.
Example:
What is the Terraform resource
aws_super_database_cluster?
The model might invent an answer.
Even though that resource doesn't exist.
Why Hallucinations Happen¶
The model predicts likely tokens.
It doesn't truly know facts.
That's why:
LLM + Company Documents
is better than:
LLM Alone
This leads us to RAG later.
How LLMs Fit Into AIOps¶
Traditional DevOps:
Alert
↓
Engineer
↓
Investigate
↓
Fix
AIOps:
Alert
↓
LLM
↓
Incident Summary
↓
Root Cause
↓
Recommendation
↓
Engineer
Example:
Prometheus Alert
↓
LLM reads:
- Logs
- Metrics
- Events
↓
Produces RCA
Hands-On Assignment¶
Answer these questions in your own words:
Q1¶
Why do tokens matter when sending logs to Bedrock?
Q2¶
Context Window is similar to what concept in computing?
Q3¶
Difference between Prompt and System Prompt?
Q4¶
For a Kubernetes troubleshooting assistant:
Should temperature be high or low?
Why?
Q5¶
Why are hallucinations dangerous in AIOps?
Mini Project¶
Create a prompt template for an AI Kubernetes Troubleshooter.
Input:
Pod Name
Namespace
Logs
Output:
Root Cause
Impact
Recommended Fix
kubectl Commands