Skip to content

Module 2: LLM Fundamentals

Objective

Understand how ChatGPT, Claude, Gemini, and Bedrock models actually work before we start building AIOps solutions.

By the end of this module, you'll understand:

  • Tokens
  • Context Window
  • Prompts
  • Temperature
  • System Prompts
  • Inference
  • Why LLMs hallucinate
  • How DevOps engineers use LLMs in AIOps

1. What is an LLM?

LLM stands for Large Language Model.

Think of it as:

Massive Dataset
      +
Deep Learning
      +
Billions of Parameters
      =
LLM

Examples:

  • OpenAI GPT
  • Anthropic Claude
  • Google Gemini
  • Meta Llama

An LLM's main job is:

Predict the next token.

Everything else is built on top of that.


2. What are Tokens?

Most beginners think:

Sentence -> Words

LLMs think:

Sentence -> Tokens

Example:

Kubernetes is awesome

May become:

Kubern
etes
is
awesome

4 tokens.

The model never sees words.

It sees tokens.


DevOps Example

Suppose you send: 5000 lines of Kubernetes logs to Bedrock.

Bedrock converts everything into tokens.

More tokens means:

  • More cost
  • More latency
  • More processing time

This becomes critical in AIOps.


3. Context Window

This is the most important concept after tokens.

Context Window = Model Memory

Example:

Conversation
+
Prompt
+
Documents
+
Logs

must fit inside the context window.

Think of it like RAM.


DevOps Analogy

Server
 ├── 4 GB RAM
 ├── 8 GB RAM
 └── 32 GB RAM

Similarly:

LLM
 ├── 8K tokens
 ├── 32K tokens
 ├── 128K tokens
 └── 1M+ tokens

If you exceed the limit:

Older context gets dropped

or

Request fails

AIOps Example

Imagine:

1 Million Application Logs

You cannot send all logs directly.

Instead:

Logs
 ↓
Filter
 ↓
Relevant Logs
 ↓
LLM

Later we'll learn RAG which solves this problem.


4. Prompt

Prompt = Input given to the model.

Example:

Explain Terraform modules.

This is a prompt.


DevOps Example

Bad Prompt:

Pod issue.

Good Prompt:

My pod is in CrashLoopBackOff.
Below are the logs.

<logs>

Explain:
1. Root Cause
2. Troubleshooting Steps
3. Fix

The better the prompt, the better the answer.


5. System Prompt

Normal Prompt:

Explain Kubernetes.

System Prompt:

You are a Senior Kubernetes Engineer.
Always explain with real examples.

The system prompt controls behavior.


AIOps Example

System Prompt:

You are a Senior SRE Engineer.

Analyze incidents.

Return:
- Root Cause
- Severity
- Recommendation

Now every response follows this format.


6. Temperature

Temperature controls creativity.

Think of it as randomness.

Temperature = 0

Very deterministic.

2 + 2 = 4

Almost always the same answer everytime.


Temperature = 1

More creative.

Multiple valid responses possible.


DevOps Rule

For troubleshooting:

Temperature = Low

For RCA:

Temperature = Low

For automation:

Temperature = Very Low

Because consistency matters.


7. Inference

Training:

Model learns data

Inference:

You ask question
Model answers

As DevOps engineers, we rarely train foundation models.

We mostly use:

Prompt
 ↓
Inference
 ↓
Response

8. Hallucination

A hallucination occurs when the model confidently generates incorrect information.

Example:

What is the Terraform resource
aws_super_database_cluster?

The model might invent an answer.

Even though that resource doesn't exist.


Why Hallucinations Happen

The model predicts likely tokens.

It doesn't truly know facts.

That's why:

LLM + Company Documents

is better than:

LLM Alone

This leads us to RAG later.


How LLMs Fit Into AIOps

Traditional DevOps:

Alert
 ↓
Engineer
 ↓
Investigate
 ↓
Fix

AIOps:

Alert
 ↓
LLM
 ↓
Incident Summary
 ↓
Root Cause
 ↓
Recommendation
 ↓
Engineer

Example:

Prometheus Alert
       ↓
LLM reads:
- Logs
- Metrics
- Events
       ↓
Produces RCA

Hands-On Assignment

Answer these questions in your own words:

Q1

Why do tokens matter when sending logs to Bedrock?

Q2

Context Window is similar to what concept in computing?

Q3

Difference between Prompt and System Prompt?

Q4

For a Kubernetes troubleshooting assistant:

Should temperature be high or low?

Why?

Q5

Why are hallucinations dangerous in AIOps?


Mini Project

Create a prompt template for an AI Kubernetes Troubleshooter.

Input:

Pod Name
Namespace
Logs

Output:

Root Cause
Impact
Recommended Fix
kubectl Commands