📡 You're offline — showing cached content
New version available!
Quick Access
Tutorials Cybersecurity and AI Security AI Security — LLM Threats

AI Security — LLM Threats

6 min read Quiz at the end
LLM threats: prompt injection, jailbreaking, training data poisoning, model extraction, indirect injection.

AI Security: LLM-Specific Threats

ThreatDescriptionImpactMitigation
Prompt InjectionMalicious input overrides system instructionsData exfiltration, policy bypassInput sanitisation, output validation
JailbreakingBypass safety guardrails via crafted promptsGenerate harmful contentRegular red-teaming, safety classifiers
Training Data PoisoningInject malicious data into fine-tuning setBackdoor model behaviourData provenance, anomaly detection
Model ExtractionRepeated queries to replicate modelIP theft, bypass rate limitsRate limiting, query anomaly detection
Hallucination ExploitationAttacker crafts prompt to induce false factsMisinformation, wrong decisionsRAG grounding, citation requirements
Indirect Prompt InjectionMalicious instructions in retrieved documentsAgent hijackingSanitise all tool/retrieval outputs
Membership InferenceDetermine if data was in training setPrivacy violationDifferential privacy, output perturbation
Topic Quiz · 1 questions

Test your understanding before moving on

1. What is indirect prompt injection in AI agents?
💡 Indirect injection hides attacker instructions in retrieved content (web pages, PDFs) — agents execute them unknowingly.