Chapter 1: Why AI Security Is Being Done Wrong
The Pattern We Keep Repeating
A financial services firm spent eighteen months building what they called their "AI-first" customer service platform. They evaluated model providers, negotiated enterprise agreements, established an AI center of excellence, and hired prompt engineers. The CISO's team was brought in late—about six weeks before launch—and asked to "sign off on security."
The security review focused on the obvious: Was the model API using TLS? Were API keys rotated? Did the vendor have SOC 2? The answers were yes, yes, and yes. The platform launched.
Four months later, the firm discovered that their AI system had been surfacing internal customer notes—comments written by support staff that were never meant to be customer-facing—in response to certain queries. The notes included everything from credit risk assessments to informal complaints about difficult customers. The AI wasn't hallucinating. It was doing exactly what it was trained to do: find relevant context and surface it.
The root cause wasn't the model. It wasn't prompt injection. It wasn't a vendor breach. The root cause was that internal knowledge bases—some containing sensitive annotations—had been connected to the retrieval system during a "data enrichment" sprint. No one had mapped what data was flowing into the AI's context window. No one had asked what trust boundaries existed between staff-facing notes and customer-facing outputs.
The security team had been looking at the wrong layer entirely.
This chapter is about why that keeps happening—and why most organizations are fundamentally misunderstanding where AI security risk actually lives.
The Model Obsession
When security teams think about AI, they think about models. This is understandable. Models are the novel component. They're the thing that seems magical, opaque, and therefore dangerous. The security industry has responded accordingly: we now have tools for scanning models, red-teaming prompts, detecting jailbreaks, and monitoring for hallucinations.
None of this is wrong, exactly. But it's incomplete in a way that creates false confidence.
The model obsession mirrors a pattern we've seen before. When cloud computing emerged, security teams focused on hypervisor escapes and multi-tenancy risks—the novel, scary parts. Meanwhile, the actual breaches were happening through misconfigured S3 buckets, overprivileged IAM roles, and forgotten API keys. The exotic threats made for good conference talks. The mundane architectural failures caused the incidents.
The same dynamic is playing out with AI. Security teams are building elaborate defenses against prompt injection while leaving the data pipelines that feed those prompts completely unmonitored. They're evaluating model providers' security certifications while ignoring the fact that their own retrieval systems can access half the company's file shares.
The real problem is that AI security has been framed as a model problem when it's actually a systems problem.
Why This Framing Fails
Think about what an AI system actually is in production. There's a model, yes—but that model is surrounded by infrastructure: data stores, retrieval systems, orchestration layers, API gateways, logging pipelines, identity systems, and integration points. The model is the center of gravity, but it's not where most of the risk lives.
When something goes wrong with an AI system, the failure mode is almost never "the model did something unexpected in isolation." The failure mode is:
- Data that shouldn't have been accessible was fed into the model's context
- Outputs were trusted in downstream systems without validation
- Identity and authorization were handled at the API layer but not at the data layer
- Logging captured prompts but not the retrieval results that influenced responses
- Integration points assumed AI outputs were safe to act upon
These are architectural failures. They're failures of system design, trust boundaries, and data flow control. They have nothing to do with whether your model can be jailbroken.
And yet, the security industry has built an entire market around model-layer defenses while leaving these architectural gaps largely unaddressed.
The Vendor-Industrial Complex
There's a reason for this mismatch, and it's not conspiracy—it's market dynamics.
Model-layer security is easy to productize. You can build a tool that sits in front of an LLM API, scans inputs for injection patterns, and filters outputs for sensitive content. It's a clean integration point. It produces dashboards. It generates alerts. It looks like security.
Architectural security is hard to productize. Every organization's data flows are different. Their trust boundaries are different. Their identity systems are different. There's no universal "AI firewall" that solves the problem of poorly designed integrations.
So the market builds what it can sell, and security teams buy what the market offers. The result is a growing gap between perceived security ("we have an AI security tool") and actual security ("we understand the architectural risks of our AI systems").
This isn't a criticism of vendors—they're responding to demand. It's a criticism of how the demand has been shaped. We've allowed the conversation about AI security to be dominated by model-layer concerns because those concerns are legible and addressable with products. The harder, more important work—understanding your data flows, mapping your trust boundaries, controlling your integration points—doesn't fit neatly into a procurement cycle.
The Historical Pattern
We've done this before. Multiple times.
The Firewall Era: When network security emerged, the firewall became the symbol of protection. Organizations invested heavily in perimeter defenses while internal networks remained flat, overprivileged, and unmonitored. The result was a generation of breaches where attackers walked through the front door with stolen credentials and found no resistance inside.
The Cloud Transition: When organizations moved to cloud infrastructure, security teams initially treated it like a hostile external environment—focusing on encryption, access controls at the cloud boundary, and vendor risk assessments. Meanwhile, the actual breaches were happening because of misconfigurations, excessive permissions, and poor secrets management. The cloud wasn't the threat; the way organizations used the cloud was the threat.
The DevOps Shift: When development and operations merged, security was initially bolted on—scan the code, scan the containers, add a security review gate. The result was friction without effectiveness. The organizations that actually improved security were the ones who integrated security thinking into the architecture of their pipelines, not the ones who added more scanning.
In each case, the pattern was the same: focus on the new, novel component; neglect the systemic and architectural; get burned by the fundamentals.
AI security is following the same trajectory. The novel component is the model, so that's where attention flows. But the risk is systemic, and the failures will be architectural.
The Lifecycle Reframe
Here's the mental model that should replace the model obsession: AI security is about securing a lifecycle.
That lifecycle has distinct stages, each with its own risk profile:
Data. Before a model does anything, data has to be collected, cleaned, stored, and made accessible. This is where lineage matters, where poisoning happens, where sensitive information gets inadvertently included. If you can't trace your data from source to model, you can't secure your AI.
Training and Fine-Tuning. Models are software artifacts. They have provenance. They come from somewhere—a vendor, an open-source repository, an internal team. They can be tampered with, corrupted, or simply poorly constructed. The supply chain thinking that we've developed for software applies here.
Deployment. The moment a model starts accepting inputs and producing outputs, it becomes an attack surface. But the attack surface isn't just the model—it's every integration point, every system that feeds it context, every downstream consumer of its outputs. Deployment is where architectural decisions become security decisions.
Runtime. Modern AI systems aren't static. They use tools, access external systems, maintain memory across sessions, and operate with increasing autonomy. Runtime is where privilege becomes dangerous, where persistence creates risk, and where feedback loops can amplify failures.
Infrastructure. AI systems run on compute, connect to networks, authenticate against identity providers, and log to observability platforms. All the infrastructure security fundamentals still apply—and they're often neglected because teams are focused on AI-specific concerns.
Governance. Someone has to be accountable for how AI systems behave. Policies have to be defined, enforced, and audited. But governance that isn't connected to architecture is theater—it produces documents without producing security.
This lifecycle isn't linear. Data decisions affect runtime risk. Infrastructure choices constrain deployment options. Governance failures often trace back to training decisions made months earlier. The lifecycle is the system, and securing any single stage in isolation is insufficient.
What "Done Wrong" Actually Means
When I say AI security is being done wrong, I don't mean that security teams are incompetent. I mean that the framing they've been given is incomplete.
Here's what "done wrong" looks like in practice:
Securing the API but not the data. Teams implement authentication, rate limiting, and input validation at the AI service's API layer. But the retrieval system behind that API can access sensitive data stores with broad permissions. The API is secure; the architecture isn't.
Monitoring outputs but not context. Teams deploy tools that scan AI outputs for sensitive content, PII leakage, or policy violations. But they don't monitor what data is being retrieved and fed into the model's context window. You can't understand why an AI said something if you don't know what it was shown.
Treating models as black boxes. Teams evaluate model providers based on security certifications and contractual terms. But they don't map the trust implications of giving that model access to internal systems, data, and identity contexts. A SOC 2 report doesn't tell you what happens when the model has access to your customer database.
Applying static controls to dynamic systems. Teams define what an AI system should and shouldn't do at design time, then treat those definitions as fixed. But AI systems in production encounter novel inputs, edge cases, and context combinations that weren't anticipated. Static controls fail when the system behavior is inherently dynamic.
Separating AI security from security. Teams create dedicated "AI security" functions that operate separately from infrastructure security, application security, and data security. But AI systems span all these domains. Organizational silos create architectural blind spots.
Each of these patterns represents a well-intentioned effort that fails because it's operating at the wrong level of abstraction. The work isn't wasted—it's just incomplete.
The Uncomfortable Truth
Here's the part that doesn't fit neatly into a product pitch or a maturity framework: most organizations aren't ready to secure AI systems because they haven't secured the foundations those systems depend on.
If you don't have data lineage, you can't trace what's feeding your AI. If you don't have least-privilege access controls, your AI inherits excessive permissions. If you don't have meaningful observability, you can't detect AI-related incidents. If you don't have clear trust boundaries, your AI will cross them without anyone noticing.
AI doesn't create these gaps—it exposes them. It amplifies them. It turns latent architectural debt into active risk.
This is why organizations that treat AI security as a separate discipline, something that can be addressed with dedicated tools and dedicated teams, keep getting surprised. The AI isn't the problem. The architecture is the problem. The AI just makes the architecture's weaknesses visible.
What This Book Is About
The chapters that follow are organized around the lifecycle I described earlier. Each chapter takes a stage—data, training, deployment, runtime, infrastructure, governance—and examines it architecturally.
The goal isn't to give you a checklist of controls to implement. Checklists are seductive because they're concrete, but they're also brittle. They tell you what to do without helping you understand why, which means they fail the moment you encounter a situation that isn't on the list.
Instead, the goal is to help you think architecturally about AI security. That means:
- Understanding where trust boundaries exist and where they should exist
- Tracing data flows and identifying where sensitive information could leak
- Recognizing failure modes and reasoning about blast radius
- Asking diagnostic questions that reveal gaps before incidents do
- Connecting security decisions to the lifecycle stages where they actually matter
This isn't glamorous work. It's the same kind of architectural discipline that separates good security programs from compliance-driven theater in every other domain. AI doesn't change the fundamentals—it tests whether you actually have them.
A Note on What This Book Is Not
This book will not teach you how models work internally. You don't need to understand attention mechanisms, gradient descent, or tokenization to secure AI systems. Those details matter to researchers and ML engineers. They don't matter to architects and security practitioners.
This book will not give you a catalog of AI-specific threats. Threat catalogs are useful for awareness, but they encourage a whack-a-mole mentality: find the threat, deploy the countermeasure, repeat. Architectural thinking is different. It asks what conditions allow threats to succeed, then addresses those conditions.
This book will not tell you which vendors to use or which tools to deploy. Vendor landscapes change. Tools come and go. Architecture endures. If you understand the architectural requirements, you can evaluate tools yourself.
This book will not reassure you that AI security is manageable if you just follow the right steps. It's not manageable—not if you're doing anything interesting with AI. The risk is real, the systems are complex, and the controls are imperfect. The goal is to be clear-eyed about that reality, not to pretend it away.
Key Takeaways
AI security is a systems problem, not a model problem. The model is one component. The architecture surrounding it—data flows, trust boundaries, identity systems, integration points—is where most risk actually lives.
The security industry's focus on model-layer defenses creates false confidence. Tools that scan prompts and filter outputs are useful but insufficient. They address the visible attack surface while leaving architectural gaps unexamined.
AI exposes architectural debt. If your data governance is weak, your AI will surface sensitive information. If your access controls are broad, your AI will inherit excessive permissions. AI doesn't create these problems—it amplifies them.
Lifecycle thinking replaces model thinking. Securing AI requires attention to data, training, deployment, runtime, infrastructure, and governance. Each stage has distinct risks, and they're interconnected.
Architectural discipline is the only durable foundation. Checklists and tools change. Threats evolve. But the fundamentals—trust boundaries, data flow control, least privilege, observability—remain constant. Build on those.
The chapters that follow will make this concrete. We'll examine AI systems as systems, trace the architectural patterns that create risk, and develop the questions that reveal gaps before incidents do. The work isn't easy, but it's necessary. Let's begin.