Chapter 10 — An Architectural Checklist for AI Security
Opening Scenario
A technology company decided to get serious about AI security. They hired consultants, purchased tools, and launched a six-month AI security program. The program delivered exactly what was promised: a comprehensive risk assessment, a prioritized remediation roadmap, and a governance framework aligned with industry standards.
Eighteen months later, they suffered a significant breach through their AI systems. Attackers had poisoned training data for a customer-facing recommendation engine, gradually shifting outputs to promote fraudulent products. The manipulation had been active for four months before detection.
When the incident response team investigated, they made a troubling discovery. The AI security program had assessed risks, but hadn't verified whether controls actually existed. The roadmap had prioritized remediation, but items had been marked "complete" when policies were written—not when technical controls were implemented. The governance framework defined responsibilities, but no one had confirmed those responsibilities were being executed.
The company had invested in AI security artifacts. They had not invested in AI security architecture. They had a checklist of things they should do, but no mechanism to verify they had done them. The roadmap gave them confidence. The breach revealed that confidence was unfounded.
This chapter is different from the others. It's not about a specific lifecycle stage or capability. It's about how to honestly assess whether your AI security architecture actually exists—and how to identify the gaps that documentation and programs often obscure.
Why This Chapter Matters
The conventional wisdom says that security maturity comes from programs, frameworks, and roadmaps. Assess your current state, define your target state, create a plan to close the gaps, execute the plan, and repeat. This model assumes that completing activities produces security outcomes.
The reality is that activities and outcomes are different things. You can complete a risk assessment without reducing risk. You can implement a governance framework without governing anything. You can check every box on a compliance checklist and still be fundamentally insecure. The activities create artifacts. Only architecture creates security.
This book has argued throughout that AI security is about securing the lifecycle—data, training, deployment, runtime, infrastructure, governance—as an integrated system. Each chapter has explored a specific stage, its unique risks, and the architectural patterns that address them. But understanding the stages isn't enough. You need to know whether your organization has actually built the architecture.
The gap between knowing what to do and verifying that it's done is where AI security programs fail. Teams read about data lineage and agree it's important. They plan to implement it. They may even start projects to address it. But unless someone asks "can you actually trace this specific training dataset to its sources, right now, today?"—and demands a concrete answer—the gap between intent and reality persists.
This chapter provides the questions that reveal those gaps. Not aspirational questions about what you should build, but diagnostic questions about what actually exists. Questions that can be answered with "yes, here's the evidence" or "no, we can't do that." Questions that, when you can't answer them, tell you exactly where your AI security architecture is incomplete.
The architectural principle is simple: if you can't demonstrate a capability, you don't have it. Documentation that describes a capability is not the capability. A project to build a capability is not the capability. Only working systems that produce verifiable evidence constitute actual security architecture.
How to Use This Checklist
This is not a compliance checklist. Checking boxes will not make you secure. This is a diagnostic instrument—a way to identify where your AI security architecture has gaps that need architectural solutions.
For each question:
Attempt to answer concretely. Not "we have a process for that" but "yes, here's how we do it" or "no, we cannot do that today."
Demand evidence. If the answer is yes, can you demonstrate it? Can you show the system, the log, the control, the output? If you can't show it, you don't have it.
Note the gaps honestly. Questions you cannot answer are not failures—they're information. They tell you where to focus architectural investment.
Prioritize by blast radius. Not all gaps are equal. Gaps that affect high-risk systems, sensitive data, or autonomous capabilities matter more than gaps in low-stakes applications.
Revisit regularly. Architecture degrades. Controls drift. What you could demonstrate six months ago may not be true today.
The questions are organized by lifecycle stage, matching the structure of this book. Within each stage, questions progress from foundational (you need this to have any security) to advanced (you need this for mature security). If you can't answer foundational questions, advanced questions are irrelevant.
Data: The First Attack Surface
Data is where AI security begins. If you can't answer questions about your data, nothing else matters.
Foundational Questions
Do you have an inventory of all data sources used in AI training and fine-tuning?
Not a general description—an actual inventory. For every AI system in production, can you list the specific datasets, data pipelines, and data sources that contributed to it? If you acquired a pre-trained model, do you know what data it was trained on?
If you cannot answer this: You have no foundation for data security. You cannot assess risk, detect poisoning, or respond to data-related incidents for systems whose data origins are unknown.
Can you trace any training sample back to its original source?
Pick a random sample from a training dataset. Can you identify where it came from, when it was collected, what transformations it underwent, and who had access to it along the way? Can you do this in hours, not weeks?
If you cannot answer this: You have data lineage documentation, not data lineage capability. When incidents occur, you will not be able to determine whether data was compromised.
Do you know what sensitive data exists in your training sets?
Not what categories of data you intended to include—what actually exists. Have you scanned training data for PII, credentials, proprietary information, or other sensitive content? Would you detect it if sensitive data entered your pipeline unintentionally?
If you cannot answer this: Your models may be trained on data that creates legal, privacy, or security exposure you're unaware of. The risk exists whether or not you've measured it.
Operational Questions
Can you detect unauthorized modifications to training data?
If someone altered records in a training dataset—either through malicious access or compromised pipelines—would you know? Do you have integrity verification that would detect changes between when data was validated and when it was used?
If you cannot answer this: Data poisoning attacks would succeed without detection. You're trusting data integrity without verifying it.
Do you enforce access controls on training data with the same rigor as production databases?
Who can read training data? Who can modify it? Are these permissions based on legitimate need, regularly reviewed, and logged? Or is training data treated as "just research data" with relaxed controls?
If you cannot answer this: Training data may be your most sensitive asset—it shapes model behavior. Treating it with less protection than production data inverts your actual risk.
Can you reproduce the exact dataset used to train any production model?
If you needed to retrain a model with identical data—for debugging, validation, or incident response—could you? Do you version datasets, or only models?
If you cannot answer this: You cannot investigate data-related issues. You cannot verify that training was performed on approved data. You've lost forensic capability.
Advanced Questions
Do you monitor data pipelines for anomalies that might indicate poisoning?
Statistical distribution shifts, unexpected sources, unusual volumes, modified schemas—do you have detection for data-level anomalies before data reaches training?
Can you selectively remove data from a trained model if that data is later found to be problematic?
If you discover that training data was biased, poisoned, or included data you shouldn't have used, can you remediate without full retraining?
Do you have different data governance based on AI use case risk levels?
Not all AI systems need identical data controls. Do you have a risk-based approach that applies stronger governance to higher-risk applications?
Training and Fine-Tuning: The Supply Chain
Models are software artifacts with supply chains. Treat them accordingly.
Foundational Questions
Do you know the provenance of every model in production?
For each model: Was it trained internally or acquired externally? If external, from what source? What license governs its use? What do you know about how it was trained, by whom, and on what data?
If you cannot answer this: You're running code of unknown origin in production. This is the AI equivalent of downloading random executables from the internet.
Are models stored with integrity verification that would detect tampering?
Model files are just files. Someone with write access can modify them. Do you have cryptographic verification that would detect if a model was altered after validation?
If you cannot answer this: An attacker who gains access to model storage can replace models with malicious versions. You would deploy the compromised model without knowing.
Is access to model training infrastructure controlled and audited?
Who can initiate training runs? Who can access training environments? Who can modify training configurations? Are these actions logged? Would you detect unauthorized training?
If you cannot answer this: Attackers could train malicious models using your infrastructure, potentially using your data, and deploy them into your pipeline.
Operational Questions
Can you recreate any production model from its training artifacts?
Given the code, configuration, and data reference for a model—can you reproduce training and get an equivalent model? Have you verified this capability?
If you cannot answer this: You cannot verify that models were trained as documented. You cannot investigate anomalies. You've lost auditability.
Do you validate models for security properties before deployment?
Not just accuracy and performance—security properties. Resistance to known attacks, behavior on adversarial inputs, absence of obvious backdoors. Is security validation part of your model release process?
If you cannot answer this: You're deploying models whose security characteristics are unknown. You're trusting without verifying.
How do you manage third-party model risk?
For models you didn't train—open source models, vendor models, fine-tuned versions of external models—what due diligence do you perform? How do you monitor for vulnerabilities or compromises in those models?
If you cannot answer this: Third-party models bring third-party risks. If a widely-used open source model is found to be compromised, would you know if you're affected?
Advanced Questions
Do you have a model inventory that tracks all versions, their lineage, and their deployment status?
Can you answer: "What models are running in production? What versions? When were they deployed? What's different between the current and previous version?"
Can you detect if a model was trained on data it shouldn't have access to?
If someone trained a model using unauthorized data—either through malice or mistake—would your systems detect it?
Do you have automated checks for model vulnerabilities integrated into your training pipeline?
Similar to SAST/DAST for code—automated security validation that runs as part of model development, not just before production deployment.
Deployment: Where AI Meets the World
Deployment is where theoretical risks become actual exposures.
Foundational Questions
Do you have an inventory of all deployed AI systems and their integration points?
Every AI model serving inferences somewhere—do you know where they all are? What systems call them? What data they receive? What actions their outputs trigger?
If you cannot answer this: You cannot assess your attack surface. You cannot respond to vulnerabilities. You have AI systems operating outside your security visibility.
Are AI endpoints protected with authentication and authorization appropriate to their risk?
Who can call your AI systems? How is that verified? Is access control based on identity or just network location? Do you differentiate authorization based on what queries are allowed?
If you cannot answer this: Your AI systems are accessible to anyone who can reach them on the network. Internal doesn't mean trusted.
Do you validate and sanitize inputs to AI systems?
What prevents malformed, malicious, or out-of-scope inputs from reaching your models? Do you have input validation that rejects obviously problematic queries before inference?
If you cannot answer this: Your AI systems will process whatever they receive. Prompt injection, adversarial inputs, and malformed queries will reach model logic unimpeded.
Operational Questions
Can you rate-limit or throttle AI system usage?
If an attacker attempts to abuse your AI system through volume—extraction attacks, resource exhaustion, reconnaissance—can you detect and limit it?
If you cannot answer this: Resource exhaustion and extraction attacks can proceed without restriction. A determined attacker can make unlimited queries.
Do you monitor AI system outputs for policy violations before they reach users or downstream systems?
Is there anything between model output and consumption that checks whether that output should actually be delivered? Or does raw model output flow directly to users?
If you cannot answer this: Models can and do produce outputs that violate policies. If nothing checks outputs, violations reach users.
Can you quickly disable or roll back a specific AI deployment?
If you discover a deployed model is compromised, behaving badly, or under attack—can you remove it from production in minutes? Hours? Days? Have you tested this capability?
If you cannot answer this: When incidents occur, your response time is limited by your deployment architecture. If rollback is a multi-day process, compromised models serve traffic for days.
Advanced Questions
Do you have different deployment security postures based on AI system risk?
Customer-facing systems vs. internal tools vs. autonomous agents—are security controls calibrated to risk? Or do all AI systems get the same controls regardless of exposure?
Can you deploy AI systems to isolated environments that limit blast radius?
For high-risk AI applications, can you deploy with restricted network access, limited data access, and contained failure modes?
Do you test AI deployments against known attack patterns before production?
Adversarial testing, prompt injection attempts, extraction techniques—do you validate resilience before deployment, not just after incidents?
Runtime and Agents: Autonomy Is Privilege
When AI systems can take actions, security requirements escalate dramatically.
Foundational Questions
Do you have an inventory of AI systems with tool use or action capabilities?
Which AI systems can do more than generate text? Which can execute code, access databases, call APIs, modify files, or take actions in other systems? Do you know, definitively?
If you cannot answer this: You don't know which AI systems can act on your environment. You cannot assess or control the privileges they hold.
Are agent capabilities explicitly defined and enforced?
For each AI agent: what exactly can it do? Is that defined as an explicit permission set? Is that permission set enforced by technical controls, or just described in documentation?
If you cannot answer this: Agent capabilities are implicit and unbounded. Agents can do whatever their integration points allow, which may be far more than intended.
Can you trace any agent action back to the request and reasoning that triggered it?
If an agent modified a database, sent an email, or called an API—can you find out why? What request initiated the chain? What reasoning led to that action?
If you cannot answer this: You have autonomous systems taking actions you cannot explain. When something goes wrong, you cannot investigate causation.
Operational Questions
Do agents operate with least-privilege permissions?
Are agent permissions scoped to what they actually need? Or do they hold broad permissions because that was easier to configure?
If you cannot answer this: Agents likely have more privilege than required. If an agent is compromised or misbehaves, its blast radius is larger than necessary.
Can you halt all agent actions quickly if needed?
Kill switch. If you determine that your agents are compromised or behaving dangerously—can you stop all autonomous actions immediately? Have you tested this?
If you cannot answer this: Compromised agents will continue acting while you figure out how to stop them. Your incident response cannot move faster than your control architecture.
Do you monitor agent actions for anomalous behavior?
Not just operational metrics—behavioral anomalies. Is an agent taking actions outside its normal patterns? Accessing resources it doesn't usually access? Making more calls than usual?
If you cannot answer this: Compromised or manipulated agents look normal by operational metrics. You need behavioral baselines to detect subtle manipulation.
Advanced Questions
Can you limit the scope of actions an agent can take in a single session?
Even within allowed capabilities—can you constrain cumulative impact? Prevent an agent from modifying thousands of records even if it can modify one?
Do you have approval workflows for high-impact agent actions?
For actions above a risk threshold—does a human need to approve before the agent proceeds? Is this enforced technically, or just policy?
Can you simulate agent behavior before deploying capability changes?
If you expand what an agent can do—can you predict how it will use those new capabilities? Do you test in environments that reveal misuse before production?
Infrastructure: AI Is Cloud at Scale
AI infrastructure is infrastructure. The fundamentals apply.
Foundational Questions
Is AI infrastructure identity-managed with the same rigor as other production systems?
Service accounts for AI workloads—are they inventoried, scoped to least privilege, regularly rotated, monitored for misuse? Or are they set-and-forget with broad permissions?
If you cannot answer this: AI infrastructure likely has standing privileges that exceed requirements. Compromised AI workloads inherit those privileges.
Is network segmentation implemented for AI training and inference environments?
Can training environments reach production databases? Can inference services reach training data? Are AI workloads on the same network segments as everything else?
If you cannot answer this: Compromised AI workloads can likely move laterally. Training environments with research-grade security become paths to production data.
Do you have visibility into AI infrastructure costs and resource consumption?
Unexpected GPU usage, unusual storage growth, abnormal network traffic—would you detect these? Do you have baselines that would reveal anomalies?
If you cannot answer this: Attackers using your AI infrastructure for their purposes—cryptomining, unauthorized training, exfiltration—would go unnoticed.
Operational Questions
Are secrets and credentials used by AI systems managed securely?
API keys, database credentials, service tokens—are they in secret managers with access logging and rotation? Or hardcoded in configurations and notebooks?
If you cannot answer this: AI credentials are likely the weakest managed in your environment. Research and experimentation culture often deprioritizes credential hygiene.
Do you have logging and audit capability for AI infrastructure that supports security investigation?
If you need to investigate what an AI workload did—who accessed it, what data it read, what network connections it made—can you find out?
If you cannot answer this: Security investigations involving AI infrastructure will be limited by observability gaps. You may not be able to determine scope or impact.
Is AI infrastructure included in your vulnerability management program?
Container images, ML frameworks, GPU drivers, inference servers—are these scanned, patched, and maintained? Or treated as special and excluded from standard processes?
If you cannot answer this: AI infrastructure will accumulate vulnerabilities that your vulnerability program doesn't see. Known exploits will succeed because patches weren't applied.
Advanced Questions
Can you isolate AI workloads from each other based on data sensitivity?
Workloads processing public data vs. workloads processing sensitive data—can they be segregated? Or do all AI workloads share infrastructure regardless of data classification?
Do you have infrastructure-level controls that enforce AI security policies?
At the platform level—not per-application—can you enforce that models must be signed, that data sources must be approved, that deployments must be authorized?
Is your AI infrastructure resilient to compromise of a single component?
If one training job is compromised, is the blast radius limited? Can it affect other workloads, access other data, or pivot to other environments?
Governance: Connecting Policy to Systems
Governance without technical enforcement is documentation.
Foundational Questions
Can you demonstrate that AI policies are technically enforced, not just documented?
Pick a policy. Show how it is enforced in systems. If the policy says "bias testing required before deployment"—what technical control prevents deployment without bias testing?
If you cannot answer this: Your governance is aspirational. Policies exist as intentions, not constraints. Systems can operate outside policy without prevention or detection.
Do you have operational definitions for AI governance requirements?
"Fair," "transparent," "accountable"—what do these mean for specific systems? How are they measured? What thresholds define compliance?
If you cannot answer this: Governance requirements cannot be verified. Every audit becomes interpretation. You cannot prove compliance because you haven't defined what compliance means operationally.
Is there a clear accountability matrix for AI systems?
For any AI system: who owns the data, the model, the deployment, the operation, the compliance, the risk? Are these documented and known? When something goes wrong, is it immediately clear who answers questions?
If you cannot answer this: Accountability is determined after incidents, not before. Gaps between responsibilities become gaps in governance.
Operational Questions
Do you measure governance compliance continuously?
Not at deployment, not quarterly, not at audits—continuously. Are there dashboards showing current compliance status? Alerts when compliance degrades?
If you cannot answer this: You only know compliance state at audit time. Between audits, compliance can degrade without detection.
Can you produce audit evidence from system telemetry, not manual documentation?
When auditors ask for evidence—is it automatically generated by systems? Or assembled manually by people reviewing records?
If you cannot answer this: Your evidence is only as good as your documentation discipline. Systems operate; people sometimes document. What's undocumented is unauditable.
How quickly can you respond to a governance policy change?
If a new regulation requires changes to AI systems—how long to implement? Can governance controls be updated without rebuilding AI systems?
If you cannot answer this: Policy changes become major projects. The gap between policy update and system update becomes a compliance gap.
Advanced Questions
Does AI governance integrate with enterprise risk, compliance, security, and data governance?
Or is it a separate silo with separate processes, separate evidence, separate reviews?
Can governance decisions be traced to technical controls?
From policy requirement to procedure to control to enforcement mechanism—is the chain complete and documented?
Do governance teams have technical capability to verify compliance themselves?
Or do they depend entirely on technical teams to tell them whether systems comply?
Cross-Cutting: Organizational Capability
Technical architecture exists within organizational capability.
Foundational Questions
Do you have people with both security expertise and AI systems understanding?
Not security people who've read about AI, not AI people who've read about security—people who can think architecturally about both? How many? Where are they?
If you cannot answer this: AI security falls in the gap between teams. Security teams don't understand AI systems. AI teams don't understand security architecture. Neither builds what's needed.
Is AI security integrated into existing security processes or parallel to them?
Threat modeling, vulnerability management, incident response, security reviews—do they include AI, or does AI have separate processes?
If you cannot answer this: AI security is probably siloed. Separate processes create gaps and inconsistencies. Integration with existing processes creates leverage.
Can your organization learn from AI security incidents?
When things go wrong—is there a mechanism to understand what failed, update architecture, and prevent recurrence? Or do incidents get resolved and forgotten?
If you cannot answer this: You will repeat failures. Without learning mechanisms, the same architectural gaps will cause repeated incidents.
Operational Questions
Do development teams have AI security guidance they can actually use?
Not theoretical frameworks—practical guidance. When someone is building an AI system, do they know what security controls to implement? Is guidance specific enough to be actionable?
If you cannot answer this: Security depends on individual developer judgment. Quality varies by team. Consistent architecture requires consistent guidance.
Is there a mechanism for security to engage with AI projects early?
Before architecture is locked, before data is collected, before models are trained—can security influence design? Or does security only see AI systems at deployment review?
If you cannot answer this: Security becomes a deployment gate that finds problems too late to fix efficiently. Remediation is expensive. Prevention is cheap.
Can you prioritize AI security investments based on actual risk?
Not all AI systems are equal risk. Do you have a way to assess which systems need more controls? Are your investments proportional to risk?
If you cannot answer this: You're either over-controlling low-risk systems or under-controlling high-risk systems. Probably both.
Advanced Questions
Do you have metrics that would show AI security posture trends over time?
Are things getting better or worse? Can you tell?
Is there executive visibility into AI security risks and investments?
Can you explain AI security posture to leadership in terms they can act on?
Can you demonstrate ROI on AI security investments?
When you invest in AI security capabilities, can you show what risk was reduced?
Using the Results
After working through these questions, you will have one of three answers for each:
"Yes, and here's the evidence." This is the only good answer. You have the capability. You can demonstrate it. Move on.
"Yes, but I'd need to check." This is a warning. The capability might exist, but it's not operationalized. If you can't quickly demonstrate it, you can't rely on it in an incident. Treat this as a gap that needs verification.
"No." This is honest. You have a gap. Now you know where to invest.
The gaps you identify should be prioritized:
By blast radius. Gaps in foundational areas affect everything downstream. Gaps in data security undermine training security, which undermines deployment security. Fix upstream gaps first.
By system risk. Not all AI systems are equal. Gaps affecting autonomous agents with broad permissions matter more than gaps affecting internal recommendation systems.
By exploitability. Some gaps are theoretical. Some are actively exploitable by current attack techniques. Prioritize gaps that attackers know how to use.
By organizational readiness. Some gaps require organizational change, not just technical implementation. Sequence investments based on what your organization can actually absorb.
The goal is not to answer "yes" to every question. The goal is to have honest answers and intentional priorities. An organization that knows its gaps and has prioritized remediation is more secure than one that has checked boxes without verification.
Key Takeaways
Questions that demand evidence reveal reality that documentation obscures. The difference between "we have a process for that" and "yes, here's how it works" is the difference between governance theater and actual security. Demand evidence. Accept nothing less.
Foundational gaps invalidate advanced capabilities. If you can't trace your training data, it doesn't matter how sophisticated your model validation is. If you don't know what AI systems are deployed, runtime monitoring is incomplete by definition. Fix foundations first.
AI security is lifecycle security. These questions span data, training, deployment, runtime, infrastructure, and governance because AI security spans all of them. Securing one stage while ignoring others leaves attack paths open. The lifecycle is the unit of security.
Organizational capability is architectural capability. Technical controls require people who can build, operate, and improve them. Processes require people who understand both security and AI. Investment in people and integration is investment in architecture.
Checklists don't create security—architecture does. These questions are diagnostic, not prescriptive. Answering them identifies gaps. Closing those gaps requires building actual systems: data lineage infrastructure, model provenance systems, deployment controls, agent governance, infrastructure security, and governance enforcement mechanisms. The checklist tells you what's missing. You still have to build what's missing.