The AI Security Crisis Hiding in Plain Sight: Why Your ML Models Are More Vulnerable Than You Think
Most companies deploying machine learning models right now share a dangerous secret: they cannot prove what's actually running in production. Jesse Williams, Co-founder and COO of Jozu and former VP at Docker, spent 15 years watching this security crisis develop. In this episode of Lead with AI, host Dr. Tamara Nall speaks with Jesse about how companies skip the governance frameworks software teams spent two decades perfecting in their rush to ship AI capabilities.
Jesse has climbed the corporate ladder through strategic acquisitions at Red Hat, IBM, AWS, and Docker, staying on the leading edge of DevOps innovation. But he describes the current phase as the most alarming he's witnessed. Teams use excellent specialized tools for different pipeline components, creating a false sense of security. Then someone asks a simple question that changes everything: what exactly is running in production right now?
When Confidence Meets Reality
Jesse sees the same pattern play out repeatedly. Teams walk into the room brimming with confidence about their ML security posture. They've invested in sophisticated tools. They follow best practices. Everything appears buttoned up. One defense contractor exemplified this perfectly. They proudly demonstrated their ML pipeline during Jesse's assessment. Data versioned in DVC. Code managed in GitHub. Models stored in S3 buckets. They had checked every box. Then Jesse asked: "Can you prove exactly what deployed into your classified environment?"
The room went silent. They couldn't produce a unified record showing which specific data, code version, model weights, and prompts went into production together. Any developer with access could have swapped a dataset or modified a prompt without leaving an audit trail that would survive scrutiny. This wasn't incompetence. These were skilled professionals using reputable tools. The problem was fragmentation. Each component lived in its own silo, but nothing connected them into one provable truth.
The Vulnerability Hiding in Plain Sight
Sometimes security gaps reveal themselves through near misses that make your blood run cold. A malicious pull request was submitted to a prompt in AWS's developer tools. The modification was subtle, dangerous, and it went completely unnoticed during code review. Why? The prompt spanned multiple pages of text. Without line-by-line diff views like you'd see in GitHub, the malicious modification hid in plain sight. Manual review of multi-page prompts is impractical at best, impossible at scale.
This actually happened at AWS, one of the most security-conscious organizations in tech. If it happened there, it can happen anywhere. When prompts dictate how AI models behave, version control stops being a nice-to-have and becomes critical security infrastructure. Jesse built Jozu specifically to close gaps like this. The platform treats prompts with the same rigor software engineers apply to code: version controlled, diff-viewable, audit-tracked, and cryptographically signed. One overlooked prompt change can compromise an entire model.
The Compliance Nightmare Nobody Talks About
Jesse kept hearing the same exhausting story from enterprise clients in healthcare, finance, and defense. When audit season approaches or regulators come knocking, teams scramble to reconstruct what actually happened during model development. Creating comprehensive audit logs requires assigning two developers to work full-time for two to three weeks per model. That's 80+ hours of manual compilation, combing through scattered systems, stitching together records from different tools. The work is tedious, error-prone, and pulls developers away from actual product development.
With AI regulation accelerating globally, audit trails aren't optional anymore. They're compliance requirements. The question isn't whether you'll need comprehensive audit documentation. It's whether you'll waste weeks compiling it manually or generate it automatically. Jozu eliminates this entire burden. The platform auto-generates complete audit logs that are cryptographically signed and tamper-proof. What used to consume weeks now happens automatically as part of normal development. Teams get regulation-ready documentation from day one without additional effort.
Making AI Development Actually Accessible
Here's something surprising about Jesse: his background isn't in software development. He's spent his career in product management and operations, which gave him unique perspective when building Jozu. He needed tools simple enough for non-technical founders while maintaining the rigor enterprise security demands. Using KitOps—the open source framework his team donated to the Cloud Native Computing Foundation—Jesse can pull models from Hugging Face and package them as model kits in literally one or two commands. Those packages then deploy to Docker containers almost automatically, getting models running locally in seconds.
No extensive setup. No hand-holding from ChatGPT or Claude. Just straightforward commands that work. As a former Docker VP, Jesse knows what "easy deployment" actually means. He built Jozu to deliver that same elegant simplicity for AI models. This accessibility matters because it democratizes who can build with AI. The technology shouldn't require a PhD to use effectively. It should enable anyone with good ideas and execution skills to compete.
Where Ethics Actually Starts
When most people discuss AI ethics, they focus on model behavior. Can the system generate harmful content? Does it exhibit bias? Will it refuse inappropriate requests? Jesse approaches ethics from a fundamentally different angle: infrastructure governance before models ever interact with users. Consider the Chevy chatbot incident. Someone convinced a dealership's AI to sell a vehicle for one dollar by manipulating the model into believing this absurd offer was legally binding. Most people blamed the model's guardrails. Jesse looks earlier in the chain.
The questions Jozu helps organizations answer happen before deployment: Is this data violating HIPAA compliance? Could someone extract credit card numbers because training data included sensitive information? Are you deploying into embargoed nations? Which teams should access which datasets? Jesse experienced this at AWS, where different teams—Cloud, Prime Video, Amazon Shopping—often worked on similar models but absolutely could not access each other's sensitive customer data. Infrastructure governance solves this at the source by enforcing policy before deployment rather than discovering problems after incidents make headlines.
The Future Taking Shape Right Now
Jesse identifies three major shifts already materializing across industries.
First is the move from ChatGPT APIs to on-premises models running in business-critical systems. Companies are taking open source versions of foundation models and hosting them internally. A logistics company tracking truck fleets doesn't need external API calls. A manufacturing facility monitoring IoT sensors doesn't gain value from sending data to cloud providers. A hospital processing patient records cannot ship protected health information to third parties. These applications need AI running in secure environments where data never leaves the building.
Second is the emergence of billion-dollar companies run by just 10 people. This makes most people do a double-take, but the reasoning is sound. Tasks requiring 10 people now need just one with the right AI tools. Capabilities demanding years of specialized training become accessible to AI-assisted generalists. A kid in a rural area with internet access can compete with well-funded Silicon Valley startups. That's not hyperbole. That's the infrastructure reality Jesse's building toward.
Third is AI leveling the playing field globally. Dr. Tamara Nall captured this beautifully: "You can take kids somewhere without resources or access. Now they have AI. Now they're at a level playing field with someone who has way more resources." Geography doesn't determine opportunity anymore. Educational pedigree stops being a gatekeeper. What matters now is your ability to think critically, your willingness to act decisively, and your execution speed. Talented people without traditional resources now have tools that used to require millions in funding.
Getting Started Without Complexity
Jesse encourages teams to start with KitOps, the open source project his team donated to make secure ML deployment freely available to everyone. It's completely free with an active community on Discord where developers share knowledge and solve problems collaboratively. Visit jozu.ml to explore their hosted sandbox. For organizations ready to implement full governance, Jozu spins up proofs of concept directly inside your existing infrastructure.
The goal isn't forcing any particular solution. It's encouraging teams to experience what "single source of truth" actually means before the next security audit catches you unprepared or a headline-making incident forces expensive emergency remediation.
Jesse's boldest prediction challenges prevailing AI doom scenarios. Instead of destroying jobs, AI will enable unprecedented opportunities at global scale. This isn't about replacement. It's about amplification. More opportunities emerging, not fewer. More viable paths to build real value on your own terms. That's the future Jesse's building infrastructure for, and that's the revolution that actually matters.
Want to experience secure AI deployment? Visit jozu.com and kitops.org to access the open source framework and enterprise platform making ML governance automatic.
For more insights on how AI is transforming business and society, subscribe to the Lead with AI podcast, where we explore the frontiers of artificial intelligence with the innovators who are shaping its development.
#AIGovernance #EnterpriseAI #AIInfrastructure #DevOpsAI #LeadWithAI #FutureOfAI #AIMissionControl #ModelOps #ResponsibleAI #TechInnovation #OpenSourceAI

Comments