Frontier models are predicting recursive self-improvement (i.e., AI that improves itself without human involvement) within 12 to 18 months. The window to build adequate defenses is narrower than most organizations realize.
The real threat isn’t CVE volume; it’s the ability to chain medium-severity vulnerabilities into critical exploits, making CVSS scores an increasingly unreliable measure of actual risk.
AI-generated code will largely eliminate classic vulnerability classes like SQL injection and XSS within a few years but business logic flaws, prompt injection, and other AI-native attack surfaces are moving in to replace them.
Developers will adopt AI coding tools whether security approves them or not. The organizations winning aren’t restricting access, they’re rethinking workflows entirely and building secure defaults in from the start.
Traditional frameworks weren’t designed for agentic AI. The emerging response centers on scoped identity, action-level authorization, deep observability, and aggressive containment.
What does securing software in an AI-accelerated world actually mean?
Trying to make sense of this question, and offering expert prescriptive advice around it, was the theme of VibeSecCon Returns, an OX Security-hosted virtual event featuring CISOs, industry experts, security researchers, and ample audience participation.
From recursive self-improvement to exploit chains to agentic governance, the conversations were candid, technical, and even a little alarming at times.
Here are five key takeaways from VibeSecCon Returns 2026.
If you want to understand where AI security is heading, it helps to understand where AI itself is heading. OX Security CEO and co-founder Neatsun Ziv mapped the generational evolution of frontier models: from GPT-3’s static knowledge, to GPT-4’s “inner loop” reasoning, to Claude 4.5’s “outer loop” tool use, to today’s Mythos-class models with their deep domain focus and multi-step goal pursuit.
But Neatsun’s more provocative point was about what comes next. Recursive self-improvement (RSI) is the next frontier, and the timeline is shorter than you may realize.
“Most of the frontier models are predicting anywhere between 12 to 18 months until they have RSI live in production,” he said. “And for the frontier models to say ‘this is live in 12 months,’ it means that probably they’ve got a running POC right now in the lab that is doing self-improvement.”
The implication for security is massive. Today, we’re still learning to defend against Mythos-era capabilities. Within a year, we may be facing models that improve themselves continuously without human involvement, operating across dimensions we can’t yet frame.
The window to get ahead of this is narrow.
OX Head of Security Research Moshe Siman Tov Bustan offered some true insight on practical use when he walked through original research on Mythos-attributed CVEs.
His team manually reviewed dozens of vulnerabilities that Mythos had identified and found something counterintuitive: most of them, in isolation, didn’t amount to much.
The danger lies in the chain of vulnerabilities, not any single one.
Moshe demonstrated how four vulnerabilities in a database system (none of them individually critical) could be linked together to achieve full remote code execution. Authentication bypass led to a JDBC bypass, which enabled SQL injection, which enabled a serialization exploit, which enabled RCE. The weakest link in the chain was a medium-severity finding.
“The weakest vulnerability is a medium vulnerability,” he said. “Without that specific medium vulnerability, all of the chain wasn’t able to walk.”
The practical takeaway: CVSS scores are a poor proxy for real-world risk. Alert fatigue is real, and it’s being made worse by AI-generated vulnerability volume.
What security teams actually need is the ability to understand which vulnerabilities can be linked, and what path leads to actual business impact.
For decades, the same categories of vulnerabilities have dominated application security: SQL injection, cross-site scripting, broken authentication among them. The OWASP Top 10 has been a reliable map of the threat landscape for so long that most security programs are built around it.
That map is about to get redrawn.
The operational panel (hosted by OX Security Field CTO Boaz Barzel and joined by James Berthoty of Latio, Daniel Begimher of AWS, and Pritam Mungse of Poshmark) took on this topic, and Pritam made this case directly:
AI-generated code will largely eliminate the classic vulnerability classes within the next few years, simply because the models generating the code are getting good enough to avoid them by default.
“Issues that we are concerned about today, I don’t think they will be that relevant in 12 months or so,” he said. “There will be quite a lot of changes to the OWASP Top 10. Vulnerabilities that we are primarily targeting today, like SQL injection and XSS — those kind of things will go away, because AI will predominantly try to generate secure code in that aspect.”
That’s not an entirely reassuring picture, however.
The same shift that makes traditional vulnerabilities less common is creating new categories of risk that existing security programs aren’t equipped to handle. Business logic flaws, insecure direct object references, and entirely new AI-native vulnerability classes (prompt injection, skill-based attacks, compromised MCP servers) are moving into the space that SQL injection and XSS are vacating.
“New areas of issues will come up with all of the new technologies,” Pritam noted. “There will still be some things that continue, like business logic issues, that are very complex for AI to understand so far.”
For security teams, this means the tools and workflows built around catching yesterday’s most common vulnerabilities are becoming less relevant, while the exposure surface that actually matters is shifting somewhere programs may not be looking yet.
There’s a tension every security team already feels: developers want to move fast, and AI has dramatically increased their ability to do so.
The problem? Governance models haven’t kept up.
James put it in resonant terms: developers will use the AI tools they want to use, with or without security’s blessing.
“The modern version of that is now if I can’t use my Claude Code, then I’m just not interested in it, and I’m gonna go use it anyways,” he said. “So, creating the safe ways for them to use the tooling that they want to use in the first place.”
The panel’s consensus was that the organizations pulling ahead aren’t the ones running governance committees to decide which models to allow but the ones rethinking workflows from the ground up. They’re asking what it means to do their work differently in an AI world, not just which AI tools to permit.
Security teams that shift from gatekeeping to enabling (i.e., building secure defaults into agent context, threat modeling codebases with AI, providing fixes rather than just findings) are the ones building durable programs.
How do you govern systems that behave in ways you can’t fully predict?
The closing session, hosted by Rain Capital Managing General Partner Chenxi Wang and joined by Pieter Vanlperen of AlphaSense, Samir Sharif of Fastly, and Daniel Liber of Monday.com, tackled this difficult question.
Most existing security frameworks weren’t built for this reality. Traditional controls assume known assets, defined behaviors, and deterministic policies. Agentic AI breaks all three assumptions.
Pieter captured the attribution problem precisely:
“There’s this gut reaction to say that, Daniel spurred that agent, so Daniel’s responsible for what the agent did, but that’s kind of like saying that you’re gonna play a game of telephone with perhaps the seven most terrible people you know in your life, and then be held responsible for what the seventh person in that game of telephone does,” he said.
The emerging framework that practitioners are coalescing around has four pillars:
Scoped identity (no shared credentials, no standing access)
Action-level authorization with human-in-the-loop for high-impact operations,
Full observability with enough telemetry to reconstruct what an agent was trying to do and why
Containment, with network segmentation, blast radius isolation, and kill switches
Containment is the most important layer, Daniel said, because if you start there, whatever goes wrong will hurt less.
Samir added a supply chain dimension: organizations need to centralize build-fail policy, get real visibility at the edge, and close the runtime feedback loop, because the attack surface is expanding faster than most teams can track.
Lastly, Pieter offered what became the session’s most quotable piece of operational advice:
“Be violent with your agents.”
Agents are ephemeral software. When something looks wrong, don’t deliberate — kill it and restart. Building agentic systems that can’t survive that kind of intervention is itself an architectural mistake.
The final poll of the day told an honest story: 80% of attendees said balancing AI innovation and security control is a constant struggle, and not a single person said they were confident their existing frameworks could adapt to the AI era.
This is where the industry is right now, not an admission of failure.
The practitioners who showed up to VibeSecCon Returns aren’t waiting for certainty — they’re building frameworks in real time, sharing what’s working, and being honest about what isn’t.
That kind of shared, open conversation is exactly what the moment calls for.
Catch up on a full replay of VibeSecCon Returns.
The post AI is Rewriting the Rules of Software Security. Here’s What the Experts Say. appeared first on OX Security.