Meet Constantine – Find Mythos-level vulnerabilities in your code. It proves them, patches them, PRs them back. Autonomously.

FreeBSoD: Leveraging Language Models to Find and Exploit Kernel Bugs (Part 1 of 2)

Cross-section of a layered silicon chip with a glowing red fracture splitting the kernel layer, representing a FreeBSD kernel memory corruption vulnerability

Overview

Earlier this year, a team at Praetorian was building Constantine, our automated 0-day discovery engine. I wanted to find techniques worth folding into it, so on the side I started poking at the FreeBSD kernel with Claude Code, running on Opus 4.6, which was the latest Opus model at the time. A few days of work turned up real bugs and a weekend after that produced two working exploits capable of escaping from a FreeBSD jail.

This article is part of a two-part series. In part one, I will be focusing on the methodology used to uncover the identified vulnerabilities and part two will focus on the methodology we leveraged to develop and exploit the vulnerabilities. It’s been several months since I disclosed roughly eight separate vulnerabilities to the FreeBSD security team. The reality is that this is a volunteer team and they are likely overwhelmed by the sheer number of vulnerabilities being identified within FreeBSD by various security researchers leveraging large language models.

Because of this, we can really only publicly discuss a single vulnerability we reported CVE-2026-3038, a fairly straight-forward stack overflow fixed in FreeBSD-SA-26:05.route on February 24th, 2026. The other issues have been disclosed, but are still awaiting remediation and patches. For this reason, this post will focus primarily on methodology leveraged to identify these vulnerabilities by incorporating language models into the vulnerability discovery workflow.

Why FreeBSD?

FreeBSD is a fairly niche operating system, and you would be forgiven for never having heard of it. It quietly powers a surprising amount of critical infrastructure, often as a component buried inside something else. Juniper’s Junos, which runs on a large share of the world’s routers, switches, and firewalls, is built on FreeBSD. Netflix streams much of the internet’s video traffic from FreeBSD-based appliances co-located inside ISP networks. Sony’s PlayStation is built on it, and the Nintendo Switch borrows its networking stack from FreeBSD code under the permissive BSD license. 

FreeBSD was a practical choice for a side project. It seemed like a target where a few days of experimentation could plausibly turn up real bugs, while still being widely deployed enough to be an interesting place to find them. 

It is also not a toy target. The baseline mitigations are all there. SMEP, SMAP, KASLR, and stack canaries are in play so exploitation isn’t trivial either. FreeBSD carries the same exploit mitigations you find across most modern operating systems and software, but it is not as aggressively hardened as something like OpenBSD. This combination made FreeBSD a useful benchmark for understanding the exploit development capabilities of bleeding-edge frontier models at the time like Claude Opus 4.6.

What Made It Hard

There were several difficulties we ran into when leveraging Claude Code alongside Claude Opus for vulnerability research. This section gives an overview of the main ones, and the subsections that follow describe how we worked to solve or address each:

Token limits: I ran this against a Claude Pro Max subscription, which puts a hard ceiling on how much the agent can do in a given window. You burn through usage limits fast when an agent is reading source code, generating test programs, and chewing through crash output across hundreds of iterations. Navigating and auditing a large codebase requires either a massive research budget or an intelligent approach to token usage. 

False positives: Like any source code analysis tooling the models are prone to false positives and jumping to incorrect conclusions. In the absence of an oracle or feedback mechanism, the model will often confidently conclude that a vulnerability exists when it doesn’t.

Context window. You cannot simply feed the entirety of the FreeBSD source code to the model due to context window limitations. You need to be careful how much context you provide the model as too little context will lead to false positives or missed findings and too much context can lead to wasted tokens or model degradation. 

Sycophancy: Prompts need to be carefully written. When a model is given an explicit objective or goal it desperately wants to accomplish that goal even if it means hallucinating or cheating on finding analysis. For example, in one scenario, the model found a way to trigger a use-after-free vulnerability, but it did this by modifying kernel source code to trigger the vulnerability as it wasn’t triggerable from userspace. 

Cheating: The model is prone to mischaracterizing issues, cheating, and getting tunnel vision when trying to solve a problem. For example, in one scenario we were having the model develop a ROP chain for an exploit to escape out of a FreeBSD jail. It couldn’t find the ROP gadget it was looking for and failed to consider alternative vectors. Instead, it compiled and attempted to load a kernel module with the requisite gadget chain included in the module. 

Dealing With Token Limitations

Due to the hardware requirements for running these models, there are some obvious cost bottlenecks and limitations associated with using them for vulnerability research. One of the key differences between the large labs and our research approach for this project was that we simply weren’t willing to spend 20k – 30k USD on tokens to potentially find a vulnerability. We wanted to be a lot more strategic in our usage and used a single Claude Pro Max account costing only roughly 100 USD monthly.

We’ve dedicated a lot of time to customizing our Claude instances to maximize the value we get out of tooling like Claude Code. One of the workflows we’ve built out is a deep research workflow that allows the agent to perform extensive research online including web searches, reviewing past academic papers, and existing source code and writeups on GitHub. 

During our research, we leveraged this workflow to research and build out a database of known vulnerability patterns within the FreeBSD kernel. This allowed us to identify dangerous functions or flows within the code that we could then leverage for variant hunting within the codebase.

After conducting this deep research job we then tasked the agent with writing CodeQL and semgrep rules to identify potential vulnerable code patterns. These rules were designed by the agent using the data collected during the deep research phase on known vulnerable code patterns. This process is captured in the diagram given below:

Diagram of the AI agent workflow for finding FreeBSD bugs: researches bug patterns, writes static rules, sweeps the source tree, flags candidate regions, and triages flagged code

Running these rules against the entire codebase provided us with thousands of candidate vulnerabilities we could then task our agent with triaging to determine their validity. For parts of the codebase where a large number of rules were triggered, we also tasked the agent with performing an in-depth review of the code within the relevant module.

NOTE: It’s important when adopting this methodology for other targets to note that CodeQL has a rather restrictive license. It’s possible for us to use it against FreeBSD in this scenario as it’s an open source project, but the rules get rather strict depending on your use-case and other circumstances.

Building a Feedback Loop

One of the beautiful things about operating system kernels is that many of the vulnerabilities we were looking for were within the userland-to-kernel interface exposed to userland processes. This is a pretty straightforward and well documented attack surface. Additionally, FreeBSD natively supports instrumentation frameworks like KASAN that makes it easy to detect when a memory corruption vulnerability is triggered within the system even if it doesn’t immediately result in a crash.

This made it very easy to construct a feedback loop with the agent where we could provision it access to a virtual machine running the FreeBSD kernel compiled with KASAN support. The agent would then leverage this environment as an oracle to test its hypothesis regarding the presence of a vulnerability. 

The diagram below shows the workflow the agent used pivoting from audit to hypothesis, and finally testing against KASAN. Outside of confirming the presence of a vulnerability this proved incredibly useful for the agent to iterate on proof-of-concept reproducer generation. 

Often, when tasking the agent with generating a KASAN reproducer for a potential vulnerability it would take multiple iterations to generate the reproducer. This gave the agent the ability to gather feedback and iterate until it either determined the issue didn’t exist or it successfully fixed the relevant bugs and built a successful reproducer.

Flowchart of the KASAN feedback loop used to confirm a FreeBSD vulnerability: agent audits source, writes a trigger program, runs it in an instrumented VM, and checks KASAN telemetry

In general, we’ve found this pattern to be incredibly useful when building automation around agents for performing various tasks. For example, my coworker Michael Weber recently built out a feedback loop with an agent and some custom skills for evading signature and ML based detection mechanisms. I’ve also written about a similar feedback loop pattern we used in a previous post for building a virtualized loader as well. 

Managing Context Window Limitations

Another key aspect is managing context window limitations when leveraging the agent for code review. One of my hypotheses when conducting this research was that for a model to effectively audit a component it’s probably going to work best when it’s able to understand and fit most of the structure of the codebase into its context window. I figured that if I focused the agent on a specific subsystem where that subsystem had a reasonable number of lines of code like 30k to 40k lines it would be much easier for the agent to audit a smaller component of a larger system.

As part of this methodology I tasked the agent with traversing the codebase and identifying potential candidates or subsystems for more in-depth auditing by the agent. This produced a list of candidate subsystems which I also combined with other data such as metrics around the number of dangerous API calls within the subsystem and the amount of attack surface within the subsystem exposed to usermode processes. 

One of the more interesting areas I decided to target was the attack surface exposed to restricted processes running in FreeBSD jails and VNET jails. After watching Ilja and Michael Smith’s 39C3 presentation titled Escaping Containment: A Security Analysis of FreeBSD Jails, it gave me an interesting idea surrounding potential attack surfaces to investigate. 

In the presentation they mention that there is a lot of code written in FreeBSD from the 90s where it was written before jails existed and under the assumption that the code would only be invoked by the root user account. However, once the FreeBSD jails functionality was implemented this attack surface was also exposed to any user with root access within FreeBSD jails. This insight alongside work performed by Claude Opus allowed me to identify several subsystems of interest for more in-depth auditing.

When performing manual auditing I experimented with several approaches for prompting the agent. Something I observed during this process was that it was generally more effective to task the agent with auditing for and identifying a specific vulnerability class or pattern of vulnerabilities within a given module than it was to task the agent with more general prompts like “find any security issues in this subsystem”, etc.

Instances of Model Cheating During Exploitation

During this research I also observed many interesting instances of models cheating during the exploit development or vulnerability discovery processes. In one instance, the model was attempting to construct a ROP chain for an exploit for a heap overflow vulnerability. It failed to find the gadget it needed to trigger a stack pivot and solved the problem by writing and loading a custom kernel module containing the gadget it needed to complete the chain as shown in the image given below:

Terminal table showing root shell on FreeBSD 14.4, OpenBSD 7.8, and NetBSD 10.0 via a SUID payload on a mounted filesystem image

In my opinion, this is really where the structure of the prompt and a custom harness come into play in terms of helping to reduce cheating and false positives. I think that if we instead built the workflow using a custom harness with a multi-agent workflow where another LLM or group of LLMs acted as a judge or peer providing feedback on the proposed exploitation plan alongside a mandatory prompt outlining strict acceptance criteria for the plan, this is something that would allow us to potentially solve for the cheating issue in the workflow. 

Cheating, in my opinion, is also just another instance of the sycophancy we often see with models where they fundamentally really very much want to help you to complete your task even if it means doing it in a way that doesn’t make sense or completely breaks the preconditions we are working within for exploitation.

Sycophancy and Hallucinations on Behalf of the Model

One vulnerability that we identified was a memory corruption vulnerability in a filesystem driver impacting OpenBSD, NetBSD, and FreeBSD. We tasked the agent with attempting to turn the memory corruption vulnerability into a local privilege escalation vulnerability. 

The agent developed a “proof of concept” exploit which allowed it to gain root privileges. The vulnerability involved tricking a user into mounting an attacker controlled filesystem without the nosuid flag set. At this point, the attacker could then run a SUID binary on the mounted filesystem to gain root privileges. 

In this case, the model got confused as the vulnerability was a buffer overread in the filesystem driver and if the attacker positioned things correctly they could cause inappropriate metadata to be read from an adjacent file to control the SUID flags on a binary. This confused the model into thinking this was a security issue without realizing that mounting an attacker-controlled filesystem without the nosuid flag is universally dangerous and doesn’t require this type of bug to exploit.

Agent exploit plan proposing a custom kernel module to supply a missing ROP stack-pivot gadget on a GENERIC FreeBSD kernel

This was an interesting failure case as the model did successfully identify several non-exploitable memory corruption issues in the filesystem driver. It was also able to successfully trigger a memory overread by formatting the filesystem in a specific manner. However, where it broke down was in reasoning about the implications of the primitive provided by the vulnerability. 

In this case, the model effectively identified an interesting edge case, but failed to properly consider the preconditions required for exploitation and how this fit into the overall usefulness of the potential vulnerability or exploit. If an attacker can convince a user to mount an attacker-controlled filesystem without the nosuid flag, the attacker can just place a SUID root binary on the image directly.

At this point you might reasonably wonder why filesystem driver vulnerabilities are worth investigating at all. The answer is that exploitation context matters. In scenarios where an attacker controls the storage media but lacks code execution on the target, a filesystem parsing bug becomes a useful entrypoint into the system. 

These types of vulnerabilities are often exploited by the PS4 and PS5 jailbreak community as part of exploit chains for jailbreaking an appliance. For example, one chain leveraged a filesystem driver vulnerabilities alongside a WebKit vulnerability to fully jailbreak the console. The end-user would browse to a malicious site using the built-in web browser on the console. The malicious site would exploit a vulnerability in WebKit and execute usermode code within a restricted process to perform heap spraying within the kernel. The end user performing the jailbreak would then be instructed to plug a maliciously formatted USB drive into the appliance to trigger a heap overflow and trigger code execution within the kernel.

Digging into FreeBSD RTSock Stack Overflow (CVE-2026-3038)

As mentioned previously, we have roughly eight or so different vulnerabilities currently within the disclosure process and waiting on a fix. However, one of the issues we reported was fixed promptly which was CVE-2026-3038 which was a stack-based buffer overflow vulnerability in the RTSock subsystem of FreeBSD.

The routing socket subsystem (sys/net/rtsock.c) was one of the first files the agent flagged during the semgrep and CodeQL triage pass. It sits directly on the userland-to-kernel boundary, handles complex message parsing with multiple sockaddr structures, and has been around since the early days of FreeBSD’s networking stack. When we tasked the agent with a deeper review, it identified a stack buffer overflow in rtsock_msg_buffer() where user-supplied sockaddr data is copied into a 128-byte struct sockaddr_storage on the kernel stack:

bcopy(sa, &ss, sa->sa_len);    /* line 1881 */

A KASSERT two lines above checks sa_len against sizeof(ss), but it compiles out in production kernels. The sa_len field is user-controlled and can be up to 255, giving us a 127-byte overflow with fully attacker-controlled content. The key insight was that RTAX_AUTHOR is the one sockaddr type that slips through every stage of validation. cleanup_xaddrs() only validates destination, gateway, and netmask. update_rtm_from_rc() replaces those same fields with kernel data in the reply path but never touches RTAX_AUTHOR. So an oversized author sockaddr survives completely intact from the initial message all the way into the vulnerable bcopy. And because RTM_GET is explicitly exempted from the PRIV_NET_ROUTE privilege check, any unprivileged local user can trigger it.

On a standard GENERIC kernel, the overflow overwrites the stack canary and triggers a clean panic via __stack_chk_fail. On our KASAN-instrumented kernel, the KASSERT caught it before the bcopy executed, giving us a precise stack trace through the full call chain. The 127 bytes of overflow corrupt the canary, saved frame pointer, and return address, but on amd64 the compiler allocates the critical local write pointer to a register rather than the stack, so the canary check fires before any corrupted pointers are used. 

This makes the bug a reliable unprivileged local denial of service in its default configuration, but it requires another vulnerability to leak the stack canary to exploit reliably for privilege escalation. We reported it to the FreeBSD security team on February 23rd, and Mark Johnston and the team had a patch out the following day as FreeBSD-SA-26:05.route. We will dig deeper into the exploitation considerations and what it takes to get past the canary in part two.

Conclusion

By leveraging an AI-assisted vulnerability research workflow we were able to significantly compress the amount of time it would have taken to review, audit, and write exploits for zero-day vulnerabilities in the FreeBSD kernel. 

This research was performed using Claude Opus 4.6 back in February before the public discussion surrounding Anthropic’s Mythos. It’s likely that the capabilities of frontier models at performing source code auditing for memory corruption issues will continue to improve over time. However, it’s not just the model being used that is important for success in AI-assisted vulnerability research. The other key factor is the ability to design custom harnesses, prompts, and skills, alongside denoising capabilities (e.g. KASAN) and targeting techniques that maximize the model’s effectiveness while minimizing token costs.

At Praetorian, we are constantly researching and working to identify new ways we can incorporate large language models into our workflows. If you’d like to discuss how Praetorian applies offensive research like this to harden the systems your business depends on, reach out for a conversation.

 

About the Authors

Adam Crosser

Adam Crosser

Adam is an operator on the red team at Praetorian. He is currently focused on conducting red team operations and capabilities development.

Catch the Latest

Catch our latest exploits, news, articles, and events.

Ready to Discuss Your Next Continuous Threat Exposure Management Initiative?

Praetorian’s Offense Security Experts are Ready to Answer Your Questions