Skip to content

Instantly share code, notes, and snippets.

@mavdol
Created January 2, 2026 11:22
Show Gist options
  • Select an option

  • Save mavdol/2c68acb408686f1e038bf89e5705b28c to your computer and use it in GitHub Desktop.

Select an option

Save mavdol/2c68acb408686f1e038bf89e5705b28c to your computer and use it in GitHub Desktop.
Notes on sandboxing untrusted code - why Python can't be sandboxed, comparing Firecracker/gVisor/WASM approaches

Sandboxing Untrusted Python

Python doesn't have a built-in way to run untrusted code safely. Multiple attempts have been made, but none really succeeded.

Why? Because Python is a highly introspective object-oriented language with a mutable runtime. Core elements of the interpreter can be accessed through the object graph, frames and tracebacks, making runtime isolation difficult. This means that even aggressive restrictions can be bypassed:

# Attempt: Remove dangerous built-ins
del __builtins__.eval
del __builtins__.__import__

# 1. Bypass via introspection
().__class__.__bases__[0].__subclasses__()

# 2. Bypass via exceptions and frames
try:
    raise Exception
except Exception as e:
    e.__traceback__.tb_frame.f_globals['__builtins__']

Note

Older alternatives like sandbox-2 exist, but they provide isolation near the OS level, not the language level. At that point we might as well use Docker or VMs.

So people concluded it's safer to run Python in a sandbox rather than sandbox Python itself.

The thing is, Python dominates AI/ML, especially the AI agents space. We're moving from deterministic systems to probabilistic ones, where executing untrusted code is becoming common.

Why sandboxing became important now?

2025 was marked by great progress but also showed us that isolation for AI agents goes beyond resource control or retry strategies. It's become a security issue.

LLMs have architectural flaws. The most notorious one is prompt injection, which exploits the fact that LLMs can't tell the difference between system prompt, legitimate user instructions and malicious ones injected from external sources. For example, People demonstrate how a hidden instruction can be injected from a web page through the coding agent and extract sensitive data from your .env file.

It's a pretty common pattern. We've found similar flaws across many AI tools in recent months. Take Model Context Protocol (MCP) for example. It shows how improper implementation extends the attack surface: the SQLite MCP server was forked thousands of times despite SQL injection vulnerabilities.

At least developers using coding agents or MCPs most likely know or are informed about the risks. But for non-technical users accessing AI through third-party services, it's a different story, and this is clear from the incidents we keep seeing like private data leaks or browser-based AI issues.

For me, the most important thing is to make sure that unaware users remain safe when using the AI agents we implement.


The solution isn’t better prompts. It’s isolation.

Like we said, focusing on the prompt is missing the point. You can't filter or prompt-engineer your way out of injections or architectural flaws. The solution has to be at the infrastructure level, through isolation and least privilege.

But what does isolation look like in practice? If your agent needs to read a specific configuration file, it should only have access to that file, not your entire filesystem. If it needs to query a customer database, it should connect with read-only credentials scoped to specific tables, not root access.

We can think of this in terms of levels of isolation :

flowchart TB
    direction TB
    Code["AI Agent"]

    subgraph Isolation["πŸ›‘οΈ Isolation Layers"]
        direction TB
        FS["<b>Filesystem Isolation</b> <br/><i>Only /tmp/agent_sandbox</i>"]
        NET["<b>Network Isolation</b> <br/><i>Allowlisted APIs only</i>"]
        CRED["<b>Credential Scoping</b> <br/><i>Least privilege tokens</</i>"]
        RT["<b>Runtime Isolation</b> <br/><i>Sandboxed environment</i>"]
    end

    subgraph Protected["πŸ”’ Protected Resources"]
        direction TB
        Home["/home, /etc, .env"]
        ExtAPI["External APIs"]
        DB["Databases & CRMs"]
        Infra["Core Infrastructure"]
    end

    Code --> FS
    Code --> NET
    Code --> CRED
    Code --> RT

    FS -.->|❌ Blocked| Home
    NET -.->|❌ Blocked| ExtAPI
    CRED -.->|❌ Blocked| DB
    RT -.->|❌ Blocked| Infra

    style Isolation fill:#2ecc71,stroke:#27ae60,color:#fff
    style Protected fill:#e74c3c,stroke:#c0392b,color:#fff
    style FS fill:#9b59b6,stroke:#8e44ad,color:#fff
    style NET fill:#9b59b6,stroke:#8e44ad,color:#fff
    style CRED fill:#9b59b6,stroke:#8e44ad,color:#fff
    style RT fill:#9b59b6,stroke:#8e44ad,color:#fff
Loading

In a perfect world, we would apply these isolation layers to all AI agents, whether they're part of large enterprise platforms or small frameworks.

I believe this is the only way to prevent systemic issues. However, I'm also aware that many agents do need more context or access to specific resources to function properly.

So, the real challenge is finding the right balance between security and functionality.


What the industry does

There are several ways to sandbox AI agents, most of them operating at the infrastructure level, outside the Python code itself. From what I see, two main paradigms stand out: one is to sandbox the entire agent, and the other is to sandbox each individual task separately.

For sandboxing the entire agent, we have many solutions:

  • Firecracker: A microVM that provides a sandboxed environment for running untrusted code. It requires KVM, so it's Linux-only, but it's still a solid solution for agent-level isolation. The downside is that for granular task isolation, it introduces more overhead, higher resource consumption, and added complexity. AWS originally built it for Lambda, making it the closest option to "secure by default".

  • Docker: Everyone uses it. But it's not the most secure option. Security teams recommend Firecracker or gVisor (which we'll cover below) for agent-level isolation. And like Firecracker, it's a bit heavy for granular isolation.

On the other hand, for task-level isolation, we have only one popular option: gVisor.

Gvisor sits between container and VMs and provides a strong isolation. Not as strong as Firecracker, but still a solid choice. In my opinion, if you're already using Kubernetes, gVisor is the natural fit, even though it's flexible enough to work with any container runtime.

The only downside is that it's also Linux-only, since it was designed to secure Linux containers by intercepting and reimplementing Linux system calls. On top of that, it adds a non-trivial overhead. That's something to keep in mind when you're sandboxing at the task level.

An emerging alternative: WebAssembly (WASM)

I remember reading an NVIDIA article about using WebAssembly to sandbox AI agents and run them directly in the browser. This is a very interesting approach, though WASM can be used for sandboxing in other contexts too.

I'll admit, I might be a bit biased, since I've started a few projects around WASM recently. But I wouldn't mention it if I wasn't convinced by its technical strengths: no elevated privileges by default, no filesystem, network, or environment variable access unless explicitly granted, which is a great advantage. It can definitely work with or even compete against solutions like Firecracker or gVisor for low-overhead, task-level isolation.

Obviously, the ecosystem is still young, so there are some limits. It supports pure Python well, but support for C extensions is still evolving and not fully working for now. This impacts ML libraries like NumPy, Pandas, or TensorFlow.

Even with these constraints, I believe this could be promising in the future. That's why I'm working on an open-source solution that uses it to isolate individual agent tasks. The goal is to keep it simple, you can sandbox a task just by adding a decorator:

from capsule import task

@task(name="analyze_data", compute="MEDIUM", ram="512MB", timeout="30s", max_retries=1)
def analyze_data(dataset: list) -> dict:
    # Your code runs safely in a Wasm sandbox
    return {"processed": len(dataset), "status": "complete"}

If you're curious, the project is on My GitHub.

Where do we go from here?

Firecracker and gVisor have shown us that strong isolation is possible. And now, we're seeing newer players like WebAssembly come in, which can help us isolate things at a much more granular task level.

So if you're designing agent systems now, I would recommend planning for failure from the start. Assume that an agent will, at some point, execute untrusted code, process malicious instructions, or simply consume excessive resources. Your architecture must be ready to contain all of these scenarios.

Thank you for reading. I'm always interested in discussing these approaches, so feel free to reach out with any thoughts or feedback.

@SerJaimeLannister
Copy link

Very interesting, Wasm is a good solution usually

There is also libriscv and recently libloong and tinykvm (gpl/their custom dual license) written by fwsgonozo which are really nice

Also, as you have mentioned docker isn't the most secure solution. If you are inteerested in the most secure option with docker, one option is gvisor as you mention but I have heard that it has some issues with complete kernel compatibility or smth but I agree that its a good decision

I also would like to point out to incus-containers, https://linuxcontainers.org/incus/, they recently created their os as well which can streamline

Also another aspect of incus is that you can run docker containers as vm sort of to get it easily

Also regarding firecracker. I think its very bare bones and should / is usually used in multi cloud solutions. Developing/managing it is tough

Its usually used when you need a very high amount of sandboxes in short amount of time. Wasm is another good pick but if you are picking something which is more persistent, other solutions are good as well

@mavdol
Copy link
Author

mavdol commented Jan 5, 2026

@SerJaimeLannister Thanks for the insights ! I wasn't familiar with Incus, I'll definitely take a look.

Totally agree on Firecracker too, it's often complex to manage for this kind of use case. E2B helps leverage it, though.

@corv89
Copy link

corv89 commented Jan 6, 2026

Your sandbox-2 mention is interesting β€” I’ve been building on it with a different philosophy. Rather than trying to contain/block agents, capture what they want to do and show it to the user before executing. The syscall interception becomes a visibility layer, not just an isolation layer. https://github.com/corv89/shannot​​​​​​​​​​​​​​​​

@awesomebytes
Copy link

@corv89 your link is broken for me, It should point to: https://github.com/corv89/shannot

@alexellis
Copy link

We have been working with Firecracker since ~ 2021 in actuated first (CI/CD for GitHub, GitLab, Jenkins) in microVMs, then with SlicerVM.com as either a YAML/config or REST API/SDK-driven approach. Boot times are fairly quick with a full systemd setup and Ubuntu LTS. Can be made even quicker for very specialised tasks like this for running Python.

I won't deep link as I don't want to hijack the thread.

Feel free to check it out if you're intereseted in Firecracker, but want to make it actually usable / production ready pretty much out the box. The SlicerVM.com website also has getting started videos on Firecracker/microVMs if you want to do your own thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment