Skip to content

Benchmarking: omnipkg vs Docker vs Conda

Data collected Feb 2026 on production hardware (NVIDIA GPU).

The “Impossible” Benchmark

Traditional wisdom says you cannot run multiple versions of C-extension libraries (like TensorFlow or PyTorch) in the same process due to OS linker conflicts.

omnipkg solves this via the Worker Daemon.

1. Execution Latency (Hot Path)

How long does it take to execute code in an isolated environment once initialized?

Solution Mechanism Latency Speedup Factor
omnipkg Daemon Zero-Copy SHM ~2.3ms 1.0x (Baseline)
Docker HTTP/Socket API ~50ms 21x Slower
Conda Run Process Spawn ~400ms 173x Slower
Venv (Subprocess) Python Startup ~80ms 34x Slower

2. Cold Startup

How long to spin up a new isolation context from scratch?

Solution Time Notes
omnipkg ~300ms Fork-server architecture
Docker ~2000ms+ Container initialization overhead
Conda ~1500ms Solver/Linker overhead

3. Memory Overhead (Per Worker)

Running 8 concurrent workers.

Solution RAM per Worker Total System Load
omnipkg ~330 MB Shared libs via Copy-on-Write
Docker ~600 MB+ Duplicated kernel namespaces
Venv ~400 MB Minimal sharing

The “Triple Python” Multiverse

omnipkg is the only solution that allows Zero-Copy Data Transfer between different Python versions.

Scenario: Pass a 1GB Tensor from Python 3.9 (Torch 1.13) $\to$ Python 3.11 (Torch 2.2).

  • Docker: Requires serializing 1GB to disk/network, context switching, and deserializing. Time: >100ms
  • omnipkg: Passes a CUDA pointer via ctypes. Time: <5µs

Verdict: omnipkg provides 1.9x better memory efficiency and 160x faster startup than Docker for Python-specific workloads.