DEV DROPBOX VERDICT · BUILD-NARROWER research report · 2026-06-27

Developer Dropbox: Feasibility, Architecture, and MVP Brief

A build-decision document. Research conducted across eight technical areas; this report integrates all findings. Sources are cited inline as [Area-Source] references; full URLs are listed in the consolidated Sources section at the end.


1. Executive Summary

Verdict: BUILD-NARROWER.

The idea is technically feasible and the white space is real. No shipping product combines (a) instant copy-on-write agent workspaces, (b) server-enforced per-path read permissions, and (c) transparent cross-machine workspace sync. Incumbents cannot retrofit these because Git's data model structurally prevents per-blob ACL enforcement at the pack-protocol layer. The four pain points raised in the operator brief are all real, current, and worth paying to solve.

The single thing to build first: an instant, isolated, copy-on-write workspace fork per agent, backed by a content-addressed store (CAS), with zero branch-lock contention and no per-agent node_modules install. This is the agent-parallelism wedge. It directly addresses the most acute and measurable pain (AI-heavy teams burning minutes and gigabytes per agent worktree), it requires no VCS migration, and it is the primitive that makes every other capability (cross-machine sync, per-path ACL) useful by existing.

What to build second: server-enforced per-path read ACL. This is the structural moat. No Git-compatible hosting platform can add it without abandoning the pack protocol. The CVE embargo use case (ACL-gated branch with timed reveal and access audit log) is the sharpest, most fundable wedge within this area.

What to build third: transparent cross-machine sync via the same CAS. Once the CAS exists and agents are using it, syncing a second machine is a tree-pointer update away.

macOS: the hardest technical problem in the whole stack

This must be called out early, because macOS is the operator's primary development platform.

The kext-based FUSE (macFUSE) is effectively blocked for products: Apple Silicon requires Reduced Security mode and a Recovery-mode reboot to enable kexts, which is disqualifying for broad adoption [B-S12].

Apple's officially blessed API for on-demand file materialization, NSFileProviderReplicatedExtension, forces all files into ~/Library/CloudStorage/[AppBundleID]-[Domain]/. There is no API to redirect this to ~/code, ~/devdropbox, or any developer-chosen path. This is a structural constraint, not a missing feature. It makes File Provider correct for a consumer cloud sync product but incompatible with the "files live where I want them" developer-tool UX [B-S11].

The practical path for the MVP on macOS is FUSE-T, a kext-free FUSE implementation that uses an NFSv4 loopback server, installs via brew install fuse-t, requires no security policy changes, and mounts at arbitrary paths [B-S13]. The documented risk: under heavy concurrent I/O (many agents all reading files through the same mount simultaneously), the macOS NFS client can show "Server connections interrupted" dialogs and force-unmount the volume, exactly as Meta's EdenFS team experienced when they were forced off macFUSE [B-S16].

The macOS verdict: BUILD-NARROWER, Linux-first. Build the MVP on Linux first (clean FUSE, no constraints, kernel OverlayFS for agent isolation). Bring macOS in with FUSE-T for the developer-single-machine case. The high-concurrency multi-agent use case on macOS may require waiting for FUSE-T's FSKit backend (available on macOS 26 Tahoe, which skips the NFS layer entirely) or accepting Linux as the production platform for agent farms [H-S3].

FSKit (macOS 15.4+) does not solve this. It targets block-device filesystems (exFAT, MSDOS replacement) and has no virtual/network filesystem model, no process attribution, and no committed Apple support for the use case [B-S17, B-S18].


FIGURERecommended build order and platform readiness
1
Instant copy-on-write agent workspacesThe wedge. Fork an isolated view in <5ms, no branch-lock, no per-agent node_modules reinstall.
2
Server-enforced per-path read ACLThe moat. GitHub/GitLab cannot retrofit this without breaking the Git pack protocol.
3
Cross-machine sync over the same CASOnce the store exists, syncing a second machine is a root-pointer update away.
Linux · FUSEGO
Windows · ProjFSGO
macOS · FUSE-TVIABLE, CAVEAT
macOS is the hardest target: macFUSE is blocked on Apple Silicon and Apple's File Provider forces files into ~/Library/CloudStorage, so the MVP path is FUSE-T with a concurrency caveat. Build Linux-first.

2. Validated Pain Points

P1: Git permissions are repo-level, not file/change-level (REAL)

Git's pack protocol negotiates at the object-hash level with no path context. A blob's path only exists in tree objects; the server receiving a want <hash> request cannot easily infer which file path that blob is associated with without an inverted index. More critically, a tree object's hash is derived from all its children's hashes: if the server omits a blob, it cannot return a tree whose hash the client expected, breaking tree-hash integrity. There is no filtering stage between negotiation and packfile delivery [E-S10].

GitHub and GitLab support only repo-level or branch-level access control. CODEOWNERS on both platforms gates required reviewers, not file readability [E-S4, E-S5]. Gerrit, the most sophisticated Git ACL system, operates at ref (branch/tag) level, never per-file or per-directory [E-S1]. Perforce Helix Core supports per-file ACL but is not Git-compatible.

Client-side encryption (git-crypt, SOPS, age) does not provide read privacy. The encrypted ciphertext is present in every clone; file names, directory structure, and file sizes are always visible; revocation is difficult [E-S2, E-S3, E-S9]. These tools are secrets management, not access control.

Evidence: The claim is accurate and is an architectural consequence, not a policy gap. Per-path read privacy requires a non-Git transport: a custom blob-fetch API where the server withholds specific blobs by hash after checking a path ACL. Google Piper proves this is operationally feasible at scale (less than 1% of files carry ACLs; CitC FUSE overlay materializes files on demand and returns nothing for unauthorized paths) [E-S8].

P2: git worktrees are bad for parallel agents (REAL)

The git-worktree documentation states: "By default, add refuses to create a new worktree when <commit-ish> is a branch name and is already checked out by another worktree." The resulting error is fatal: 'feature/foo' is already checked out at '/path/to/other-worktree' [F-S1]. This is a ref-ownership check, not a filesystem lock, making it architectural, not a bug. The --force flag has limited bypass capability and git's own docs warn against using it to double-checkout a live branch [F-S1].

The node_modules cost is the bigger practical pain. Each worktree starts without dependencies (node_modules is gitignored). Trigger.dev's engineering team measured 9.82 GB of disk usage across two worktrees of a TypeScript monorepo, roughly 5x the raw repository size, due to duplicated node_modules and build caches [F-S2]. On macOS APFS, each install takes 31-140 seconds (see P4), so two agents cost two installs, two minutes, and 10 GB.

Stale base drift (agents branching from an old commit as main advances) requires explicit rebase logic in the orchestration layer; git worktrees provide no automatic base-tracking.

Evidence: P2 is confirmed as accurate. The branch-lock is architectural; the node_modules cost is measured; the stale base problem is documented [F-S12].

P3: Keeping several machines in sync is manual friction (PARTLY REAL)

The operator's framing is correct: syncing multiple Macs and an Ubuntu box (folder layout, env vars, code) requires manual steps. However, this is a friction problem, not a fundamental impossibility. Git itself handles code sync; the gaps are (a) no lazy materialization (you get full clones), (b) no env var / secrets sync built in, and (c) the APFS penalty on each machine means "sync" involves a slow install per machine.

Content addressing solves (a) structurally: once a file is in the CAS and identified by its BLAKE3 hash, any machine that has seen any version of that file can serve it from its local cache. "Sync" becomes delivering a new root tree hash; file bytes only travel if the machine has never seen that content before.

For env vars and secrets: runtime injection tools (1Password CLI op:// references, Doppler doppler run, Infisical) use cloud-side source of truth and need no workspace CAS at all [C-S29, C-S30, C-S31].

Evidence: The pain is real and the CAS model directly addresses the structural cause. This pain is less acute than P1 and P2 and should not be the primary product pitch.

P4: macOS APFS is very slow at creating many small files (REAL, numbers confirmed)

The benchmark numbers from the operator brief are confirmed against a community-curated multi-contributor dataset [D-S1]:

Hardware OS / FS pnpm install
Apple M4 (base, 8 core) macOS 15.2, APFS (Encrypted) 31.4s
Apple M1 Ultra macOS 15.3, APFS 137.5s
Apple M4 Pro (14 core) macOS 15.3.2, APFS (Encrypted) 145.4s
Apple M4 Pro (same hardware) Ubuntu 24.04, btrfs (OrbStack) 17.3s
AMD Ryzen 5 7640U Ubuntu 24.04, Ext4 5.9s

The operator's claimed figures (7s Ubuntu, 31s M4, 140s M1 Ultra) match the dataset rows. The claim that the M1 Ultra is 4-5x slower than the M4 base despite more cores is confirmed and is the diagnostic fingerprint of global mutex serialization: more cores pile up behind APFS's per-volume B-tree lock and add queuing delay [D-S2].

The operator's root-cause hypothesis ("fsync amplification") is partly correct. F_FULLFSYNC on macOS causes an extreme performance penalty (46 IOPS vs 40,000 IOPS without it on M1 MacBook Air [D-S5, D-S6]). However, pnpm's primary bottleneck is APFS B-tree metadata lock contention during concurrent hard-link creation, not F_FULLFSYNC directly (Node.js's fs.writeFile does not call F_FULLFSYNC by default). Both mechanisms are real; F_FULLFSYNC is more relevant for git and database tools.

Does a lazy virtual filesystem sidestep the penalty? Yes, proportionally to how sparse the agent's file access is. A content-addressed lazy FS that materializes files only when an agent opens them avoids writing anything to APFS for files never accessed. For a typical AI coding agent touching 50-200 source files out of 50,000 node_modules files, it pays 0.1-0.4% of the full install cost. The go/no-go threshold: if an agent accesses fewer than 20% of the file tree, lazy hydration wins. If a common agent task forces >50% materialization, the FUSE-T IPC overhead may negate the benefit [D-section 4].

Evidence: P4 is confirmed. The numbers are accurate and from a reputable crowd-sourced dataset. The lazy-FS hypothesis for sidestepping the penalty is well-founded but must be validated empirically against real agent access patterns.


FIGUREpnpm install time: the same cached install across hardware and filesystem
~0.1–0.4%of full install cost paid by a lazy FS when an agent touches 50–200 of ~50,000 files
>20%file-tree access is the go/no-go threshold where lazy hydration stops winning
Source: community-curated dataset [D-S1]. Red = macOS/APFS, teal = Linux. The M1 Ultra being slower than the base M4 despite more cores is the fingerprint of APFS B-tree lock contention.

3. Prior-Art Teardown

System How it works What to copy What to avoid License macOS support
Meta EdenFS + Mononoke Virtual FS (FUSE/NFSv3/ProjFS); non-materialized inodes reference CAS hash; materializes on read; Mononoke CAS server uses Blake2b blob store with pluggable backends (S3/MySQL/SQLite) Inode model (non-materialized = hash pointer, materializes on read); blobstore abstraction; RedactedBlobstore for per-blob ACL GPL-2.0 blocks proprietary embedding; NFSv3 workaround on macOS is not a foundation; Mononoke not supported for external use GPL-2.0 Functional but degraded (NFSv3, "Server connections interrupted" under load; targeting FSKit)
Microsoft VFS-for-Git / Scalar ProjFS placeholders (metadata on disk, no content); hydrate on read via PRJ_GET_FILE_DATA_CB callback; Scalar pivoted to sparse-checkout + partial-clone on macOS when Apple killed kernel APIs ProjFS placeholder lifecycle as state machine reference; Git partial-clone as a no-virtual-FS lazy fetch fallback macOS port was abandoned; building on single-OS kernel API repeats the GVFS mistake MIT Not supported; Scalar (the successor) runs on macOS but uses sparse-checkout, not virtual FS
Google Piper + CitC FUSE overlay on Linux; files materialize from Piper (Spanner-backed) on access; average workspace holds fewer than 10 files despite 86 TB repo; per-file ACL enforced server-side "Average workspace = fewer than 10 files" as proof lazy hydration is practical at scale; FUSE overlay as CitC architecture Linux-only, closed, coupled to Google infra; cannot copy code Closed (internal) Linux-only
Jujutsu (JJ) Changes (mutable, stable change-ID) vs commits (immutable, content-hash); working copy is a real commit, auto-amended on every command; operation log for safe concurrent writes; pluggable storage backend (currently uses .git via gitoxide); Git-colocated workspaces Git-compat adoption strategy (users can stay on git); change-vs-commit model for tracking agent work across rebases; operation log as concurrent-write coordination; pluggable backend trait as the seam for a cloud CAS No built-in lazy hydration or virtual FS (working copy is fully materialized); CAS backend must be built Apache 2.0 Fully supported (pure userspace)
Pijul / Darcs Patch-commutation theory (patches commute when order is reversible); sound merge semantics; Darcs had O(2^h) worst-case complexity (partially fixed); Pijul is pre-1.0 as of 2026 Conceptual insight: a private change is a patch some observers cannot see (useful framing for permission model) Asking anyone to replace Git; ecosystem lock-in beat better theory for Darcs and Mercurial; Pijul still segfaults on its own manual as of 2021 GPL-2.0 Works but niche

Closest to vision: EdenFS (architecture) + JJ (data model and adoption strategy). EdenFS proves inode-level on-demand materialization and per-blob ACL at Meta scale; JJ provides an Apache-2.0, Git-compatible, pluggable-backend foundation to build on. The combination: use JJ's change model and Git interop as the VCS layer; implement a cloud CAS backend (copying Mononoke's blobstore design without the GPL code); build the macOS FS layer with FUSE-T and Linux FS layer with native FUSE.


FIGUREOne portable core, three thin per-OS filesystem adapters (the EdenFS-proven shape)
Portable Core  ·  Rustcontent-addressed store (BLAKE3) · sync · per-path permission enforcement · agent-overlay daemon
LinuxFUSE + OverlayFSGO
WindowsProjFS placeholdersGO
macOSFUSE-T (NFSv4 loopback)CAVEAT
~90% of the system is OS-agnostic and written once; only the thin virtual-filesystem adapter differs per platform. Meta's EdenFS ships exactly this shape (FUSE / NFS / ProjFS behind one daemon).

4. Technical Feasibility by Area

4A. Prior Art (summary in table above)

The prior art validates that lazy-hydration virtual workspaces are technically achievable at scale and that Git-compatible adoption is the only realistic path. The critical gap is macOS: EdenFS uses a degraded NFSv3 workaround; VFS-for-Git was abandoned on macOS; CitC is Linux-only. See Area B for the macOS FS verdict.

4B. Lazy-Hydration Virtual Filesystem

Linux: Standard FUSE via /dev/fuse. The pattern: (1) serve readdir from an in-memory metadata index at sub-millisecond cost; (2) on read, fetch just that blob from the CAS and return it. EdenFS does this in production at Meta for monorepos with millions of files. Typical FUSE overhead: 5-15% throughput degradation; up to 3x in pathological small-file-metadata-heavy workloads [B-S4]. For a lazy-hydration product, the dominant latency is the CAS fetch, not FUSE overhead. Linux FUSE: go.

Windows: ProjFS (minifilter driver, no kext, no admin privilege for providers). Five file states: Virtual, Placeholder, Hydrated, Full, Tombstone. Provider callbacks: PRJ_GET_PLACEHOLDER_INFO_CB (serve metadata to create a placeholder), PRJ_GET_FILE_DATA_CB (serve content to hydrate). Proven at scale by VFS-for-Git. Windows ProjFS: go.

macOS (decisive, with caveats): Three paths evaluated:

macFUSE (kext): BLOCKED for products. Apple Silicon requires Reduced Security boot mode to load kexts. Disqualifying for broad adoption [B-S12].

Apple File Provider (NSFileProviderReplicatedExtension): CORRECT API, WRONG PATH. Kext-free, Apple-blessed, used by Dropbox and iCloud. BUT: files must live at ~/Library/CloudStorage/[AppBundleID]-[DomainName]/. No API exists to redirect to ~/code or any developer-chosen path. This is structural to how macOS scopes the File Provider sandbox [B-S11]. Extension cold-start adds approximately 3 seconds before fetchContents is even called [B-S19]. Correct for a consumer cloud sync product. Not viable for the developer-tool use case.

FUSE-T (NFSv4 loopback, kext-free): VIABLE FOR MVP, RELIABILITY CAVEAT. Installs via brew install fuse-t. Mounts at arbitrary paths. Drop-in macFUSE API. No kext, no SIP, no reboot. Documented reliability risk: the FORGET callback (which tells the daemon to release inode state) is never sent over NFS, causing unbounded memory growth under sustained use. Under heavy concurrent I/O, macOS displays "Server connections interrupted" dialogs that force-unmount the volume, exactly as Meta's EdenFS team experienced [B-S16]. Under moderate single-developer load, FUSE-T works. Under high multi-agent concurrency, the reliability ceiling is unverified and likely lower than needed. FUSE-T 1.x uses NFSv4 loopback; on macOS 26 Tahoe, FUSE-T adds an FSKit backend that bypasses NFS entirely [H-S3]. FUSE-T: viable for MVP at moderate concurrency; unproven at high concurrency; FSKit on macOS 26 may resolve this.

FSKit (macOS 15.4+): NOT APPLICABLE. Targets block-device filesystems (exFAT, MSDOS). No virtual/network filesystem model. No process attribution (no PID in requests). Apple has made no commitment to supporting virtual/network use cases [B-S17, B-S18].

macOS verdict: Build with FUSE-T for the MVP. Run the de-risk experiment in week one (10 parallel cat processes against a FUSE-T mount for 60 seconds; watch for NFS disconnect dialogs). If the failure threshold is below N=10, the macOS path for multi-agent use requires either (a) macOS 26+ for FSKit backend or (b) Linux as the agent-farm platform with macOS only for single-developer use.

4C. Content-Addressed Store + Sync

Every major CAS system converges on key = hash(content). This structural deduplication means identical file content costs zero incremental storage regardless of which agent, branch, or machine writes it.

Recommended MVP CAS:

  • Hash function: BLAKE3 (4-8 GB/s software throughput vs SHA-256's 0.5-1 GB/s without hardware acceleration [H-S7])
  • Store: S3 with blob keys at s3://<bucket>/blobs/<prefix2>/<blake3-hex>. Same-region S3 GET latency: 50-200ms cold; 5-10ms for S3 Express One Zone [C-S33, C-S34]
  • Local LRU cache: ~/.devdropbox/cache/ at 20 GB default. Local cache hits: sub-5ms
  • Tree objects: Git-style Merkle trees (mode, name, hash per entry), stored as blobs under their own BLAKE3 hash
  • Root pointer: s3://<bucket>/workspaces/<id>/HEAD updated with a conditional PUT to prevent clobbering
  • Large-file chunking: FastCDC for files above 512 KB (3-10x faster than Rabin-based CDC at same dedup ratio [C-S15, C-S17]); whole-file blobs for smaller files (no chunking overhead)
  • Secrets: SOPS + AWS KMS for files under a secrets/ prefix. CAS stores only ciphertext; KMS IAM controls decryption

Git partial clone is NOT a substitute for per-file ACL. Any user with repo read access can request any object by SHA at any time using git cat-file or git fetch <hash>. Sparse-checkout is a client-side filter, not a security boundary [C-section 3.3].

Cross-machine sync is a tree-pointer update: the two machines share the same CAS; "sync" means delivering the new root hash. Files the second machine never opens are never fetched. Content is immutable once stored, so there are no write conflicts on the CAS itself.

4D. The APFS Small-Files Problem

Root cause is APFS B-tree metadata lock contention under concurrent hard-link creation (pnpm's primary path), compounded by SIP/XProtect security-extension scanning (2.5x improvement when disabled) and optionally F_FULLFSYNC for durability-requiring tools. The benchmark numbers are confirmed [D-S1].

The best native-macOS fix without security tradeoffs: @pnpm/reflink native addon, which calls clonefile(2) per file directly, bypassing the hard-link serialization. APFS CoW clone per file is near-instant (one metadata write vs serial B-tree update per link) [D-S4].

A lazy-hydrating virtual FS sidesteps the penalty proportionally to access sparseness. An agent accessing 1% of node_modules files pays approximately 1% of the full install cost, and individual sequential file materializations avoid the burst-parallel B-tree contention entirely. For typical AI coding agent access patterns (50-200 source files, few config files), the effective APFS cost is negligible. The boundary: agents accessing more than 50% of files get diminishing returns vs just using a RAM disk [D-section 4].

4E. Per-Path / Per-Change Permissions

Why the Git pack protocol cannot enforce per-path ACL: Tree-hash integrity prevents blob withholding without recomputing the tree (and thus the commit). No filtering step exists between negotiation and packfile delivery. Per-path read privacy requires a non-Git transport [E-section 6].

The server-enforced CAS model: The server holds a content-addressed blob store and a path ACL table (path_prefix, principal, permission, expires_at). Client fetches blobs via GET /blobs/{hash} (HTTP API, not git-upload-pack). The server checks blob_hash -> [path] (inverted index) against the ACL before responding. A client without READ permission gets 403 for that blob. Tree objects can be returned with restricted entries redacted.

Hash deduplication side channel: In a multi-tenant CAS, a user who knows a sensitive file's content can compute its hash and probe GET /blobs/{hash} to confirm existence. Standard mitigation: per-tenant HMAC keying (key = HMAC(tenant_key, content_hash)) disables cross-tenant deduplication for sensitive paths while preserving within-tenant dedup [E-S11, E-S12].

Private branches and PRs: A branch is a named ref with an acl_policy_id. When the policy is restricted (author + named reviewers), the server omits the ref from listing and returns 403 on the commit and all downstream blobs. PRs private until merge: the PR branch has a restricted ACL; merge transitions it to the target branch's ACL atomically.

Timed/embargoed reveal: The expires_at field in the ACL table enables automatic disclosure at a specified time. Before expires_at, only the named principal can fetch blobs. After, the ACL entry is expired and broader access is granted. This is the mechanism the CVE embargo use case needs: a security researcher pushes a fix to a restricted branch; distros are granted READ until the embargo date; at expires_at, the branch becomes public. Every fetch is logged [E-section 7.5, 8].

Current CVE embargo process (the gap): The Openwall linux-distros process requires patches sent as email attachments (.tar.xz because the mail list rejects .bundle), trust-based embargo (no technical enforcement), no structured tracking, and no timed reveal [E-S13, E-S14]. The missing product is exactly the CAS ACL model described above.

4F. The Agent-Parallelism Wedge

The ideal primitive: a mutable per-agent overlay over a shared immutable CAS base. Reads fall through to the CAS. Writes are captured in the agent's overlay only. Fork cost is O(1): one SQLite insert creating an empty delta record. No git clone, no pnpm install, no container boot.

Why existing approaches fail:

  • git worktrees: branch-lock blocks parallel agents on same branch; node_modules per-worktree costs 5x disk and minutes of APFS time; stale base requires orchestration-layer rebase [F-S1, F-S2]
  • Full clones: disk scales as O(N); slow on large repos
  • Docker + worktree: doubles the overhead (container boot + worktree pain)
  • e2b / Daytona snapshots: fast VM boot (28-90ms) but still a full snapshot restore, not a true CoW fork; each agent gets its own independent copy [F-S4, F-S6]
  • GitButler virtual branches: agents share one physical directory; race conditions on shared files are possible; no true file isolation [F-S10]
  • DeltaDB (Zed): the closest announced competitor; CRDT-based virtual worktrees with "effective zero cost" forks; but it replaces Git's data model (large adoption ask) and was in closed beta as of June 2026 [F-S9]

The correct fork primitive is NOT APFS clonefile(2) on a directory. Apple's man page explicitly "strongly discourages" using clonefile(2) for directory hierarchies. A real-world measurement of a 77,000-file directory clone took just under two minutes [H-S1, H-S2]. The correct primitive is a metadata-only write-delta:

CREATE TABLE agent_overlay (
  workspace_id TEXT NOT NULL,
  agent_id     TEXT NOT NULL,
  path         TEXT NOT NULL,
  blob_hash    TEXT,  -- NULL = deleted in this agent's view
  PRIMARY KEY (workspace_id, agent_id, path)
);

Fork = one INSERT. Fork latency: sub-millisecond, regardless of repo size. This is architecturally identical to AgentFS (Turso's per-agent overlay for AI agent workspaces) [F-S8].

On Linux, OverlayFS provides a kernel-native equivalent: one mount(2) call with lowerdir=CAS_base,upperdir=empty_dir,workdir=tmp completes in microseconds and is production-grade (used by Docker and Podman) [H-S10].

node_modules deduplication: When node_modules is part of the CAS base snapshot (pre-ingested), all N agents share the same file data via CAS lookup. No per-agent install. Reads from node_modules hydrate from the shared local CAS cache. Agents that need to modify a dependency write only their change into the overlay. This eliminates both the "9.82 GB for two worktrees" cost and the "31-140s pnpm install on APFS" cost [F-S2, F-S3].

Latency targets for a compelling demo:

  • Fork a new agent workspace: under 5ms (SQLite insert) on both platforms
  • First open of a file from local CAS cache: under 50ms
  • First open from remote S3 (same region): under 500ms
  • Warm file open (already in local cache): under 5ms
  • Extract agent diff as a git patch: under 2 seconds for diffs under 100 files

FIGUREWhy lazy hydration sidesteps the APFS small-files penalty
~50–200 files an agent actually opens (materialized)
~50,000 node_modules files never touched (never written to disk)
A content-addressed lazy FS only writes the files an agent opens. For typical agent access patterns that is a fraction of a percent of the tree, so the APFS per-file write penalty is paid on almost nothing.

5. Competitive Landscape, Business Model, and Pricing

Competitor Map

Competitor Per-path ACL Instant parallel workspaces Cross-machine sync Key gap
Cursor Best-effort guardrails only (not a hard security boundary) [G-S11] Yes (isolated VMs per agent) No No per-path ACL as security primitive
Zed + DeltaDB No (project-level only) [G-S6] Beta, not shipping Yes (CRDT op-stream) No ACL; no workspace isolation; in closed beta
Daytona No [G-S3] Yes (27-90ms, massively parallel) Partial (snapshots) No ACL; no developer cross-machine sync
E2B No [G-S4] Yes (up to 1,100 concurrent) Partial (pause/resume) No ACL; no cross-machine sync
Modal No [G-S5] Yes (100K+ concurrent) Partial (Volume, artifact-only) No ACL; Volumes are artifact storage
Coder Org-level RBAC only [G-S12] Yes (parallel templates) No explicit sync No per-file ACL
GitButler N/A (local, no ACL) [G-S13] No (single local directory) No Local-only
Graphite N/A (GitHub repo-level) [G-S14] No Partial (GitHub as truth) PR tooling only
GitHub Codespaces No [G-S15] Partial (each codespace is a full clone, billed separately) No No CoW fork; $0.18-2.88/hr per codespace
Sapling/Meta No public product No public product No public product Not a commercial competitor
Dropbox/Google Drive No No Yes (file-level) Breaks Git atomicity on .git internals [G-S17]

White Space Verdict

The three-capability combination (per-path ACL + instant agent workspaces + cross-machine sync) is genuinely unoccupied. The gap is confirmed, not hypothetical.

Per-path read ACL is the rarest capability because it requires abandoning the git-upload-pack wire protocol as the primary transfer layer. GitHub, GitLab, and Bitbucket cannot add it without breaking every existing Git client. This is a structural constraint, not a product decision gap [G-section 2].

Instant parallel workspaces are near-commodity (Daytona, E2B, Modal are all competing on this). Do not try to win on VM boot latency. Win on the combination: CoW forks that share the CAS base so agents never pay for node_modules reinstall, and that require no branch-name allocation.

Open-core, interop-first.

Community Edition (free, self-hostable): The workspace client, the Git interop protocol, and the local CoW fork primitive. Any developer can run their own CAS server. This is the land: get the tool into every agent-heavy team's workflow without friction.

Hosted service (monetized): The hosted CAS server that enforces per-path blob ACL, runs deduplication and chunking, and provides the agent-runtime coordination layer. This is where the real security primitive lives: client-side permission enforcement is not a defensible boundary; server blob withholding is. This is also where the switching cost accumulates: the permission graph, the ACL dataset, and the audit logs do not exist in Git and cannot be recreated from Git history.

This mirrors the HashiCorp open-core model before the BSL relicensing [G-S20]. The client and protocol stay genuinely free; the hosted governance layer is monetized.

Adoption strategy (interop-first):

  • devdropbox init in an existing Git repo adds a sidecar metadata directory (like .jj); does not touch .git; existing Git workflow unchanged
  • CoW forks produce standard Git patches or commits that push to any Git remote
  • Per-path ACL is opt-in per path prefix; teams can enable it for secrets/ or security-fix/ without touching other paths
  • A developer who never uses sensitive paths never notices the non-Git layer
  • Template: Jujutsu's Git-colocated adoption model, where JJ and Git share the same .git directory and collaborators who use git see normal commits [G-S22]

Never ask anyone to replace Git. This is the lesson from Darcs, Pijul, Bazaar, and Mercurial. All four were technically competitive; all four failed because they required teams to switch their source of truth. Git's network effects (GitHub's pull requests, CI/CD integrations, code review tools) cannot be overcome with technical superiority [G-S7, G-S8, G-S9, G-S10].

The agent wedge avoids this: the product is workspace infrastructure, not a VCS replacement. Teams continue to run git push, git pull, and git log. CI/CD pipelines see standard Git refs. The product's value is measurable (minutes of agent setup time, gigabytes of disk, seconds of fork latency) without requiring any VCS migration.

Pricing Shape

Not seat-based. Seat pricing underprices the agent-parallelism use case (a team of 5 engineers may run 500 agent workspaces per day).

Two axes:

  1. Storage: $0.015-0.02 per GB-month stored. Comparable to Cloudflare R2 / Backblaze B2. A typical 50 GB workspace history: $0.75-1.00 per month.

  2. Workspace-hours: $0.01-0.02 per active agent workspace-hour. No full VM is provisioned; the cost is coordination metadata and CAS bandwidth, not compute. At 200 agent workspaces per day, 1 hour average runtime: $2-4 per day per team.

For a 5-engineer team running 200 agent workspaces/day: roughly 60 − 120/month.ThisisfarcheaperthanGitHubCodespacesatequivalentagentcount(0.18-0.36 per hour per codespace: 200 workspaces x 1 hour = 36 − 72perday)andcompetitivewithE2BPro(150/month for 100 concurrent sandboxes [G-S4]).

(Pricing estimate. No direct public comp exists for this exact product shape.)

First Paying Customer

Agent-platform builders: companies building products that orchestrate many parallel AI coding agents (Devin-class autonomous SWE tools, parallel code-review bots, AI-native development platforms). These companies already spend millions per month on sandbox infrastructure; E2B scaled from 40K to 15M sandbox runs per month in 12 months [G-S4]. They feel the permission and workspace-isolation gaps acutely and have the technical sophistication to adopt early. They sign enterprise contracts.

Secondary customer: AI-heavy development teams at mid-size software companies running 10-50 parallel agents concurrently and hitting the cross-machine sync and permission friction.

Not the first customer: individual developers or hobby projects. The per-path ACL and cross-machine sync features are not painful enough to pay for until a team is running agents at scale.


FIGUREThe three-capability combination nobody ships
ProductPer-path read ACLInstant parallel workspacesCross-machine sync
Developer Dropbox (proposed)YesYesYes
CursorBest-effortYesNo
Zed + DeltaDBNoBetaYes
DaytonaNoYesPartial
E2BNoYesPartial
ModalNoYesPartial
CoderOrg-onlyYesNo
GitButlerN/ANoNo
GraphiteN/ANoPartial
GitHub CodespacesNoPartialNo
Sapling / MetaN/AN/AN/A
Dropbox / Google DriveNoNoYes
Top row is the proposed product. Pricing is monthly-equivalent for a 5-engineer team running ~200 agent workspaces/day; Codespaces normalized from its ~$36–72/day figure. Estimate only, no exact public comp exists.

6. MVP Milestone Ladder

The MVP proves the four scariest technical bets in order, with a measurable go/no-go gate at each stage. Total estimated build time for a solo founder: 3-4 weeks of focused work.

M0: FUSE daemon serves a static fake tree (2-3 days)

What it proves: FUSE-T installs and works on the operator's Macs without kext or SIP changes; the Rust fuser daemon mounts at an arbitrary path; directory listing and file reads function end-to-end.

Build: A minimal Rust binary implementing lookup, getattr, readdir, read for a hardcoded 10-file tree with inline content. No CAS, no network. Mount at ~/devdropbox/test on macOS (via FUSE-T) and /tmp/testmount on Linux.

GO gate: ls ~/devdropbox/test returns 10 placeholder files; cat returns expected content; stat latency is under 10ms at p99 on both Linux and macOS.

Kill risk: FUSE-T NFS overhead adds more than 10ms per stat call on macOS. This makes directory traversal by build tools unacceptably slow. Run this benchmark before writing any CAS code. If p99 exceeds 10ms, the macOS path must shift to File Provider (with the ~/Library/CloudStorage constraint) or wait for FUSE-T's FSKit backend on macOS 26.

M1: Opening a file fetches its blob from a local CAS (5-7 days)

What it proves: the lazy-hydration core loop works end-to-end. The FUSE daemon knows which blobs exist (from a root tree), presents them as readable files, and fetches blob bytes from the CAS on first read.

Build: Rust CAS client with local ~/.devdropbox/cas/ store keyed by BLAKE3 hash. Tree deserializer builds in-memory path-to-hash index. FUSE read callback looks up hash, fetches from CAS, returns bytes. Ingest a real repo. Mount and benchmark. Use a local MinIO instance (docker run -p 9000:9000 minio/minio server /data) as an S3-compatible remote for the cross-network test.

GO gate:

  • First open from local CAS: under 50ms at p99
  • First open from local MinIO (simulated S3): under 150ms at p99
  • Warm open (bytes in local cache): under 5ms at p99
  • File content byte-for-byte identical to original (verified with BLAKE3 hash)

Kill risk: FUSE read callback round-trip (no I/O, immediate return) exceeds 5ms on macOS FUSE-T. This would indicate the NFS translation layer is the bottleneck, not CAS fetch time. Run a no-op read microbenchmark before building the CAS layer.

M2: Two machines mount the same content-addressed workspace root (3-5 days)

What it proves: the sync primitive works; an identical tree is visible on two machines without re-ingesting; a file pushed on machine A appears on machine B within seconds.

Build: Connect Rust CAS client to a real S3 bucket. Implement root pointer at s3://<bucket>/workspaces/<id>/HEAD updated with a conditional PUT. Sync daemon polls HEAD every 5 seconds; updates in-memory index on change. Test cross-machine file creation and hydration.

GO gate:

  • A file created on machine A is openable on machine B within 15 seconds of the HEAD update reaching S3
  • No content corruption (hash verification pass over the full tree)
  • Cold mount: first open of any file under 1 MB completes in under 1 second at p95

Kill risk: S3 cross-region consistency delay exceeds 2 seconds for HEAD pointer updates. Mitigation: require same AWS region, or use DynamoDB (strongly consistent) as the HEAD coordination store. Benchmark S3 update visibility latency before assuming sub-second consistency [H-section M2 kill risk, C-S35].

M3: N agents fork isolated CoW workspaces in under 5ms (5-8 days)

What it proves: the agent-parallelism primitive works; N agents have independent views of the same base with private in-flight writes; no branch-lock contention; no disk-copy overhead; agents share the CAS base's node_modules data without duplication.

Build: Add agent_overlay SQLite table. Implement fork(workspace_id) -> agent_id as a single INSERT. Update FUSE read/write callbacks to route through overlay lookup. Each agent process identifies itself by an agent ID passed as a mount option or side-channel socket. Linux path: OverlayFS for agent isolation (one mount(2) call per agent, microseconds). Test with N=10 concurrent agents; verify isolation; extract diffs.

GO gate:

  • Fork creation: under 5ms on both Linux and macOS
  • Isolation: agent A's writes not visible to agent B (verified)
  • N=10 concurrent agents: 30 seconds of sustained reads with no NFS reconnect dialogs on macOS
  • Diff extraction: git-compatible patch in under 2 seconds for diffs under 100 files

Kill risk (macOS-specific): With 10 agents funneling FUSE calls through one FUSE-T NFS loopback connection, macOS shows "Server connections interrupted" dialogs and force-unmounts. This is the specific failure Meta's EdenFS team documented. If this occurs below N=10, macOS requires either (a) FUSE-T on macOS 26 for the FSKit backend, (b) one FUSE-T daemon instance per agent (higher overhead), or (c) Linux-only for the multi-agent use case. Run the concurrency stress test after M1 is built: 10 parallel cat processes against the M1 FUSE-T mount for 60 seconds. Fail fast on this risk before investing in M3.

M4 (optional): Per-path read ACL at the FUSE layer (2-3 days)

What it proves: the FUSE daemon can enforce per-path read restrictions; a file returns EACCES for a principal without READ permission even if the principal knows the path.

Build: Add acl table (path_prefix, principal, permission, expires_at) to workspace SQLite. In FUSE getattr and read callbacks, check ACL before serving. Test: configure one agent as READ-only on /src and DENY on /secrets; verify EACCES for the denied path.

GO gate: cat /mnt/agent-A/secrets/prod.key returns EACCES for agent A (DENY configured); same path returns content for agent B (READ configured). Optional: verify expires_at timed reveal with a 30-second embargo.

Kill risk: Low. FUSE ACL enforcement is a straightforward filter in existing callback code. The only caveat is that hash-probing (direct CAS blob API calls) can bypass the FUSE layer; the CAS HTTP API must also enforce ACL with authentication for true server-side enforcement.

Stack Summary

Layer Technology
Core language Rust (fuser, object_store, blake3, rusqlite, tokio)
Linux FUSE fuser crate + kernel FUSE; OverlayFS for agent isolation
macOS FUSE FUSE-T (brew install fuse-t); FSKit backend on macOS 26+
CAS hash BLAKE3 (4-8 GB/s software)
CAS blob store S3 + local disk LRU cache (object_store crate)
Agent overlay SQLite agent_overlay table; O(1) fork via single INSERT
Agent isolation (Linux) Kernel OverlayFS; O(1) mount(2)
Root pointer coordination S3 conditional PUT or DynamoDB for strong consistency
Secrets SOPS + AWS KMS; CAS stores ciphertext only
Swift FFI (M4 only) Mozilla UniFFI for future File Provider extension

FIGUREMVP milestone ladder (~3–4 weeks solo, each gate a go/no-go)
M0 FUSE daemon serves a static fake tree
3d
M1 Opening a file fetches its blob from a local CAS
7d
M2 Two machines mount the same workspace root
5d
M3 N agents fork isolated CoW workspaces in <5ms
8d
M4 Per-path read ACL at the FUSE layer (optional)
3d
Upper-bound durations shown sequentially. The highest-risk unknown (FUSE-T concurrency on macOS) is settled in week one with a 10-parallel-read stress test before M1 is built. Latency targets are the demo go-gates.

7. Open Risks and Unknowns

R1: FUSE-T concurrency ceiling on macOS (HIGH PRIORITY, validates in week 1)

The macOS FUSE-T NFS loopback has a documented reliability ceiling under high concurrent I/O, confirmed by Meta's EdenFS team [B-S16]. The exact concurrency threshold for the "Server connections interrupted" dialog is unknown for the lazy-hydration + agent-overlay use case. This is the single highest-risk unknown in the plan. Run the de-risk experiment (10 parallel cat processes for 60 seconds) in the first week, before any M1 build work. If the threshold is below N=5, reconsider macOS as the agent-farm platform.

R2: Agent file-access sparseness (validates in M3)

The claim that lazy hydration sidesteps the APFS small-file penalty rests on agents accessing fewer than 20% of the file tree. This must be validated empirically with instrumented agent runs. If a common agent task (e.g., TypeScript type-checking) traverses all files in node_modules for resolution, the penalty partially returns. Build an access-count instrumentation harness as part of M3 to measure actual access patterns across representative agent tasks.

R3: S3 cross-region sync latency

S3 cross-region replication (CRR) has no SLA; documented delays can exceed 180 seconds for multi-file transactions [C-S35]. If the operator's machines span AWS regions, the M2 "visible within 15 seconds" gate may not be achievable without a DynamoDB coordination layer. This is a known risk with a known mitigation; confirm the operator's machine geography before M2.

R4: Adoption friction from the non-Git blob API

Per-path ACL enforcement requires agents to fetch files via a custom HTTP blob API, not git clone. Teams that instrument their agent pipelines to use this API are early adopters. Teams that rely on standard git clone before spawning agents get no per-path ACL benefit. The adoption risk is that the ACL feature is only available to teams willing to adopt the custom client. Mitigation: the Community Edition CoW workspace fork (no ACL) uses the same client, so teams adopt the client for the fork primitive and get ACL as an add-on, not as the entry requirement.

R5: DeltaDB (Zed) shipping as described

If DeltaDB ships with "effective zero cost" workspace forks and CRDT-based parallel agent editing as announced, it closes the "instant fork" part of the gap. However, DeltaDB requires abandoning Git's commit-as-snapshot data model (a large adoption ask), has no per-path ACL model, and was in closed beta as of June 2026. Monitor, but do not assume it will ship on schedule or at the described capability level.

R6: Hash deduplication side channel in multi-tenant CAS

A user who knows a sensitive file's content can compute its BLAKE3 hash and probe the CAS API to confirm existence, even without READ permission. Standard mitigation (per-tenant HMAC keying) prevents cross-tenant dedup but adds storage cost. For the single-operator MVP (all files owned by one tenant), this is not an issue. For a multi-tenant hosted service, the mitigation must be designed in from the start [E-S11, E-S12].

R7: FUSE-T's FSKit backend (macOS 26) maturity

FUSE-T's FSKit backend on macOS 26 Tahoe is announced but its maturity and reliability characteristics as of mid-2026 are unknown. If FSKit resolves the NFS reliability issues, it becomes the macOS production path for multi-agent concurrency. If FSKit introduces new limitations (Apple has not publicly committed to supporting virtual/network use cases in FSKit), the macOS path remains constrained. This risk resolves through direct testing on macOS 26 after M0.

R8: The SCM-replacement graveyard is a cultural risk, not just technical

Every VCS that tried to displace Git from the front door failed, regardless of technical merit. The interop-first strategy (Git stays the source of truth; the product is a layer) is the correct response, but it requires sustained discipline not to drift into "just one more feature that requires everyone to switch." Every product decision must be evaluated against the question: "Can a team use this feature while still pushing standard Git commits to GitHub?" If the answer is ever no, the feature must be redesigned.


FIGUREOpen risk register
R1
FUSE-T concurrency ceiling on macOSHIGH · validate week 1
R2
Agent file-access sparseness assumptionvalidate in M3
R3
S3 cross-region sync latencyknown mitigation
R4
Adoption friction from the non-Git blob APIdesign
R5
DeltaDB (Zed) shipping as describedmonitor
R6
Hash dedup side channel in multi-tenant CASper-tenant HMAC
R7
FUSE-T FSKit backend (macOS 26) maturitytest on macOS 26
R8
SCM-replacement is a cultural graveyardstay interop-first
Two risks are load-bearing: R1 (resolve in week one before building) and R8 (every feature must work while the team still pushes standard Git).

Sources

All citations below are organized by the area notes that sourced them. The area prefix (A, B, C, D, E, F, G, H) maps to the eight research-notes files under docs/research-notes/.

Area A: Prior Art

Area B: Lazy-Hydration Virtual Filesystem

Area C: Content-Addressed Store + Sync

Area D: APFS Small-Files Problem

Area E: Per-Path Permissions

Area F: Agent Parallelism

Area G: Competitive Landscape

Area H: MVP