How to keep private files tracked beside public code while publishing a private-free public history with Jujutsu filesets, revsets, sync merges, and split.

While working on Vihren I encountered the following issue. Vihren is intended to be open-source. At the same time there are files that I’m not ready to share as part of the public repo. Mostly those are files that inform my agents how to drive development. Some of them are product documentation containing internal goals, a roadmap or ideas for monetisation. Some of the work consists of experiments and half-baked ideas that I haven’t discovered the final shape of yet.

All of that private context needs to be version-controlled. At the same time I don’t want it to end up in the public repo by accident.

The need for version controlling of private context also comes from working in parallel. Jujutsu offers jj workspaces which allow for agents to explore several ideas simultaneously without any interference. But in order for this to work setting up a new jj workspace should be very simple. If all work is version-controlled, I can just do:

jj workspace new ../vihren-workspace-xyz
cd ../vihren-workspace-xyz
direnv allow

The last command direnv allow can set up a complete development environment based on Nix.

This approach can only work if all files that I care about are tracked by jj.

So the need came to find a way to work on an open-source project while at the same time maintain private data which shouldn’t go out-of-sync with the publicly visible stuff.

Alternatives

Why not .gitignore? Ignoring a file makes it untracked — it lives on my disk and nowhere else. That is the opposite of what I need: the private context has to be version-controlled and travel with the repo, so a fresh jj workspace plus direnv allow reproduces it. .gitignore solves “don’t publish” by discarding the history I am trying to keep.

The other options each give up one of the two things I want — context that is tracked and adjacent, and a public history that never contains private paths:

A separate private repo keeps both histories but loses adjacency: I cannot commit a code change and its private note together, and the two drift out of sync.
A submodule for the private directory stays versioned, but it is a second history with its own commit-and-push dance, and the pointer still rides along in public. Also Jujutsu doesn’t currently support git submodules which makes working with workspaces difficult
git-crypt keeps the private files in the public repo, only encrypted; I do not want them in public history at all.
sparse-checkout or skip-worktree only hide files in the working tree — they never produce a private-free history to push.

The heavyweight version is a real projection service like Copybara or Josh, syncing between two repositories. What follows is the small version: one local repository, and three jj features — filesets, revsets, and the git.private-commits setting — doing the rest.

The public-private split

We want to split files into public and private. And we want to maintain lines of commits which contain only public files and which can be freely pushed to a branch in the public repo. At the same time we want to keep track of the private context associated to each public commit. So we are aiming at the following structure of our log history:

---
config:
  gitGraph:
    mainBranchName: public
---
gitGraph
   commit id: "P0"
   branch private
   commit id: "S0"
   checkout public
   commit id: "P1"
   checkout private
   merge public id: "S1"
   checkout public
   commit id: "P2"
   checkout private
   merge public id: "S2"
   commit id: "S3"

Each private commit branches from the public commit whose context it carries, and merges the public line forward as public advances; that merge edge is the association between the two lines. Not every private commit has a public partner: S3 is private-only work (notes or experiments with no public counterpart yet).

In order to be able to say which files are public and which are private we can use filesets. Here’s my setting at the moment:

[fileset-aliases]
'private' = 'root-glob:"**/private/**" | root-glob:".github/workflows/*.private.yml"'
'public' = '~private'

Then we can define which commits can be published:

[revset-aliases]
'publishable' = '~(files(private)::)'

So a commit cannot be published if it is a descendant of a commit containing private files. A commit can be published if none of its ancestors contain private files.

Finally we can limit which commits can be pushed:

[git]
private-commits = '~publishable'

With this setting we cannot accidentally push a private commit. But it is still possible to do so. I use a private remote for that and pushing can be done with a simple alias:

[aliases]
push-private = ['git', 'push', '--remote', 'private', '--allow-private']

So instead of

jj git push -b private-bookmark

I do

jj push-private -b private-bookmark

Splitting commits

During normal work I tend to generate a stack of commits. This stack can involve changes to both public and private files. For example in my agentic workflow I often make my agent maintain a status file that it needs to update frequently when working on a big task. So this private status file gets updated with each commit.

So after the work is done and ready to be published I need to extract the public from the private parts and bring the commit history in the correct shape.

The key operation to do this is jj split --onto

Let’s say that we have a commit history like:

---
config:
  gitGraph:
    mainBranchName: public
---
gitGraph
   commit id: "P0"
   branch private
   commit id: "S0"
   checkout public
   commit id: "P1"
   checkout private
   merge public id: "S1"
   checkout public
   commit id: "P2"
   checkout private
   merge public id: "S2"
   commit id: "C"

In this case the commit C contains public work that needs to be extracted. We can do it like so:

jj split -r C --onto P2 public

This will extract all public files into a new commit P3 and will change the history:

---
config:
  gitGraph:
    mainBranchName: public
---
gitGraph
   commit id: "P0"
   branch private
   commit id: "S0"
   checkout public
   commit id: "P1"
   checkout private
   merge public id: "S1"
   checkout public
   commit id: "P2"
   checkout private
   merge public id: "S2"
   checkout public
   commit id: "P3"
   checkout private
   merge public id: "C1"

The commit C1 will now only contain the private changes from C, while all public changes will have moved to P3. C1 merges P3, so the public work stays reachable from the private line without living on it.

When I need to convert a whole stack like this I use a small script. It takes a single argument — the private base to operate from — and derives everything else:

#!/usr/bin/env bash
# jj-split-stack PRIV_BASE [HEAD]   (HEAD defaults to @)
# Split PRIV_BASE..HEAD into a public rail and a private ladder. The public
# base is derived as projection(PRIV_BASE), not supplied.
set -euo pipefail
[ $# -ge 1 ] || { echo "usage: jj-split-stack PRIV_BASE [HEAD]" >&2; exit 2; }
priv_base="$1"; head="${2:-@}"

# Derive and validate the public base in one step.
if ! pub_base="$(jj log --no-graph -r "projection($priv_base)" -T commit_id 2>/dev/null)"; then
  echo "error: cannot derive a public base for '$priv_base' (it is unprojectable," >&2
  echo "       or the public history forked). See the revset algebra below." >&2
  exit 1
fi

# Cursors are commit-ids. The stack is walked by change-id because split/rebase
# rewrite commit-ids but preserve change-ids.
public_cursor="$pub_base"
private_cursor="$(jj log --no-graph -r "$priv_base" -T commit_id)"

for c in $(jj log --no-graph --reversed -r "stack($priv_base, $head)" -T 'change_id ++ "\n"'); do
  if [ -n "$(jj diff -r "$c" --name-only public)" ]; then
    jj split  -r "$c" --onto "$public_cursor" public \
              -m "$(jj log --no-graph -r "$c" -T description)"
    public_cursor="$(jj log --no-graph -r "exactly($public_cursor+ & files(public), 1)" -T commit_id)"
    jj rebase -s "$c" -o "$private_cursor" -o "$public_cursor"
  fi
  private_cursor="$(jj log --no-graph -r "$c" -T commit_id)"
done

jj bookmark set public  -r "$public_cursor"
jj bookmark set private -r "$private_cursor"

The script leans on a few revsets that build on each other:

[revset-aliases]
# public work that is not publishable: a mixed commit, or a public-only
# edit authored on the private line. This is what blocks a clean projection.
'stranded'         = 'files(public) & ~publishable'
# no stranded commit anywhere in history => the public content is fully on
# the rail
'projectable'      = '~(stranded::)'
# the unique publishable commit a projectable commit sits on — its public
# base. exactly(...,1) makes it self-validating.
'projection(priv)' = 'exactly(heads(::(priv & projectable) & publishable), 1)'

'stack(base, head)' = 'base..head'

[aliases]
split-stack = ['util', 'exec', '--', 'bash', '-c', '"$JJ_WORKSPACE_ROOT"/jj/split-stack "$@"', 'jj-split-stack']

The interesting one is projection. A private base does not need me to tell it which public commit it belongs to — the graph already determines that. projection(priv) is the newest publishable commit in priv’s history (heads(::priv & publishable)), restricted to the case where priv is projectable, meaning no stranded public work sits below it. exactly(...,1) then collapses the result to a single commit or a loud failure: it fails when priv is unprojectable (there is public work that has not been peeled onto the rail yet) and when the public history has forked, so the base would be ambiguous.

So to restructure the stack ending at @ I pass only the private base:

jj split-stack private_base

and the public base is derived.

Sharp edges

This setup is simple and covers my current needs, but has limitations. The two that matter both come from the same fact: the boundary is a path convention. The fileset matches files under private/ directories; it has no idea what is inside a file.

Private content can still leak through public files. The guard guarantees that no file under a private path ends up in public history. But there is no way to ensure that public files don’t contain private content. For example we could have an internal URL in a README, a customer name in a comment, an unreleased feature in a docstring. Or we could import a private module from a public one. Such mistakes can happen and can be easy to miss when the code is generated by an agent. JJ filesets cannot see content, so this slips straight through and the commit still looks publishable.

A private file in the wrong commit poisons the line. The publishable revset is history-aware: a commit is publishable only if no ancestor touched a private path. So if a private file sneaks into a commit on the public line, deleting it later does not help — the history still touched the private fileset, so that commit and everything after it stays unpublishable forever. This situation can happen when I decide to add more files to the private fileset. The only fix is to rewrite history so the private touch never happened. jj makes that easy mechanically, but it is surgery on the graph, not rm followed by a commit.

Two smaller ones worth knowing. The git.private-commits setting lives in my local jj config. A fresh clone, a CI job, or a raw git push does not have it. And the path guard says nothing about whether the public line actually builds — nothing stops a public package from importing something that only exists under a private path, which leaves you with a public repo that does not compile. Both need ordinary checks layered on top of the fileset machinery.

Conclusion

So this is my workflow right now. It is a bit rough around the edges but it gets the job done, and it stays a lot smaller than the heavier projection tooling it stands in for.

You can see and copy the complete solution from https://github.com/vihren-dev/vihren/tree/main/jj Let me know what you think.

A jj Workflow for splitting public and private files

Alternatives

The public-private split

Splitting commits

Sharp edges

Conclusion