While working on Vihren I encountered the following issue. Vihren is intended to be open-source. At the same time there are files that I’m not ready to share as part of the public repo. Mostly those are files that inform my agents how to drive development. Some of them are product documentation containing internal goals, a roadmap or ideas for monetisation. Some of the work consists of experiments and half-baked ideas that I haven’t discovered the final shape of yet.
All of that private context needs to be version-controlled. At the same time I don’t want it to end up in the public repo by accident.
The need for version controlling of private context also comes from working in parallel. Jujutsu offers jj workspaces which allow for agents to explore several ideas simultaneously without any interference. But in order for this to work setting up a new jj workspace should be very simple. If all work is version-controlled, I can just do:
jj workspace new ../vihren-workspace-xyz
cd ../vihren-workspace-xyz
direnv allow
The last command direnv allow can set up a complete development
environment based on Nix.
This approach can only work if all files that I care about are tracked by jj.
So the need came to find a way to work on an open-source project while at the same time maintain private data which shouldn’t go out-of-sync with the publicly visible stuff.
Alternatives
Why not .gitignore? Ignoring a file makes it untracked — it
lives on my disk and nowhere else. That is the opposite of what I need:
the private context has to be version-controlled and travel with the
repo, so a fresh jj workspace plus direnv allow reproduces it.
.gitignore solves “don’t publish” by discarding the history I am
trying to keep.
The other options each give up one of the two things I want — context that is tracked and adjacent, and a public history that never contains private paths:
- A separate private repo keeps both histories but loses adjacency: I cannot commit a code change and its private note together, and the two drift out of sync.
- A submodule for the private directory stays versioned, but it is a second history with its own commit-and-push dance, and the pointer still rides along in public. Also Jujutsu doesn’t currently support git submodules which makes working with workspaces difficult
- git-crypt keeps the private files in the public repo, only encrypted; I do not want them in public history at all.
- sparse-checkout or skip-worktree only hide files in the working tree — they never produce a private-free history to push.
The heavyweight version is a real projection service like
Copybara or
Josh, syncing between two
repositories. What follows is the small version: one local repository,
and three jj features — filesets, revsets, and the git.private-commits
setting — doing the rest.
The public-private split
We want to split files into public and private. And we want to maintain lines of commits which contain only public files and which can be freely pushed to a branch in the public repo. At the same time we want to keep track of the private context associated to each public commit. So we are aiming at the following structure of our log history:
---
config:
gitGraph:
mainBranchName: public
---
gitGraph
commit id: "P0"
branch private
commit id: "S0"
checkout public
commit id: "P1"
checkout private
merge public id: "S1"
checkout public
commit id: "P2"
checkout private
merge public id: "S2"
commit id: "S3"
Each private commit branches from the public commit whose context it carries, and merges the public line forward as public advances; that merge edge is the association between the two lines. Not every private commit has a public partner: S3 is private-only work (notes or experiments with no public counterpart yet).
In order to be able to say which files are public and which are private we can use filesets. Here’s my setting at the moment:
[fileset-aliases]
'private' = 'root-glob:"**/private/**" | root-glob:".github/workflows/*.private.yml"'
'public' = '~private'
Then we can define which commits can be published:
[revset-aliases]
'publishable' = '~(files(private)::)'
So a commit cannot be published if it is a descendant of a commit containing private files. A commit can be published if none of its ancestors contain private files.
Finally we can limit which commits can be pushed:
[git]
private-commits = '~publishable'
With this setting we cannot accidentally push a private commit. But it is still possible to do so. I use a private remote for that and pushing can be done with a simple alias:
[aliases]
push-private = ['git', 'push', '--remote', 'private', '--allow-private']
So instead of
jj git push -b private-bookmark
I do
jj push-private -b private-bookmark
Splitting commits
During normal work I tend to generate a stack of commits. This stack can involve changes to both public and private files. For example in my agentic workflow I often make my agent maintain a status file that it needs to update frequently when working on a big task. So this private status file gets updated with each commit.
So after the work is done and ready to be published I need to extract the public from the private parts and bring the commit history in the correct shape.
The key operation to do this is jj split --onto
Let’s say that we have a commit history like:
---
config:
gitGraph:
mainBranchName: public
---
gitGraph
commit id: "P0"
branch private
commit id: "S0"
checkout public
commit id: "P1"
checkout private
merge public id: "S1"
checkout public
commit id: "P2"
checkout private
merge public id: "S2"
commit id: "C"
In this case the commit C contains public work that needs to be extracted. We can do it like so:
jj split -r C --onto P2 public
This will extract all public files into a new commit P3 and will change the history:
---
config:
gitGraph:
mainBranchName: public
---
gitGraph
commit id: "P0"
branch private
commit id: "S0"
checkout public
commit id: "P1"
checkout private
merge public id: "S1"
checkout public
commit id: "P2"
checkout private
merge public id: "S2"
checkout public
commit id: "P3"
checkout private
merge public id: "C1"
The commit C1 will now only contain the private changes from C, while all public changes will have moved to P3. C1 merges P3, so the public work stays reachable from the private line without living on it.
When I need to convert a whole stack like this I use a small script. It takes a single argument — the private base to operate from — and derives everything else:
#!/usr/bin/env bash
# jj-split-stack PRIV_BASE [HEAD] (HEAD defaults to @)
# Split PRIV_BASE..HEAD into a public rail and a private ladder. The public
# base is derived as projection(PRIV_BASE), not supplied.
set -euo pipefail
[ $# -ge 1 ] || { echo "usage: jj-split-stack PRIV_BASE [HEAD]" >&2; exit 2; }
priv_base="$1"; head="${2:-@}"
# Derive and validate the public base in one step.
if ! pub_base="$(jj log --no-graph -r "projection($priv_base)" -T commit_id 2>/dev/null)"; then
echo "error: cannot derive a public base for '$priv_base' (it is unprojectable," >&2
echo " or the public history forked). See the revset algebra below." >&2
exit 1
fi
# Cursors are commit-ids. The stack is walked by change-id because split/rebase
# rewrite commit-ids but preserve change-ids.
public_cursor="$pub_base"
private_cursor="$(jj log --no-graph -r "$priv_base" -T commit_id)"
for c in $(jj log --no-graph --reversed -r "stack($priv_base, $head)" -T 'change_id ++ "\n"'); do
if [ -n "$(jj diff -r "$c" --name-only public)" ]; then
jj split -r "$c" --onto "$public_cursor" public \
-m "$(jj log --no-graph -r "$c" -T description)"
public_cursor="$(jj log --no-graph -r "exactly($public_cursor+ & files(public), 1)" -T commit_id)"
jj rebase -s "$c" -o "$private_cursor" -o "$public_cursor"
fi
private_cursor="$(jj log --no-graph -r "$c" -T commit_id)"
done
jj bookmark set public -r "$public_cursor"
jj bookmark set private -r "$private_cursor"
The script leans on a few revsets that build on each other:
[revset-aliases]
# public work that is not publishable: a mixed commit, or a public-only
# edit authored on the private line. This is what blocks a clean projection.
'stranded' = 'files(public) & ~publishable'
# no stranded commit anywhere in history => the public content is fully on
# the rail
'projectable' = '~(stranded::)'
# the unique publishable commit a projectable commit sits on — its public
# base. exactly(...,1) makes it self-validating.
'projection(priv)' = 'exactly(heads(::(priv & projectable) & publishable), 1)'
'stack(base, head)' = 'base..head'
[aliases]
split-stack = ['util', 'exec', '--', 'bash', '-c', '"$JJ_WORKSPACE_ROOT"/jj/split-stack "$@"', 'jj-split-stack']
The interesting one is projection. A private base does not need me to
tell it which public commit it belongs to — the graph already determines
that. projection(priv) is the newest publishable commit in priv’s
history (heads(::priv & publishable)), restricted to the case where
priv is projectable, meaning no stranded public work sits below it.
exactly(...,1) then collapses the result to a single commit or a loud
failure: it fails when priv is unprojectable (there is public work that
has not been peeled onto the rail yet) and when the public history has
forked, so the base would be ambiguous.
So to restructure the stack ending at @ I pass only the private base:
jj split-stack private_base
and the public base is derived.
Sharp edges
This setup is simple and covers my current needs, but has
limitations. The two that matter both come from the same fact: the
boundary is a path convention. The fileset matches files under
private/ directories; it has no idea what is inside a file.
Private content can still leak through public files. The guard guarantees that no file under a private path ends up in public history. But there is no way to ensure that public files don’t contain private content. For example we could have an internal URL in a README, a customer name in a comment, an unreleased feature in a docstring. Or we could import a private module from a public one. Such mistakes can happen and can be easy to miss when the code is generated by an agent. JJ filesets cannot see content, so this slips straight through and the commit still looks publishable.
A private file in the wrong commit poisons the line. The
publishable revset is history-aware: a commit is publishable only if
no ancestor touched a private path. So if a private file sneaks into
a commit on the public line, deleting it later does not help — the
history still touched the private fileset, so that commit and
everything after it stays unpublishable forever. This situation can
happen when I decide to add more files to the private fileset. The
only fix is to rewrite history so the private touch never happened. jj
makes that easy mechanically, but it is surgery on the graph, not rm
followed by a commit.
Two smaller ones worth knowing. The git.private-commits setting
lives in my local jj config. A fresh clone, a CI job, or a raw git push
does not have it. And the path guard says nothing about whether the
public line actually builds — nothing stops a public package from
importing something that only exists under a private path, which
leaves you with a public repo that does not compile. Both need
ordinary checks layered on top of the fileset machinery.
Conclusion
So this is my workflow right now. It is a bit rough around the edges but it gets the job done, and it stays a lot smaller than the heavier projection tooling it stands in for.
You can see and copy the complete solution from https://github.com/vihren-dev/vihren/tree/main/jj Let me know what you think.