There's noting special about sub flakes here. You can build dependency graphs out of nix flakes whether or not they share a git repo. Using those graphs you can control which versions of which flakes come in contact with each other, and using that you can reason inductively about which version contains problematic code.
That's all I'm doing here.
But for the last year (since this was merged) it has been possible to have more than one flake in a repo. Since then I've been wanting to try structuring a project specifically with this kind of reasoning in mind.
Previously, the place for new tests was typically inside the same repo as what is being tested, so when you walked your repo back in time, you also ended up walking your tests back in time.