Lots of super-critical code here. I want, like, 173% code coverage. This is the exact same stuff that led to months of discover-nasty-bug-worry-whether-we've-corrupted-data games the last time we rewrote our history code, and this code is all new. And critical. And complicated. And prone to fantastically subtle bugs with disproportionate effects.

There are also a whole pile of cases that it deals with, so this is an attempt to be systematic about enumerating them, to make sure our tests really are exhaustive.

Marking

Generating a child roster and marking it is the basic operation done when we store a revision to the db. It has to detect all kinds of nastiness that revision might have, and if there is no nastiness, it must correctly generate the metadata stored in the roster.

The first thing to worry about is node lifecycle. The rule here is that each node can only be born exactly once, and once it has died in any revision, it must be dead in all descendents of that revision.

Another important rule is that nodes never change type -- dir->file or file->dir.

So, first we need to test that the lifecycle rules are respected and violations detected.

Then, there's the actual marking. Each node has a number of scalars associated with it:

  • basename+parent
  • file content (for files)
  • attributes

Attributes also have a life cycle -- they never go away. Basically, "set_attr" and "clear_attr" are both ways to set an attribute.

Each scalar has a number of different cases, described in the multi-*-merge doc (see writeup). These include:

  • No parents: a*
  • One parent: a a | | a b*
  • Two parents: a a a a a b a b \ / \ / \ / \ / a b c a? (two cases here)

Basically, we need to test each way each of these cases can arrive, for each of the scalars. This is made more complicated by the fact that they can arise in a number of ways:

  • No parents:
    • Any node in a roster with no parents
    • A newly added node in a roster with one parent
    • A newly added node in a roster with two parents
  • One parent:
    • An old node in a roster with one parent
    • An old node that only has a parent on one side, in a roster with two parents
  • Two parents
    • An old node that exists in both parents, in a roster with two parents

Plus, attributes make this even worse:

  • No parents:
    • Any of the above cases, with an attribute newly added in this node
  • One parent:
    • Any of the cases with one node parent and a pre-existing attribute
    • An old node that exists in both parents, in a roster with two parents

Plus, hey, might as well test each thing for both files and dirs.

Merging

The main thing to test here is roster_merge.cc, in particular the roster_merge function. It is totally self-contained and easily testable (it doesn't root around in the db or anything), and entirely encapsulates the core merge logic.

In particular we should put in merges that should conflict in every possible way, and check that they in fact do, and the resulting roster_merge_result is correct. (This is somewhat fiddly, it involves particular notes in the structure, and particular sorts of quirky structure in the resulting roster.) See comments in roster_merge.cc, they should unambiguously describe what should happen in each case.

We should also check that clean merges on each scalar work correctly.