Data cleanliness

One of the not-so-joyous elements of online collaboration is the create/merge/delete cycle.

It usually happens something like this:

      • two or more people think a page needs to exist, and they create it, each with a slightly different name because they each have a specific point of view about the content on the page.
      • over time, the pages are expanded or otherwise improved, until someone discovers two of the pages and proposes they be merged. Disagreements may ensue over which should be kept and which merged. Disagreements may arise over what content should be preserved in the final version.
      • Eventually a page is deleted. It may be the page made redundant after an orderly merge. It may be deleted pseudo-randomly by a deletionist who sees there is at least one other page. It may be deleted vengefully.

Each of these stages, after the first, may be re-occurring at any given moment.

It happens on many different platforms, but today’s example was on FamilySearch. In trying to trace down Jean Guyon Sieur du Buisson[G], I discovered an entry which had been merged, with a link to the merge target. That entry had been merged, with a link target, as had the next. The one after that, however, had simply been deleted, with the annotation, in French, it was redundant.

Not one of the entries I was able to review had any particular value – most appearing unsourced. But it would have been beneficial if, instead of deleting, that final step had been merged without merging any fields to generate the link to the ‘correct’ page, assuming it still exists.

Ah well, the cycle will always continue.