On 24/05/05, Brion Vibber brion@pobox.com wrote:
The person who submitted an edit is not always the author of everything that they added: material may be copied from another article written by someone else, or even taken from an entirely separate resource that was under GFDL or a compatible license terms, which might not be compatible with a new license.
Hmm, that is indeed a good and rather worrying point: content imported under the GFDL or similar and compatible licences probably *should* be creditted as such in some way which could easily be recognised - if not by computer, by a dedicated team of humans telling the computer. But material copied from other articles - merges, splits, not to mention translations - is generally treated very laxly, with hopefully a reference in the edit summary, but not always even that.
Put it together with the thorny question of "when is a rewrite a rewrite", and it makes you wish for a meaningful "blame"/"who added this line" tool - though I tend to agree with the opinion that this would be an order of magnitude harder for the free text of encyclopedia articles than it is for source code. Although, it has to be noted that those IBM researchers managed to get meaningful data in their "history flow" system...
My point being, that if relicensing of this sort *were* ever attempted on Wikipedia, it would be impossible to do it properly without some such tool to trace where every bit of the text came from, and to check (manually, since edit summaries aren't likely to be reliably machine-readable) that the crucial edits were not only made by new-license-agreeable editors, but came from sources which were themselves new-license-compatible.