[Foundation-l] Alternative approach for better video support
Robert Horning
robert_horning at netzero.net
Thu Jul 26 15:32:51 UTC 2007
Michael Snow wrote:
> Anthony wrote:
>
>> There are differing interpretations of what a transparent format is
>> (most of which are pretty obviously incorrect), but distributing more
>> than 100 copies on paper without providing any digital copy at all
>> pretty clearly violates the requirement to have a machine readable
>> copy.
>>
> Leaving aside for a moment the current state of GFDL legalism vis-a-vis
> technology, I don't see any fundamental reason why a paper copy couldn't
> qualify as machine-readable. There are some pretty substantial endeavors
> focusing on just that sort of thing.
>
> If you meant that it fails to meet the GFDL's definition of
> "transparent" you might have a stronger point. But that's a
> legalism-and-technology issue.
>
> Despite the charges some pundits like to raise, there is no
> philosophical reason for us to be enemies of the printed word. Let's not
> allow our technological inadequacies to lead us into dismissing the
> medium that has, over the course of history, spread more free knowledge
> to more people than the Wikimedia Foundation has ever managed.
>
> --Michael Snow
>
>
I would point out that none other than Richard Stallman himself has
addressed this topic (I do need to see if I can find the source.... I
heard him explain this at a conferece regarding the GFDL where I was in
attendance).
The problem with considering a paper copy as a non-opaque source is that
somehow the content needs to be transcribed or scanned using OCR
techniques to be edited again. Very few fonts are explicitly designed
to be easily read with OCR technologies (not completely unknown, but not
commonly used). Mainly the issue here is that the process of
transcription... or even cleanup from a clean OCR scan is a very labor
intensive process that is unnecessary in the era of digital
technologies. Even with the transcription issues, you still have to put
in the formatting information as well, not to mention that the error
rate for scanning even a very clear document with a good font is very
high. My experience with Distributed Proofreaders is that clean scans
still have about a 1% error rate, messing things up like c and o, or
mistaking punctuation like the comma or period. And these are just
common problems that happen with a good OCR scan.
In short, a printed paper version is an opaque format and requires the
"source code" somewhere else.
I'm not sure if a 2-D barcode that would hold the full contents of the
document would qualify as a as a non-opaque format, but that is
something to think about as well.... if you don't want to include
something like a CD as an appendx or maintain a website with the data.
But the raw text alone is not sufficient.
More information about the wikimedia-l
mailing list