[Foundation-l] Alternative approach for better video support

Thu Jul 26 15:32:51 UTC 2007

Michael Snow wrote:
> Anthony wrote:
>   
>> There are differing interpretations of what a transparent format is
>> (most of which are pretty obviously incorrect), but distributing more
>> than 100 copies on paper without providing any digital copy at all
>> pretty clearly violates the requirement to have a machine readable
>> copy.
>>     
> Leaving aside for a moment the current state of GFDL legalism vis-a-vis 
> technology, I don't see any fundamental reason why a paper copy couldn't 
> qualify as machine-readable. There are some pretty substantial endeavors 
> focusing on just that sort of thing.
>
> If you meant that it fails to meet the GFDL's definition of 
> "transparent" you might have a stronger point. But that's a 
> legalism-and-technology issue.
>
> Despite the charges some pundits like to raise, there is no 
> philosophical reason for us to be enemies of the printed word. Let's not 
> allow our technological inadequacies to lead us into dismissing the 
> medium that has, over the course of history, spread more free knowledge 
> to more people than the Wikimedia Foundation has ever managed.
>
> --Michael Snow
>
>   

I would point out that none other than Richard Stallman himself has 
addressed this topic (I do need to see if I can find the source.... I 
heard him explain this at a conferece regarding the GFDL where I was in 
attendance).

The problem with considering a paper copy as a non-opaque source is that 
somehow the content needs to be transcribed or scanned using OCR 
techniques to be edited again.  Very few fonts are explicitly designed 
to be easily read with OCR technologies (not completely unknown, but not 
commonly used).  Mainly the issue here is that the process of 
transcription... or even cleanup from a clean OCR scan is a very labor 
intensive process that is unnecessary in the era of digital 
technologies.  Even with the transcription issues, you still have to put 
in the formatting information as well, not to mention that the error 
rate for scanning even a very clear document with a good font is very 
high.  My experience with Distributed Proofreaders is that clean scans 
still have about a 1% error rate, messing things up like c and o, or 
mistaking punctuation like the comma or period.  And these are just 
common problems that happen with a good OCR scan.

In short, a printed paper version is an opaque format and requires the 
"source code" somewhere else.

I'm not sure if a 2-D barcode that would hold the full contents of the 
document would qualify as a as a non-opaque format, but that is 
something to think about as well.... if you don't want to include 
something like a CD as an appendx or maintain a website with the data.  
But the raw text alone is not sufficient.