[WikiEN-l] EB1911 in Wikipedia

Tim Starling tstarling at wikimedia.org
Thu Jul 24 16:16:11 UTC 2008


John Vandenberg wrote:
> Hi,
> 
> I've been told that a large percentage of the EB1911 sits within the
> history of English Wikipedia, and a during a recent discussion about
> EB1911 here few checks indicate that this is possibly true, and that
> the EB1911 text imported into Wikipedia is from a decent
> transcription.  In the following very long discussion, there are a two
> tables consisting of five Wikipedia articles starting with "A" and
> "B", a link to the Wikipedia revision consisting of the EB1911 text, a
> link to the copy now on Wikisource, and a link to the pagescan (set up
> by Tim Starling):
> 
> http://en.wikipedia.org/wiki/Wikipedia_talk:Plagiarism
> 
> I am interested in piecing together the history of the EB1911 import,
> because if this was as extensive as some claim, hidden in Wikipedia is
> possibly the best and most complete available transcription of EB1911,
> and I would like to work out a good algorithm to pull it out and put
> it on Wikisource, which has slowly been building an online copy that
> is true to the original.  Or maybe we can find whoever imported it,
> and re-use the import files.
> 
> This will benefit Wikipedia, as it will allow readers and editors to
> determine what parts of those Wikipedia article have not been altered
> since 1911, which will act as a caution flag for readers, and a todo
> item for editors.  There is a WikiProject to go back and verify all of
> the articles imported from EB1911; this task can be better distributed
> if the task if the reader can see the original text without a degree
> in wiki-archeology.
> 
> http://en.wikipedia.org/wiki/WP:EB1911
> 
> The relevant Wikisource pages people may way to look at are:
> 
> http://en.wikisource.org/wiki/EB1911
> 
>   and the "project page" for that effort is at
> 
> http://en.wikisource.org/wiki/WS:EB1911
> 
>   and the complete set of scans in TIFF and PNG; I recommend
> installing the TIFF plugin, as those images are a joy to view and the
> plugin has a nice zoom interface.
> 
> http://en.wikisource.org/wiki/User:Tim_Starling
> 

The scan was never meant to just sit on my user page, Wikisource community
members were meant to copy it to some relevant location and make links to
it. They apparently didn't figure this out. I wrote the scanset extension
for the benefit of all of Wikisource, not just for my user space.

You can find the details of the origin of the scan in my original mailing
list post about the subject. The scan was made and distributed by a person
who, for religious reasons, wanted to see this material disseminated as
widely as possible. The scan was distributed as a CD set at low cost, and
on the CD set, it was stated that there were no restrictions.

The contents of the CD were put up on a website, with the website's name
discreetly overlaid in a corner of the TIFF image. A Wikipedian downloaded
them and send them to me. I made a script to blank out the website name
and convert the images to PNG. The result is the version that we currently
host.

-- Tim Starling




More information about the WikiEN-l mailing list