On 22/12/05, Anthony DiPierro wikilegal@inbox.org wrote:
On 12/22/05, David Gerard fun@thingy.apana.org.au wrote:
Anthony DiPierro wrote:
You'd have to spend a whole lot of money to get human editors to pick the "useful articles". It might pay off in the really long term, but it'd require a huge investment. And due to the GFDL some other company could just come along and take the results of that huge investment and drive you out of business anyway. I'm not at all surprised no one is doing it.
They certainly didn't for de:. Oh, wait ...
- d.
You're referring to the producers of the DVD, I assume. I don't know a whole lot about that project but I assumed they used some automated method to select articles for inclusion (there was a mention of only using articles which were last edited by a certain selection of logged in users), not that they had someone go through each one by hand.
First two versions were hand-processed, last version was automated.
First edition: "To produce the CD, a dump of the live Wikipedia had been copied to a separate server, where a team of seventy Wikipedians vetted the material, deleting nonsense articles and obvious copyright violations. Questionable articles were added to a special list, to be reviewed later. The final CD contained 132,000 articles and 1,200 images."
Second edition: "The vetting process was similar to the one for the CD described above and took place on a separate MediaWiki server. The process took about a week and involved 33 Wikipedians, communicating on IRC. To prevent duplication, editors would protect every article that they had reviewed; links to protected articles were shown in green. List of potential spam or vandalism had been produced ahead of time with SQL queries. Unacceptable articles were simply deleted on the spot. The final DVD contained about 205,000 articles, with every article linking to a list of contributors."
Third edition: The vetting process for this version was different and did not involve human intervention. A "white list" of trusted Wikipedians was assembled, the last 10 days of every article's history were examined, and the last version edited by a white-listed Wikipedian was chosen for the DVD. If no such version existed, the last version older than 10 days was used. Articles nominated for cleanup or deletion were not used.
-- - Andrew Gray andrew.gray@dunelm.org.uk