On Wed, Mar 2, 2011 at 00:50, Nikhil Sheth <nikhil.js@gmail.com> wrote:
Hi All,

I had a great time reading the blogpost. Very amusing. But had to clarify some things before we all pick up our pitch-forks and torches, so..

http://nikhilsheth.blogspot.com/2011/03/clarification-about-wfsoe-and-english.html

I think you took it a bit personal. Shiju's blogpost IMHO is a good guide especially for Indian languages CD creation project. Giving ZIM format dumps has some adverse effects which was what Shiju tried to share as a best practice. I think they are valid concerns. Let me explain.

I think he clearly mentioned about the http://schools-wikipedia.org/ project for English wiki and there is no problem in such approaches. I think the blog post was mainly addressing the impact of giving whole dumps of Indian language wikipedia's, all [except English(Yes, English is also an Indian langugage)] of which are in their early stages.

Sometime back i proposed a similar CD creation project on tamil wikipedia and thought given that Santhosh's software is available, we could get a CD in couple of month's time. My timeline was rightfully mocked at by the community, with valid concerns on quality and we have just started collecting articles, which will then be peer reviewed, copy edited etc before we could use the software and create a CD. Along with Shiju's points, I am sharing some important viewpoints(which i learnt from tamil community), why i think the whole exercise of fact checking and releasing is very important instead of just dumping them in the case of Indian language wikipedia's. 

** Who had checked the validity of the content provided in the CD? Once the CD is created, the content is frozen for ever. We never know in what ways the content in CD is going to get duplicated.

>>This is very important. Imagine students using it to write answers for exams using this and if there is a factual error and teacher fails to give marks, the students will never ever turn back to Wikipedia again.We are loosing young buds here. While you might say Wikipedia is not reliable for information source, but you can use it, for a school kid, its always binary.Indian language wikipedia's are bound to have more spelling mistakes(since most of us use a non native keyboard / translation and we are human), the error rate will be even higher. Its not just about schools, anyone who is find such errors in a published form(CD is a published form you cant change) will find it annoying and may arrive at a prejudice which we wouldnt want. 

**How we are going to handle the copyright violation of text and images included in the CD? Can we always point our fingers to WMF?

>> This is very serious issue, its a different thing that we as individuals might not respect copyright because of the climate in India, but as a project Wikipedia holds copyright as one of the 5 pillars(Do I need to tell this list about this, sorry *runs away* :D).

**Who will answer the queries related to the  explicit images contained in many articles. The inclusion of explicit images, controversial articles, and factual errors in the CD supplied to the school children are not small issues.

>> Not many know, Shiju may be sharing this because malayalam community actually burnt their fingers here. There was a (politically motivated?)  article in leading malayalam daily/magazine criticizing the effort regarding inclusion criteria / some factual erroneous information.This may not have been shared widely to avoid multiplication of negative publicity. While at personal level it hurts the people who worked hard for this after spending weeks for the project, at project level it creates a negative publicity / impact for language wikipedia which none of us would want. 

>> On the content side as well, Indian wikipedia's arent that great(barring few). The Stub ratio may be high which will give a wrong impression of language wikipedia's to people using the offline ZIM dumps that language wikipedia's will only have templated content stubs. While its good to have stubs online, since they help in improvement of articles, i dont fancy stubs on an offline wikipedia.

>> While giving ZIM format dumps might be useful for people without connectivity, IMHO its definitely not suitable for mass consumption without a bunch of disclaimers (which might not even be understood by well educated)

>> While the intentions may be good, that shouldnt cause a negative impact to the language wikipedia is the bottom line.I hope you understand.(This was probably missing in Shiju's blog post which made you think it was a pointing fingers, but I am sure Shiju's intention was to share his "bottom line")

Regards
Srikanth.L
http://srik.me