Hi,
I am a 4th year student of the Department of Computer Science and Engineering at Indian Institute of Technology( IIT), Kharagpur, India.
I am good at programming in PHP, MySQL, JavaScript, jQuery, HTML, CSS, Java, C, C++ and Python. I have done all the important courses including Machine Learning, Artificial Intelligence, Algorithms, Information Retrieval, Natural language processing, Advanced Graph theory.
I am very enthusiast to work with Wikimedia in GSOC 2014. I have an idea which I believe can help improve Wikipedia content:
I want to implement a ranking system of the editors(especially 3rd party editors) of the Wikipedia through which viewers can differentiate between the content of the page. This ranking system will increase the content reliability. We can implement :
1. An extension which take all the editors information--
a. How many times editor has edited this particular page, b. The number of pages he edited and editor's reputation( i.e number and type of badge)
We can get this information from the "view history tab and then user info from the user page" and generate a reliability score by using (i.e Data clustering https://en.wikipedia.org/wiki/Cluster_analysis) for every line/paragraph of the content for all Wikipedia page. After installing this extension, user right click on any line to see the reliability score, all editor info and history in concise form.
2. Make the different color of the line/paragraph if the content of the line/paragraph is very new and its reliability score is less.
Please let me if I should go with this idea. If not, guide me how to start working on different idea.
Thanks and Regards
Devender (Linkedin Profilehttp://in.linkedin.com/pub/devender-bindal/27/70b/133/ ) 4th Year Student Dual Degree(B.Tech+M.Tech) Computer Science and Engineering IIT Kharagpur Phone +91-8967224480 Alternate Email *devender@cse.iitkgp.ernet.in devender@cse.iitkgp.ernet.in*
Hi Devender, I'm not a developer but I hope my feedback as editor is useful.
On 03/06/2014 12:02 AM, Devender wrote:
I want to implement a ranking system of the editors(especially 3rd party editors) of the Wikipedia through which viewers can differentiate between the content of the page.
What do you mean with "3rd party editors"?
This ranking system will increase the content reliability
Content reliability is indeed an interesting value for wiki content, especially in projects like Wikipedia. However, basing the reliability of the content on the quantity of edits done by an editor is risky --to say the least.
Reliability is based on quantity, not quality. If you would find a way to assess the quality of the editions of an editor (and therefor the reliability of an editor)... Then maybe you could provide a hint about the reliability of an article based on the reliability of the editors that edited it.
Even in that case it might be complex to figure out when the reliable editors are acting to add more quality to an already good article, or to fix the worst issues of a horrible article. When they add and when they revert...
And of course it may also happen that editors not identified as reliable produce great content, as it often the case with editors very specialized in certain topic, with a short history of excellent edits.
- Make the different color of the line/paragraph if the content of the
line/paragraph is very new and its reliability score is less.
Even if there is some probability that older paragraphs that have survived many edits intact are somewhat reliable, it is too easy to find examples disproving this point. This is true especially in the articles needing more a quality assessment, those that are not edited often and are not watched by many experienced editors.
Please let me if I should go with this idea. If not, guide me how to start working on different idea.
This is just my personal opinion and I'm not an expert. Maybe someone else will ave a different, more positive opinion about your project, or advice to re-focus it.
In general, students proposing new projects have more chances of success if they start pitching and testing their ideas months before the GSoC. Add a factor of x5 at least if your main target is a Wikimedia project.
If you don't get mentors for your project very soon, then the safest option is to choose a project at https://www.mediawiki.org/wiki/Summer_of_Code_2014 and go for it.
Thank you for your interest in contributing to Wikimedia. Also thank you for following my suggestion to post at wikitech-l. I hope you wll get more feedback from other people in this list.
Hi, Devander. Have you looked at WikiTrust[0]? It does roughly what you describe (though I don't think the live demo works anymore).
See also https://meta.wikimedia.org/wiki/Research:Content_persistence
On Thu, Mar 6, 2014 at 7:32 PM, Benjamin Lees emufarmers@gmail.com wrote:
Hi, Devander. Have you looked at WikiTrust[0]? It does roughly what you describe (though I don't think the live demo works anymore).
[0] https://en.wikipedia.org/wiki/Wikipedia:WikiTrust _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I oppose such idea or implementation, automating ranking of content sounds like a way to get people focus on the rank/score aggressively instead of human work on content. They already focus on 'number of GA reviews' and 'number of FAs I contributed to', relying on style and content guides for these more than on the concept of freedom of knowledge. I had created https://meta.wikimedia.org/wiki/Karma recently in an attempt to start gathering examples of such blindly misleading work.
If implemented, I dare to ask that the thing is opt-in...
On Fri, 7 Mar 2014, at 10:25, Quim Gil wrote:
Hi Devender, I'm not a developer but I hope my feedback as editor is useful.
On 03/06/2014 12:02 AM, Devender wrote:
I want to implement a ranking system of the editors(especially 3rd party editors) of the Wikipedia through which viewers can differentiate between the content of the page.
What do you mean with "3rd party editors"?
This ranking system will increase the content reliability
Content reliability is indeed an interesting value for wiki content, especially in projects like Wikipedia. However, basing the reliability of the content on the quantity of edits done by an editor is risky --to say the least.
Reliability is based on quantity, not quality. If you would find a way to assess the quality of the editions of an editor (and therefor the reliability of an editor)... Then maybe you could provide a hint about the reliability of an article based on the reliability of the editors that edited it.
Even in that case it might be complex to figure out when the reliable editors are acting to add more quality to an already good article, or to fix the worst issues of a horrible article. When they add and when they revert...
And of course it may also happen that editors not identified as reliable produce great content, as it often the case with editors very specialized in certain topic, with a short history of excellent edits.
- Make the different color of the line/paragraph if the content of the
line/paragraph is very new and its reliability score is less.
Even if there is some probability that older paragraphs that have survived many edits intact are somewhat reliable, it is too easy to find examples disproving this point. This is true especially in the articles needing more a quality assessment, those that are not edited often and are not watched by many experienced editors.
Please let me if I should go with this idea. If not, guide me how to start working on different idea.
This is just my personal opinion and I'm not an expert. Maybe someone else will ave a different, more positive opinion about your project, or advice to re-focus it.
In general, students proposing new projects have more chances of success if they start pitching and testing their ideas months before the GSoC. Add a factor of x5 at least if your main target is a Wikimedia project.
If you don't get mentors for your project very soon, then the safest option is to choose a project at https://www.mediawiki.org/wiki/Summer_of_Code_2014 and go for it.
Thank you for your interest in contributing to Wikimedia. Also thank you for following my suggestion to post at wikitech-l. I hope you wll get more feedback from other people in this list.
-- Quim Gil Technical Contributor Coordinator @ Wikimedia Foundation http://www.mediawiki.org/wiki/User:Qgil
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l Email had 1 attachment:
- signature.asc 1k (application/pgp-signature)
Hello, Devender, and thank you for being interested in open source and in Wikimedia!
Are you proposing to do this for *Wikipedia* specifically, or just as a MediaWiki extension that any MediaWiki installation might choose to install? If you want to do this to Wikipedia, then I think you haven't thought enough about this: https://www.mediawiki.org/wiki/Summer_of_Code_2013#Your_project
"MediaWiki != Wikipedia: YES to generic MediaWiki projects. YES to projects already backed by a Wikimedia community. NO to projects requiring Wikipedia to be convinced."
What you are proposing would require that you persuade the Wikipedia community that it is a good idea. In our experience, one summer is not long enough for you to do that and build it as well. If you want to change the experience for ALL of the users of all the Wikimedia sites, you need a lot of time for community discussion and design.
Let me give you some examples.
Good: https://www.mediawiki.org/wiki/Summer_of_Code_2014#UniversalLanguageSelector... see that this is an idea for an improvement to the software that simply improves an existing experience. So that's fine.
Bad: "my idea is to change English Wikipedia so that instead of a main page full of text, the main page is an interactive video/animation that autoplays." (I, Sumana, made this up.) That idea is a bad fit for several reasons, but the thing I would want to concentrate on here, Devender, is that this video idea would require that the proposer negotiate with the English Wikipedia community to get them to accept her idea and change the way they do things.
For Google Summer of Code/OPW projects, it just seems like a bad idea -- it doesn't seem like there would be enough time for that kind of consultation and negotiation. It's reasonable for large, long-term projects (like the VisualEditor or Flow or mobile editing/uploading) to have these kinds of conversations, to work with the wiki communities to rearrange workflow, or the basic way edits are displayed, or something like that. For Google Summer of Code/OPW projects, it just seems like a bad idea -- it doesn't seem like there would be enough time for that kind of consultation and negotiation.
So, how can you get an idea that's good, and that makes it easier for people to understand the quality of Wikipedia edits and Wikipedia edit history, but that you can do in three months? If you are interested in that, I suggest that you take a look at the existing projects at WMF analytics https://www.mediawiki.org/wiki/Analytics , and the existing work by researchers who care about this topic. https://meta.wikimedia.org/wiki/Research That should give you some ideas.
Thanks again!
Sumana Harihareswara Engineering Community Manager Wikimedia Foundation
On Thu, Mar 6, 2014 at 3:02 AM, Devender dev.iitkgp.cse@gmail.com wrote:
Hi,
I am a 4th year student of the Department of Computer Science and Engineering at Indian Institute of Technology( IIT), Kharagpur, India.
I am good at programming in PHP, MySQL, JavaScript, jQuery, HTML, CSS, Java, C, C++ and Python. I have done all the important courses including Machine Learning, Artificial Intelligence, Algorithms, Information Retrieval, Natural language processing, Advanced Graph theory.
I am very enthusiast to work with Wikimedia in GSOC 2014. I have an idea which I believe can help improve Wikipedia content:
I want to implement a ranking system of the editors(especially 3rd party editors) of the Wikipedia through which viewers can differentiate between the content of the page. This ranking system will increase the content reliability. We can implement :
- An extension which take all the editors information--
a. How many times editor has edited this particular page, b. The number of pages he edited and editor's reputation( i.e number and type of badge)
We can get this information from the "view history tab and then user info from the user page" and generate a reliability score by using (i.e Data clustering https://en.wikipedia.org/wiki/Cluster_analysis) for every line/paragraph of the content for all Wikipedia page. After installing this extension, user right click on any line to see the reliability score, all editor info and history in concise form.
- Make the different color of the line/paragraph if the content of the
line/paragraph is very new and its reliability score is less.
Please let me if I should go with this idea. If not, guide me how to start working on different idea.
Thanks and Regards
Devender (Linkedin Profilehttp://in.linkedin.com/pub/devender-bindal/27/70b/133/ ) 4th Year Student Dual Degree(B.Tech+M.Tech) Computer Science and Engineering IIT Kharagpur Phone +91-8967224480 Alternate Email *devender@cse.iitkgp.ernet.in devender@cse.iitkgp.ernet.in* _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org