On 2/12/15, James Salsman jsalsman@gmail.com wrote:
I invite review of this preliminary proposal for a Google Summer of Code project: http://www.mediawiki.org/wiki/Accuracy_review
If you would like to co-mentor this project, please sign up. I've been a GSoC mentor every year since 2010, and successfully mentored two students in 2012 resulting in work which has become academically relevant, including in languages which I can not read, i.e., http://talknicer.com/turkish-tablet.pdf .) I am most interested in co-mentors at the WMF or Wiki Education Foundation involved with engineering, design, or education.
Synopsis:
Create a Pywikibot to find articles in given categories, category trees, and lists. For each such article, add in-line templates to indicate the location of passages with (1) facts and statistics which are likely to have become out of date and have not been updated in a given number of years, and (2) phrases which are likely unclear. Use a customizable set of keywords and the DELPH-IN LOGIN parser [http://erg.delph-in.net/logon] to find such passages for review. Prepare a table of each word in article dumps indicating its age. Convert flagged passages to GIFT questions [http://microformats.org/wiki/gift] for review and present them to one or more subscribed reviewers. Update the source template with the reviewer(s)' answers to the GIFT question, but keep the original text as part of the template. When reviewers disagree, update the template to reflect that fact, and present the question to a third reviewer to break the tie.
Possible stretch goals for Global Learning Xprize Meta-Team systems [http://www.wiki.xprize.org/Meta-team#Goals] integration TBD.
Best regards, James Salsman
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Have you run this by Wikipedians? (I'm assuming enwikipedia would be your target audience). I would recommend making sure that enwikipedia is politically ok with this first, since it involves adding a bunch of templates to articles, as it would suck for a gsoc student if their work wasn't used due to politics happening at the end.
Prepare a table of each word in article dumps indicating its age.
This in itself is a non-trivial problem (for a gsoc student anyways), assuming you need it for the entire enwikipedia, and you need it up to date as soon as people edit. Even getting the student sufficient storage and CPU resources to actually compute that could potentially be difficult (maybe?)
Convert flagged passages to GIFT questions for review and present them to one or more subscribed reviewers
Wouldn't you want to give the reviewers an actual form where they can fill out the questions, not something in a markup language (Unless you mean you want them to store it in that form internally,which seems like a rather minor implementation detail)
--bawolff