Hi Everyone,
I am a GSOC student working with Parsoid Team [1] < https://www.mediawiki.org/wiki/Parsoid%3E to build a Parsoid based linter (Linttrap) [2] < https://www.mediawiki.org/wiki/Parsoid/Linting/GSoC_2014_Application%3E. Lintrap will detect broken wikitext found on the wiki pages and will also collect stats about certain wikitext usage patterns.
Currently, for this demo, Lintrap can detect 4 types of broken wikitext, But, other kinds of issues could be detected in the coming weeks and months : * Fostered Wikitext : Eg [3] http://lintbridge.wmflabs.org/_html/issues/53a2fe1e94641d9101f8e8b2 * Missing End Tag : Eg [4] http://lintbridge.wmflabs.org/_html/issues/53a2fca994641d9101f8e72a * Missing Start Tag : Eg [5] http://lintbridge.wmflabs.org/_html/issues/53a2fdc394641d9101f8e850 * Stripped Tags : Eg [6] http://lintbridge.wmflabs.org/_html/issues/53a2fd4394641d9101f8e819
Linttrap also collect information about transclusion usages where multiple templates are used to construct a DOM structure [7] < http://lintbridge.wmflabs.org/_html/issues/53a3026794641d9101f8ea81%3E. Here's our stats page [8] http://lintbridge.wmflabs.org/stats.
Once a page is parsed, Lintrap uses parsoid based logger facility to log them to a web service. We call it Lintbridge [9] < http://lintbridge.wmflabs.org/%3E. Currently Lintbridge is hosted on Wikimedia Labs and use mongodb to store all the issues. Lintbridge offers a REST api which can be used by bots and other applications to fix the broken wikitext. Linttrap uses this REST api to store issues into Lintbridge.
We have also built a HTML app on top of Lintbridge [10] < http://lintbridge.wmflabs.org/_html/issues%3E. This is a basic app for now which is used to demonstrate linttrap abilities. But, It is quite useful as it is today. Feel free to browse over the issues.
* You can use the links in the table to filter the issues. * Click on issue type to filter issue by issue type. * You can filter issue by page too. * You can use Fix All to fix all issue for that page. * You can even use filters on the top bar to filter by Wiki and Type. * Each issue contain a info about wiki, page, revision on the left and the wikitext snippet on the right.
Just for the demo of this working prototype, we have collected issues by parsing 1000 picked from http://parsoid-tests.wikimedia.org/topfails http://parsoid-tests.wikimedia.org/topfails/0. If you want to try the JSON API you can use the following routes.
GET /_api/issues : Show all issues ( http://lintbridge.wmflabs.org/_api/issues) GET /_api/issues/type/issue-type : Filter by issue-type ( http://lintbridge.wmflabs.org/_api/issues/type/fostered) GET /_api/enwiki/issues : Filter by enwiki ( http://lintbridge.wmflabs.org/_api/enwiki/issues)
POST /_api/add : Add a issue to the Lintbridge Inviting feedback.
Thank you Hardik Juneja
-- [1] Parsoid: https://www.mediawiki.org/wiki/Parsoid [2] Linttrap: https://www.mediawiki.org/wiki/Parsoid/Linting/GSoC_2014_Application [3] Fostered Ex : http://lintbridge.wmflabs.org/_html/issues/53a2fe1e94641d9101f8e8b2 [4] Missing End Tag eg : http://lintbridge.wmflabs.org/_html/issues/53a2fca994641d9101f8e72a [5] Missing Start Tag eg : http://lintbridge.wmflabs.org/_html/issues/53a2fdc394641d9101f8e85 http://lintbridge.wmflabs.org/_html/issues/53a2fdc394641d9101f8e8500 http://lintbridge.wmflabs.org/_html/issues/53a2fdc394641d9101f8e850 [6] Stripped Tag eg : http://lintbridge.wmflabs.org/_html/issues/53a2fd4394641d9101f8e819 [7] Mixed Template eg : http://lintbridge.wmflabs.org/_html/issues/53a3026794641d9101f8ea81 [8] Stats Page : http://lintbridge.wmflabs.org/stats [9] Lintbridge: http://lintbridge.wmflabs.org/ [10] HTML App: http://lintbridge.wmflabs.org/_html/issues