Re: [Wikimedia-search] [QA] [reading-wmf] top articles across languages for testing?

20 May 2015

  For Parsoid, we run tests [1] against a set of 160K articles that we 
randomly picked a couple years back .. about 10K articles from 16 wikis. 
For Parsoid's purposes, we run roundtrip tests (wikitext -> html -> 
wikitext) and compare diffs, as well as run trivial edit tests (wikitext 
-> html -> add comment at end of page -> wikitext) and check how clean 
our roundtripping is.

This testing has been extremely good at telling us when something is 
broken vs. when something is good to be deployed. Checking these results 
is part of our deployment process. We also collect performance 
statistics in each testing run, however our testing database / database 
schema is not sufficiently tuned to let us actually track performance 
regressions well .. so, that data has just sat in the db without being 
used for anything.

But, we've also been recently talking about:
* refresh this to pick a more proportional set of articles from 
different wikis (more from enwiki, less from others, etc.), but not yet 
done this.
* throw in a different (non-random selection) set of pages that are 
particularly important (featured articles, etc.). so, we would be 
interested in any set of articles that is considered important enough to 
be regularly tested against.

This map-reduce style testing code is somewhat general enough that it 
could be repurposed for other kinds of testing. For example, we have 
also repurposed this same rt-testing code for running visual diffs 
(compare phantomjs renderings of php parser output and parsoid output on 
the same title) on a set of about 800 enwiki articles (random selection) 
[2].

This kind of testing is very essential for our deployments and not sure 
if it is appropriate for other teams .. but sharing just in case.

Subbu.

[1] See http://parsoid-tests.wikimedia.org/topfails and 
http://parsoid-tests.wikimedia.org/commits .. The main page is 
http://parsoid-tests.wikimedia.org but this page can sometimes timeout 
whenever the db is clogged and old test results need clearing out.

[2] http://parsoid-tests.wikimedia.org/visualdiff/  with code @ 
https://github.com/subbuss/parsoid_visual_diffs

On 05/20/2015 01:48 AM, Elena Tonkovidova wrote:
...
  On 

https://docs.google.com/spreadsheets/d/14Ei-KWYbZcmvT70irx6NGIJCi17tF2o1szX…

 there are articles that I usually check when I do regression testing.

 One group is a set of articles that used to have some sort of 
 performance/display issues
 - Barack Obama, Cat, India, Richard Nixon,
 Europe, English language

 Another group of articles - where images or Image Gallery is 
 tested(gif, svg, image map, charts, timeline, large amount of imgs in 
 the Image Gallery)

 - *Claude Monet *- extensive Image Gallery(different img sizes)
 - *List of go games* - many svg images
 - Lilac chaser, Caridoid escape reaction - animated(gif) images
 - *The Club(dining club), Image map*- for image map img
 - *Tel Aviv(Hebrew*) for timeline img template
 - several specific articles with problems in their lead img

 And, yes, it'd be really great if we can 1) define more precisely what 
 articles properties we are interested to test(visiting statistics, 
 size, structures, special layouts, imgs etc.) and 2) create a 
 process(system) to find such articles

   Also, there is still an open task -
   https://phabricator.wikimedia.org/T97151 - Testing Page issues and
   disambiguation templates(T90250). Going through the list of
   http://en.wikipedia.org/wiki/Category:Wikipedia_articles_with_content_issues

http://en.wikipedia.org/wiki/Wikipedia:Template_messages/General#Disambigua…
should 
 help to catch some issues.

 thanks
 Elena

 On Tue, May 19, 2015 at 9:23 PM, Brian Gerstle &lt;bgerstle(a)wikimedia.org 
 <mailto:bgerstle@wikimedia.org>> wrote:

     +search

     On Tue, May 19, 2015 at 3:14 PM, Brian Gerstle
     &lt;bgerstle(a)wikimedia.org <mailto:bgerstle@wikimedia.org>> wrote:

         The subject hints at a question that's been nagging me for a
         while, and now that I'm going to be hacking on testing in Lyon
         I wanted to ask:

         Do we have a list of articles we usually run tests against?

         If not, do we have any processes for curating such a list? 
         Would anyone be interested in a brainstorming session at Lyon
         to discuss this further?

         Basically, as a developer, I would love to have more
         confidence that some code I wrote doesn't break on our most
         popular articles.  Or, if we can get more sophisticated, that
         *certain properties of my code hold true for certain kinds of
         generated pages*.*

         Please respond with your thoughts and whether you think I
         should create a phab task for the hackathon about this.  In
         either case, ping me anytime or grab me at Lyon to discuss
         further!

         Regards,

         Brian

         * Yes, I'm talking about using property-based testing
         generators to create random, shrinkable MW pages that we can
         run tests on. Not sure if it's practical, but could be an
         interesting experiment.

         -- 
         EN Wikipedia user page:
         https://en.wikipedia.org/wiki/User:Brian.gerstle
         IRC: bgerstle

     -- 
     EN Wikipedia user page:
     https://en.wikipedia.org/wiki/User:Brian.gerstle
     IRC: bgerstle

     _______________________________________________
     reading-wmf mailing list
     reading-wmf(a)lists.wikimedia.org
     <mailto:reading-wmf@lists.wikimedia.org>
     https://lists.wikimedia.org/mailman/listinfo/reading-wmf

 _______________________________________________
 QA mailing list
 QA(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/qa 

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Wikimedia-search] [QA] [reading-wmf] top articles across languages for testing?