Is there some list or tool that identifies Wikipedia pages that are slow to parse?
My interest is mainly in the context of Lua deployment. I would like to identify non-obvious templates that may be having an appreciable impact on performance. (As opposed to things like {{cite}}, where the performance problems are already well-known.)
I don't suppose the database stores a "time for last parse" somewhere?
On enwiki we've already made Lua conversions with most of the string templates, several formatting templates (e.g. {{rnd}}, {{precision}}), {{coord}}, and a number of others. And there is work underway on a number of the more complex overhauls (e.g. {{cite}}, {{convert}}). However, it would be nice to identify problematic templates that may be less obvious.
-Robert Rohde aka Dragons_flight
A partial answer: I wrote a script under Pywikipedia framework to search templates with many parser functions that are worth to convert to Lua now: [[mw:Special:Code/pywikipedia/11099]]. I hope that it helps you to find good targets.
The result looks like http://hu.wikipedia.org/wiki/Wikip%C3%A9dia:Sablonm%C5%B1hely/Lua_k%C3%ADv%C... .
2013/3/7 Robert Rohde rarohde@gmail.com
Is there some list or tool that identifies Wikipedia pages that are slow to parse?
My interest is mainly in the context of Lua deployment. I would like to identify non-obvious templates that may be having an appreciable impact on performance. (As opposed to things like {{cite}}, where the performance problems are already well-known.)
I don't suppose the database stores a "time for last parse" somewhere?
On enwiki we've already made Lua conversions with most of the string templates, several formatting templates (e.g. {{rnd}}, {{precision}}), {{coord}}, and a number of others. And there is work underway on a number of the more complex overhauls (e.g. {{cite}}, {{convert}}). However, it would be nice to identify problematic templates that may be less obvious.
-Robert Rohde aka Dragons_flight
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Mar 7, 2013 1:06 AM, "Robert Rohde" rarohde@gmail.com wrote:
Is there some list or tool that identifies Wikipedia pages that are slow to parse?
My interest is mainly in the context of Lua deployment. I would like to identify non-obvious templates that may be having an appreciable impact on performance. (As opposed to things like {{cite}}, where the performance problems are already well-known.)
I don't suppose the database stores a "time for last parse" somewhere?
Not sure. Maybe try one of these cats: https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blob;f=includes...
There's slow-parse.log, but it's private unless a solution is found for https://gerrit.wikimedia.org/r/#/c/49678/ https://wikitech.wikimedia.org/wiki/Logs
Nemo
Le 06/03/13 23:58, Federico Leva (Nemo) a écrit :
There's slow-parse.log, but it's private unless a solution is found for https://gerrit.wikimedia.org/r/#/c/49678/ https://wikitech.wikimedia.org/wiki/Logs
And slow-parse.log is probably going to be kept private unless proven it is not harmful =)
On 03/07/2013 12:00 PM, Antoine Musso wrote:
Le 06/03/13 23:58, Federico Leva (Nemo) a écrit :
There's slow-parse.log, but it's private unless a solution is found for https://gerrit.wikimedia.org/r/#/c/49678/ https://wikitech.wikimedia.org/wiki/Logs
And slow-parse.log is probably going to be kept private unless proven it is not harmful =)
Why would it be harmful for public wikis? Anyone can do this on an article-by-article basis by copying the source their own MediaWiki instances.
But it ends up being repeated work.
Matt
On Thu, Mar 7, 2013 at 8:06 PM, Matthew Flaschen mflaschen@wikimedia.org wrote:
Why would it be harmful for public wikis? Anyone can do this on an article-by-article basis by copying the source their own MediaWiki instances.
That user would have to pick which articles to copy and test (or test them all).
The log doesn't contain (I guess?) all articles. Only slow articles.
-Jeremy
On 2013-03-07 4:06 PM, "Matthew Flaschen" mflaschen@wikimedia.org wrote:
On 03/07/2013 12:00 PM, Antoine Musso wrote:
Le 06/03/13 23:58, Federico Leva (Nemo) a écrit :
There's slow-parse.log, but it's private unless a solution is found for https://gerrit.wikimedia.org/r/#/c/49678/ https://wikitech.wikimedia.org/wiki/Logs
And slow-parse.log is probably going to be kept private unless proven it is not harmful =)
Why would it be harmful for public wikis? Anyone can do this on an article-by-article basis by copying the source their own MediaWiki instances.
But it ends up being repeated work.
Matt
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
+1 . I have trouble imagining how making this public could be harmful. There are plenty of well known slow to parse pages already. There's also more than a couple of ways to convince mw to make slow queries (longer than the php time limit), we publically release detailed profiling data, etc. Well that sort of thing isnt exactly proclaimed to the world, its also not a secret. If someone wanted to find slow points on mediawiki, theres a lot worse things just floating around the internet than a slow to parse page list.
-bawolff
Στις 07-03-2013, ημέρα Πεμ, και ώρα 21:12 -0400, ο/η bawolff έγραψε:
On 2013-03-07 4:06 PM, "Matthew Flaschen" mflaschen@wikimedia.org wrote:
On 03/07/2013 12:00 PM, Antoine Musso wrote:
Le 06/03/13 23:58, Federico Leva (Nemo) a écrit :
There's slow-parse.log, but it's private unless a solution is found for https://gerrit.wikimedia.org/r/#/c/49678/ https://wikitech.wikimedia.org/wiki/Logs
And slow-parse.log is probably going to be kept private unless proven it is not harmful =)
Why would it be harmful for public wikis? Anyone can do this on an article-by-article basis by copying the source their own MediaWiki instances.
But it ends up being repeated work.
Matt
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
+1 . I have trouble imagining how making this public could be harmful. There are plenty of well known slow to parse pages already. There's also more than a couple of ways to convince mw to make slow queries (longer than the php time limit), we publically release detailed profiling data, etc. Well that sort of thing isnt exactly proclaimed to the world, its also not a secret. If someone wanted to find slow points on mediawiki, theres a lot worse things just floating around the internet than a slow to parse page list.
The log in its current form is not just a list of publically viewable pages with parse times. The extraneous information would need to be removed before it could be made public.
Ariel
Federico Leva (Nemo) wrote:
There's slow-parse.log, but it's private unless a solution is found for https://gerrit.wikimedia.org/r/#/c/49678/ https://wikitech.wikimedia.org/wiki/Logs
"Separate slow-parse into public and private files" https://bugzilla.wikimedia.org/show_bug.cgi?id=45830
https://gerrit.wikimedia.org/r/49678 was abandoned; it looks like https://gerrit.wikimedia.org/r/52608 is now the relevant Gerrit changeset.
MZMcBride
Le 06/03/13 22:05, Robert Rohde a écrit :
On enwiki we've already made Lua conversions with most of the string templates, several formatting templates (e.g. {{rnd}}, {{precision}}), {{coord}}, and a number of others. And there is work underway on a number of the more complex overhauls (e.g. {{cite}}, {{convert}}). However, it would be nice to identify problematic templates that may be less obvious.
You can get in touch with Brad Jorsch and Tim Starling. They most probably have a list of templates that should quickly converted to LUA modules.
If we got {{cite}} out, that will be already a nice improvement :-]
Antoine Musso hashar+wmf@free.fr wrote:
Le 06/03/13 22:05, Robert Rohde a écrit :
On enwiki we've already made Lua conversions with most of the string templates, several formatting templates (e.g. {{rnd}}, {{precision}}), {{coord}}, and a number of others. And there is work underway on a number of the more complex overhauls (e.g. {{cite}}, {{convert}}). However, it would be nice to identify problematic templates that may be less obvious.
You can get in touch with Brad Jorsch and Tim Starling. They most probably have a list of templates that should quickly converted to LUA modules.
If we got {{cite}} out, that will be already a nice improvement :-]
Not really, given https://bugzilla.wikimedia.org/show_bug.cgi?id=45861
//Saper
On 03/07/2013 03:02 PM, Antoine Musso wrote:
Le 06/03/13 22:05, Robert Rohde a écrit :
On enwiki we've already made Lua conversions with most of the string templates, several formatting templates (e.g. {{rnd}}, {{precision}}), {{coord}}, and a number of others. And there is work underway on a number of the more complex overhauls (e.g. {{cite}}, {{convert}}). However, it would be nice to identify problematic templates that may be less obvious.
You can get in touch with Brad Jorsch and Tim Starling. They most probably have a list of templates that should quickly converted to LUA modules.
If we got {{cite}} out, that will be already a nice improvement :-]
Thanks for this thread. :-) I used some links from it in https://blog.wikimedia.org/2013/03/11/lua-templates-faster-more-flexible-pag... .
There are suggestions on https://en.wikipedia.org/wiki/Wikipedia:Lua_requests for templates to be converted; I'm not sure how other wikis are categorizing or listing their most inefficient templates. Thank you for https://svn.wikimedia.org/svnroot/pywikipedia/trunk/pywikipedia/parserfuncti... , Bináris.
You can use using Anomie’s Greasemonkey script to more easily see the performance gains: https://en.wikipedia.org/wiki/User:Anomie/PP-report-Greasemonkey-script
wikitech-l@lists.wikimedia.org