Hi all,
I'm finding myself making calls to the live api on the wmf wikis and thinking: Writing the query from scratch for every detail (or copying it from the bits and pieces in mw source code) for every time I need the data is nonsense since it was already done..... in mediawiki core.
MediaWiki uses the API internally as well in some places (more in extensions) with a FauxRequest (which is calling the api without a real http request). From a server hosting a live MediaWiki-site it's very easy (php include includes/WebStart.php anad make a FauxRequest() - one can make several requests all without making a http request ). However from the toolserver it's a little trickier since we're not on the same server.
So I was thinking : * upload a mediawiki install (the same version that WMF runs, ie. 1.16wmf4 or 1.17wmf1) * make it not publically accessable (we don't want people actually browsing the wiki) * Configure it in a special way so that one can use the same code for any wiki (ie. a $lang and $family variable of some kind)
Then one can include includes/WebStart.php, and use the API (ie. using the huge library of quiries already in the MediaWiki core (ie. action=query&list=categorymembers, using generators and getting properties, you name it) like this:
<source> $site = 'wikipedia'; $lang = 'en'; require( $mw_root . '/includes/WebStart.php' ); // loads all common code including LocalSettings.php // LocalSettings contains extra code to check for $site and $lang figuring out the correct $wgDBname, // $wgDBserver etc. a tiny bit like wmf's CommonSettings.php
$apiRequest = array( 'action' => 'query', 'list' => '...', /* etc. */ ); /* etc. */ </source>
This should basically be includable by anyone so that not everybody has to re-do this.
ie. it could be in /home/somebody/wmf-api/includes/WebStart.php which would be a checkout of the wmf-branch in SVN and (maybe) the same extensions etc.
This will make it a lot easier to interact with the database when you need certain information, this will also prevent us from hardcoding names all the time (which I'm sure happends a lot and this is one of the causes some tools brake over time when just small details changed).
I believe some of the toolserver users already have parts of mediawiki in their home (I imagine stuff like GlobalFunctions.php can be very handy at times).
Basically I'm asking three things: * Has this been done already ? If so, we should document this better as I spent time looking for it but came up empty * Do we want this ? Are there potential problems here, what do we need to tackle or fix on our side ? * Who would want to do this ? (If nobody has plans for this already, I would like to do this)
-- Krinkle
Krinkle wrote:
- Configure it in a special way so that one can use the same code for
any wiki (ie. a $lang and $family variable of some kind)
The general idea seems fine. My only comment about this proposal is that the lang/family combination is horrific and should never, ever be used. It ends up requiring abominations such as lang=commons or lang=strategy, which is completely nonsensical and completely avoidable. It also presents problems in URLs when parameters such &lang=en get incorrectly auto-corrected by bad programs/scripts/applications into ⟨.
Wikimedia separates its wikis by database name. Please use "db", "wiki", or "site" (containing either the DB name or the URL) instead of family/language hackery.
MZMcBride
Op 5 feb 2011, om 23:49 heeft MZMcBride het volgende geschreven:
Krinkle wrote:
- Configure it in a special way so that one can use the same code for
any wiki (ie. a $lang and $family variable of some kind)
The general idea seems fine. My only comment about this proposal is that the lang/family combination is horrific and should never, ever be used. It ends up requiring abominations such as lang=commons or lang=strategy, which is completely nonsensical and completely avoidable. It also presents problems in URLs when parameters such &lang=en get incorrectly auto-corrected by bad programs/scripts/applications into ⟨.
Wikimedia separates its wikis by database name. Please use "db", "wiki", or "site" (containing either the DB name or the URL) instead of family/ language hackery.
MZMcBride
Although I forgot for a second that, as developers, we can simply pass the dbname directly. So I will certainly do that and simply require knowing the database name.
But so you know, look at:
http://noc.wikimedia.org/conf/highlight.php?file=CommonSettings.php
and search for: $lang
The horror begins (and hopefully at some point, ends) there :-)
-- Krinkle
Hello all!
"[Toolserver-l] Internally calling the MediaWiki API on Toolserver" is a very good idea!!
What about doing the same e.g. with the actual pywikipedia bot framework to have some API for python script as well?
Greetings Dr. Trigon
Am 05.02.2011 23:42, schrieb Krinkle:
Hi all,
I'm finding myself making calls to the live api on the wmf wikis and thinking: Writing the query from scratch for every detail (or copying it from the bits and pieces in mw source code) for every time I need the data is nonsense since it was already done..... in mediawiki core.
MediaWiki uses the API internally as well in some places (more in extensions) with a FauxRequest (which is calling the api without a real http request). From a server hosting a live MediaWiki-site it's very easy (php include includes/WebStart.php anad make a FauxRequest() - one can make several requests all without making a http request ). However from the toolserver it's a little trickier since we're not on the same server.
So I was thinking :
- upload a mediawiki install (the same version that WMF runs, ie.
1.16wmf4 or 1.17wmf1)
- make it not publically accessable (we don't want people actually
browsing the wiki)
- Configure it in a special way so that one can use the same code for
any wiki (ie. a $lang and $family variable of some kind)
Then one can include includes/WebStart.php, and use the API (ie. using the huge library of quiries already in the MediaWiki core (ie. action=query&list=categorymembers, using generators and getting properties, you name it) like this:
<source> $site = 'wikipedia'; $lang = 'en'; require( $mw_root . '/includes/WebStart.php' ); // loads all common code including LocalSettings.php // LocalSettings contains extra code to check for $site and $lang figuring out the correct $wgDBname, // $wgDBserver etc. a tiny bit like wmf's CommonSettings.php
$apiRequest = array( 'action' => 'query', 'list' => '...', /* etc. */ ); /* etc. */
</source>
This should basically be includable by anyone so that not everybody has to re-do this.
ie. it could be in /home/somebody/wmf-api/includes/WebStart.php which would be a checkout of the wmf-branch in SVN and (maybe) the same extensions etc.
This will make it a lot easier to interact with the database when you need certain information, this will also prevent us from hardcoding names all the time (which I'm sure happends a lot and this is one of the causes some tools brake over time when just small details changed).
I believe some of the toolserver users already have parts of mediawiki in their home (I imagine stuff like GlobalFunctions.php can be very handy at times).
Basically I'm asking three things:
- Has this been done already ? If so, we should document this better
as I spent time looking for it but came up empty
- Do we want this ? Are there potential problems here, what do we need
to tackle or fix on our side ?
- Who would want to do this ? (If nobody has plans for this already, I
would like to do this)
-- Krinkle
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Dr. Trigon wrote:
Hello all!
"[Toolserver-l] Internally calling the MediaWiki API on Toolserver" is a very good idea!!
What about doing the same e.g. with the actual pywikipedia bot framework to have some API for python script as well?
Greetings Dr. Trigon
Hi again,
I decided to drop this for now (atleast for myself, anyone else can ofcourse pick it[1]). Because there were too many problems setting it up: * Missing tables that MW wants by default (mostly cache tables) * Invisibility of some indexes that the API needs * Configuration problems * more errors, problems and incompatibilities
What I did do to solve the problem I was having (which inspired me to bring this up) is write the following tool: http://toolserver.org/~krinkle/getWikiAPI.php
Which can be used to get wiki-specific information by any of the identifiers we know (like database name, sitename, host, url, domain, topdomain)
Can come in handy if you got one of the pieces but are missing the other pieces (either from within a tool or off-toolserver)
This API supports a few human readable formats and serliazed PHP and JSON/JSONP callbacks: * http://toolserver.org/~krinkle/getWikiAPI.php?wikiids=commons#output * http://toolserver.org/~krinkle/getWikiAPI.php?wikiids=media&format=json&... * http://toolserver.org/~krinkle/getWikiAPI.php?wikiids=dewikt&format=php_...
-- Krinkle
[1] Some points for who would like to do this: * svn checkout /branchces/wmf/1.16wmf4 (better wait two weeks so you can take /1.17wmf1) * regularly update this from svn * global variable aray (eg. $tsWmfApi) with wanted databasename, database user and datebase password (users can pass their .my.cnf. user/password and then which ever database they need) * Map this to the right dbhost (either replace _p with -p and go to dbname.rrdb.toolserver.org or get it from sql.toolserver.org / toolserver.wiki table (SELECT server WHERE dbname=sql_clean($tsWmfApi['dbname']). ) * include the right extension files for the right wikis (or include none for all wikis, but don't include all for all wikis) ** Use wgConf(), see http://noc.wikimedia.org/conf/CommonSettings.php.txt , http://noc.wikimedia.org/conf/InitialiseSettings.php.txt * in LocalSettings.php set $wgLocalisationCacheConf['store'] = 'files'; // since l10n_cache doesn't exist on Toolserver * Figure out how to fix the missing objectcache table as well. * Figure out how to stop mediawiki from forcing index in queries since those indexes are not visible from views on Toolserver ** There's a useIndexClause() function in /includes/db/ DatabaseMysql.php. Reset to returning "" ** Then there's a lot of calls to $this-> addOption( 'USE INDEX', / * ....*/ ) in api modules, add an if() statement to not use index if $value is USE INDEX. * And more
On 02/06/2011 12:42 AM, Krinkle wrote:
So I was thinking :
- upload a mediawiki install (the same version that WMF runs, ie.
1.16wmf4 or 1.17wmf1)
- make it not publically accessable (we don't want people actually
browsing the wiki)
- Configure it in a special way so that one can use the same code for
any wiki (ie. a $lang and $family variable of some kind)
So you basically want to set up an internal copy of MediaWiki core so that it can run on the Toolserver databases in read-only mode, and provide a wrapper so that it can be conveniently accessed from PHP code?
That sounds potentially very useful, at least for people who write their tools in PHP.
Then one can include includes/WebStart.php, and use the API (ie. using the huge library of quiries already in the MediaWiki core (ie. action=query&list=categorymembers, using generators and getting properties, you name it) like this:
<source> $site = 'wikipedia'; $lang = 'en'; require( $mw_root . '/includes/WebStart.php' ); // loads all common code including LocalSettings.php // LocalSettings contains extra code to check for $site and $lang figuring out the correct $wgDBname, // $wgDBserver etc. a tiny bit like wmf's CommonSettings.php
$apiRequest = array( 'action' => 'query', 'list' => '...', /* etc. */ ); /* etc. */
</source>
On the other hand, I'm not sure why you'd want to do that. Calling the API internally via FauxRequest is basically a huge kluge. I suppose it may sometimes be the easiest way to get a particular job done, but it's not really the first (or even second or third) solution I'd consider.
Most of the time, if you have direct access to the MediaWiki core classes, you really should just use them directly.
Ilmari Karonen wrote:
On 02/06/2011 12:42 AM, Krinkle wrote:
So I was thinking :
- upload a mediawiki install (the same version that WMF runs, ie.
1.16wmf4 or 1.17wmf1)
- make it not publically accessable (we don't want people actually
browsing the wiki)
- Configure it in a special way so that one can use the same code for
any wiki (ie. a $lang and $family variable of some kind)
So you basically want to set up an internal copy of MediaWiki core so that it can run on the Toolserver databases in read-only mode, and provide a wrapper so that it can be conveniently accessed from PHP code?
That sounds potentially very useful, at least for people who write their tools in PHP.
Then one can include includes/WebStart.php, and use the API (ie. using the huge library of quiries already in the MediaWiki core (ie. action=query&list=categorymembers, using generators and getting properties, you name it) like this:
<source> $site = 'wikipedia'; $lang = 'en'; require( $mw_root . '/includes/WebStart.php' ); // loads all common code including LocalSettings.php // LocalSettings contains extra code to check for $site and $lang figuring out the correct $wgDBname, // $wgDBserver etc. a tiny bit like wmf's CommonSettings.php
$apiRequest = array( 'action' => 'query', 'list' => '...', /* etc. */ ); /* etc. */
</source>
On the other hand, I'm not sure why you'd want to do that. Calling the API internally via FauxRequest is basically a huge kluge. I suppose it may sometimes be the easiest way to get a particular job done, but it's not really the first (or even second or third) solution I'd consider.
Most of the time, if you have direct access to the MediaWiki core classes, you really should just use them directly.
-- Ilmari Karonen
Yes indeed. FauxRequest is just one of the examples possible when having a internally working version of wmf-deployment on Toolserver.
Ofcourse using any other classes is just as easy and is probably a better idea.
-- Krinkle
If the api was only a thin wrapper around some base classes, then indeed you would not need fauxrequest. However the api does much processing itself and in practice using fauxrequest is much easier.
Bryan
Op 6 feb 2011 02:29 schreef "Krinkle" krinklemail@gmail.com:
Ilmari Karonen wrote:
On 02/06/2011 12:42 AM, Krinkle wrote:
So I was thinking :
- upload...
Yes indeed. FauxRequest is just one of the examples possible when having a internally working version of wmf-deployment on Toolserver.
Ofcourse using any other classes is just as easy and is probably a better idea.
-- Krinkle
_______________________________________________ Toolserver-l mailing list (Toolserver-l@lists.w...
toolserver-l@lists.wikimedia.org