Dear admins,
Let me insist and ask again about the status und plans of the toolserver: Starting on Monday our team will start over for an interim last programming round on Wikipoint database in our spare time.
So, I'd really like to avoid redundant programming on outdated articles (residing in dumps).
Now, we will use WikiProxy in the hope everything works as well as it gets, right?
-- Stefan
P.S. There is yet another potential use of our Wikipoint service coming up: There is a proposal for a 'wiki safary across germany' which has the goal to capture pictures for georeferenced articles which have not yet one... c.f. http://de.wikipedia.org/wiki/Wikipedia_Diskussion:Bilderw%C3%BCnsche. If they would plan on the basis of a dump, which is always some weeks old, then they probably would gather photos where there has been already someone...
-----Original Message----- From: toolserver-l-bounces@Wikipedia.org [mailto:toolserver-l-bounces@Wikipedia.org] On Behalf Of Leo Büttiker Sent: Mittwoch, 22. März 2006 22:04 To: toolserver-l@wikipedia.org Subject: [Toolserver-l] Troubles with reading Articles
Hi all, For a toolserver-project I will read all Wikipedia (pwiki_de) articles and parse them for geoinformation. After some troubles I've fixed now nearly all bugs, but I have still some troubles with opening the articles.
I open the article with the help of the mediawiki functions in the following way: $title = Title::newFromID($page_id); $art = new Article($title); $text = $art->getContent(true);
For some articles this work quite well, but for some it doesn't return text. I think there's a problem with the compresion of the database (in a local enviroment with a wikipedia dump it works), but I could't find out a workaround. Any suggestions?
Thanks Leo _______________________________________________ Toolserver-l mailing list Toolserver-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/toolserver-l
Hello, Am Samstag 01 April 2006 22:34 schrieb sfkeller@hsr.ch:
Dear admins,
Let me insist and ask again about the status und plans of the toolserver: Starting on Monday our team will start over for an interim last programming round on Wikipoint database in our spare time.
So, I'd really like to avoid redundant programming on outdated articles (residing in dumps).
Now, we will use WikiProxy in the hope everything works as well as it gets, right?
you can do in that way, when you don't generate to much load neither on the tool- nor on the master-db-server. You shouldn't not read more then a few articles per minute (max 10 or so).
-- Stefan
Sincertly DaB.
Hi all
I just want to let you know that I have an extra database for the caching now (thanks Kate!) and can now add wikis to the cache on request. Currently, the following wikis are cached:
commons, en, de, fr, pl, nl, it, sv, pt, es, ru.
If you need more, please tell me. Note that *all* wikis can be accessed through WikiProxy, data from "uncached" wikis is simply passed through.
-- Daniel
Hi !
Great Tool !
because of the big replag at the moment i use a mod-way to access the data :
$replag=file_get_contents(" http://tools.wikimedia.de/~interiot/cgi-bin/replag?raw"); if ($replag<7200){ $text1=file_get_contents(" http://tools.wikimedia.de/~daniel/WikiSense/WikiProxy.php?wiki= ".$fields[1]."&title=".$fields[4]."&rev=&go=Fetch"); $text2=file_get_contents(" http://tools.wikimedia.de/~daniel/WikiSense/WikiProxy.php?wiki= ".$fields[2]."&title=".$fields[4]."&rev=&go=Fetch"); }else{ $text1=file_get_contents("http://%22.$fields%5B1%5D.%22.wikipedia.org/w/index.php?title=%22.$fields%5B..."); $text2=file_get_contents("http://%22.$fields%5B2%5D.%22.wikipedia.org/w/index.php?title=%22.$fields%5B..."); }
I dont know if its a good idea that i use this hack.
Perhaps daniel should itegrate it in HIS software.
flacus
FlaBot schrieb:
I dont know if its a good idea that i use this hack.
Not for the people who need to parse header files(running bots or similar stuff with Login-Feature; these need to capture the SET-COOKIE-field in HTTP reply, which file_get_contents doesn't support, I think)
Greets,
Marco
Hi FlaBot, hi all
Great Tool !
because of the big replag at the moment i use a mod-way to access the data :
[...]
I dont know if its a good idea that i use this hack.
Perhaps daniel should itegrate it in HIS software.
No, I don't think so. The cache has to be consistent with the databases on the toolserver, otherwise I would not be able to match up revisions correctly. You should only use that hack if you use the article text only, without any data from the toolserver db...
Regards, Daniel
No, I don't think so. The cache has to be consistent with the databases on the toolserver, otherwise I would not be able to match up revisions correctly. You should only use that hack if you use the article text only, without any data from the toolserver db...
Regards, Daniel
It's not a hack. Caches can store multiple rev_id's of wikitext, and if you use Special:Export when fetching the latest wikitext version, then you know what rev_id to cache the wikitext under.
I have a number of tools that use wikipedia's latest rev_id, rather than the replaged one, and I find it to be generally useful. As you said, you have to keep in mind that any metadata you get from the local database may be inconsistent with the latest version of the wikitext, but I still find that there are some tools that are better off using the latest wikitext.
-Interiot
toolserver-l@lists.wikimedia.org