hello,
it is now possible to access page text directly from the toolserver. to do this, make an HTTP request to:
http://localhost:6987/text/<database>/<rev_id>
e.g.:
http://localhost:6987/text/enwiki_p/27785174
the text is replicated in real-time to a dedicated server on the same LAN as hemlock, so retrieval should be fast.
direct access to the text database is not available, because there is no way to hide deleted and oversighted text then.
for legal reasons, you cannot distribute large portions of page text to users. i have added another rule explaining this at [[Toolserver/Rules]].
i understand Daniel is working on making Wikiproxy access the database directly, so for people who already use that, no change will be required.
please report any problems with accessing the text. this is currently experimental and might disappear later, or not work properly.
- river.
I'm assuming this includes English Wikipedia?
On 7/27/07, River Tarnell river@wikimedia.org wrote:
hello,
it is now possible to access page text directly from the toolserver. to do this, make an HTTP request to:
http://localhost:6987/text/<database>/<rev_id>
e.g.:
http://localhost:6987/text/enwiki_p/27785174
the text is replicated in real-time to a dedicated server on the same LAN as hemlock, so retrieval should be fast.
direct access to the text database is not available, because there is no way to hide deleted and oversighted text then.
for legal reasons, you cannot distribute large portions of page text to users. i have added another rule explaining this at [[Toolserver/Rules]].
i understand Daniel is working on making Wikiproxy access the database directly, so for people who already use that, no change will be required.
please report any problems with accessing the text. this is currently experimental and might disappear later, or not work properly.
- river.
Toolserver-l mailing list Toolserver-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/toolserver-l
Messedrocker:
On 7/27/07, River Tarnell river@wikimedia.org wrote:
it is now possible to access page text directly from the toolserver.
I'm assuming this includes English Wikipedia?
this is for all wikis.
- river.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
River Tarnell wrote:
Messedrocker:
On 7/27/07, River Tarnell river@wikimedia.org wrote:
it is now possible to access page text directly from the toolserver.
I'm assuming this includes English Wikipedia?
this is for all wikis.
- river.
Awesome! Thanks river! This will certainly be of use--let's just hope that it eventually moves from the "experimental" realm to the permanent realm :)
- -- Daniel Cannon (AmiDaniel)
http://amidaniel.com cannon.danielc@gmail.com
Hi all
I have enabled a new version of WikiProxy which uses the text databases on clematis to fetch page content. It has a couple of advantages over river's version:
* you don't need to know the revision id to fetch the latest version of a page (where "latest" is the latest version according to the toolserver's db, so it is consistent if the wiki's database is lagged) * you can use prefixed page names, or alternatively a namespace id separate from the page title * it automatically falls back to pulling the page content via HTTP should the text database fall behind * it provides some meta-info about the page in the http header * I'm also planning an alternative access path to avoid the overhead php-startup induces - maybe a basic web server written in php, maybe something involving fifos... not sure yet. Also don't know when it will come about.
@river: does your version prevent access from outside the toolserver? that would be important.
To use WikiProxy, you can go to http://tools.wikimedia.de/~daniel/WikiSense/WikiProxy.php and fill in the form. This will only work from the toolserver, though - for access from the outside, you need an access token, available on request.
To use it programmatically, use a URL like the following:
http://tools.wikimedia.de/~daniel/WikiSense/WikiProxy.php?wiki=de.wikipedia....
additional parameters: * rev=12345 if you want a specific revision * ns=5 if title is an un-prefixed page name (you can also use canonical or local namespace names)
The wiki parameter accepts names like "de", "dewiki", or "de.wikipedia.org".
Please try it and tell me if you experience any problems.
Enjoy, Daniel
WikiProxy is great, Daniel! Thanks for making such a robust interface for accessing text :-)
I have one question: is it possible to access the data WikiProxy holds without using HTTP? (For example, using PHP's CLI SAPI or direct inclusion of the WikiProxy library). You mention that you're concerned about PHP startup times, and I guess the latter method would alleviate that to some extent.
Thanks,
Tangotango
On Jul 28, 2007, at 9:45 PM, Daniel Kinzler wrote:
Hi all
<snip />
To use WikiProxy, you can go to http://tools.wikimedia.de/~daniel/WikiSense/WikiProxy.php and fill in the form. This will only work from the toolserver, though - for access from the outside, you need an access token, available on request.
To use it programmatically, use a URL like the following:
http://tools.wikimedia.de/~daniel/WikiSense/WikiProxy.php? wiki=de.wikipedia.org&title=SomePage
additional parameters:
- rev=12345 if you want a specific revision
- ns=5 if title is an un-prefixed page name (you can also use
canonical or local namespace names)
The wiki parameter accepts names like "de", "dewiki", or "de.wikipedia.org".
Please try it and tell me if you experience any problems.
Enjoy, Daniel
Just an idea, not sure if its good:
Would it make sense to keep synced version of the mediawiki software itself on the tools server, but only allow localhost api.php access?? This way all the queries, namespace localization and resolution, redirects, possibly even page parsing and html rendering, and lots of other logic that will keep changing with the schema as part of the api will be transparently available, and the tools will be able to use both local and real wiki api transparently, but local ones at much higher speed? Additional tool-server specific queries can also be added to the api since it allows that. On top of it, the API allows internal PHP access - so any tool server applications that use php can use the API transparently, without going over http.
If you haven't seen it yet, its at http://en.wikipedia.org/w/api.php
--Yurik
On 7/28/07, Tangotango tangotango.wp@gmail.com wrote:
WikiProxy is great, Daniel! Thanks for making such a robust interface for accessing text :-)
I have one question: is it possible to access the data WikiProxy holds without using HTTP? (For example, using PHP's CLI SAPI or direct inclusion of the WikiProxy library). You mention that you're concerned about PHP startup times, and I guess the latter method would alleviate that to some extent.
Thanks,
Tangotango
On Jul 28, 2007, at 9:45 PM, Daniel Kinzler wrote:
Hi all
<snip />
To use WikiProxy, you can go to http://tools.wikimedia.de/~daniel/WikiSense/WikiProxy.php and fill in the form. This will only work from the toolserver, though - for access from the outside, you need an access token, available on request.
To use it programmatically, use a URL like the following:
http://tools.wikimedia.de/~daniel/WikiSense/WikiProxy.php? wiki=de.wikipedia.org&title=SomePage
additional parameters:
- rev=12345 if you want a specific revision
- ns=5 if title is an un-prefixed page name (you can also use
canonical or local namespace names)
The wiki parameter accepts names like "de", "dewiki", or "de.wikipedia.org".
Please try it and tell me if you experience any problems.
Enjoy, Daniel
Toolserver-l mailing list Toolserver-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/toolserver-l
Daniel Kinzler:
- I'm also planning an alternative access path to avoid the overhead php-startup
induces - maybe a basic web server written in php, maybe something involving fifos... not sure yet. Also don't know when it will come about.
the main advantage of mine is that it's a standalone daemon, so it's very fast :)
@river: does your version prevent access from outside the toolserver? that would be important.
yes, it only listens to localhost.
- river.
Why would it not?
_____
From: toolserver-l-bounces@lists.wikimedia.org [mailto:toolserver-l-bounces@lists.wikimedia.org] On Behalf Of Messedrocker Sent: Friday, July 27, 2007 11:20 PM To: toolserver-l@lists.wikimedia.org Subject: Re: [Toolserver-l] page text now available experimentally
I'm assuming this includes English Wikipedia? On 7/27/07, River Tarnell river@wikimedia.org wrote: hello,
it is now possible to access page text directly from the toolserver. to do this, make an HTTP request to:
http://localhost:6987/text/<database>/<rev_id>
e.g.:
http://localhost:6987/text/enwiki_p/27785174
the text is replicated in real-time to a dedicated server on the same LAN as hemlock, so retrieval should be fast.
direct access to the text database is not available, because there is no way to hide deleted and oversighted text then.
for legal reasons, you cannot distribute large portions of page text to users. i have added another rule explaining this at [[Toolserver/Rules]].
i understand Daniel is working on making Wikiproxy access the database directly, so for people who already use that, no change will be required.
please report any problems with accessing the text. this is currently experimental and might disappear later, or not work properly.
- river.
_______________________________________________ Toolserver-l mailing list Toolserver-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/toolserver-l
toolserver-l@lists.wikimedia.org