Hello together, I would like to know if it is possible to get the content to a specific request from extern, that means NOT from the Wikipedia Website. The constellation would be: Fill a Form (Text Area) with the Word I am searching for in my Website or a program or something like this. This Website, Program, etc.. connects (however) to the Wikipedia website and returns the content (if the word is specific enough) to the website, program, etc... I hope I formulate the case in a understandable way. Now the concrete questions are:
- Is there a Wikipedia API-Specification in order to reach my goal? - If yes, how can I be able to get it? - If no, which alternatives are there?
Thank you in advance!
Gab
G@B wrote:
The constellation would be: Fill a Form (Text Area) with the Word I am searching for in my Website or a program or something like this. This Website, Program, etc.. connects (however) to the Wikipedia website and returns the content (if the word is specific enough) to the website, program, etc... I hope I formulate the case in a understandable way. Now the concrete questions are:
- Is there a Wikipedia API-Specification in order to reach my goal?
Not at this time.
- If no, which alternatives are there?
If you're a nice person: open a web browser and point it at Wikipedia.
If you're a very nice person: run your own mirror of the database and search & format data however you like, see http://download.wikimedia.org/
-- brion vibber (brion @ pobox.com)
At 2:28 PM -0800 3/21/06, Brion Vibber wrote:
G@B wrote:
- Is there a Wikipedia API-Specification in order to reach my goal?
Not at this time.
- If no, which alternatives are there?
If you're a nice person: open a web browser and point it at Wikipedia.
That's approximately what I do in my web pages. My text is rife with links to WP (to the point that $WP is defined in the PHP code :-). In fact, the ability to use WP as a source of explanatory footnotes is a very big win for me.
However, I can imagine situations where this would not be an optimal solution. For example, someone might wish to present the user with tooltips, image-mapped diagrams for context and navigation, etc. So, the web site would display derived information (probably linking to WP, as well).
This sort of analysis requires intimate familiarity with the structure of the input data. So, definitions, consistency, and stability are important requirements.
At 11:25 PM +0100 3/21/06, G@B wrote:
- If no, which alternatives are there?
Here are some possibilities I've considered:
* screen scraping
UI design is hard enough without trying to keep things convenient for use by programs. So, most developers (let alone contributors) won't optimize for this use.
XHTML, if used, helps with some low-level syntax issues (XML parsers work :-), but the structure may still be chaotic and subject to unannounced changes.
Nonetheless, I've suggested that Semantic WP (SWP) tags be part of the generated XHTML, to enable analysis (etc) by browsers.
* XML/SOAP/...
Quite possible, assuming that WP will allow it and someone can do (or support) the necessary standardization and implementation. The SWP folks will be forced to do something like this, if nobody gets there first. In any case, it won't be trivial to do right.
* RDBMS (eg, MySQL)
Assuming that read access is available, a script can easily send off queries and evaluate the replies. WP could, in fact, allow this, but caution would be worthwhile, as this level of access might create new openings for DDoS attacks, etc. OTOH, if access were controlled, correct behavior could be enforced by fiat.
Otherwise, you fall back to Brion's suggestion of keeping a mirror. The last time I checked, this was not a turnkey procedure, but the situation may be different now. Is mirroring automated now?
* code-level (e.g., PHP) access
If you have access to the MW code base, you can grab any data you like. However, this puts you in the role of maintaining a forked version of MW. Of course, if your changes are deemed useful and safe, you might get them into the MW code base. In fact, putting XML access and/or SMW facilities into MW is an example of this.
* command-line access
If you have command-line access on a machine where MW is running, and appropriate permissions, you can access data in a number of ways. For example, you could look directly into the MySQL files or behind MW's back at generated files, etc. (However, YMMV!)
In summary, there are a variety of options. My own approach is to use mediated database access (eg, via Perl's DBI module). This shields me from implementation details and reduces portability issues. With the exception of the DB structure, I can treat MW largely as a "black box".
Although I'm not sure I'll need it, DBI-Link provides a way to access arbitrary databases via PostgreSQL. So, if PostgreSQL can provide facilities that MySQL (or whatever) does not, it can be used as a "wrapper":
??? -> PostgreSQL -> PL/PerlU -> DBI-Link -> Perl DBI -> MySQL (etc)
I would be happy to hear of other possibilities, etc. TMTOWTDI!
-r
On 3/21/06, Brion Vibber brion@pobox.com wrote:
G@B wrote:
The constellation would be: Fill a Form (Text Area) with the Word I am searching for in my Website or a program or something like this. This Website, Program, etc.. connects (however) to the Wikipedia website and returns the content (if the word is specific enough) to the website, program, etc... I hope I formulate the case in a understandable way. Now the concrete questions are:
- Is there a Wikipedia API-Specification in order to reach my goal?
Not at this time.
- If no, which alternatives are there?
If you're a nice person: open a web browser and point it at Wikipedia.
If you're a very nice person: run your own mirror of the database and search & format data however you like, see http://download.wikimedia.org/
If you're a VERY nice person, work on a Wikipedia API...
The Cunctator schrieb:
On 3/21/06, Brion Vibber brion@pobox.com wrote:
G@B wrote:
The constellation would be: Fill a Form (Text Area) with the Word I am searching for in my Website or a program or something like this. This Website, Program, etc.. connects (however) to the Wikipedia website and returns the content (if the word is specific enough) to the website, program, etc... I hope I formulate the case in a understandable way. Now the concrete questions are:
- Is there a Wikipedia API-Specification in order to reach my goal?
Not at this time.
- If no, which alternatives are there?
If you're a nice person: open a web browser and point it at Wikipedia.
If you're a very nice person: run your own mirror of the database and search & format data however you like, see http://download.wikimedia.org/
If you're a VERY nice person, work on a Wikipedia API...
*I* would work on a Wikipedia API, but it would only be reverted by Brion for being too expensive, or for putting spaces between ")" and ";" in the source, or some other very important reason.
Magnus
wikitech-l@lists.wikimedia.org