Guten Tag Rich Morin,
am Mittwoch, 22. März 2006 um 01:58 schrieben Sie:
RM> At 2:28 PM -0800 3/21/06, Brion Vibber wrote:
G@B wrote:
- Is there a Wikipedia API-Specification in order to reach my goal?
Not at this time.
- If no, which alternatives are there?
If you're a nice person: open a web browser and point it at Wikipedia.
RM> That's approximately what I do in my web pages. My text is rife with RM> links to WP (to the point that $WP is defined in the PHP code :-). In RM> fact, the ability to use WP as a source of explanatory footnotes is a RM> very big win for me.
RM> However, I can imagine situations where this would not be an optimal RM> solution. For example, someone might wish to present the user with RM> tooltips, image-mapped diagrams for context and navigation, etc. So, RM> the web site would display derived information (probably linking to RM> WP, as well).
RM> This sort of analysis requires intimate familiarity with the structure RM> of the input data. So, definitions, consistency, and stability are RM> important requirements.
RM> At 11:25 PM +0100 3/21/06, G@B wrote:
- If no, which alternatives are there?
RM> Here are some possibilities I've considered:
RM> * screen scraping
RM> UI design is hard enough without trying to keep things convenient RM> for use by programs. So, most developers (let alone contributors) RM> won't optimize for this use.
RM> XHTML, if used, helps with some low-level syntax issues (XML RM> parsers work :-), but the structure may still be chaotic and RM> subject to unannounced changes.
RM> Nonetheless, I've suggested that Semantic WP (SWP) tags be part RM> of the generated XHTML, to enable analysis (etc) by browsers.
RM> * XML/SOAP/...
RM> Quite possible, assuming that WP will allow it and someone can do RM> (or support) the necessary standardization and implementation. RM> The SWP folks will be forced to do something like this, if nobody RM> gets there first. In any case, it won't be trivial to do right.
RM> * RDBMS (eg, MySQL)
RM> Assuming that read access is available, a script can easily send RM> off queries and evaluate the replies. WP could, in fact, allow RM> this, but caution would be worthwhile, as this level of access RM> might create new openings for DDoS attacks, etc. OTOH, if access RM> were controlled, correct behavior could be enforced by fiat.
RM> Otherwise, you fall back to Brion's suggestion of keeping a mirror. RM> The last time I checked, this was not a turnkey procedure, but the RM> situation may be different now. Is mirroring automated now?
RM> * code-level (e.g., PHP) access
RM> If you have access to the MW code base, you can grab any data you RM> like. However, this puts you in the role of maintaining a forked RM> version of MW. Of course, if your changes are deemed useful and RM> safe, you might get them into the MW code base. In fact, putting RM> XML access and/or SMW facilities into MW is an example of this.
RM> * command-line access
RM> If you have command-line access on a machine where MW is running, RM> and appropriate permissions, you can access data in a number of RM> ways. For example, you could look directly into the MySQL files RM> or behind MW's back at generated files, etc. (However, YMMV!)
RM> In summary, there are a variety of options. My own approach is to use RM> mediated database access (eg, via Perl's DBI module). This shields me RM> from implementation details and reduces portability issues. With the RM> exception of the DB structure, I can treat MW largely as a "black box".
RM> Although I'm not sure I'll need it, DBI-Link provides a way to access RM> arbitrary databases via PostgreSQL. So, if PostgreSQL can provide RM> facilities that MySQL (or whatever) does not, it can be used as a RM> "wrapper":
RM> ??? -> PostgreSQL -> PL/PerlU -> DBI-Link -> Perl DBI -> MySQL (etc)
RM> I would be happy to hear of other possibilities, etc. TMTOWTDI!
RM> -r
Thank you for your advices guys. I hope in future there will be a more comfortable way to get the data. For me would be comfortable to have the solution (for example) in Java, considered that a lot of Web Technologies are beiing developed in this area (see JSP, JSF, etc..). Well another question to Rich Morin: Can you please give the url to you page in order to see how does "the wiki implementation" look and work?
wikitech-l@lists.wikimedia.org