Christopher Sahnwaldt schrieb:
I'm pretty new to MediaWiki and I'm not sure
if I understand
this correctly... Here's my attempt at spelling it out in a bit
more detail:
When a user edits a page and sends the new text to the
server, the server / the RDF extension parses the text, extracts
the desired data and saves it in a RDF store.
I hope I got that about right - please correct me if not!
More or less - the parser parses the text, and hands the bit that is RDF
(turtle) to the RDF-Extension for analysis. It analyzes the statements and would
save it to the database (this is not yet implemented).
Now when I think about the pros and cons of having
this
process run integrated in MediaWiki or on a different server,
a few questions come up... again, I'm new to MediaWiki,
so these may be newbie questions... :-)
How much parsing does MediaWiki currently do when it stores
new text for an article? Are templates expanded / transcluded?
There is a preprocessor that expands all templates recursively. After that, the
real "parser" (read: munger) is invoked to turn wiki text into HTML.
In the case of a "semantified" infobox, the substitution process would generate
RDF/Turtle statements using the template parameters. These would in turn be
handed to the RDF extension, which would write the resulting triples to the
database.
How are updates distributed? Do subscribers regularly
poll
the server for recent changes? Or is there some kind of
store-and-forward / publish-subscribe?
There is the RSS/Atom feed (human readable, not easy to parse), and an OAI-PMH
interface ("life update feed"). There's also the web API for polling data in
a
machine readable form, and there's the RC ("recent changes") channel on IRC
(human readable, can't be parsed reliably). True XMPP based pubsub is being
worked on, see <http://brightbyte.de/page/RecentChanges_via_Jabber>.
-- daniel