On Thursday night, I went to London PerlMongers social drinks and guzzled ridiculous quantites of Adnam's Oyster Stout and talked rubbish with geeks.
Minor anecdotal notes on our public relations with the advanced geek crowd:
* They really wish MediaWiki had decent WYSIWYG editing - Wikipedia's wikitext is so full of template code it repels casual editors. I explained the technical problems ("it's not a parser, it's a twisty maze of regexps" - they recoiled in horror) and that we're working on it. (Current status: promising vaporware.) They want WYSIWYG editing because they want to be able to say "no" to installing Confluence, which is horrible to administer and not much better to use.
* I told the story of why MediaWiki is written in PHP. (Magnus had read up on PHP to make some changes to NuPedia code, and decided he needed a project. So Phase 2 is Magnus' first ever proper PHP program ...)
* They really want machine-readability from Wikipedia. The infobox templates on Wikipedia are getting there. Mostly what they need is standardisation (is the image called "image", "Image" or "Img"?), and a base template that's {{Persondata}} or a reasonable approximation. This is a matter of parser-functions in the template wikitext on the 'pedia, but it's something someone needs to take on as a project: to re-plumb the templates without breaking the nice exposed external interface. Who knows parser-function code and is feeling ambitious and patient?
- d.
On 06/04/2008, David Gerard dgerard@gmail.com wrote:
On Thursday night, I went to London PerlMongers social drinks and guzzled ridiculous quantites of Adnam's Oyster Stout and talked rubbish with geeks.
Good times!
- I told the story of why MediaWiki is written in PHP. (Magnus had
read up on PHP to make some changes to NuPedia code, and decided he needed a project. So Phase 2 is Magnus' first ever proper PHP program ...)
I hadn't heard that before - explains a few things.
- They really want machine-readability from Wikipedia. The infobox
templates on Wikipedia are getting there. Mostly what they need is standardisation (is the image called "image", "Image" or "Img"?), and a base template that's {{Persondata}} or a reasonable approximation. This is a matter of parser-functions in the template wikitext on the 'pedia, but it's something someone needs to take on as a project: to re-plumb the templates without breaking the nice exposed external interface. Who knows parser-function code and is feeling ambitious and patient?
Is it worth getting Wikipedia to use Semantic MediaWiki? It would allow for much more powerful machine-readability than templates, but probably has hundreds of obstacles to trip over to get there.
On 06/04/2008, Thomas Dalton thomas.dalton@gmail.com wrote:
On 06/04/2008, David Gerard dgerard@gmail.com wrote:
- They really want machine-readability from Wikipedia. The infobox
templates on Wikipedia are getting there. Mostly what they need is standardisation (is the image called "image", "Image" or "Img"?), and a base template that's {{Persondata}} or a reasonable approximation. This is a matter of parser-functions in the template wikitext on the 'pedia, but it's something someone needs to take on as a project: to re-plumb the templates without breaking the nice exposed external interface. Who knows parser-function code and is feeling ambitious and patient?
Is it worth getting Wikipedia to use Semantic MediaWiki? It would allow for much more powerful machine-readability than templates, but probably has hundreds of obstacles to trip over to get there.
Template standardisation struck me as a *feasible* way to the same thing. It has the advantage that consistency would appeal to the sort of geek who's happy to code parser-functions. And users are fine with templates taking parameters and hiding the horrible plumbing behind a nice interface.
The big problem I can see with Semantic MediaWiki is that it involves horrible new wikitext syntax ... although if that can be hidden inside the template code, all the better.
- d.
David Gerard wrote:
On 06/04/2008, Thomas Dalton thomas.dalton@gmail.com wrote:
Is it worth getting Wikipedia to use Semantic MediaWiki? It would allow for much more powerful machine-readability than templates, but probably has hundreds of obstacles to trip over to get there.
Template standardisation struck me as a *feasible* way to the same thing. It has the advantage that consistency would appeal to the sort of geek who's happy to code parser-functions. And users are fine with templates taking parameters and hiding the horrible plumbing behind a nice interface.
The big problem I can see with Semantic MediaWiki is that it involves horrible new wikitext syntax ... although if that can be hidden inside the template code, all the better.
Semantic MediaWiki is basically everything that's horrible about templates, but you can query the data in the parameters. :)
Alas, that may mean it's a bit at odds with the wysiwyg ideal of 'hide those awful templates'.
To the extent that templates are things like infoboxes, those *can* be sensible separated from body text and handled easily. To the extent that references, formatting, and data relations are extensively embedded *into* body text, that's where things get a bit ugly.
-- brion vibber (brion @ wikimedia.org)
On 07/04/2008, Brion Vibber brion@wikimedia.org wrote:
Alas, that may mean it's a bit at odds with the wysiwyg ideal of 'hide those awful templates'. To the extent that templates are things like infoboxes, those *can* be sensible separated from body text and handled easily. To the extent that references, formatting, and data relations are extensively embedded *into* body text, that's where things get a bit ugly.
Yeah. When I say "templates" above, I meant "infoboxes" - which I wasn't much of a fan of until it clicked that they were in fact machine-readable information. (Presumably for the Opera project and suchlike, we can have a "display=no" parameter.) References are a whole other spitball of sheer joy ...
Infoboxes are a start on machine readability. Some short articles, the entire content can basically be encoded in the infobox. If only RamBot had done that for US places in 2003.
I suppose I should cc large chunks of this thread to wikien-l, this is really the technical end of "editorial".
- d.
Hoi, The biggest problem with Semantic MediaWiki, something that will prevent localisation at Betawiki is the way it does its localisation. I have been told by Danny that they will fix this before Wikimania.. Given that we need some time at Betawiki to absorb it, I hope that it will be sooner rather then later... Some parts of SMMW have already been adapted to the standard way of localising ie we know that there are no technical obstacles, it is just work that needs doing. Thanks, GerardM
On Sun, Apr 6, 2008 at 11:52 PM, Thomas Dalton thomas.dalton@gmail.com wrote:
On 06/04/2008, David Gerard dgerard@gmail.com wrote:
On Thursday night, I went to London PerlMongers social drinks and guzzled ridiculous quantites of Adnam's Oyster Stout and talked rubbish with geeks.
Good times!
- I told the story of why MediaWiki is written in PHP. (Magnus had
read up on PHP to make some changes to NuPedia code, and decided he needed a project. So Phase 2 is Magnus' first ever proper PHP program ...)
I hadn't heard that before - explains a few things.
- They really want machine-readability from Wikipedia. The infobox
templates on Wikipedia are getting there. Mostly what they need is standardisation (is the image called "image", "Image" or "Img"?), and a base template that's {{Persondata}} or a reasonable approximation. This is a matter of parser-functions in the template wikitext on the 'pedia, but it's something someone needs to take on as a project: to re-plumb the templates without breaking the nice exposed external interface. Who knows parser-function code and is feeling ambitious and patient?
Is it worth getting Wikipedia to use Semantic MediaWiki? It would allow for much more powerful machine-readability than templates, but probably has hundreds of obstacles to trip over to get there.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Thomas Dalton wrote:
On 06/04/2008, David Gerard dgerard@gmail.com wrote:
On Thursday night, I went to London PerlMongers social drinks and guzzled ridiculous quantites of Adnam's Oyster Stout and talked rubbish with geeks.
Good times!
- I told the story of why MediaWiki is written in PHP. (Magnus had
read up on PHP to make some changes to NuPedia code, and decided he needed a project. So Phase 2 is Magnus' first ever proper PHP program ...)
I hadn't heard that before - explains a few things.
Well, it was my first PHP project as well, and it can't have been far off from Brion's first either. But we learnt as we went along, and I'd like to think we're PHP experts now. There's very little original Magnus/Lee code left these days.
-- Tim Starling
On 07/04/2008, Tim Starling tstarling@wikimedia.org wrote:
Well, it was my first PHP project as well, and it can't have been far off from Brion's first either. But we learnt as we went along, and I'd like to think we're PHP experts now. There's very little original Magnus/Lee code left these days.
Yeah. It's not like PHP is difficult to start on. The hard part of MediaWiki programming is, I expect, much more a matter of how to extract the best performance from MySQL on limited hardware with huge demands than it is of anything to do with PHP.
- d.
wikitech-l@lists.wikimedia.org