Re: [Wikitech-l] Exporting sections of pages

19 Feb 2007


      On Saturday 17 February 2007 21:04, Jay R. Ashworth wrote:
...
On Fri, Feb 16, 2007 at 10:18:14AM +1100, Steve Bennett wrote:
...
On 2/16/07, Jay R. Ashworth jra@baylink.com wrote:
...
There was a fairly extensive discussion on this list a couple weeks ago
-- which I thought you had participated in -- about microformats (I
think the buzzword was) which amounted to "custom-made attributed to
HTML tags which would be usable for semantic extraction, but ignored by
browsers".
I'm sure I've never heard that term before, but fortunately
[[Microformats]] got me up to speed. (Is there anything Wikipedia
can't do :))
...
This topic, and the approach you and I forsee, is sort of an offshoot
thereof...
I guess. What would the syntax look like to the user?
Well, one suggested solution was piping the
==section header tag|secthead==
but while I understand why that is most intiutive to people who *get*
Wikipedia, I suspect it's a bit too breakable when confronted with
people who don't--and there are a lot more of them.  So I like a
template or parser function that takes an argument and expands to the
appropriate hidden markup to support the pointer, myself.
Semantic MediaWiki in parts is very similar to what microformats try to 
achieve. It collects semantic data and offers it in machine-readable formats. 
The difference is that it also can work on the data within the wiki (e.g. you 
can search for things). One could adjust Semantic MediaWiki to support 
microformat-like applications. Now I don't know which microformat you have in 
mind -- Semantic MediaWiki is not the solution for everything; but is has a 
lot of existing infrastructure that was built for storing and processing such 
structured data. So maybe extending Semantic MediaWiki is easier than 
building another parser extension for microformats.
In general, microformats are application-specific mini-markups, that were 
tailored to simplify markup for people who write XHTML. Microformats are 
intended to be easy to write (for people used to writing XHTML), but they are 
not easy to parse. Extracting microformat data from XHTML may require 
substantial effort (at least that's what I heard from microformat people at 
the W3C Technical Plenary last year). Now if you create a wiki markup 
(Semantic MediaWiki has one, but you can invent another one if you need), 
just to re-embed the already extracted information into XHTML, this seems to 
be unnecessarily complicated. Since you already have the information, you can 
easily provide it in a separate block of data -- both XML and RDF based 
formats are available for many typical microformat applications.
There are many ways to attach metadata to HTML. flickr for instance exports 
RDF metadata by directly embedding it into XHTML pages in a sort of 
customised way. The RDFa effort is about to standardise a clean solution for 
this. Semantic MediaWiki in turn puts the RDF on another URL which is linked 
to the HTML-document through the header (this is fully XHTML conformant and 
it scales to larger amounts of data); tools like Firefox' Piggybank extension 
will find the data and can import it.
Microformats are still a very good starting point: they have identified common 
applications and provide sets of important property definitions for each. 
This is certainly something to draw from. I just would not implement a wiki 
markup, XHMTL encoding, and custom handling for each such format.
Btw. what tool that supports microformats do you have in mind? Maybe this tool 
supports further input formats that have been invented for similar 
applications.
Cheers,
Markus
-- 
Markus Krötzsch
Institute AIFB, University of Karlsruhe, D-76128 Karlsruhe
mak@aifb.uni-karlsruhe.de        phone +49 (0)721 608 7362
www.aifb.uni-karlsruhe.de/WBS/     fax +49 (0)721 693  717

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Exporting sections of pages