On 04/24/2014 05:24 AM, Daan Kuijsten wrote:
On 23-Apr-14 21:29, wikitech-l-request(a)lists.wikimedia.org wrote:
Re: API attribute ID for querying wikipedia pages
@Matma Rex: This is way to general, I think it would be a lot better when
this would be in more detail. For example when I want to fetch a table with
all currencies on
https://en.wikipedia.org/wiki/List_of_circulating_currencies, I would make
an API call like
this:https://en.wikipedia.org/w/api.php?action=parse&page=List%20of%20c…ormat=jsonfm.
This returns 5 sections with "numbers" which I can use as reference points,
but I would rather have a "number" for the table in the section. A section
can have multiple tables.
Querying specific (structured) data from Wikipedia is still very difficult
in my opinion. My suggestion is that every paragraph, image, link and table
get a unique identifiable number. This way Wikipedia gets more machine
readable.
We (the Parsoid team) are actually working on this, see
https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec/Element_IDs
Besides making it possible to reference content, our goal is to use these
ids as a key that lets us associate additional metadata with each element in
the DOM.
We expect stable element ids to be available in Parsoid output by this summer.
Gabriel