Re: [Wikisource-l] Drop OAI-PMH repository of Index: pages

31 Dec 2016


      Hello Andrea,
...
I guess it could be very useful to use for importing those data into Wikidata.
Even when removing the OAI-PMH API we could still extract data from the Index: page serialization. It's a bit more difficult but not much more (and definitely far less than the entity matching problem).
...
The problem with those API is that it works only it Index pages, which are only a fraction of the "book" entity on Wikisource. Index pages are not linked in a structured way with their ns0 pages, and this is a problem for us.
It's possible to retrieve the ns0 pages that uses a given Index: page using the <pages> tag (you just have to retrieve the list of transclusions of the Index: page as if it where a regular template).
...
Ideally, we would know when a Index page has only one ns0 page, and we would use the same set of data to create an entity (or more) into Wikidata.
Yes. What we could do is see if the "Title" field of the index pages has only one link to a ns0 page and consider this is the "one" ns0 page. An other possible thing is, when the header feature of the <pages> tag is use, retrieve the pages that use the automatic summary feature and, if there is only one, consider this as the "one".
...
and I don't know if that uses your API.
I believe he doesn't but we should definitely ask him if it's useful for his use case.
Thomas
...
Le 31 déc. 2016 à 12:58, Andrea Zanni zanni.andrea84@gmail.com a écrit :
Hi Thomas.
I used, one year ago, the API: I downloaded the data from the Index pages, and I think that it would be good to have it while we still don't have Wikidata.
I guess it could be very useful to use for importing those data into Wikidata.
The problem with those API is that it works only it Index pages, which are only a fraction of the "book" entity on Wikisource. Index pages are not linked in a structured way with their ns0 pages, and this is a problem for us.
Ideally, we would know when a Index page has only one ns0 page, and we would use the same set of data to create an entity (or more) into Wikidata.
I know that Sam is trying to develop a similar tool:
https://tools.wmflabs.org/ws-search/
and I don't know if that uses your API.
Aubrey
On Fri, Dec 30, 2016 at 6:15 PM, Thomas PT thomaspt@hotmail.fr wrote:
I definitely used the pageviews API. So I understand now why the count was 0. Sorry for the false info and thank you for your correction.
But my proposal still stands as I do not know any actual user of the API.
Thomas
...
Le 30 déc. 2016 à 18:11, Federico Leva (Nemo) nemowiki@gmail.com a écrit :
Sorry for the double message.
Thomas PT, 30/12/2016 17:31:
...
According to the Wikimedia PageView statistic tool
Did you literally use https://tools.wmflabs.org/pageviews , or have you asked for real requests data? The pageviews API doesn't count requests to the OAI-PMH endpoint at all, because they have "content-type: text/xml" while text/html is required: https://meta.wikimedia.org/wiki/Research:Page_view#Definition
Only people with access to https://wikitech.wikimedia.org/wiki/Analytics/Data/Webrequest#wmf.webrequest can extract data on how much it's used.
Nemo

Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikisource-l] Drop OAI-PMH repository of Index: pages