Mediawiki-api October 2011

mediawiki-api@lists.wikimedia.org

14 participants
13 discussions

Need to extract abstract of a wikipedia page
by aditya srinivas 23 Nov '23

23 Nov '23

Hello, I am writing a Java program to extract the abstract of the wikipedia page given the title of the wikipedia page. I have done some research and found out that the abstract with be in rvsection=0 So for example if I want the abstract of 'Eiffel Tower" wiki page then I am querying using the api in the following way. http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Eiffel… and parse the XML data which we get and take the wikitext in the tag <rev xml:space="preserve"> which represents the abstract of the wikipedia page. But this wiki text also contains the infobox data which I do not need. I would like to know if there is anyway in which I can remove the infobox data and get only the wikitext related to the page's abstract Or if there is any alternative method by which I can get the abstract of the page directly. Looking forward to your help. Thanks in Advance Aditya Uppu

4 3

Re: [Mediawiki-api] API Question
by Roan Kattouw 28 Oct '11

28 Oct '11

On Thu, Oct 27, 2011 at 12:04 PM, Robert Stojnic <rainmansr(a)gmail.com> wrote: > > Could be that one of the hosts has a stale copy of an index? > > r. > How do I check for that, and how do I fix it? Roan

2 10

API Question
by Joe Osowski 26 Oct '11

26 Oct '11

I'm seeing a problem here when paging through large result sets. It appears as though the order and count is changing as I'm walking through a given result set. For instance, I make a call: http://en.wikipedia.org/w/api.php?srprop=sectiontitle&srlimit=25&srsearch=1… And I see I get 689 results. But wait, I make the call again, and there is 690. OK, you mirror your lucene indexes across a cluster and one is slightly out of sync. Not a big deal. I adjust the offset and move on. http://en.wikipedia.org/w/api.php?srprop=sectiontitle&srlimit=25&srsearch=1… But now... whenever I get the 689 record set, the order appears changed. I can't page the results without the order being deterministic! #1 Is there a way to specify order for this command? #2 Did I find a bug with one your lucene indexes? Thanks in advance for any insights. -Joe

2 1

Reverse Lookup using curid
by Prateek 26 Oct '11

26 Oct '11

Hello, I want to do a reverse lookup for article name using curid such as curid=1194195 is for the article of Hortense Ellis (as indicated by the URL http://en.wikipedia.org/w/index.php?curid=1194195 ). Is there any method in the API to achieve this? If not, is there any way to use the operations for fetching interwikilinks, using the curid? Thanks for the help, Prateek -- Prateek Jain Research Assistant, Kno.e.sis Center Wright State University Fairborn,Ohio 45435 http://knoesis.wright.edu/students/prateek/

2 1

Pagelinks and categorylinks
by Giuseppe De Ruvo 21 Oct '11

21 Oct '11

Hello, I'm a Computer Science Engineering Student. I'm working on Mediawiki dumps and I would like to know if there is a way to obtain pagelinks and categorylinks by Category (avoiding to download the huge dump). I'm using the Special:Export and I have an xml file but without pagelinks (figuring out after have used the mwdumper to convert xml to sql). Thanks in advance, P

3 2

Pagelinks and categorylinks
by Giuseppe De Ruvo 21 Oct '11

21 Oct '11

I'm looking for files like: http://dumps.wikimedia.org/enwiki/20111007/ (categorylinks, pagelinks), but for a given category only. Maybe you are right, but I can't figure out how to do this. Thanks and Regards, GDR

3 2

To find duplicate of file in Wikimedia commons
by Sreejith K. 14 Oct '11

14 Oct '11

Hi, I was wondering if we have a way of finding whether a given image is available in commons under the same or different name. For example, on this image page (File:Defender18thkkk.jpg<en.wikipedia.org/wiki/File:Defender18thkkk.jpg>), towards the bottom, we have the following text. ---------------------The following file is a duplicate of this file (more details<http://en.wikipedia.org/wiki/Special:FileDuplicateSearch/Defender18thkkk.jpg> ): - File:Defender18thkkk.jpg<http://commons.wikimedia.org/wiki/File:Defender18thkkk.jpg>from Wikimedia Commons --------------------- Which API method gives me this detail? - Sreejith K.

2 2

[Mediawiki-api-announce] Fwd: [Wikitech-l] Protocol-relative URLs enabled on test.wikipedia.org
by Roan Kattouw 13 Oct '11

13 Oct '11

Per the below, protocol-relative URLs are now enabled on test.wikipedia.org and will be rolled out to the rest of the wikis over the course of the next few weeks. What this means is that URLs used in the interface will now look like //example.com instead of http://example.com , so we can support both HTTP and HTTPS without splitting our cache. The API, in most cases, will not output protocol-relative URLs, but will continue to output http:// URLs no matter whether you call it over HTTP or HTTPS. This is because we don't expect API clients to be able to resolve these correctly, and that the context of these URLs (which is needed to resolve them) will frequently get lost along the way. And we don't wanna go breaking clients, now, do we? :) The exceptions to this, as far as I am aware, are: * HTML produced by the parser will have protocol-relative URLs in <a href="..."> tags etc. * prop=extlinks and list=exturlusage will output URLs verbatim as they appear in the article, which means they may output protocol-relative URLs If you are getting protocol-relative URLs in some other place, that's probably a bug (or maybe it's intentional and I forgot to list it here), so please let me know, or e-mail this list, or file bug, if you see that happening. Roan Kattouw (Catrope) ---------- Forwarded message ---------- From: Ryan Lane <rlane32(a)gmail.com> Date: Thu, Jul 14, 2011 at 8:55 PM Subject: [Wikitech-l] Protocol-relative URLs enabled on test.wikipedia.org To: Wikimedia developers <wikitech-l(a)lists.wikimedia.org> Over the past couple days Roan Kattouw and I have been pushing out changes to enable protocol-relative URL support. We've gotten to a point where we think it is stable and working. We've enabled this on test.wikipedia.org, and plan on running it for two weeks before enabling it elsewhere. Please test if everything is working properly, especially with regards to the API and bots. Report bugs in bugzilla if any are found. - Ryan _______________________________________________ Wikitech-l mailing list Wikitech-l(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l _______________________________________________ Mediawiki-api-announce mailing list Mediawiki-api-announce(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce

1 2

[Mediawiki-api-announce] API XML output now has a namespace
by Roan Kattouw 08 Oct '11

08 Oct '11

Per https://bugzilla.wikimedia.org/show_bug.cgi?id=24781 , the XML output returned by the API has a namespace since 1.18. 1.17: <api> 1.18: <api xmlns="http://www.mediawiki.org/xml/api/"> Apparently this breaks for some people using XPATH expressions, although comments on Bugzilla indicated this shouldn't be the case, and since I don't know anything about XML namespaces I trusted my fellow MW devs who said it wouldn't be a problem :) Roan _______________________________________________ Mediawiki-api-announce mailing list Mediawiki-api-announce(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce

4 6

Namespace added to XML answer ?
by Nicolas Vervelle 06 Oct '11

06 Oct '11

Hi, Is it normal that a namespace has been added to the XML answer of the API with MW 1.18 ? I was quite busy lately, so I may have missed the announcement about this. It means that answers from MW 1.18 are not compatible with answers from previous versions. How can we keep our tools working with both 1.18 and previous versions ? Thanks Nico <?xml version="1.0" encoding="UTF-8"?> <api xmlns="http://www.mediawiki.org/xml/api/">

2 4

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Mediawiki-api October 2011