Hello,
I am writing a Java program to extract the abstract of the wikipedia page
given the title of the wikipedia page. I have done some research and found
out that the abstract with be in rvsection=0
So for example if I want the abstract of 'Eiffel Tower" wiki page then I am
querying using the api in the following way.
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Eiffel…
and parse the XML data which we get and take the wikitext in the tag <rev
xml:space="preserve"> which represents the abstract of the wikipedia page.
But this wiki text also contains the infobox data which I do not need. I
would like to know if there is anyway in which I can remove the infobox data
and get only the wikitext related to the page's abstract Or if there is any
alternative method by which I can get the abstract of the page directly.
Looking forward to your help.
Thanks in Advance
Aditya Uppu
On Thu, Oct 27, 2011 at 12:04 PM, Robert Stojnic <rainmansr(a)gmail.com> wrote:
>
> Could be that one of the hosts has a stale copy of an index?
>
> r.
>
How do I check for that, and how do I fix it?
Roan
I'm seeing a problem here when paging through large result sets. It
appears as though the order and count is changing as I'm walking through
a given result set.
For instance, I make a call:
http://en.wikipedia.org/w/api.php?srprop=sectiontitle&srlimit=25&srsearch=1…
And I see I get 689 results. But wait, I make the call again, and there
is 690. OK, you mirror your lucene indexes across a cluster and one is
slightly out of sync. Not a big deal. I adjust the offset and move on.
http://en.wikipedia.org/w/api.php?srprop=sectiontitle&srlimit=25&srsearch=1…
But now... whenever I get the 689 record set, the order appears
changed. I can't page the results without the order being deterministic!
#1 Is there a way to specify order for this command?
#2 Did I find a bug with one your lucene indexes?
Thanks in advance for any insights.
-Joe
Hello,
I want to do a reverse lookup for article name using curid such as
curid=1194195
is for the article of
Hortense Ellis
(as indicated by the URL
http://en.wikipedia.org/w/index.php?curid=1194195 ). Is there any method
in the API to achieve this?
If not, is there any way to use the operations for fetching
interwikilinks, using the curid?
Thanks for the help,
Prateek
--
Prateek Jain
Research Assistant, Kno.e.sis Center
Wright State University
Fairborn,Ohio 45435
http://knoesis.wright.edu/students/prateek/
Hello, I'm a Computer Science Engineering Student.
I'm working on Mediawiki dumps and I would like to know if there is a
way to obtain pagelinks and categorylinks by Category (avoiding to
download the huge dump). I'm using the Special:Export and I have an xml
file but without pagelinks (figuring out after have used the mwdumper to
convert xml to sql).
Thanks in advance,
P
I'm looking for files like:
http://dumps.wikimedia.org/enwiki/20111007/ (categorylinks, pagelinks),
but for a given category only.
Maybe you are right, but I can't figure out how to do this.
Thanks and Regards,
GDR
Per the below, protocol-relative URLs are now enabled on
test.wikipedia.org and will be rolled out to the rest of the wikis
over the course of the next few weeks. What this means is that URLs
used in the interface will now look like //example.com instead of
http://example.com , so we can support both HTTP and HTTPS without
splitting our cache.
The API, in most cases, will not output protocol-relative URLs, but
will continue to output http:// URLs no matter whether you call it
over HTTP or HTTPS. This is because we don't expect API clients to be
able to resolve these correctly, and that the context of these URLs
(which is needed to resolve them) will frequently get lost along the
way. And we don't wanna go breaking clients, now, do we? :)
The exceptions to this, as far as I am aware, are:
* HTML produced by the parser will have protocol-relative URLs in <a
href="..."> tags etc.
* prop=extlinks and list=exturlusage will output URLs verbatim as they
appear in the article, which means they may output protocol-relative
URLs
If you are getting protocol-relative URLs in some other place, that's
probably a bug (or maybe it's intentional and I forgot to list it
here), so please let me know, or e-mail this list, or file bug, if you
see that happening.
Roan Kattouw (Catrope)
---------- Forwarded message ----------
From: Ryan Lane <rlane32(a)gmail.com>
Date: Thu, Jul 14, 2011 at 8:55 PM
Subject: [Wikitech-l] Protocol-relative URLs enabled on test.wikipedia.org
To: Wikimedia developers <wikitech-l(a)lists.wikimedia.org>
Over the past couple days Roan Kattouw and I have been pushing out
changes to enable protocol-relative URL support. We've gotten to a
point where we think it is stable and working.
We've enabled this on test.wikipedia.org, and plan on running it for
two weeks before enabling it elsewhere. Please test if everything is
working properly, especially with regards to the API and bots. Report
bugs in bugzilla if any are found.
- Ryan
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Mediawiki-api-announce mailing list
Mediawiki-api-announce(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce
Per https://bugzilla.wikimedia.org/show_bug.cgi?id=24781 , the XML
output returned by the API has a namespace since 1.18.
1.17: <api>
1.18: <api xmlns="http://www.mediawiki.org/xml/api/">
Apparently this breaks for some people using XPATH expressions,
although comments on Bugzilla indicated this shouldn't be the case,
and since I don't know anything about XML namespaces I trusted my
fellow MW devs who said it wouldn't be a problem :)
Roan
_______________________________________________
Mediawiki-api-announce mailing list
Mediawiki-api-announce(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce
Hi,
Is it normal that a namespace has been added to the XML answer of the API
with MW 1.18 ?
I was quite busy lately, so I may have missed the announcement about this.
It means that answers from MW 1.18 are not compatible with answers from
previous versions.
How can we keep our tools working with both 1.18 and previous versions ?
Thanks
Nico
<?xml version="1.0" encoding="UTF-8"?>
<api xmlns="http://www.mediawiki.org/xml/api/">