Hello,
I am writing a Java program to extract the abstract of the wikipedia page
given the title of the wikipedia page. I have done some research and found
out that the abstract with be in rvsection=0
So for example if I want the abstract of 'Eiffel Tower" wiki page then I am
querying using the api in the following way.
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Eiffel…
and parse the XML data which we get and take the wikitext in the tag <rev
xml:space="preserve"> which represents the abstract of the wikipedia page.
But this wiki text also contains the infobox data which I do not need. I
would like to know if there is anyway in which I can remove the infobox data
and get only the wikitext related to the page's abstract Or if there is any
alternative method by which I can get the abstract of the page directly.
Looking forward to your help.
Thanks in Advance
Aditya Uppu
Hi all!
I've been studing the api and had a look to the extensions, I have some
questions about the proper way to style the html obtained of an article.
I am looking at the parse api.
I wonder if contentformat or contentmodel do the trick to obtain an article
with a "cleaner" html or the matched css, so that it can be "re-styled".
Unfortunately, I have not been able to use this parameters:
http://en.wikipedia.org/w/api.php?action=parse&page=New%20Jersey&contentfor…
Could you please give me an example so that i understand what do they do?
Alternatively, i could use mobileformat parameters, which seems to get an
adapted and cleaner html.
But still, I would need to find out the scheme of Id and classes used in
the html to re-style it.
Where could I find such scheme?
I also would like to get rid of the [edit] or [update] elements: I want the
article in "read mode" only.
Is there maybe a parameters providing this output?
If someone of you ever saw the "Dictionary" application in a macbook
(pointing at wikipedia) that is the style I am aiming to.
Am I going in the proper direction ?
Do you have any remark or suggestion ?
I was thinking to apply a css on obtained html, if there was a json output
with just "bare" elements (such as table, images, paragraphs, links ext,
links int, ) it would awesome...
As always, thank you very much for sharing ideas.
--
Luigi Assom
Skype contact: oggigigi
Hello,
I read there is no api to obtain main article of wikipedia articles, but I
also noticed many topics on stackoverflow are old of one year.
Is there any progress on this?
Also, is it possible to obtain thumbnails of images (or specified size) for
multiple pages?
If there is no way to obtain the first image appearing in the article,
which way would you suggest in order to pick up the image from a set of
images in the page, avoiding to parse the whole page and get the thumbnail?
this would be important because I am trying to make a sequential call and i
don't want to query in parallel.
Thank you for sharing ideas and state of the art!
Luigi
--
Luigi Assom
Skype contact: oggigigi
Hi, I am requesting a number of pages to be parsed on Wikipedia. Generally I
am being returned the text, as expected. For some higher traffic pages, I am
instead receiving a notice about it being cached. Could someone give me some
pointers as to how I am meant to handle this? I’m guessing I need to make a
new request to the parser, cache, using the provided key (see below). Does
anyone have a link to a page, that explains this behaviour? Notice text is
below…
Cheers
Chris Thomas
"<ol>
<li>REDIRECT <a href="/wiki/Stonehenge"
title="Stonehenge">Stonehenge</a></li>
</ol>
<!--
NewPP limit report
Preprocessor visited node count: 1/1000000
Preprocessor generated node count: 4/1500000
Post‐expand include size: 0/2048000 bytes
Template argument size: 0/2048000 bytes
Highest expansion depth: 1/40
Expensive parser function count: 0/500
-->
<!-- Saved in parser cache with key enwiki:pcache:idhash:518934-0!*!0!*!*!*!
* and timestamp 20130427125848 -->
"
Language links added by Wikidata are currently stored in the parser
cache and in the langlinks table in the database, which means they
work the same as in-page langlinks but also that the page must be
reparsed if these wikidata langlinks change. The Wikidata team has
proposed to remove the necessity for the page reparse, at the cost of
changing the behavior of the API with regard to langlinks.
Gerrit change 59997[1] (still in review) will make the following
behavioral changes:
* action=parse will return only the in-page langlinks by default.
Inclusion of Wikidata langlinks may be requested using a new
parameter.
* list=allpages with apfilterlanglinks will only consider in-page langlinks.
* list=langbacklinks will only consider in-page langlinks.
* prop=langlinks will only list in-page langlinks.
Gerrit change 60034[2] (still in review) will make the following
behavioral changes:
* prop=langlinks will have a new parameter to request inclusion of the
Wikidata langlinks in the result.
A future change, not coded yet, will allow for Wikidata to flag its
langlinks in various ways. For example, it could indicate which of the
other-language articles are Featured Articles.
At this time, it seems likely that the first change will make it into
1.22wmf3.[3] The timing of the second and third changes are less
certain.
[1]: https://gerrit.wikimedia.org/r/#/c/59997
[2]: https://gerrit.wikimedia.org/r/#/c/60034
[3]: https://www.mediawiki.org/wiki/MediaWiki_1.22/Roadmap
--
Brad Jorsch
Software Engineer
Wikimedia Foundation
_______________________________________________
Mediawiki-api-announce mailing list
Mediawiki-api-announce(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce
Hello, All.
I have a request here:
I listen to the recent change list, and to the "page restore" action, I
receive the log like this:
<rc type="log" ns="0" title="William Pierce, Jr." rcid="565524106"
pageid="38932317" revid="0" old_revid="0" user="Secret" oldlen="0"
newlen="0" timestamp="2013-03-27T04:02:12Z" comment="5 revisions restored:
keep the original redirect" logid="48107996" logtype="delete"
logaction="restore"/>
It's said that it restores 5 revisions but not tells me what are the 5
revisions.
But in the log of "revision restore/delete", we can get the list of the
affected revisions.
Is it possible to do some changes of the API in order to support this
feature.
Thanks.
Hi,
Is it possible to use the API on WikiData, for example to remove or update
some interwiki links ?
How do you do that ?
I'd like to create a few bot tools for deleting / renaming categories, and
I'd need to update the interwiki on WikiData ?
Thanks
Nico
Hello!
I have a problem here:
Do the restore and delete page action can cancel out each other?
i.e. delete a page and then restore this page, equals we do nothing of this
page.
I listened the recent changes, and got more than 5 revisions of this page:
http://en.wikipedia.org/w/index.php?title=William_Pierce,_Jr.&redirect=no
And then I got a delete page action,
later I got a restore page action, as you can see from the log:
<rc type="log" ns="0" title="William Pierce, Jr." rcid="565524106" pageid="
38932317" revid="0" old_revid="0" user="Secret" oldlen="0"newlen="0"
timestamp="2013-03-27T04:02:12Z" comment="5 revisions restored: keep the
original redirect" logid="48107996" logtype="delete"logaction="restore"/>
it only restored 5 revisions but not specified what the 5 revisions were.
In my opinion, the restore and delete page actions should cancel out each
other like the restore and delete revision actions.