Well, the pages that you do not want are called "disambiguation" pages.
They all should have a template that adds them to the category "All
disambiguation pages" on en.wikipedia. You can check for that category on
your pages so you know which ones to manually review.
--
Dan
On Dec 28, 2012 11:19 PM, "NetizenApps" <netizenapps(a)gmail.com> wrote:
Thanks Brad.
So is title the most recommended way to query the article page content?
Right now, even though I am using custom bot to ensure presence of
article pages in
wikipedia for my database of wikipedia title queries, I am
doing manual verification of all my hundreds of pages to ensure correct
relevant article page is fetched. This is especially true when multiple
entries exist for same common title.
I want to avoid doing this manual verification of all my pages again in
future.
Is there a way to query and know 'page move history' if we give the time
range as input?
This info will help me avoid doing manual verification of all my pages
from
time-to-time assuming some pages may have moved. I can just run a bot
program to know what all pages are moved in last 1 or 2 years and then do
manual verification of only those pages that moved in this time period to
ensure correct relevant article page is fetched.
Please advise.
Thanks,
Ravi
On Dec 24, 2012, at 6:04 AM, Brad Jorsch <bjorsch(a)wikimedia.org> wrote:
> On Mon, Dec 24, 2012 at 8:56 AM, NetizenApps <netizenapps(a)gmail.com>
wrote:
>>
>> I found that using title to query is not reliable as I have seen
titles
changing when more articles with same/similar titles are being
added. Some of the older titles now lead to generic page stating this title
might mean one of the following in the list and a list of article pages
links are shown below this message. . I want to avoid getting redirected to
this generic page and always stay on the article page as I plan to show
this Wikipedia content on my numerous sub-domain home pages.
>
> You're not actually being redirected. It's just that the old page was
> moved and a new page created at the old title.
>
>> Is there a page id that I can use which doesn't change at all and is
always associate to same article? If so, how can I query the page id using
current titles? Any examples and pointers is very much appreciated.
>
> There's the pageid, which can be queried using pageids instead of
> titles.[1] Note though that if the page is deleted and then undeleted
> (or recreated), a new pageid may be assigned.
>
> [1]: e.g.
https://en.wikipedia.org/w/api.php?action=query&pageids=15580374
--
Brad Jorsch
Software Engineer
Wikimedia Foundation
_______________________________________________
Mediawiki-api mailing list
Mediawiki-api(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
_______________________________________________
Mediawiki-api mailing list
Mediawiki-api(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api