Folks,
Can anyone please tell me the most reliable way to request an article page using Media wiki API without fear of getting redirected?
I found that using title to query is not reliable as I have seen titles changing when more articles with same/similar titles are being added. Some of the older titles now lead to generic page stating this title might mean one of the following in the list and a list of article pages links are shown below this message. . I want to avoid getting redirected to this generic page and always stay on the article page as I plan to show this Wikipedia content on my numerous sub-domain home pages.
Is there a page id that I can use which doesn't change at all and is always associate to same article? If so, how can I query the page id using current titles? Any examples and pointers is very much appreciated.
Thanks, Ravi
On Mon, Dec 24, 2012 at 8:56 AM, NetizenApps netizenapps@gmail.com wrote:
I found that using title to query is not reliable as I have seen titles changing when more articles with same/similar titles are being added. Some of the older titles now lead to generic page stating this title might mean one of the following in the list and a list of article pages links are shown below this message. . I want to avoid getting redirected to this generic page and always stay on the article page as I plan to show this Wikipedia content on my numerous sub-domain home pages.
You're not actually being redirected. It's just that the old page was moved and a new page created at the old title.
Is there a page id that I can use which doesn't change at all and is always associate to same article? If so, how can I query the page id using current titles? Any examples and pointers is very much appreciated.
There's the pageid, which can be queried using pageids instead of titles.[1] Note though that if the page is deleted and then undeleted (or recreated), a new pageid may be assigned.
[1]: e.g. https://en.wikipedia.org/w/api.php?action=query&pageids=15580374
Thanks Brad.
So is title the most recommended way to query the article page content?
Right now, even though I am using custom bot to ensure presence of article pages in wikipedia for my database of wikipedia title queries, I am doing manual verification of all my hundreds of pages to ensure correct relevant article page is fetched. This is especially true when multiple entries exist for same common title.
I want to avoid doing this manual verification of all my pages again in future.
Is there a way to query and know 'page move history' if we give the time range as input?
This info will help me avoid doing manual verification of all my pages from time-to-time assuming some pages may have moved. I can just run a bot program to know what all pages are moved in last 1 or 2 years and then do manual verification of only those pages that moved in this time period to ensure correct relevant article page is fetched.
Please advise.
Thanks, Ravi
On Dec 24, 2012, at 6:04 AM, Brad Jorsch bjorsch@wikimedia.org wrote:
On Mon, Dec 24, 2012 at 8:56 AM, NetizenApps netizenapps@gmail.com wrote:
I found that using title to query is not reliable as I have seen titles changing when more articles with same/similar titles are being added. Some of the older titles now lead to generic page stating this title might mean one of the following in the list and a list of article pages links are shown below this message. . I want to avoid getting redirected to this generic page and always stay on the article page as I plan to show this Wikipedia content on my numerous sub-domain home pages.
You're not actually being redirected. It's just that the old page was moved and a new page created at the old title.
Is there a page id that I can use which doesn't change at all and is always associate to same article? If so, how can I query the page id using current titles? Any examples and pointers is very much appreciated.
There's the pageid, which can be queried using pageids instead of titles.[1] Note though that if the page is deleted and then undeleted (or recreated), a new pageid may be assigned.
[1]: e.g. https://en.wikipedia.org/w/api.php?action=query&pageids=15580374
-- Brad Jorsch Software Engineer Wikimedia Foundation
Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
Well, the pages that you do not want are called "disambiguation" pages. They all should have a template that adds them to the category "All disambiguation pages" on en.wikipedia. You can check for that category on your pages so you know which ones to manually review.
-- Dan
On Dec 28, 2012 11:19 PM, "NetizenApps" netizenapps@gmail.com wrote:
Thanks Brad.
So is title the most recommended way to query the article page content?
Right now, even though I am using custom bot to ensure presence of
article pages in wikipedia for my database of wikipedia title queries, I am doing manual verification of all my hundreds of pages to ensure correct relevant article page is fetched. This is especially true when multiple entries exist for same common title.
I want to avoid doing this manual verification of all my pages again in
future.
Is there a way to query and know 'page move history' if we give the time
range as input?
This info will help me avoid doing manual verification of all my pages
from time-to-time assuming some pages may have moved. I can just run a bot program to know what all pages are moved in last 1 or 2 years and then do manual verification of only those pages that moved in this time period to ensure correct relevant article page is fetched.
Please advise.
Thanks, Ravi
On Dec 24, 2012, at 6:04 AM, Brad Jorsch bjorsch@wikimedia.org wrote:
On Mon, Dec 24, 2012 at 8:56 AM, NetizenApps netizenapps@gmail.com
wrote:
I found that using title to query is not reliable as I have seen
titles changing when more articles with same/similar titles are being added. Some of the older titles now lead to generic page stating this title might mean one of the following in the list and a list of article pages links are shown below this message. . I want to avoid getting redirected to this generic page and always stay on the article page as I plan to show this Wikipedia content on my numerous sub-domain home pages.
You're not actually being redirected. It's just that the old page was moved and a new page created at the old title.
Is there a page id that I can use which doesn't change at all and is
always associate to same article? If so, how can I query the page id using current titles? Any examples and pointers is very much appreciated.
There's the pageid, which can be queried using pageids instead of titles.[1] Note though that if the page is deleted and then undeleted (or recreated), a new pageid may be assigned.
https://en.wikipedia.org/w/api.php?action=query&pageids=15580374
-- Brad Jorsch Software Engineer Wikimedia Foundation
Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
On Fri, Dec 28, 2012 at 11:19 PM, NetizenApps netizenapps@gmail.com wrote:
So is title the most recommended way to query the article page content?
Depends. It's usually the easiest.
Is there a way to query and know 'page move history' if we give the time range as input?
You can query for move log events.[1] Dealing with merges or splits will be harder, as there is no way to query those.
mediawiki-api@lists.wikimedia.org