Hi PyWikipedians,
Is there a simple way to get a list of pages created on a particular date?
I was hoping to find something in pagegenerators, but maybe not?
Thanks for any pointers! John
As far I can recall there is no way in the API to get particular pages by date of creation so You can't get the list via PWB
but You can do it via SQL queries, It's simple: http://www.mediawiki.org/wiki/Database_layout
On 3/21/13, John R. Frank jrf@mit.edu wrote:
Hi PyWikipedians,
Is there a simple way to get a list of pages created on a particular date?
I was hoping to find something in pagegenerators, but maybe not?
Thanks for any pointers! John
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
but You can do it via SQL queries, It's simple: http://www.mediawiki.org/wiki/Database_layout
That's very cool. I don't have toolserver access, so I cannot run it directly on en.wikipedia.org's SQL database. I created this ticket:
https://jira.toolserver.org/browse/DBQ-204
Are any of you toolserver power users who could run this for us?
jrf
Text of ticket:
Would it be possible for someone to run a SQL query for pages created on dates between October 2011 and March 2013 in the English Wikipedia? This would be very helpful to the TREC KBA research effort at NIST: http://trec.nist.gov/ http://trec-kba.org/
Any text format output is fine. Just need the page creation date and the URL. For example:
2011-10-02T04:04:03Z http://en.wikipedia.org/wiki/Interesting_Topic1 2011-10-02T04:05:03Z http://en.wikipedia.org/wiki/Interesting_Topic2
If the date range is challenging, then limiting to pages created in 2012 or the even just the first half of 2012 would be still be very helpful.
Thank you for your help! Regards, John
I can run, I'll e mail you the result
On Thu, Mar 21, 2013 at 9:38 PM, John R. Frank jrf@mit.edu wrote:
but You can do it via SQL queries, It's simple:
http://www.mediawiki.org/wiki/**Database_layouthttp://www.mediawiki.org/wiki/Database_layout
That's very cool. I don't have toolserver access, so I cannot run it directly on en.wikipedia.org's SQL database. I created this ticket:
https://jira.toolserver.org/**browse/DBQ-204<https://jira.toolserver.org/browse/DBQ-204>
Are any of you toolserver power users who could run this for us?
jrf
Text of ticket:
Would it be possible for someone to run a SQL query for pages created on dates between October 2011 and March 2013 in the English Wikipedia? This would be very helpful to the TREC KBA research effort at NIST: http://trec.nist.gov/ http://trec-kba.org/
Any text format output is fine. Just need the page creation date and the URL. For example:
2011-10-02T04:04:03Z http://en.wikipedia.org/wiki/**Interesting_Topic1http://en.wikipedia.org/wiki/Interesting_Topic1 2011-10-02T04:05:03Z http://en.wikipedia.org/wiki/**Interesting_Topic2http://en.wikipedia.org/wiki/Interesting_Topic2
If the date range is challenging, then limiting to pages created in 2012 or the even just the first half of 2012 would be still be very helpful.
Thank you for your help! Regards,
John
______________________________**_________________ Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.**org Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/pywikipedia-lhttps://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
On 03/21/2013 05:13 PM, John R. Frank wrote:
Hi PyWikipedians,
Is there a simple way to get a list of pages created on a particular date?
I was hoping to find something in pagegenerators, but maybe not?
Thanks for any pointers! John
no simple way but you can investigate those methods from Page class: getVersionHistory : Load the version history information from wiki getVersionHistoryTable: Create a wiki table from the history data fullVersionHistory : Return all past versions including wikitext
masti