On Tue, Oct 25, 2011 at 7:11 AM, Rami Al-Rfou' <rmyeid(a)gmail.com> wrote:
Hi All,
So with more investigation I discovered that I can get a list of the users
depending on their skill at a specific language. For example:
http://en.wikipedia.org/w/index.php?title=Category:User_zh-N
It seems that such list is populated from a database. Does anyone know
where can I find such database ?
Other questions are regarding the partial dumps of wikipedia. Are the dumps
sorted by any field ? How can get all the users pages ? Are they stored in a
specific dump ? Or the dumps are stored by page titles or categories
?
http://csv.ozziesport.com/October%209%20-%20Wikipedia%20English%20Data.csvis
a file I have related to that. It is about a year old and a result of
manual data mining, where I looked for user boxes and which users had
transcluded them onto their user space. My file only covers English
Wikipedia and doesn't include every user box around. It might be a good
place to start. I don't think that userbox information is stored in a
separate user table, so I doubt that you would be able to get access to it
through that route. :/
--
twitter: purplepopple
blog:
ozziesport.com