Heya folks :)
Is anyone interested in getting us some stats for the deployment on the Hungarian Wikipedia? There is a database dump at http://dumps.wikimedia.org/backup-index.html from the 22nd of January that could be used. I'm interested in the effect Wikidata had so far on this one Wikipedia.
Cheers Lydia
-- Lydia Pintscher - http://about.me/lydia.pintscher Community Communications for Wikidata
Wikimedia Deutschland e.V. Obentrautstr. 72 10963 Berlin www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
You won't be able to see this from database dumps, but I am interested in how much raw time wikidata will save volunteers by eliminating the need to run bots to and manually add interlinking links.
--guerillero
On Monday, January 28, 2013, Lydia Pintscher wrote:
Heya folks :)
Is anyone interested in getting us some stats for the deployment on the Hungarian Wikipedia? There is a database dump at http://dumps.wikimedia.org/backup-index.html from the 22nd of January that could be used. I'm interested in the effect Wikidata had so far on this one Wikipedia.
Cheers Lydia
-- Lydia Pintscher - http://about.me/lydia.pintscher Community Communications for Wikidata
Wikimedia Deutschland e.V. Obentrautstr. 72 10963 Berlin www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org javascript:; https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 28/01/13 15:39, Lydia Pintscher wrote:
Is anyone interested in getting us some stats for the deployment on the Hungarian Wikipedia? There is a database dump at http://dumps.wikimedia.org/backup-index.html from the 22nd of January that could be used. I'm interested in the effect Wikidata had so far on this one Wikipedia.
Not from dumps, but from Toolserver, I don't see some reduction in bot activity. Number of bot edits in last 12 months:
201201 61527 201202 48472 201203 55553 201204 60875 201205 56364 201206 56483 201207 49836 201208 50862 201209 39235 201210 44943 201211 37492 201212 52815 201301 40258
On the subject of stats are there any plans to add wikidata access stats to:
http://dumps.wikimedia.org/other/pagecounts-raw/
Or are they available elsewhere?
//Ed
On Mon, Jan 28, 2013 at 10:31 AM, Nikola Smolenski smolensk@eunet.rs wrote:
On 28/01/13 15:39, Lydia Pintscher wrote:
Is anyone interested in getting us some stats for the deployment on the Hungarian Wikipedia? There is a database dump at http://dumps.wikimedia.org/backup-index.html from the 22nd of January that could be used. I'm interested in the effect Wikidata had so far on this one Wikipedia.
Not from dumps, but from Toolserver, I don't see some reduction in bot activity. Number of bot edits in last 12 months:
201201 61527 201202 48472 201203 55553 201204 60875 201205 56364 201206 56483 201207 49836 201208 50862 201209 39235 201210 44943 201211 37492 201212 52815 201301 40258
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On Mon, Jan 28, 2013 at 4:39 PM, Ed Summers ehs@pobox.com wrote:
On the subject of stats are there any plans to add wikidata access stats to:
http://dumps.wikimedia.org/other/pagecounts-raw/
Or are they available elsewhere?
FYI: Denny asked here: http://lists.wikimedia.org/pipermail/wikitech-l/2013-January/066028.html
Cheers Lydia
-- Lydia Pintscher - http://about.me/lydia.pintscher Community Communications for Wikidata
Wikimedia Deutschland e.V. Obentrautstr. 72 10963 Berlin www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Ahh, I see that there was no response to Denny's question about wikidata stats?
I took a look in one of the hourly stats files with this:
curl http://dumps.wikimedia.org/other/pagecounts-raw/2013/2013-01/pagecounts-2013... | zcat - | grep wikidata
It does appear that wikidata is showing up in there, but it's just one line:
undefined//www.wikidata.org/w/api.php 8 50103
It would be nice to correct the 'undefined' so that it was something like 'wd'. Also, it's too bad that we don't actually get to see in the logs what article is being looked up via the API, I guess because these requests were POSTs instead of GETs.
I apologize if this has come up before, but is Hungarian Wikipedia using the Wikidata API for integration? Or is it talking directly to the Wikidata database?
//Ed
On Mon, Jan 28, 2013 at 10:39 AM, Ed Summers ehs@pobox.com wrote:
On the subject of stats are there any plans to add wikidata access stats to:
http://dumps.wikimedia.org/other/pagecounts-raw/
Or are they available elsewhere?
//Ed
On Mon, Jan 28, 2013 at 10:31 AM, Nikola Smolenski smolensk@eunet.rs wrote:
On 28/01/13 15:39, Lydia Pintscher wrote:
Is anyone interested in getting us some stats for the deployment on the Hungarian Wikipedia? There is a database dump at http://dumps.wikimedia.org/backup-index.html from the 22nd of January that could be used. I'm interested in the effect Wikidata had so far on this one Wikipedia.
Not from dumps, but from Toolserver, I don't see some reduction in bot activity. Number of bot edits in last 12 months:
201201 61527 201202 48472 201203 55553 201204 60875 201205 56364 201206 56483 201207 49836 201208 50862 201209 39235 201210 44943 201211 37492 201212 52815 201301 40258
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 29.01.2013 11:35, Ed Summers wrote:
It does appear that wikidata is showing up in there, but it's just one line:
undefined//www.wikidata.org/w/api.php 8 50103
It would be nice to correct the 'undefined' so that it was something like 'wd'. Also, it's too bad that we don't actually get to see in the logs what article is being looked up via the API, I guess because these requests were POSTs instead of GETs.
But that's only for editing. Viewing should show up in the same way it shows for other wikis.
I apologize if this has come up before, but is Hungarian Wikipedia using the Wikidata API for integration? Or is it talking directly to the Wikidata database?
It's talking directly to the database.
-- daniel
On Tue, Jan 29, 2013 at 6:02 AM, Daniel Kinzler daniel.kinzler@wikimedia.de wrote:
But that's only for editing. Viewing should show up in the same way it shows for other wikis.
Thanks. Maybe I'm not looking correctly (or enough) but I don't see any wikidata pages being accessed, for example:
2013-01/pagecounts-20130129-110000.gz | zcat - | egrep ' Q\d+ ' de Q10 3 17060 de Q7 1 20607 en Q1 4 26849 en Q10 2 16419 en Q100 2 15580 en Q106 1 8122 en Q17 1 9697 en Q2 9 45346 en Q22 1 7835 en Q3 1 8520 en Q35 1 377 en Q374 3 21466 en Q4 9 57882 en Q400 1 34656 en Q6700 1 29519 en Q711 1 7148 en Q8 2 78309 en Q9 1 274 en Q9450 1 29412 en Q96 1 11959 es Q2 1 29036 es Q4 1 9167 fr Q1 1 8243 fr Q400 6 145497 hu Q10 3 210830 it Q4 2 14962 ja Q10 24 583983 ko Q10 1 13121 ko Q3 1 7785 nl Q7 1 365 ru Q4000 1 401 zh Q1 1 10929 zh Q10 3 41551
I suppose it's possible that nobody accessed wikidata.org during that hour. I will check again in an hour after I accessed some pages :-)
It's talking directly to the database.
Ok. From looking very quickly at pollForChanges I guess the polling doesn't use the API either? Does that mean that users of Wikidata who want to keep up to date with changes need to be hosted in the Wikimedia datacenter and granted read access to the Wikidata database?
//Ed
On 29.01.2013 13:09, Ed Summers wrote:
Ok. From looking very quickly at pollForChanges I guess the polling doesn't use the API either? Does that mean that users of Wikidata who want to keep up to date with changes need to be hosted in the Wikimedia datacenter and granted read access to the Wikidata database?
Both, the now deprecated pollForChanges and the new dispatchChanges directly poll the repo's database, and directly push to the client's database.
3rd party clients which want to embed data from Wikidata, but cannot access the database directly, are not yet supported. We have designed the architecture in a way that should allow us to support them easily enough, but the necessary mechanisms are not yet implemented.
The plan is to eventually implement "remote" clients that fetch data via the API, and get notifications pushed to them probably via PubsubHubbub. I would very much like to see this, but our priority is to get Wikimedia sites feature complete first.
-- daniel
On Tue, Jan 29, 2013 at 7:14 AM, Daniel Kinzler daniel.kinzler@wikimedia.de wrote:
3rd party clients which want to embed data from Wikidata, but cannot access the database directly, are not yet supported. We have designed the architecture in a way that should allow us to support them easily enough, but the necessary mechanisms are not yet implemented.
The plan is to eventually implement "remote" clients that fetch data via the API, and get notifications pushed to them probably via PubsubHubbub. I would very much like to see this, but our priority is to get Wikimedia sites feature complete first.
Being able to talk to the database directly does simplify things greatly I imagine, and I can completely understand wanting to focus on Wikimedia sites first. Thanks for the details.
//Ed
I just checked with a new stats file, and the wikidata page I accessed during the hour was not recorded in the file. So my suspicion is the page views are not getting recorded. I could double check with the analytics folks to see what the best course of action is if that is helpful.
//Ed
On Tue, Jan 29, 2013 at 7:09 AM, Ed Summers ehs@pobox.com wrote:
On Tue, Jan 29, 2013 at 6:02 AM, Daniel Kinzler daniel.kinzler@wikimedia.de wrote:
But that's only for editing. Viewing should show up in the same way it shows for other wikis.
Thanks. Maybe I'm not looking correctly (or enough) but I don't see any wikidata pages being accessed, for example:
2013-01/pagecounts-20130129-110000.gz | zcat - | egrep ' Q\d+ ' de Q10 3 17060 de Q7 1 20607 en Q1 4 26849 en Q10 2 16419 en Q100 2 15580 en Q106 1 8122 en Q17 1 9697 en Q2 9 45346 en Q22 1 7835 en Q3 1 8520 en Q35 1 377 en Q374 3 21466 en Q4 9 57882 en Q400 1 34656 en Q6700 1 29519 en Q711 1 7148 en Q8 2 78309 en Q9 1 274 en Q9450 1 29412 en Q96 1 11959 es Q2 1 29036 es Q4 1 9167 fr Q1 1 8243 fr Q400 6 145497 hu Q10 3 210830 it Q4 2 14962 ja Q10 24 583983 ko Q10 1 13121 ko Q3 1 7785 nl Q7 1 365 ru Q4000 1 401 zh Q1 1 10929 zh Q10 3 41551
I suppose it's possible that nobody accessed wikidata.org during that hour. I will check again in an hour after I accessed some pages :-)
It's talking directly to the database.
Ok. From looking very quickly at pollForChanges I guess the polling doesn't use the API either? Does that mean that users of Wikidata who want to keep up to date with changes need to be hosted in the Wikimedia datacenter and granted read access to the Wikidata database?
//Ed
Yes, that would be very helpful. I am not sure who the right person to poke is.
Are the numbers absolute, or a sample of out of a thousand?
2013/1/29 Ed Summers ehs@pobox.com
I just checked with a new stats file, and the wikidata page I accessed during the hour was not recorded in the file. So my suspicion is the page views are not getting recorded. I could double check with the analytics folks to see what the best course of action is if that is helpful.
//Ed
On Tue, Jan 29, 2013 at 7:09 AM, Ed Summers ehs@pobox.com wrote:
On Tue, Jan 29, 2013 at 6:02 AM, Daniel Kinzler daniel.kinzler@wikimedia.de wrote:
But that's only for editing. Viewing should show up in the same way it
shows for
other wikis.
Thanks. Maybe I'm not looking correctly (or enough) but I don't see any wikidata pages being accessed, for example:
2013-01/pagecounts-20130129-110000.gz | zcat - | egrep ' Q\d+ ' de Q10 3 17060 de Q7 1 20607 en Q1 4 26849 en Q10 2 16419 en Q100 2 15580 en Q106 1 8122 en Q17 1 9697 en Q2 9 45346 en Q22 1 7835 en Q3 1 8520 en Q35 1 377 en Q374 3 21466 en Q4 9 57882 en Q400 1 34656 en Q6700 1 29519 en Q711 1 7148 en Q8 2 78309 en Q9 1 274 en Q9450 1 29412 en Q96 1 11959 es Q2 1 29036 es Q4 1 9167 fr Q1 1 8243 fr Q400 6 145497 hu Q10 3 210830 it Q4 2 14962 ja Q10 24 583983 ko Q10 1 13121 ko Q3 1 7785 nl Q7 1 365 ru Q4000 1 401 zh Q1 1 10929 zh Q10 3 41551
I suppose it's possible that nobody accessed wikidata.org during that hour. I will check again in an hour after I accessed some pages :-)
It's talking directly to the database.
Ok. From looking very quickly at pollForChanges I guess the polling doesn't use the API either? Does that mean that users of Wikidata who want to keep up to date with changes need to be hosted in the Wikimedia datacenter and granted read access to the Wikidata database?
//Ed
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On Tue, Jan 29, 2013 at 7:29 AM, Denny Vrandečić denny.vrandecic@wikimedia.de wrote:
Are the numbers absolute, or a sample of out of a thousand?
I believe they are absolute. I'll see if I can figure out what's going on by asking over on the analytics list.
//Ed
On 28.01.2013 16:31, Nikola Smolenski wrote:
On 28/01/13 15:39, Lydia Pintscher wrote:
Is anyone interested in getting us some stats for the deployment on the Hungarian Wikipedia? There is a database dump at http://dumps.wikimedia.org/backup-index.html from the 22nd of January that could be used. I'm interested in the effect Wikidata had so far on this one Wikipedia.
Not from dumps, but from Toolserver, I don't see some reduction in bot activity. Number of bot edits in last 12 months:
Yea, well.. what's the status of communication with the bot owners? I was hoping that a process for keeping the bots from adding interlanguage links would evolve. Can't huwiki simply opt out of the interwiki bot stuff?
-- daniel