Regarding the files at http://dammit.lt/wikistats/ :
What are "en.b", "en.d", "en2", etc?
Are edits included, or only views?
Are the hit counts actual, or 1/10th sampled, or something else?
pagecounts-20090501-200000.gzhttp://dammit.lt/wikistats/pagecounts-20090501-200000.gzis the hour *beginning* 20:00:00?
Hello Anthony,
I'm back at my lair (phew, finally ;-)
Regarding the files at http://dammit.lt/wikistats/ : What are "en.b", "en.d", "en2", etc?
suffixes indicate projects - from http://svn.wikimedia.org/viewvc/mediawiki/trunk/webstatscollector/filter.c?r... :
projects[] = { {"wikipedia","",NULL}, {"wiktionary",".d",NULL}, {"wikinews",".n",NULL}, {"wikimedia",".m",check_wikimedia}, {"wikibooks",".b",NULL}, {"wikisource",".s",NULL}, {"mediawiki",".w",NULL}, {"wikiversity",".v",NULL}, {"wikiquote",".q",NULL}, NULL },
en2 is, um, http://en2.wikipedia.org/ ;-) it used to exist once upon a time, and apparently there're some referrals.
Are edits included, or only views?
That is views only - though you can find actual logic in above file, it is mostly this pattern:
which is what we have for special pages and views.
Are the hit counts actual, or 1/10th sampled, or something else?
They are actual, with duplicates removed (that is, we don't count in cache-to-cache traffic, only end-user-to-cache).
pagecounts-20090501-200000.gz<http://dammit.lt/wikistats/pagecounts-20090501-200000.gz
is
the hour *beginning* 20:00:00?
ending, I think. let me check, yes, end time. logic is in produceDump() at http://svn.wikimedia.org/viewvc/mediawiki/trunk/webstatscollector/collector.... :)
I think I may end up documenting this somewhat more, but I need to do some promised and long overdue development on this project.
Domas
Domas Mituzas wrote:
Hello Anthony,
I'm back at my lair (phew, finally ;-)
Regarding the files at http://dammit.lt/wikistats/ : What are "en.b", "en.d", "en2", etc?
suffixes indicate projects - from http://svn.wikimedia.org/viewvc/mediawiki/trunk/webstatscollector/filter.c?r... :
projects[] = { {"wikipedia","",NULL}, {"wiktionary",".d",NULL}, {"wikinews",".n",NULL}, {"wikimedia",".m",check_wikimedia}, {"wikibooks",".b",NULL}, {"wikisource",".s",NULL}, {"mediawiki",".w",NULL}, {"wikiversity",".v",NULL}, {"wikiquote",".q",NULL}, NULL },
en2 is, um, http://en2.wikipedia.org/ ;-) it used to exist once upon a time, and apparently there're some referrals.
Are edits included, or only views?
That is views only - though you can find actual logic in above file, it is mostly this pattern:
which is what we have for special pages and views.
Are the hit counts actual, or 1/10th sampled, or something else?
They are actual, with duplicates removed (that is, we don't count in cache-to-cache traffic, only end-user-to-cache).
pagecounts-20090501-200000.gz<http://dammit.lt/wikistats/pagecounts-20090501-200000.gz
is
the hour *beginning* 20:00:00?
ending, I think. let me check, yes, end time. logic is in produceDump() at http://svn.wikimedia.org/viewvc/mediawiki/trunk/webstatscollector/collector.... :)
I think I may end up documenting this somewhat more, but I need to do some promised and long overdue development on this project.
If no one minds, I think I will copy this email to the toolserver wiki :)
Domas Mituzas wrote:
Are edits included, or only views?
That is views only - though you can find actual logic in above file, it is mostly this pattern:
which is what we have for special pages and views.
However, note that after saving an edit, the editor will be sent to a view.
Domas Mituzas wrote:
Hi,
However, note that after saving an edit, the editor will be sent to a view.
yes, you're absolutely right, but no differentiation is done on that. technically, you're not editing, you're viewing :)
Domas
I know, but its worth remembering that to people who might want to do some kind of edit differenciating.
On Mon, Aug 31, 2009 at 11:03 AM, Domas Mituzasmidom.lists@gmail.com wrote:
en2 is, um, http://en2.wikipedia.org/ ;-) it used to exist once upon a time, and apparently there're some referrals.
Wikimedia news, October 2003: -- A portion of traffic to "www.wikipedia.org" will be diverted to "en2.wikipedia.org", while most of it will go to "en.wikipedia.org", where all logins will be directed. Until the server configuration is more stable and transparent load-sharing is set up, this should help share some of the traffic without burdening the other wikis too greatly. --
I think the reason that en got the lion's share is that en2 was on one machine with the other languages whereas en was on a machine on its own. At that time apparently en: still had significantly more traffic than all other languages taken together.
On 8/31/09 7:51 AM, Andre Engels wrote:
On Mon, Aug 31, 2009 at 11:03 AM, Domas Mituzasmidom.lists@gmail.com wrote:
en2 is, um, http://en2.wikipedia.org/ ;-) it used to exist once upon a time, and apparently there're some referrals.
Wikimedia news, October 2003:
A portion of traffic to "www.wikipedia.org" will be diverted to "en2.wikipedia.org", while most of it will go to "en.wikipedia.org", where all logins will be directed. Until the server configuration is more stable and transparent load-sharing is set up, this should help share some of the traffic without burdening the other wikis too greatly. --
I think the reason that en got the lion's share is that en2 was on one machine with the other languages whereas en was on a machine on its own. At that time apparently en: still had significantly more traffic than all other languages taken together.
Ah, the good old days! Sure glad we figured out Squid soon after that... ;)
-- brion
wikitech-l@lists.wikimedia.org