On Tue, Mar 30, 2010 at 00:08, River Tarnell <river.tarnell(a)wikimedia.de> wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Lars Aronsson:
Is page view statistics (as in stats.grok.se)
being imported to the
toolserver and available in a database for quick reference?
A user has made this available in raw form at /mnt/user-store/stats. We
are currently working on making this more official; that'll probably be
announced in a month or two.
It's not available in the database yet, but that's something we're
looking at doing. If anyone else has a particular reason to need this
data, it would help if they could describe it, so we can decide how to
format the data, and how detailed it needs to be.
I use it for Wikitrends[1], which I'm in the process of moving to
Toolserver. I like the current format, but the data is not perfect. It
contains duplicates (probably pages that redirect) and encoding
problems. Here's three examples that counts as different pages in the
dumps, but point to the same Wikipedia page.
Same word but ISO-8859-1 vs. UTF-8:
F%F6rst%E4rkare
F%C3%B6rst%C3%A4rkare
URL encoded vs. not URL encoded:
Adam_%26_Eva
Adam_&_Eva
Typical redirect:
Adolf_Hitler
Adolf_hitler
[1]
http://users.student.lth.se/dt05jg2/wikitrends/en/24h.html
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (HP-UX)
iEYEARECAAYFAkuxJMwACgkQIXd7fCuc5vLAbgCfdfmu8xoa78lT3CJCXyt6pF3q
1UgAn2Uzvy5pbJn/oJWSjmolgEDL0NwN
=vyoR
-----END PGP SIGNATURE-----
_______________________________________________
Toolserver-l mailing list (Toolserver-l(a)lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list:
https://wiki.toolserver.org/view/Mailing_list_etiquette