[Toolserver-l] Page view stats

Johan G johang at toolserver.org
Tue Mar 30 21:13:06 UTC 2010


On Tue, Mar 30, 2010 at 00:08, River Tarnell <river.tarnell at wikimedia.de> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Lars Aronsson:
>> Is page view statistics (as in stats.grok.se) being imported to the
>> toolserver and available in a database for quick reference?
>
> A user has made this available in raw form at /mnt/user-store/stats.  We
> are currently working on making this more official; that'll probably be
> announced in a month or two.
>
> It's not available in the database yet, but that's something we're
> looking at doing.  If anyone else has a particular reason to need this
> data, it would help if they could describe it, so we can decide how to
> format the data, and how detailed it needs to be.

I use it for Wikitrends[1], which I'm in the process of moving to
Toolserver. I like the current format, but the data is not perfect. It
contains duplicates (probably pages that redirect) and encoding
problems. Here's three examples that counts as different pages in the
dumps, but point to the same Wikipedia page.

Same word but ISO-8859-1 vs. UTF-8:

F%F6rst%E4rkare
F%C3%B6rst%C3%A4rkare

URL encoded vs. not URL encoded:

Adam_%26_Eva
Adam_&_Eva

Typical redirect:

Adolf_Hitler
Adolf_hitler

[1] http://users.student.lth.se/dt05jg2/wikitrends/en/24h.html

>
>        - river.
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (HP-UX)
>
> iEYEARECAAYFAkuxJMwACgkQIXd7fCuc5vLAbgCfdfmu8xoa78lT3CJCXyt6pF3q
> 1UgAn2Uzvy5pbJn/oJWSjmolgEDL0NwN
> =vyoR
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> Toolserver-l mailing list (Toolserver-l at lists.wikimedia.org)
> https://lists.wikimedia.org/mailman/listinfo/toolserver-l
> Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
>



More information about the Toolserver-l mailing list