Many people are probably aware now that i've started a test of the Google Analytics
page counter on en.wikibooks. I hear that people are running a similar kind of test on
en.wikinews. Currently, these programs are opt-in: only registered users are using these
scripts, and it involves manually adding them to the personal monobook files. The
information received so far has been fantastic: counts of page hits, click patterns,
information about entry points that we can use to improve the welcome for new visitions,
etc. However this test has also raised a few concerns. Some concerns I would like to
address, others I would like to get input from the foundation about.
1) First and foremost is the issue of privacy. The information that google analytics
collects is a step above what is typically available to regular users, but not quite as
detailed as CU data. Some information, such as geographical area and the ISP of a user is
aggregated, but it is not attached in any way to a user's screenname. That is, without
a priori knowledge about the user, it is impossible to attach a particular username to a
particular ISP, geographical location, or any other piece of collected data. I am
currently inspecting the google analytics code looking for a way to suppress the
collection of ISP or geographical information, but havent found a way yet.
1a) Ancillary to the idea of privacy is the issue that the analytics code should probably
remain opt-in. Many users are conscious of privacy and security issues, and they shouldnt
be forced to decide between participating in a tracking program or not visiting wikibooks
at all. I've proposed a solution that unregistered users could be tracked by default
(testing wgUserName == null), but registered users would need to opt-in explicitly. After
all, I feel that information about our readers is far more important then the same
information about our editors.
1b) Another related idea is that individual books could be tracked for readership
patterns, while the whole remainder of the wikibooks project could remain script-free.
Notification templates could be used to indicate which books the scripts were active on. A
book could be tracked for a month or so at a time. We could track a handful of books at
once, and then change which books we track on a regular basis.
2) Second is the issue of server load. Running the script now currently involves an
additional javascript page access per user. However, the javascript files can be cached.
The script runs in javascript and performs interactions with the google analytics website,
but does not transact with the WMF servers. I believe that server load for us should be
minimal (but I want confirmation about this from the techs)
3) Log files are only available by default to the google account holder (myself) and other
people that are specifically added by myself to the profile. If we keep the access list
very restrictive, we dont need to worry about sensitive data from becoming public.
However, we do run the risk of giving users with access "power", which is a
common fear. If we were to set up accounts on behalf of the project or the WMF (as opposed
to personal accounts), we could negate this issue entirely.
I'm looking for as much input on this issue as I can get. I'm not planning to make
any changes to any javascript for the forseeable future, till the concerns are ironed out.
--Andrew Whitworth
_________________________________________________________________
Peek-a-boo FREE Tricks & Treats for You!
http://www.reallivemoms.com?ocid=TXT_TAGHM&loc=us