Hi,
recently the report of the KnowPrivacy [1] study - a research project
by the School of Information from University of California in Berkeley
- hit the German media [2].
It came to the conclusion that "All of the top 50 websites contained
at least one web bug at some point in a one month time period." [3]
which includes wikipedia.org.
This is very troubleing and irritating for some of our (German) users
who are very sensitive to data privacy topics. So I established
contact to Brian W. Carver (University of California) who connected me
to David Cancel, the maintainer of Ghostery, which was used to
identify the web bugs. David wrote me today:
> The following web bug trackers were reported to us, on the following subdomains:
> Google Analytics - vls.wikipedia.org
> Doubleclick - hu.wikipedia.org
> Both were seen in yesterday's data so they're recent. We don't receive any page level information so that's as much detail as we have. Hope that helps.
I wasn't able to track down the Doubleclick web bug on the hungarian
Wikipedia, but Google Analytics web bug is integrated in every page of
the West Flemish Wikipedia via JavaScript [4].
Our privacy policy [5] states "The Wikimedia Foundation may keep raw
logs of such transactions [IP and other technical information], but
these will not be published or used to track legitimate users." and
"As a general principle, the access to, and retention of, personally
identifiable data in all projects should be minimal and should be used
only internally to serve the well-being of the projects."
I think we should stop the current use of Google Analytics ASAP.
Bye, Tim.
--
http://wikimedia.de
[1] http://knowprivacy.org
[2] http://www.heise.de/newsticker/Studie-Google-fuehrend-bei-Web-Bug-Nutzung--…
[3] http://www.knowprivacy.org/report/KnowPrivacy_Final_Report.pdf, p. 4
[4] http://vls.wikipedia.org/wiki/MediaWiki:Common.js
[5] http://wikimediafoundation.org/wiki/Privacy_policy
Virgilio, you simply have not provided or described sufficient evidence to
back up the conclusion that the people who "run" pt.wp are have severe
emotional problems. Such accusations serve only to call your own integrity
into question, which I'm sure you wish to avoid.
It should be noted that most disability access laws refer to the right of
access to certain classes of goods and services and employment. Editing
Wikipedia would not seem to fall into any of the typically covered
categories, even were it under the jurisdiction of such laws. While I'm not
an expert on the subject, I'm not aware of any laws that even require access
to the Internet, let alone resources or activities accessed through it. So
the question of law is really separate; if you want to make a case about
access, it needs to be done on other grounds.
In the last discussion it was said by many that the primary role of editors
is the contribution and improvement of free content, and the privilege of
editing access is provided for that purpose. If we can help people with
certain disabilities be productive as editors, we should. If a disabled
editor, as any editor, becomes disruptive and impedes the goal of the
project (and assistance fails to solve the problem) then that person should
be blocked.
My suggestion is that if you have a specific problem you'd like addressed,
bring that specific problem to the front. The way you've written your post,
it seems like you are trying to elicit statements that you can bring back to
pt.wp and use in a dispute - all without telling us what the actual dispute
is. That doesn't really fly here.
Nathan
It might be worth seeing if we can get the EFF to add our privacy policy to
http://www.tosback.org/timeline.php
In order to raise the profile of any changes.
--
geni
Hello all --
There is an ongoing discussion about a possible global biographies of living
people policy at Meta.[1]
Any input / comments / suggestions / etc. would be welcome on the talk page.
Best --
MZMcBride
public(a)mzmcbride.com
[1] http://meta.wikimedia.org/wiki/BLP
This is in reference to:
http://lists.wikimedia.org/pipermail/foundation-l/2009-May/051889.html
I would like to thank Michael Bimmler for
steering me through this mailing list. Michael
always addressed me in a polite, professional,
and non-judgmental manner. It was a pleasure to
correspond with him. We had the kind and level of
interaction I was expecting to find at the
pt:wiki. Thanks also for the sensible comment
made by Phil Nash. Although we might not be in
complete agreement, some good points were raised
and the benefit of experience is of great value.
Twice I asked for Cary Bass' advice about posting
this message, but I'm sorry to say that I never
got an answer. According to Michael, Cary is
Volunteer Coordinator at the Wikimedia
Foundation. I'm sure he had more pressing matters to attend to.
Let me try to organize the discussion by
separating a) a very real general question from
b) my hypothetical example. I believe that the
discussion of real examples will be beneficial to both.
a) A very real and clear statement was made by an
administrator bureaucrat, also a member of the
arbitration committee, which can be found here
(the quotations are in English):
http://pt.wikipedia.org/wiki/Wikipedia:Pedidos_a_administradores/Discuss%C3…
He quoted the Wikimedia:Non discrimination
policy, explaining that that policy did NOT allow
them to treat editors differently, based on their
[...] medical condition. Wikimedia:Code of Conduct Policy was also quoted.
I believe that "medical condition" includes the
whole spectrum of physical and mental illnesses,
but please let me know if my interpretation is not correct.
Phil Nash states that in case a registered user
is not able to communicate effectively, as it has
already happened on en:wiki, they have been
persuaded to be adopted by willing mentors.
I consider that a good example of treating
editors differently based on their medical
condition. This is also similar to the special
treatment given inexperienced users, namely
through the Adopt-a-User program
(http://en.wikipedia.org/wiki/Wikipedia:Adopt-a-User)
that has a parallel in the Portuguese Wikipedia
(please see interlanguage link.)
That procedure also conforms to current non
discriminatory legislation in many countries that
makes it compulsory to provide ramps for
wheelchairs, Braille markings and sound warnings,
and special education for those with all sorts of
illnesses, both physical and mental. That is, a
non discriminatory policy means that you treat
people differently based on their medical
condition. NOT treating editors differently,
based on their medical condition, is considered DISCRIMINATION.
In the Portuguese Wikipedia, as exemplified by
the statement of that administrator bureaucrat,
and member of the arbitration committee, there is
the exact opposite understanding and
interpretation, contrary to what non
discrimination is. So far, nobody else has
contradicted that position which was only
disclosed in response to my questioning.
My point is that this state of affairs in the
Portuguese Wikipedia cannot be tolerated,
condoned and supported by the resources of the
Wikimedia Foundation, generously provided by
volunteers and donors keen on improving the
general knowledge and welfare of humankind and
not the misguidance of a group that actively or
with their silence have taken over the Portuguese
Wikipedia. Swift and drastic measures need to be taken to stop this.
b) My strictly hypothetical case assumed that a
tetraplegic girl had learned how to use a
computer and found out about Wikipedia. After
registering as a user she did all sort of
trampling. To my question if there would be any
administrator willing to block her from editing
Wikipedia, three administrators, one of them a
bureaucrat and member of the arbitration committee answered YES:
http://pt.wikipedia.org/wiki/Wikipedia:Pedidos_a_administradores/Discuss%C3…
No dissenting opinion has been published, to this
date, anywhere on the Portuguese Wikipedia. I
have refused to do so for the reasons stated at
the conclusions of both part a) and b).
This is in stark contrast with the assumptions
and procedures advocated by Phil Nash. First he
narrows the case to one in which her physical
disability does not impair her mental faculties,
that she is aware of what she is doing, and
certainly should be after a number of warnings.
There's no problem with this scenario since it is added:
"If it's just a case of being unable to
communicate effectively, we do have users on
en:wiki with similar issues, and have persuaded
them to be adopted by willing mentors". Thus a
procedure is suggested to prevent errors at the
source or have someone at the ready to revert
them, without requesting for the user blocking.
Admittedly, the corrective actions of such mentor
would also avoid the need for those requests to
be made and to act on them. I find this a viable and correct approach.
I beg to differ with Phil Nash when he states
that "However, the bottom line to me is whether
the harm to the encyclopedia (willed or not)
outweighs the benefit of having that person
editing". It is not difficult to conclude, even
without any figures, that this kind of
benefits-cost analysis would make any action in
favor of the disabled unfeasible, and disability
rights laws unactable. The very nature of
Wikipedia makes it impossible to produce any harm
comparable to the benefit of making its edition
available to anyone whose capable of doing it, no
matter at what cost in reverts. There's already
enough vandalism being done by people supposedly
sound of mind and body. It's hard to imagine that
the marginal costs of handling the errors of the
disabled would put the project in jeopardy. There
might even be a way to tap additional resources to cope with such costs.
Such is the current situation of the Portuguese
Wikipedia. I believe that as a consequence of the
self management of the project, it is now being
operated and run on a daily basis by a group of
people with severe mental, emotional, and
behavioral problems, completely out of control
and without any kind of supervision and/or
regulation. This has been corroborated by several
pt-wikipedians. In an attempt to gather a sample
of their statements, a non-exhaustive collection
was made
(http://pt.wikipedia.org/wiki/Usu%C3%A1rio:Vapmachado/Adeus_Wikip%C3%A9dia).
It was voted for deletion
(http://pt.wikipedia.org/wiki/Wikipedia:P%C3%A1ginas_para_eliminar/Usu%C3%A1…)
with arguments from both sides that are outright
embarrassing. Maintaining the page won by four votes.
This voting is just one of many examples of
rampant disrespect for the five pillars,
occuring, unchalanged, on a regular basis on the
Portuguese Wikipedia. Mobbing is practiced matter
of factly, and promoted openly on discussion
pages. Just for your information, please be aware
that I was already harassed on the Portuguese
Wikipedia
(http://pt.wikipedia.org/wiki/Wikipedia:Esplanada/Arquivo/2009/Maio#Vapmacha…)
for bringing up this subject on "foundation-l." I
was under the threat of banishment
(http://pt.wikipedia.org/wiki/Usu%C3%A1rio_Discuss%C3%A3o:Vapmachado#Aviso_2)
from the pages where this harassment takes place,
by the same administrator bureaucrat and member
of the of arbitration committee mentioned in both
parts a) and b). When I questioned the voting for
violating that Wikipedia is free content, I ended
up blocked for six days
(http://pt.wikipedia.org/wiki/Usu%C3%A1rio_Discuss%C3%A3o:Vapmachado#Bloquei…).
I don't think that analysis of much of the goings
on in the pt:wiki by competent professionals
would give it a clean bill of mental health. It's
a crazy world, I know, but the project is of an
encyclopedia, not a crazypedia (forgive my
hyperbole.) "Pero si muove." Certainly, it does,
but at what cost, it is my turn to ask. Is it
really as impossible to bring a project like this
under control, once it gets spinning on its own
axis, as it is to stop the Earth from moving? Or
are there enough resources to correct the course?
Sincerely,
Virgílio A. P. Machado (Vapmachado)
Prof. Virgilio A. P. Machado vam(a)fct.unl.pt
Engenharia
Industrial
http://web.archive.org/web/20070824105539/www.ipei.pt/GDEI/
DEMI/FCT/UNL Fax: 351-21-294-8546 or 21-294-8531
Universidade de Portugal or 351-21-295-4461
2829-516 Caparica Tel.: 351-21-294-8542 or 21-294-8567
PORTUGAL or 351-21-294-8300 or 21 294-8500
Ext.112-32
96-888-6852
Faculdade de Ciencias e Tecnologia/UNL (FCT/UNL)
(Dr. Machado is Associate Professor of Industrial Engineering at the
School of Sciences and Engineering/UNL of the University of Portugal)
[repost with proper subscribed mail address]
Alex wrote:
> The plain pageview stats are already available.
> Erik Zachte has been doing some work on other stats.
> <http://stats.wikimedia.org/EN/VisitorsSampledLogRequests.htm>
> If I were to compile a wishlist of stats things:
> 1. stats.grok.se data for non-Wikipedia projects 2. A better interface
> for stats.wikimedia.org - There's a lot of data there, but it can be
> hard to find it and its not very publicized. The only reason I knew
> about the link above is because someone pointed it out to me once and
> I bookmarked it.
> 3. Pageview stats at <http://dammit.lt/wikistats/> in files based on
> projects. It would be a lot easier for people at the West Flemish
> Wikipedia to analyze statistics themselves if they didn't have to
> download tons of data they don't need.
Your enhancement requests:
1 IIRC this is already a (albeit undocumented) feature.
One can manually alter the url to find e.g. wiktionary stats.
But I forgot precisely how and see nothing on User:Henriks talk page.
2 Seconded whole heartedly. In fact I started to reshape the main page (just
eight links) this week :) I just uploaded it a bit earlier than planned:
http://stats.wikimedia.org/
3 That could be a useful extension on the preservation script described
below.
--------------------------------
General response
I would say since begin 2008 quite a lot has happened. A recap:
As already has been said Domas' (and Tim's) work was a major step forward.
http://dammit.lt/wikistats/
Two very useful aggregators of these on a page by page basis are
http://stats.grok.se/http://wikistics.falsikon.de/
Based on the same data, on a higher aggregation level there are visitors
counts for all projects in a easily digestible fashion
http://stats.wikimedia.org/EN/TablesPageViewsMonthly.htm
Also since two months we know much more about Wikimedia traffic based on 8
reports with all kinds of cross sections:
http://infodisiac.com/blog/2009/04/wikimedia-traffic-analyzed/
With regard to dammit.lt raw data I helped to preserve these for posterity
in a more compact and slightly filtered state, so that we can query them
much longer. (dammit.lt server has space for one or two months) Actually
Mathias Schindler started this important rescue effort. Each day all files
are downloaded and processed, reduced from 40 Gb per month to 3 Gb (May
2009). I also made a script to query these files, which is much more
efficiently than processing the original hourly files. But runtime is still
considerably so querying these files without restraints through a public
interface is not advisable. But the toolserver could get a copy of the files
of course.
http://infodisiac.com/blog/wp-content/uploads/2009/05/influenza1.png
Is this enough? Of course not, there is so much more to learn.
Considering geo data: for many months a patch for Domas' (and Tims) code has
been laying around, by Antonio José Reinoso Peinado, that would add country
level geolocation data from Maxmind's public database (ip->geo lookup).
Although I promised to look at it, I haven't found the time yet.
Considering web bugs: comScore also proposed such a scheme to us.
Apart from the question how much it would bring us that we don't or can't
figure out ourselves an overriding concern is privacy.
Erik Zachte
Data Analyst
Wikimedia Foundation, Inc.
E-Mail: ezachte(a)wikimedia.org
Probably, some of you already saw that Google made something for which
I think that it will be the new form of the mainstream Internet
perception. You may read Slashdot article [1], a good description at
the blog "Google Operating System" [2] (not officially connected with
Google) and, of course, you may see the official site with more than
one hour of presentation [3].
I expected such kind of tool (a client connected with others via P2P
XML-based protocol; with servers for identification). However, I
didn't expect that i will come so soon, that it will be done by one
large corporation and that it will be done at the right way: open
protocol, free software referent implementation.
At the official site they said that it will start to work during this
year. As one large corporation is behind the project, as well as free
and open source community is able to participate, I have no doubts
that it will be implemented all over the Internet (and not just
Internet) very quickly. Probably, in two years the basic component of
one modern operating system will not be a Web browser, but a Wave
client. Probably, Web will become a storage system, while all of the
interaction will be done via Waves.
This development of Internet is very strongly related to the Wikimedia projects:
* I want to be able to edit Wikipedia through the Wave client.
* I want to add my own notes to articles, history of articles etc.
* I want to have collection of my knowledge at one place, including
Wikipedia articles and my notes.
* I want to be able to make a program which would analyze articles on
Wikipedia and to give program and/or analysis to my friends.
* I want many more things to be browsable or editable or whatever from
a Wave client...
All of those my (but, in one year, not just my) wishes may be
fulfilled just through work on MediaWiki and Pywikipediabot. So, I am
calling all of you who are willing to think about it or who are at the
position to think about it -- to start with thinking :)
[1] - http://tech.slashdot.org/story/09/05/28/1912226/Googles-Wave-Blurs-Chat-Ema…
[2] - http://googlesystem.blogspot.com/2009/05/google-wave.html
[3] - http://wave.google.com/
Given currently existing technology, and technology that we can reasonably
assume to be available within the next decade, how can the WMF best achieve
its goal of giving every person free access to our current best summary of
all human knowledge?
Consider that Google Translate has the best machine translation corpus,
consisting not only of the Internet but also all United Nations translations
and many other datasets. It is the closest existing thing to a Babelfish,
now supporting 41 languages and winning all translation competitions for
several years. It will continue to be the best for the foreseeable future.
Consider that 75% of the world is not online and that there may be a way to
beat market forces in the race to getting free Internet access to every
person by literally giving Wikipedia to every person instead, offline. Our
current micro-content distribution model would be sufficient if everyone had
access to the Internet. They don't so it's not.
Consider that the money the WMF could potentially raise through competitive
market forces (the OLPC way) may lag behind the money they can raise through
their idealistic goals, uncompromised values and principles, and smart
ideas. This money can be used to give copies of the entirety of Wikipedia
away.
Consider that access to Wikipedia does not require readability proper
(beautiful prose), just the ability to comprehend the information, and just
barely. The human brain is the most powerful translator in existence, we
just have to meet said brain halfway. We may see a meta language in our
lifetimes but not within the next decade. The current best meta language is
a set of fuzzy translations that are a function of the size of the source
and target language corpuses.
I propose a cheap cellphone-sized device (OWPP) whose only purpose is to
read Wikipedia. The WMF teams up with Google to obtain CC-BY-SA translations
from all supported source languages to all supported target languages. The
device holds just one copy of all of the Wikipedia's in a single target
language.
The technical specifications of such a device allow for it to be extremely
cheap.
Let's let those of us fortunate enough to have access to the Internet
write an encyclopedia and give it to those who are not,
sooner rather than later.
Brian
On Mon, Jun 1, 2009 at 12:20 PM, Thomas Dalton <thomas.dalton(a)gmail.com>wrote:
> 2009/6/1 Brian <Brian.Mingus(a)colorado.edu>:
> > While I'm thinking about it:
> >
> > I would like to see the WMF solicit feedback on these kinds of issues -
> how
> > it might further its goals (distribution for example) - from the wider
> > readership. The small, well informed and focused group on foundation-l
> can
> > do a lot, but what about inviting everyone to the conversation in a
> medium
> > that makes it easy for them to contribute their ideas?
> >
> > Erik, you had pitched us the Ideazilla application not too long ago. That
> in
> > coordination with a site notice would be an awesome experiment. Let's do
> it
> > sooner rather than later? :)
>
> Did you see this email (and the resulting thread)?
> http://lists.wikimedia.org/pipermail/foundation-l/2009-April/051580.html
>
> The kind of discussion you suggest could well be part of that, or at
> least done in connection with it.
>
I'm glad they are doing that but it's not quite what I was thinking. My idea
is more of a combination of the ideas that liquid threads and ideazilla
bring to mind (not necessarily related to how those applications actually
work, however - just the ideas they elicit). I got the impression from
Michael's e-mail that Wikimedia's Strategic Planning would mostly be done by
really smart Wikimedians who are already meta-contributors. It was broad
enough to include all volunteers, but wasn't really oriented around helping
those volunteers become strategic planners. Ultimately, strategic planning
will be best done in coordination with a vast amount of evidence and
opinions. It doesn't make sense to create a strategic plan before
considering all possible options in detail. Nowadays we can do that better
than its ever been done before.
Idea: You perform a Google search for some topic and end up at Wikipedia.
You find your information and are now looking for your next distraction when
you see a prominent site notice that says, "How can we make Wikipedia
better?" or somesuch. You click it and end up at a fully ajaxified
application that doesn't require (but supports) login, has no captchas and
does all anti-spam and anti-ballot stuffing on the backend (and a
"report/flag this thread" link for human spam detection). What you see is a
list of idea threads that are ranked according to simple ajax thumbs up /
thumbs down votes in addition to a fully ajax form for adding a new idea.
Clicking on it loads the threaded idea conversation on the same page. You
can vote on individual comments and reply to them on the same page.
As you can tell I am of the opinion that loading pages incurs a heavy
cognitive load, lowering the probability that you will convert the reader
into a collaborator. There is of course the spam/ham tradeoff, but Gmail and
Craigslist have nailed the solution and we could too. An important takeaway
from the brain sciences regarding executive function is that you need to
"trick" yourself (or your users) into switching tasks. Task switching is
tough and every single additional degree of freedom that you add between
your user reading and then following up on that with some creative writing
lowers the probability that it will happen very significantly.
Sunday 31th of May the Norwegian newspaper VG (Verdens Gang) compared
Wikipedia in Norsk (Bokmål) and Store Norske Leksikon. The latter
encyclopedia is a large traditional paper lexicon transfered to a web
portal, together with to other lexicons; one medical and health lexicon
and one biographical lexicon. SNL has transfered from a closed licensing
model to an open licensing model, and also open up for outside
contributions.
The roundup focused on five different areas; history, culture,
entertainment, society and politics, and sport. The grades goes from 1
to 6, with 6 as the best.
History: Wp 5 - SNL 5
(Det norske arbeiderparti, Andre verdenskrig, Vikinger, Vikingtid)
Culture: Wp 4 - SNL 3
(Edvard Munch, Henrik Ibsen, Leif Ove Andsnes, Hjalmar Borgström,
Ungdommens kulturmønstring)
Entertainment: Wp 5 - SNL 2
(Melodi Grand Prix, Harald Eia, Tone Damli Aaberget, Nytt på nytt,
Alexander Rybak, Idol, Wenche Foss, Sivert Høyem)
Society and politics: Wp 5 - SNL 3
(Kristin Halvorsen, AUF-skandalen, Rødt, Saera Khan)
Sport: Wp 5 - SNL 3
(Marit Breivik, Tore André Flo, Tore Reginiussen, Kjetil André Aamodt,
Fotball-VM
The comparison seems to have a slight bias because there are two much
articles about persons and events fairly close to present time.
Wikipedia is much better on that kind of articles compared to more
classical lexicon articles, where it can be assumed that SNL would be
somewhat better.
John