On Sunday 28 July 2002 03:00 am, The Cunctator wrote:
> What are the articles this person has been changing?
For 66.108.155.126:
20:08 Jul 27, 2002 Computer
20:07 Jul 27, 2002 Exploit
20:07 Jul 27, 2002 AOL
20:05 Jul 27, 2002 Hacker
20:05 Jul 27, 2002 Leet
20:03 Jul 27, 2002 Root
20:02 Jul 27, 2002 Hacker
19:59 Jul 27, 2002 Hacker
19:58 Jul 27, 2002 Hacker
19:54 Jul 27, 2002 Principle of least astonishment
19:54 Jul 27, 2002 Hacker
19:52 Jul 27, 2002 Trance music
19:51 Jul 27, 2002 Trance music
For 208.24.115.6:
20:20 Jul 27, 2002 Hacker
For 141.157.232.26:
20:19 Jul 27, 2002 Hacker
Most of these were complete replacements with discoherent statements.
Such as "TAP IS THE ABSOLUTE DEFINITION OF THE NOUN HACKER" for Hacker.
For the specifics follow http://www.wikipedia.com/wiki/Special:Ipblocklist
and look at the contribs.
--mav
Dear all,
Most of you would be aware of some of the discussions that have occurred
around Wikipedia in the Norwegian languages. Since the last round of
discussions on this list, there has been a lot of internal debate, as
well as what seems to be a fairly widely accepted agreement following
voting.
This e-mail intends to, after a brief recap on Norwegian language and
wikipedia issues, take those interested through the latest development
and will stake out the road ahead. It is also intended to inform the
international community about the current agreement on no.wikipedia, so
as to prevent misunderstandings in the future.
Finally, we will mention an unfortunate reaction to the vote by a small
number of users at the Norwegian Bokmål/Riksmål (no:) wikipedia who want
to disregard the result of the voting and are planning to create a
_third_ Norwegian wikipedia with the sole mission of mixing the contents
of the two current Norwegian versions.
== A short language history of Norway ==
Spoken Norwegian ("norsk") (ISO 639-2 alpha-2 code "no") is in a fairly
unique situation compared to most other languages of the world in that
it has two widely accepted written standards, Bokmål (ISO 639-2 alpha-2
code "nb") and Nynorsk (ISO 639-2 alpha-2 code "nn"). By national
legislation they are both regarded as official written forms of
Norwegian. In addition, many people still make a distinction between
Bokmål and its precursor which still is in use, Riksmål.
Briefly speaking, Bokmål and Riksmål are descendants of the Danish
written language. Until the 1800s, Danish was the only widely used
written language in Norway as a result of four centuries of union with
Denmark. With increasing independence came a wish to norwegianise the
Danish standard, with Knud Knudsen at the forefront for changing parts
of the vocabulary and orthographics. Thus, Riksmål, and later Bokmål,
resulted. These forms together are today probably used by about 90% of
Norway's population, or somewhere around 3,500,000 people.
Parallel to this development, a new written standard was created by Ivar
Aasen. He travelled extensively throughout Norway, and based his new
language, landsmål, on the grammar and vocabulary of dialect samples
from around the country. This was later renamed Nynorsk. Modern Nynorsk
differs significantly from modern Bokmål, and may be linguistically
looked upon as as different (or as similar if you like) as Swedish is to
Danish. For English or Dutch/German speakers, the differences may be
likened to those between (Lowland) Scots and English or Low German and
Dutch. Today it is estimated that about 500,000-600,000 people have
Nynorsk as their first written language.
More information about the Norwegian language history can be found in
English, German, French, Spanish or Portuguese on the website of the
Norwegian Language Council:
http://www.sprakrad.no/templates/Page.aspx?id=653
== A short history of Wikipedia in Norwegian ==
The first Norwegian wikipedia started 26 November 2001 on the subdomain
no.wikipedia.org. As most wikipedias, its contributor and article count
started really picking up around the end of 2003. At the time, it
accepted all written standards of Norwegian, although the amount of
Nynorsk was minimal. There were already several debates about the
feasibilty and appropriateness of keeping the two languages united on
one Wikipedia. On 31 July 2004 a Wikipedia for Nynorsk was created.
The creation of nn:, however, split the community at no: wikipedia. Many
felt that given that Nynorsk now had its own wikipedia, no: should
become a Bokmål/Riksmål Wikipedia only. Others disapproved and claimed
that there was no need to change and that it should continue its
language policy of accepting all and keep its interwiki link name of
"Norsk".
Nynorsk Wikipedia soon proved a success, as it within the next few
months gathered several people who had felt uncomfortable in the
(mainly) Bokmål environment at no:. The name displayed in interwiki
links became "Norsk (nynorsk)" (languages are not spelt with upper case
in Norwegian). To date it continues to be one of the fastest growing
wikipedias, with a steady article increase, now at over 6000 articles
and >50 editors with more than 10 edits since arrival.
== Votes ==
The issue of no:'s language policy has come up time and again, and a
vote was held in March ([[:no:Wikipedia:Målform]]) as to which policy to
adapt. Independent of the method of the tally (whether or not to include
new contributors etc.) there was a majority for switching to a
Bokmål/Riksmål only language policy (50% for Bokmål/Riksmål, 43.2% for
Bokmål/Riksmål/Nynorsk/Høgnorsk, and 6.8% for the official variants
Bokmål/Nynorsk only).
Following this result, there is now going to be a vote on which
interwiki link name will most appropriately reflect the current language
policy of no:. The result of this vote will most likely be either "Norsk
(bokmål)" or "Norsk (bokmål/riksmål)".
Understandably, there has also been a debate as to whether the subdomain
should change from "no" to "nb", as this is the correct representation
of Bokmål according to ISO 639-2. However, there is some resentment
towards such a move and currently a general acceptance in letting the
Bokmål wikipedia stay at "no". The alternative some have suggested is a
server-side redirect from "no" to "nb", in the same way that "nb" today
is a server-side redirect to the equivalent page on "no".
== Summary of the problem ==
Unfortunately, a small group of users (who all write Bokmål/Riksmål) are
ignoring the results from the vote, and are claiming they want to
re-establish a wikipedia for all written standards of Norwegian. They
claim they have been in touch with people centrally in Wikimedia
(developers? stewards?) and that they have so far received positive
comments. With this email, we would like to state the fact that there
have been no official decisions about creating a third Norwegian
wikipedia containing both Bokmål and Nynorsk, it is merely an unofficial
initiative from a small group of users which started a sign-on list at
[[:no:Bruker:Norsk_Wikipedia]]. A spontaneous list with signatures
against this activity was immediately created at
[[:no:Wikipedia-diskusjon:Fellesnorsk]]. The process of creating a third
Norwegian wikipedia has not gone through a voting process in any of the
two existing Norwegian wikipedias (no: and nn:) and can not be
considered as a decision by the Norwegian Wikipedia community.
We believe the creation of a third wikipedia under the Wikimedia
foundation would have a serious and unfortunate impact on the existing
wikipedias in Norwegian, no: and nn:, and would undermine Wikipedia's
reputation in Norway. This being said, we are all for extensive co-
operation between the four Scandinavian language wikipedias (including
Swedish and Danish), as evident by the recent creation of
[[:meta:Skanwiki]], the Scandinavian meta-pages, and the use of featured
articles from neighbour wikipedias.
== Conclusion ==
Hopefully, this letter will help people better understand the
complicated language situation of the Norwegian Wikipedia community, so
as to give a background on which discussion can take place on this list
in the future, such as the inevitable debate following a possible
request for a re-establishment of the common (and third!) Norwegian
Wikipedia.
>From the community of no.wikipedia.org and nn.wikipedia.org,
Bjarte Sørensen [[:meta:User:BjarteSorensen]] (Administrator/bureaucrat on nn:)
Lars Alvik [[:no:User:Profoss]] (Administrator/bureaucrat on no:)
Øyvind A. Holm [[:no:User:Sunny256]] (Administrator on no:)
Onar Vikingstad [[:no:User:Vikingstad]] (Administrator on no:)
Jon Harald Søby [[:no:User:Jhs]] (Administrator on no:)
Chris Nyborg [[:no:User:Cnyborg]] (Administrator on no:)
Guttorm Flatabø [[:no:User:Dittaeva]] (Administrator on nn:)
Gunleiv Hadland [[:meta:User:Gunnernett]] (Administrator on nn:)
Jarle Fagerheim [[:nn:User:Jarle]] (Administrator on nn:)
Øyvind Jo Heimdal Eik [[:en:User:Pladask]] (Administrator on nn: and no:)
Kristian André Gallis [[:nn:User:Kristaga]]
Vegard Wærp [[:no:User:Vegardw]]
Nina Aldin Thune [[:no:User:Nina]]
Thor-Rune Hansen [[:no:User:ThorRune]]
Claes Tande [[:no:User:Ctande]]
Arnt-Erik Krokaa [[:no:User:AEK]]
Rune Sattler [[:no:User:Shauni]]
Dear friend,
We are conducting a study on the motivation of the knowledge sharing on the
Wikipedia community.
The contributors’ experience to Linux is very important to the design and
management of this knowledge platform.
Would you please post the following on-line questionnaire message to the
Wikipedia platform or forward the message to the members?
After the survey is done, we will randomly select twenty persons and present
them with USB 2GB Flash Drives.
Besides, with each valid questionnaire, we will donate US $1 dollar to the
Wikimedia Foundation.
The result of this survey is analyzed in an anonymous way and is only
regarded as the academic use.
Please help us to complete the data collection.
Thanks so much for your help.
Cheers,
Joanne
[The Message content]
Dear friends,
We are conducting a study on the motivation of the knowledge sharing on
Wikipedia. Your experience of the read from and write to Wikipedia is very
important to the design and management of this knowledge platform. The
survey will take about two minutes. We deeply appreciate your help on
answering the following questions.
After the survey is done, we will randomly select twenty persons and
present them with USB 2GB Flash Drives. Besides, with each valid
questionnaire, we will donate US $1 dollar to the Wikimedia Foundation. The
result of this survey is analyzed in an anonymous way and is only regarded
as the academic use. Please feel free to fill out the questionnaire. Thanks
again for your time and valuable input.
May happiness and health be with you everyday!
★ On-line Questionnaire: http://140.119.19.152:8080/wiki/
Shari S. C. Shang
Eldon Y. Li
Professor,
Department of Management Information Systems,
National Chengchi University
Tel.: +886-2-82374038; Fax: +886-2-29393754 ; E-mail: s1213527(a)yahoo.com.
tw
Dear friend,
We are conducting a study on the motivation of the knowledge sharing on the
Wikipedia community.
The contributors’ experience to Linux is very important to the design and
management of this knowledge platform.
Would you please post the following on-line questionnaire message to the
Wikipedia platform or forward the message to the members?
After the survey is done, we will randomly select twenty persons and present
them with USB 2GB Flash Drives.
Besides, with each valid questionnaire, we will donate US $1 dollar to the
Wikimedia Foundation.
The result of this survey is analyzed in an anonymous way and is only
regarded as the academic use.
Please help us to complete the data collection.
Thanks so much for your help.
Cheers,
Joanne
[The Message content]
Dear friends,
We are conducting a study on the motivation of the knowledge sharing on
Wikipedia. Your experience of the read from and write to Wikipedia is very
important to the design and management of this knowledge platform. The
survey will take about two minutes. We deeply appreciate your help on
answering the following questions.
After the survey is done, we will randomly select twenty persons and
present them with USB 2GB Flash Drives. Besides, with each valid
questionnaire, we will donate US $1 dollar to the Wikimedia Foundation. The
result of this survey is analyzed in an anonymous way and is only regarded
as the academic use. Please feel free to fill out the questionnaire. Thanks
again for your time and valuable input.
May happiness and health be with you everyday!
★ On-line Questionnaire: http://140.119.19.152:8080/wiki/
Shari S. C. Shang
Eldon Y. Li
Professor,
Department of Management Information Systems,
National Chengchi University
Tel.: +886-2-82374038; Fax: +886-2-29393754 ; E-mail: s1213527(a)yahoo.com.
tw
Dear All,
My name is Avanidhar Chandrasekaran
(http://en.wikipedia.org/wiki/User_talk:Avanidhar).
I work with GroupLens Research at the University of Minnesota, Twin Cities.
As part of my research, I am involved in analyzing the usefulness and
Necessity of author reputation in Wikipedia.
In lieu of this, I have simulated an Interface to color words in an article
based on their Age.
Being experienced contributors to Wikipedia, I invite you to participate in
this study, which involves the following.
1. Please visit the following Instances of wikipedia and evaluate the
interface components which have been incorporated into each of them. Each
of these use their own algorithm to color text.
a) The Wikitrust project
http://wiki-trust.cse.ucsc.edu/index.php/Main_Page
b) The Wiki-reputation project at Grouplens research
http://wiki-reputation.cs.umn.edu/index.php/Main_Page
2) Once you have evaluated the two interfaces, kindly complete this survey
on Wikipedia quality
http://www.surveymonkey.com/s.aspx?sm=hagN5S1JZHxH6pF9SmXkkA_3d_3d
We hope to get your valuable feedback on these interfaces and how Wikipedia
article quality can be improved.
Thanks for your time
Avanidhar Chandrasekaran,
GroupLens Research, University of Minnesota
----- Original Message -----
From: <wikipedia-l-request(a)lists.wikimedia.org>
To: <wikipedia-l(a)lists.wikimedia.org>
Sent: Tuesday, November 25, 2008 7:59 AM
Subject: Wikipedia-l Digest, Vol 64, Issue 4
> Send Wikipedia-l mailing list submissions to
> wikipedia-l(a)lists.wikimedia.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
> or, via email, send a message with subject or body 'help' to
> wikipedia-l-request(a)lists.wikimedia.org
>
> You can reach the person managing the list at
> wikipedia-l-owner(a)lists.wikimedia.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Wikipedia-l digest..."
>
>
> Today's Topics:
>
> 1. Re: Study on Interfaces to Improving Wikipedia Quality
> (J.L.W.S. The Special One)
> 2. Re: Study on Interfaces to Improving Wikipedia Quality
> (Gregory Maxwell)
> 3. Re: Wikipedia-l Digest, Vol 64, Issue 3 (Jocla)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 25 Nov 2008 11:59:54 +0800
> From: "J.L.W.S. The Special One" <hildanknight(a)gmail.com>
> Subject: Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia
> Quality
> To: wikipedia-l(a)lists.wikimedia.org
> Message-ID:
> <d41ac4640811241959o239db59bp1807b90877a33aa9(a)mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> How would the system handle a paragraph full of high quality,
> well-referenced and well-organised content contributed by an editor A,
> that
> is thoroughly copyedited by an editor B? Would editor A be deemed less
> trustworthy when his prose is thoroughly copyedited?
>
> 2008/11/25, Luca de Alfaro <luca(a)dealfaro.org>:
>>
>> Maury,
>>
>> perhaps I can help explain the behavior you saw in the UCSC system (I am
>> one
>> of the developers).
>> New text is always somewhat orange, to signal to visitors that it has not
>> yet been fully reviewed.
>> The higher the reputation, the lighter the shade of orange, but orange it
>> still is (I have no idea of how high was your computed reputation when
>> you
>> started writing that article).
>>
>> Text background becomes white when other people revise it without
>> drastically changing it: this indicates consensus.
>> In our more recent code version, we also have a "vote" button; using
>> this,
>> text can more speedily gain trust without need for many revisions to
>> occur.
>> In a live experiment, where people can click on the vote button, I
>> presume
>> the trust of the text would raise more rapidly. Note that the code
>> prevents
>> double voting, or creating sock-puppet accounts to vote, etc etc.
>>
>> So I don't think based on what you say that the system is tripping over
>> diffs. It is simply considering new text less trusted, and more revised
>> text more trusted, which is what we wanted. It appears however we don't
>> do
>> a very good job on the web site describing the algorithm (I guess we put
>> most of the description work in writing the papers... we will try to
>> improve
>> the web site).
>>
>> We don't measure "edit work" in number of edits, but in number of words
>> changed.
>> As you say, for our system, changing 1000 words in separate edits is the
>> same (provided the edits are all kept, i.e., not reverted) as providing a
>> single 1000-word contribution. We thought of giving a larger prize to
>> larger contributions: precisely, of making the reputation increment
>> proportional to n^a, where n is the number of words, and a > 1. This did
>> not work well for the Wikipedia, because it ended up not rewarding enough
>> the work of the many editors, who clean and polish the articles, thus
>> making
>> many small edits. Technically it would be trivial to change the code to
>> include such a non-linear reward scheme (to adopt rewards proportional to
>> n^a rather than n); whether it is desirable, I have no idea. It does not
>> lead to better quantitative performance of the system, i.e., the
>> resulting
>> trust is not better at predicting future text deletions.
>>
>>
>> Luca
>>
>>
>>
>> > The USCS system did work, but gave me odd results. Apparently I have a
>> > very bad reputation, because when I look in the History at the first
>> > versions, which I wrote in entirety, it colored it all yellow!
>> >
>> > Newer versions of the same articles had much more white, even though
>> > huge portions of the text were still from the origial. This may be due
>> > to diff problems -- I consider diff to be largely random in
>> > effectiveness, sometimes it works, but othertimes a single whitespace
>> > change, especially vertical, will make it think the entire article was
>> > edited.
>> >
>> > My guess is that the system is tripping over diffs like this, and thus
>> > considering the article to have been re-written by another editor.
>> > Since this has happened, MY reputation goes down, or so I understand
>> > it.
>> >
>> > I don?t think this system could possibly work if based on wiki's
>> > diffs. If its going to work it?s going to need to use a much more
>> > reliable system.
>> >
>> > Another problem I see with it is that it will rank an author who?s
>> > contributions are 1000 unchanged comma inserts to be as reliable as an
>> > author who created a perfect 1000 character article (or perhaps rate
>> > the first even higher). There should be some sort of length bias, if
>> > an author makes a big edit, out of character, that?s important to
>> > know.
>> >
>> > Maury
>> >
>> > _______________________________________________
>> > Wikipedia-l mailing list
>> > Wikipedia-l(a)lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
>> >
>> _______________________________________________
>> Wikipedia-l mailing list
>> Wikipedia-l(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
>>
>
>
>
> --
> Written with passion,
> J.L.W.S. The Special One
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 24 Nov 2008 23:14:57 -0500
> From: "Gregory Maxwell" <gmaxwell(a)gmail.com>
> Subject: Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia
> Quality
> To: wikipedia-l(a)lists.wikimedia.org
> Message-ID:
> <e692861c0811242014h464f5a2ei31a1cb3aecfea04b(a)mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> On Mon, Nov 24, 2008 at 8:35 PM, Luca de Alfaro <luca(a)dealfaro.org> wrote:
> [snip]
>> So I don't think based on what you say that the system is tripping over
>> diffs.
>
> For example: I can't figure out why the text in the image caption is
> colored here
> http://wiki-trust.cse.ucsc.edu/index.php/Digital_room_correction
>
> I couldn't initially figure out why *anything* above the external link
> section was colored? though the inability to diff contributed to that.
>
> On Mon, Nov 24, 2008 at 8:22 PM, Luca de Alfaro <luca(a)dealfaro.org> wrote:
>> I agree with Gregory that it is very useful to quantify the usefulness of
>> trust information on text -- otherwise, all comparison are very
>> subjective.
>> In our WikiSym 08 paper, we measure various parameters of the "trust"
>> coloring we compute, including:
>>
>> - Recall of deletions. Only 3.4% of text is in the lower half of trust
>> values, yet this is 66% of the text that is deleted in the very next
>> revision.
>> - Precision of deletions. Text is the bottom half of trust values has
>> probability 33% of being deleted in the next revision, agaist a
>> probability
>> of 1.9% for general text. The deletion probability raises to 62% for
>> text
>> in the bottom 20% of trust values.
>> - We study the correlation between the trust of a word, sampled at
>> random
>> in all revisions, and the future lifespan of a word (correcting for the
>> finite horizon effect due to the finite number of revisions in each
>> article), showing positive correlation.
> [snip]
>
> These performance metrics are better than I would have guessed from
> browsing through the output. How does the color mapping reflect the
> trust values? Basically when I use it I see a *lot* of colored things
> which are perfectly fine. At least for me, the difference between
> shades is far less cognitively significant than colored vs
> non-colored, so that may be the source of my confusion.
>
> Have you compared your system to a simple toy trust metric? I'd
> propose "revisions by users in their first week and before their first
> 7 (?) edits are untrusted". This reflects the existing automatic
> trust system on the site (auto-confirmation), and also reflects the a
> type of trust checking applied manually by editors. I think thats
> the bar any more sophisticated trust metric needs to outperform.
>
> Thank you so much for your response!
>
> ------------------------------
>
> Message: 3
> Date: Tue, 25 Nov 2008 07:59:19 -0000
> From: "Jocla" <paresdoce(a)gmail.com>
> Subject: Re: [Wikipedia-l] Wikipedia-l Digest, Vol 64, Issue 3
> To: <wikipedia-l(a)lists.wikimedia.org>
> Message-ID: <001701c94ed3$b9b78d50$7f01a8c0@windows337902b>
> Content-Type: text/plain; format=flowed; charset="iso-8859-1";
> reply-type=original
>
> susceibe
> ----- Original Message -----
> From: <wikipedia-l-request(a)lists.wikimedia.org>
> To: <wikipedia-l(a)lists.wikimedia.org>
> Sent: Tuesday, November 25, 2008 1:35 AM
> Subject: Wikipedia-l Digest, Vol 64, Issue 3
>
>
>> Send Wikipedia-l mailing list submissions to
>> wikipedia-l(a)lists.wikimedia.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
>> or, via email, send a message with subject or body 'help' to
>> wikipedia-l-request(a)lists.wikimedia.org
>>
>> You can reach the person managing the list at
>> wikipedia-l-owner(a)lists.wikimedia.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Wikipedia-l digest..."
>>
>>
>> Today's Topics:
>>
>> 1. suscribe (Jocla)
>> 2. Study on Interfaces to Improving Wikipedia Quality
>> (avani(a)cs.umn.edu)
>> 3. Re: Study on Interfaces to Improving Wikipedia Quality
>> (michael west)
>> 4. Re: Study on Interfaces to Improving Wikipedia Quality
>> (Joseph Reagle)
>> 5. Re: Study on Interfaces to Improving Wikipedia Quality
>> (Maury Markowitz)
>> 6. Re: Study on Interfaces to Improving Wikipedia Quality
>> (Gregory Maxwell)
>> 7. Re: Study on Interfaces to Improving Wikipedia Quality
>> (Luca de Alfaro)
>> 8. Re: Study on Interfaces to Improving Wikipedia Quality
>> (Luca de Alfaro)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Wed, 19 Nov 2008 18:22:03 -0000
>> From: "Jocla" <paresdoce(a)gmail.com>
>> Subject: [Wikipedia-l] suscribe
>> To: <wikipedia-l(a)lists.wikimedia.org>
>> Message-ID: <001c01c94a73$ba1850e0$7f01a8c0@windows337902b>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>> thanks for your e-mail, i would like to suscribe.
>>
>> ------------------------------
>>
>> Message: 2
>> Date: 19 Nov 2008 13:23:53 -0600
>> From: avani(a)cs.umn.edu
>> Subject: [Wikipedia-l] Study on Interfaces to Improving Wikipedia
>> Quality
>> To: wikipedia-l(a)lists.wikimedia.org
>> Message-ID: <Prayer.1.0.18.0811191323530.7842(a)sabinus.cs.umn.edu>
>> Content-Type: text/plain; format=flowed; charset=ISO-8859-1
>>
>>
>> Dear All,
>>
>> My name is Avanidhar Chandrasekaran
>> (http://en.wikipedia.org/wiki/User_talk:Avanidhar).
>>
>> I work with GroupLens Research at the University of Minnesota, Twin
>> Cities.
>> As part of my research, I am involved in analyzing the usefulness and
>> Necessity of author reputation in Wikipedia.
>>
>> In lieu of this, I have simulated an Interface to color words in an
>> article
>> based on their Age.
>>
>> Being experienced contributors to Wikipedia, I invite you to participate
>> in
>> this study, which involves the following.
>>
>> 1. Please visit the following Instances of wikipedia and evaluate the
>> interface components which have been incorporated into each of them. Each
>> of these use their own algorithm to color text.
>>
>> a) The Wikitrust project
>>
>> http://wiki-trust.cse.ucsc.edu/index.php/Main_Page
>>
>> b) The Wiki-reputation project at Grouplens research
>>
>> http://wiki-reputation.cs.umn.edu/index.php/Main_Page
>>
>> 2) Once you have evaluated the two interfaces, kindly complete this
>> survey
>> on Wikipedia quality
>>
>> http://www.surveymonkey.com/s.aspx?sm=hagN5S1JZHxH6pF9SmXkkA_3d_3d
>>
>>
>> We hope to get your valuable feedback on these interfaces and how
>> Wikipedia
>> article quality can be improved.
>>
>> Thanks for your time
>>
>> Avanidhar Chandrasekaran,
>>
>> GroupLens Research, University of Minnesota
>>
>>
>>
>>
>> ------------------------------
>>
>> Message: 3
>> Date: Wed, 19 Nov 2008 20:01:27 +0000
>> From: "michael west" <michawest(a)gmail.com>
>> Subject: Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia
>> Quality
>> To: wikipedia-l(a)lists.wikimedia.org
>> Message-ID:
>> <cfe6de600811191201h727fb4e4s9660f64f2815c93f(a)mail.gmail.com>
>> Content-Type: text/plain; charset=ISO-8859-1
>>
>> 2008/11/19 <avani(a)cs.umn.edu>
>>
>>>
>>> Dear All,
>>>
>>> My name is Avanidhar Chandrasekaran
>>> (http://en.wikipedia.org/wiki/User_talk:Avanidhar).
>>>
>>> I work with GroupLens Research at the University of Minnesota, Twin
>>> Cities.
>>> As part of my research, I am involved in analyzing the usefulness and
>>> Necessity of author reputation in Wikipedia.
>>>
>>> In lieu of this, I have simulated an Interface to color words in an
>>> article
>>> based on their Age.
>>>
>>> Being experienced contributors to Wikipedia, I invite you to participate
>>> in
>>> this study, which involves the following.
>>>
>>> 1. Please visit the following Instances of wikipedia and evaluate the
>>> interface components which have been incorporated into each of them.
>>> Each
>>> of these use their own algorithm to color text.
>>>
>>> a) The Wikitrust project
>>>
>>> http://wiki-trust.cse.ucsc.edu/index.php/Main_Page
>>>
>>> b) The Wiki-reputation project at Grouplens research
>>>
>>> http://wiki-reputation.cs.umn.edu/index.php/Main_Page
>>>
>>> 2) Once you have evaluated the two interfaces, kindly complete this
>>> survey
>>> on Wikipedia quality
>>>
>>> http://www.surveymonkey.com/s.aspx?sm=hagN5S1JZHxH6pF9SmXkkA_3d_3d
>>>
>>>
>>> We hope to get your valuable feedback on these interfaces and how
>>> Wikipedia
>>> article quality can be improved.
>>>
>>> Thanks for your time
>>>
>>> Avanidhar Chandrasekaran,
>>>
>>> GroupLens Research, University of Minnesota
>>>
>>
>> Quite interesting - the "age of words" color coding might be useful in
>> detecting obtuse type vandalism.
>>
>> m
>>
>>
>> ------------------------------
>>
>> Message: 4
>> Date: Wed, 19 Nov 2008 17:40:23 -0500
>> From: Joseph Reagle <reagle(a)mit.edu>
>> Subject: Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia
>> Quality
>> To: wikipedia-l(a)lists.wikimedia.org
>> Message-ID: <200811191740.23471.reagle(a)mit.edu>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>> On Wednesday 19 November 2008, avani(a)cs.umn.edu wrote:
>>> We hope to get your valuable feedback on these interfaces and how
>>> Wikipedia
>>> article quality can be improved.
>>
>> This might bias other respondants, but I thought it was an intersting
>> idea
>> so I wanted to share it. I concluded with the following which is no doubt
>> affected by my being a WikiGnome:
>>
>> [[
>> If I see an error, I fix it without much regard to time or author
>> reputation. I do pay attention to and investigate author reputation on
>> substantive issues on the discussion pages and it would be interesting to
>> see a discussion thread colored according to reputation.
>> ]]
>>
>>
>>
>> ------------------------------
>>
>> Message: 5
>> Date: Sun, 23 Nov 2008 09:03:25 -0500
>> From: "Maury Markowitz" <maury.markowitz(a)gmail.com>
>> Subject: Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia
>> Quality
>> To: wikipedia-l(a)lists.wikimedia.org
>> Message-ID:
>> <5bdbc9050811230603u5a9ca6e8ned59c4421c8eacb0(a)mail.gmail.com>
>> Content-Type: text/plain; charset=ISO-8859-1
>>
>> On Wed, Nov 19, 2008 at 2:23 PM, <avani(a)cs.umn.edu> wrote:
>>> We hope to get your valuable feedback on these interfaces and how
>>> Wikipedia
>>> article quality can be improved.
>>
>> Given the older snapshots, I selected older articles that I had
>> started, NuBUS and ARCNET.
>>
>> The "time based" system from UMN did not work at all, every search
>> resulted in a page not found.
>>
>> The USCS system did work, but gave me odd results. Apparently I have a
>> very bad reputation, because when I look in the History at the first
>> versions, which I wrote in entirety, it colored it all yellow!
>>
>> Newer versions of the same articles had much more white, even though
>> huge portions of the text were still from the origial. This may be due
>> to diff problems -- I consider diff to be largely random in
>> effectiveness, sometimes it works, but othertimes a single whitespace
>> change, especially vertical, will make it think the entire article was
>> edited.
>>
>> My guess is that the system is tripping over diffs like this, and thus
>> considering the article to have been re-written by another editor.
>> Since this has happened, MY reputation goes down, or so I understand
>> it.
>>
>> I don?t think this system could possibly work if based on wiki's
>> diffs. If its going to work it?s going to need to use a much more
>> reliable system.
>>
>> Another problem I see with it is that it will rank an author who?s
>> contributions are 1000 unchanged comma inserts to be as reliable as an
>> author who created a perfect 1000 character article (or perhaps rate
>> the first even higher). There should be some sort of length bias, if
>> an author makes a big edit, out of character, that?s important to
>> know.
>>
>> Maury
>>
>>
>>
>> ------------------------------
>>
>> Message: 6
>> Date: Sun, 23 Nov 2008 09:44:40 -0500
>> From: "Gregory Maxwell" <gmaxwell(a)gmail.com>
>> Subject: Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia
>> Quality
>> To: wikipedia-l(a)lists.wikimedia.org
>> Message-ID:
>> <e692861c0811230644i316f94abg6cafe7ef87f6bc3b(a)mail.gmail.com>
>> Content-Type: text/plain; charset=UTF-8
>>
>> On Sun, Nov 23, 2008 at 9:03 AM, Maury Markowitz
>> <maury.markowitz(a)gmail.com> wrote:
>>> On Wed, Nov 19, 2008 at 2:23 PM, <avani(a)cs.umn.edu> wrote:
>>>> We hope to get your valuable feedback on these interfaces and how
>>>> Wikipedia
>>>> article quality can be improved.
>>>
>>> Given the older snapshots, I selected older articles that I had
>>> started, NuBUS and ARCNET.
>>>
>>> The "time based" system from UMN did not work at all, every search
>>> resulted in a page not found.
>>
>> The UMN system intentionally included only a small number (70?)
>> articles. This is why you needed to use the random page function to
>> browse among them.
>>
>> This doesn't reflect any short coming of the system, but it most
>> likely just reflects the limits of computational resources they were
>> working under.
>>
>> [snip]
>>> Newer versions of the same articles had much more white, even though
>>> huge portions of the text were still from the origial. This may be due
>>> to diff problems -- I consider diff to be largely random in
>>> effectiveness, sometimes it works, but othertimes a single whitespace
>>> change, especially vertical, will make it think the entire article was
>>> edited.
>>
>> Yes, I had exactly the same experience with the USCS system: Different
>> coloring for text I'd added in same edit which created the article.
>> Quite inscrutable.
>>
>> [snip]
>>> Another problem I see with it is that it will rank an author who?s
>>> contributions are 1000 unchanged comma inserts to be as reliable as an
>>> author who created a perfect 1000 character article (or perhaps rate
>>> the first even higher). There should be some sort of length bias, if
>>> an author makes a big edit, out of character, that?s important to
>>> know.
>>
>> For the articles it covered I found the UMN system to be more usable:
>> It's output was more explicable, and the signal to noise ratio was
>> just better. This may be partially due to bugs in the USCS history
>> analysis, and different a different choice in coloring thresholds
>> (USCS seemed to color almost everything, removing the usefulness of
>> color as something to draw my attention).
>>
>> Even so, I'm distrustful of "reputation" as an automated metric.
>> Reputation is a fuzzy thing (consider your comma example), but time is
>> just a straight forward metric which is much easier to get right. Your
>> tireless and unreverted editing of external links tells me very little
>> about your ability to make a reliable edit to the intro of an article,
>> ... or at least very little that I didn't already know by merely
>> knowing if your account was brand new or not. (New accounts are more
>> likely to be used by inexperienced and ill-motivated persons)
>>
>> I believe a metric applied correctly, consistently, and understandably
>> is just going to be more useful than a metric which considers more
>> data but is also subject to more noise. The differential performance
>> between these two systems has done nothing but confirm my suspicions
>> in this regard.
>>
>> A simply objective challenge for any predictive coloring system would
>> be to use them in the following experimental procedure:
>>
>> * Take a dump of Wikipedia up a year old, use this as the underlying
>> knowledge for the systems.
>> * Make several random selections of articles and include the newer
>> revisions not included in the initial set up to 6 months old. Call
>> these the test sets.
>> * The predictive coloring system should then take each revision in a
>> test set in time order and predict if it will be reverted (Within X
>> time?).
>> * The actual edits up to now should be analyzed to determined which
>> changes actually were reverted and when.
>>
>> The final score will be the false positive and false negative rates.
>> So long as e assume that the existing editing practices are not too
>> bad we should find that the best predictive coloring system would
>> generally tend to minimize these rates.
>>
>> ------------------------------
>>
>> Message: 7
>> Date: Mon, 24 Nov 2008 17:22:23 -0800
>> From: "Luca de Alfaro" <luca(a)dealfaro.org>
>> Subject: Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia
>> Quality
>> To: wikipedia-l(a)lists.wikimedia.org
>> Message-ID:
>> <28fa90930811241722y25c26bf1i6441b489e3ff6285(a)mail.gmail.com>
>> Content-Type: text/plain; charset=ISO-8859-1
>>
>> I agree with Gregory that it is very useful to quantify the usefulness of
>> trust information on text -- otherwise, all comparison are very
>> subjective.
>> In our WikiSym 08 paper, we measure various parameters of the "trust"
>> coloring we compute, including:
>>
>> - Recall of deletions. Only 3.4% of text is in the lower half of trust
>> values, yet this is 66% of the text that is deleted in the very next
>> revision.
>> - Precision of deletions. Text is the bottom half of trust values has
>> probability 33% of being deleted in the next revision, agaist a
>> probability
>> of 1.9% for general text. The deletion probability raises to 62% for
>> text
>> in the bottom 20% of trust values.
>> - We study the correlation between the trust of a word, sampled at
>> random
>> in all revisions, and the future lifespan of a word (correcting for the
>> finite horizon effect due to the finite number of revisions in each
>> article), showing positive correlation.
>>
>> Some aspects are not captured by the above measures:
>>
>> - We ensured that every "tampering" (including cut-and-paste) are
>> reflected in the trust coloring, so it is hard to subvert the algorithm
>> (does "age" provide this?).
>> - We ensured the whole scheme is robust wrt attacks (see the various
>> papers if you are interested).
>>
>> I fully believe that it should not be hard to improve on our system re.
>> the
>> above measurements. And I fully agree that the "reputation" we compute
>> is
>> essentially an internal parameter of the system, and does not really
>> constitute a good summary of a person's overall Wikipedia contribution;
>> for
>> this and other reasons we do not display it.
>>
>> Luca
>>
>> A simply objective challenge for any predictive coloring system would
>>> be to use them in the following experimental procedure:
>>>
>>> * Take a dump of Wikipedia up a year old, use this as the underlying
>>> knowledge for the systems.
>>> * Make several random selections of articles and include the newer
>>> revisions not included in the initial set up to 6 months old. Call
>>> these the test sets.
>>> * The predictive coloring system should then take each revision in a
>>> test set in time order and predict if it will be reverted (Within X
>>> time?).
>>> * The actual edits up to now should be analyzed to determined which
>>> changes actually were reverted and when.
>>>
>>> The final score will be the false positive and false negative rates.
>>> So long as e assume that the existing editing practices are not too
>>> bad we should find that the best predictive coloring system would
>>> generally tend to minimize these rates.
>>> _______________________________________________
>>> Wikipedia-l mailing list
>>> Wikipedia-l(a)lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
>>>
>>
>>
>> ------------------------------
>>
>> Message: 8
>> Date: Mon, 24 Nov 2008 17:35:13 -0800
>> From: "Luca de Alfaro" <luca(a)dealfaro.org>
>> Subject: Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia
>> Quality
>> To: wikipedia-l(a)lists.wikimedia.org
>> Message-ID:
>> <28fa90930811241735l235af9cag554632448d80ef7(a)mail.gmail.com>
>> Content-Type: text/plain; charset=ISO-8859-1
>>
>> Maury,
>>
>> perhaps I can help explain the behavior you saw in the UCSC system (I am
>> one
>> of the developers).
>> New text is always somewhat orange, to signal to visitors that it has not
>> yet been fully reviewed.
>> The higher the reputation, the lighter the shade of orange, but orange it
>> still is (I have no idea of how high was your computed reputation when
>> you
>> started writing that article).
>>
>> Text background becomes white when other people revise it without
>> drastically changing it: this indicates consensus.
>> In our more recent code version, we also have a "vote" button; using
>> this,
>> text can more speedily gain trust without need for many revisions to
>> occur.
>> In a live experiment, where people can click on the vote button, I
>> presume
>> the trust of the text would raise more rapidly. Note that the code
>> prevents
>> double voting, or creating sock-puppet accounts to vote, etc etc.
>>
>> So I don't think based on what you say that the system is tripping over
>> diffs. It is simply considering new text less trusted, and more revised
>> text more trusted, which is what we wanted. It appears however we don't
>> do
>> a very good job on the web site describing the algorithm (I guess we put
>> most of the description work in writing the papers... we will try to
>> improve
>> the web site).
>>
>> We don't measure "edit work" in number of edits, but in number of words
>> changed.
>> As you say, for our system, changing 1000 words in separate edits is the
>> same (provided the edits are all kept, i.e., not reverted) as providing a
>> single 1000-word contribution. We thought of giving a larger prize to
>> larger contributions: precisely, of making the reputation increment
>> proportional to n^a, where n is the number of words, and a > 1. This did
>> not work well for the Wikipedia, because it ended up not rewarding enough
>> the work of the many editors, who clean and polish the articles, thus
>> making
>> many small edits. Technically it would be trivial to change the code to
>> include such a non-linear reward scheme (to adopt rewards proportional to
>> n^a rather than n); whether it is desirable, I have no idea. It does not
>> lead to better quantitative performance of the system, i.e., the
>> resulting
>> trust is not better at predicting future text deletions.
>>
>> Luca
>>
>>
>>> The USCS system did work, but gave me odd results. Apparently I have a
>>> very bad reputation, because when I look in the History at the first
>>> versions, which I wrote in entirety, it colored it all yellow!
>>>
>>> Newer versions of the same articles had much more white, even though
>>> huge portions of the text were still from the origial. This may be due
>>> to diff problems -- I consider diff to be largely random in
>>> effectiveness, sometimes it works, but othertimes a single whitespace
>>> change, especially vertical, will make it think the entire article was
>>> edited.
>>>
>>> My guess is that the system is tripping over diffs like this, and thus
>>> considering the article to have been re-written by another editor.
>>> Since this has happened, MY reputation goes down, or so I understand
>>> it.
>>>
>>> I don?t think this system could possibly work if based on wiki's
>>> diffs. If its going to work it?s going to need to use a much more
>>> reliable system.
>>>
>>> Another problem I see with it is that it will rank an author who?s
>>> contributions are 1000 unchanged comma inserts to be as reliable as an
>>> author who created a perfect 1000 character article (or perhaps rate
>>> the first even higher). There should be some sort of length bias, if
>>> an author makes a big edit, out of character, that?s important to
>>> know.
>>>
>>> Maury
>>>
>>> _______________________________________________
>>> Wikipedia-l mailing list
>>> Wikipedia-l(a)lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
>>>
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> Wikipedia-l mailing list
>> Wikipedia-l(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
>>
>>
>> End of Wikipedia-l Digest, Vol 64, Issue 3
>> ******************************************
>>
>
>
>
>
> ------------------------------
>
> _______________________________________________
> Wikipedia-l mailing list
> Wikipedia-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
>
>
> End of Wikipedia-l Digest, Vol 64, Issue 4
> ******************************************
susceibe
----- Original Message -----
From: <wikipedia-l-request(a)lists.wikimedia.org>
To: <wikipedia-l(a)lists.wikimedia.org>
Sent: Tuesday, November 25, 2008 1:35 AM
Subject: Wikipedia-l Digest, Vol 64, Issue 3
> Send Wikipedia-l mailing list submissions to
> wikipedia-l(a)lists.wikimedia.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
> or, via email, send a message with subject or body 'help' to
> wikipedia-l-request(a)lists.wikimedia.org
>
> You can reach the person managing the list at
> wikipedia-l-owner(a)lists.wikimedia.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Wikipedia-l digest..."
>
>
> Today's Topics:
>
> 1. suscribe (Jocla)
> 2. Study on Interfaces to Improving Wikipedia Quality
> (avani(a)cs.umn.edu)
> 3. Re: Study on Interfaces to Improving Wikipedia Quality
> (michael west)
> 4. Re: Study on Interfaces to Improving Wikipedia Quality
> (Joseph Reagle)
> 5. Re: Study on Interfaces to Improving Wikipedia Quality
> (Maury Markowitz)
> 6. Re: Study on Interfaces to Improving Wikipedia Quality
> (Gregory Maxwell)
> 7. Re: Study on Interfaces to Improving Wikipedia Quality
> (Luca de Alfaro)
> 8. Re: Study on Interfaces to Improving Wikipedia Quality
> (Luca de Alfaro)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 19 Nov 2008 18:22:03 -0000
> From: "Jocla" <paresdoce(a)gmail.com>
> Subject: [Wikipedia-l] suscribe
> To: <wikipedia-l(a)lists.wikimedia.org>
> Message-ID: <001c01c94a73$ba1850e0$7f01a8c0@windows337902b>
> Content-Type: text/plain; charset="iso-8859-1"
>
> thanks for your e-mail, i would like to suscribe.
>
> ------------------------------
>
> Message: 2
> Date: 19 Nov 2008 13:23:53 -0600
> From: avani(a)cs.umn.edu
> Subject: [Wikipedia-l] Study on Interfaces to Improving Wikipedia
> Quality
> To: wikipedia-l(a)lists.wikimedia.org
> Message-ID: <Prayer.1.0.18.0811191323530.7842(a)sabinus.cs.umn.edu>
> Content-Type: text/plain; format=flowed; charset=ISO-8859-1
>
>
> Dear All,
>
> My name is Avanidhar Chandrasekaran
> (http://en.wikipedia.org/wiki/User_talk:Avanidhar).
>
> I work with GroupLens Research at the University of Minnesota, Twin
> Cities.
> As part of my research, I am involved in analyzing the usefulness and
> Necessity of author reputation in Wikipedia.
>
> In lieu of this, I have simulated an Interface to color words in an
> article
> based on their Age.
>
> Being experienced contributors to Wikipedia, I invite you to participate
> in
> this study, which involves the following.
>
> 1. Please visit the following Instances of wikipedia and evaluate the
> interface components which have been incorporated into each of them. Each
> of these use their own algorithm to color text.
>
> a) The Wikitrust project
>
> http://wiki-trust.cse.ucsc.edu/index.php/Main_Page
>
> b) The Wiki-reputation project at Grouplens research
>
> http://wiki-reputation.cs.umn.edu/index.php/Main_Page
>
> 2) Once you have evaluated the two interfaces, kindly complete this survey
> on Wikipedia quality
>
> http://www.surveymonkey.com/s.aspx?sm=hagN5S1JZHxH6pF9SmXkkA_3d_3d
>
>
> We hope to get your valuable feedback on these interfaces and how
> Wikipedia
> article quality can be improved.
>
> Thanks for your time
>
> Avanidhar Chandrasekaran,
>
> GroupLens Research, University of Minnesota
>
>
>
>
> ------------------------------
>
> Message: 3
> Date: Wed, 19 Nov 2008 20:01:27 +0000
> From: "michael west" <michawest(a)gmail.com>
> Subject: Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia
> Quality
> To: wikipedia-l(a)lists.wikimedia.org
> Message-ID:
> <cfe6de600811191201h727fb4e4s9660f64f2815c93f(a)mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> 2008/11/19 <avani(a)cs.umn.edu>
>
>>
>> Dear All,
>>
>> My name is Avanidhar Chandrasekaran
>> (http://en.wikipedia.org/wiki/User_talk:Avanidhar).
>>
>> I work with GroupLens Research at the University of Minnesota, Twin
>> Cities.
>> As part of my research, I am involved in analyzing the usefulness and
>> Necessity of author reputation in Wikipedia.
>>
>> In lieu of this, I have simulated an Interface to color words in an
>> article
>> based on their Age.
>>
>> Being experienced contributors to Wikipedia, I invite you to participate
>> in
>> this study, which involves the following.
>>
>> 1. Please visit the following Instances of wikipedia and evaluate the
>> interface components which have been incorporated into each of them. Each
>> of these use their own algorithm to color text.
>>
>> a) The Wikitrust project
>>
>> http://wiki-trust.cse.ucsc.edu/index.php/Main_Page
>>
>> b) The Wiki-reputation project at Grouplens research
>>
>> http://wiki-reputation.cs.umn.edu/index.php/Main_Page
>>
>> 2) Once you have evaluated the two interfaces, kindly complete this
>> survey
>> on Wikipedia quality
>>
>> http://www.surveymonkey.com/s.aspx?sm=hagN5S1JZHxH6pF9SmXkkA_3d_3d
>>
>>
>> We hope to get your valuable feedback on these interfaces and how
>> Wikipedia
>> article quality can be improved.
>>
>> Thanks for your time
>>
>> Avanidhar Chandrasekaran,
>>
>> GroupLens Research, University of Minnesota
>>
>
> Quite interesting - the "age of words" color coding might be useful in
> detecting obtuse type vandalism.
>
> m
>
>
> ------------------------------
>
> Message: 4
> Date: Wed, 19 Nov 2008 17:40:23 -0500
> From: Joseph Reagle <reagle(a)mit.edu>
> Subject: Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia
> Quality
> To: wikipedia-l(a)lists.wikimedia.org
> Message-ID: <200811191740.23471.reagle(a)mit.edu>
> Content-Type: text/plain; charset="iso-8859-1"
>
> On Wednesday 19 November 2008, avani(a)cs.umn.edu wrote:
>> We hope to get your valuable feedback on these interfaces and how
>> Wikipedia
>> article quality can be improved.
>
> This might bias other respondants, but I thought it was an intersting idea
> so I wanted to share it. I concluded with the following which is no doubt
> affected by my being a WikiGnome:
>
> [[
> If I see an error, I fix it without much regard to time or author
> reputation. I do pay attention to and investigate author reputation on
> substantive issues on the discussion pages and it would be interesting to
> see a discussion thread colored according to reputation.
> ]]
>
>
>
> ------------------------------
>
> Message: 5
> Date: Sun, 23 Nov 2008 09:03:25 -0500
> From: "Maury Markowitz" <maury.markowitz(a)gmail.com>
> Subject: Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia
> Quality
> To: wikipedia-l(a)lists.wikimedia.org
> Message-ID:
> <5bdbc9050811230603u5a9ca6e8ned59c4421c8eacb0(a)mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> On Wed, Nov 19, 2008 at 2:23 PM, <avani(a)cs.umn.edu> wrote:
>> We hope to get your valuable feedback on these interfaces and how
>> Wikipedia
>> article quality can be improved.
>
> Given the older snapshots, I selected older articles that I had
> started, NuBUS and ARCNET.
>
> The "time based" system from UMN did not work at all, every search
> resulted in a page not found.
>
> The USCS system did work, but gave me odd results. Apparently I have a
> very bad reputation, because when I look in the History at the first
> versions, which I wrote in entirety, it colored it all yellow!
>
> Newer versions of the same articles had much more white, even though
> huge portions of the text were still from the origial. This may be due
> to diff problems -- I consider diff to be largely random in
> effectiveness, sometimes it works, but othertimes a single whitespace
> change, especially vertical, will make it think the entire article was
> edited.
>
> My guess is that the system is tripping over diffs like this, and thus
> considering the article to have been re-written by another editor.
> Since this has happened, MY reputation goes down, or so I understand
> it.
>
> I don?t think this system could possibly work if based on wiki's
> diffs. If its going to work it?s going to need to use a much more
> reliable system.
>
> Another problem I see with it is that it will rank an author who?s
> contributions are 1000 unchanged comma inserts to be as reliable as an
> author who created a perfect 1000 character article (or perhaps rate
> the first even higher). There should be some sort of length bias, if
> an author makes a big edit, out of character, that?s important to
> know.
>
> Maury
>
>
>
> ------------------------------
>
> Message: 6
> Date: Sun, 23 Nov 2008 09:44:40 -0500
> From: "Gregory Maxwell" <gmaxwell(a)gmail.com>
> Subject: Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia
> Quality
> To: wikipedia-l(a)lists.wikimedia.org
> Message-ID:
> <e692861c0811230644i316f94abg6cafe7ef87f6bc3b(a)mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> On Sun, Nov 23, 2008 at 9:03 AM, Maury Markowitz
> <maury.markowitz(a)gmail.com> wrote:
>> On Wed, Nov 19, 2008 at 2:23 PM, <avani(a)cs.umn.edu> wrote:
>>> We hope to get your valuable feedback on these interfaces and how
>>> Wikipedia
>>> article quality can be improved.
>>
>> Given the older snapshots, I selected older articles that I had
>> started, NuBUS and ARCNET.
>>
>> The "time based" system from UMN did not work at all, every search
>> resulted in a page not found.
>
> The UMN system intentionally included only a small number (70?)
> articles. This is why you needed to use the random page function to
> browse among them.
>
> This doesn't reflect any short coming of the system, but it most
> likely just reflects the limits of computational resources they were
> working under.
>
> [snip]
>> Newer versions of the same articles had much more white, even though
>> huge portions of the text were still from the origial. This may be due
>> to diff problems -- I consider diff to be largely random in
>> effectiveness, sometimes it works, but othertimes a single whitespace
>> change, especially vertical, will make it think the entire article was
>> edited.
>
> Yes, I had exactly the same experience with the USCS system: Different
> coloring for text I'd added in same edit which created the article.
> Quite inscrutable.
>
> [snip]
>> Another problem I see with it is that it will rank an author who?s
>> contributions are 1000 unchanged comma inserts to be as reliable as an
>> author who created a perfect 1000 character article (or perhaps rate
>> the first even higher). There should be some sort of length bias, if
>> an author makes a big edit, out of character, that?s important to
>> know.
>
> For the articles it covered I found the UMN system to be more usable:
> It's output was more explicable, and the signal to noise ratio was
> just better. This may be partially due to bugs in the USCS history
> analysis, and different a different choice in coloring thresholds
> (USCS seemed to color almost everything, removing the usefulness of
> color as something to draw my attention).
>
> Even so, I'm distrustful of "reputation" as an automated metric.
> Reputation is a fuzzy thing (consider your comma example), but time is
> just a straight forward metric which is much easier to get right. Your
> tireless and unreverted editing of external links tells me very little
> about your ability to make a reliable edit to the intro of an article,
> ... or at least very little that I didn't already know by merely
> knowing if your account was brand new or not. (New accounts are more
> likely to be used by inexperienced and ill-motivated persons)
>
> I believe a metric applied correctly, consistently, and understandably
> is just going to be more useful than a metric which considers more
> data but is also subject to more noise. The differential performance
> between these two systems has done nothing but confirm my suspicions
> in this regard.
>
> A simply objective challenge for any predictive coloring system would
> be to use them in the following experimental procedure:
>
> * Take a dump of Wikipedia up a year old, use this as the underlying
> knowledge for the systems.
> * Make several random selections of articles and include the newer
> revisions not included in the initial set up to 6 months old. Call
> these the test sets.
> * The predictive coloring system should then take each revision in a
> test set in time order and predict if it will be reverted (Within X
> time?).
> * The actual edits up to now should be analyzed to determined which
> changes actually were reverted and when.
>
> The final score will be the false positive and false negative rates.
> So long as e assume that the existing editing practices are not too
> bad we should find that the best predictive coloring system would
> generally tend to minimize these rates.
>
> ------------------------------
>
> Message: 7
> Date: Mon, 24 Nov 2008 17:22:23 -0800
> From: "Luca de Alfaro" <luca(a)dealfaro.org>
> Subject: Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia
> Quality
> To: wikipedia-l(a)lists.wikimedia.org
> Message-ID:
> <28fa90930811241722y25c26bf1i6441b489e3ff6285(a)mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> I agree with Gregory that it is very useful to quantify the usefulness of
> trust information on text -- otherwise, all comparison are very
> subjective.
> In our WikiSym 08 paper, we measure various parameters of the "trust"
> coloring we compute, including:
>
> - Recall of deletions. Only 3.4% of text is in the lower half of trust
> values, yet this is 66% of the text that is deleted in the very next
> revision.
> - Precision of deletions. Text is the bottom half of trust values has
> probability 33% of being deleted in the next revision, agaist a
> probability
> of 1.9% for general text. The deletion probability raises to 62% for
> text
> in the bottom 20% of trust values.
> - We study the correlation between the trust of a word, sampled at
> random
> in all revisions, and the future lifespan of a word (correcting for the
> finite horizon effect due to the finite number of revisions in each
> article), showing positive correlation.
>
> Some aspects are not captured by the above measures:
>
> - We ensured that every "tampering" (including cut-and-paste) are
> reflected in the trust coloring, so it is hard to subvert the algorithm
> (does "age" provide this?).
> - We ensured the whole scheme is robust wrt attacks (see the various
> papers if you are interested).
>
> I fully believe that it should not be hard to improve on our system re.
> the
> above measurements. And I fully agree that the "reputation" we compute is
> essentially an internal parameter of the system, and does not really
> constitute a good summary of a person's overall Wikipedia contribution;
> for
> this and other reasons we do not display it.
>
> Luca
>
> A simply objective challenge for any predictive coloring system would
>> be to use them in the following experimental procedure:
>>
>> * Take a dump of Wikipedia up a year old, use this as the underlying
>> knowledge for the systems.
>> * Make several random selections of articles and include the newer
>> revisions not included in the initial set up to 6 months old. Call
>> these the test sets.
>> * The predictive coloring system should then take each revision in a
>> test set in time order and predict if it will be reverted (Within X
>> time?).
>> * The actual edits up to now should be analyzed to determined which
>> changes actually were reverted and when.
>>
>> The final score will be the false positive and false negative rates.
>> So long as e assume that the existing editing practices are not too
>> bad we should find that the best predictive coloring system would
>> generally tend to minimize these rates.
>> _______________________________________________
>> Wikipedia-l mailing list
>> Wikipedia-l(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
>>
>
>
> ------------------------------
>
> Message: 8
> Date: Mon, 24 Nov 2008 17:35:13 -0800
> From: "Luca de Alfaro" <luca(a)dealfaro.org>
> Subject: Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia
> Quality
> To: wikipedia-l(a)lists.wikimedia.org
> Message-ID:
> <28fa90930811241735l235af9cag554632448d80ef7(a)mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Maury,
>
> perhaps I can help explain the behavior you saw in the UCSC system (I am
> one
> of the developers).
> New text is always somewhat orange, to signal to visitors that it has not
> yet been fully reviewed.
> The higher the reputation, the lighter the shade of orange, but orange it
> still is (I have no idea of how high was your computed reputation when you
> started writing that article).
>
> Text background becomes white when other people revise it without
> drastically changing it: this indicates consensus.
> In our more recent code version, we also have a "vote" button; using this,
> text can more speedily gain trust without need for many revisions to
> occur.
> In a live experiment, where people can click on the vote button, I presume
> the trust of the text would raise more rapidly. Note that the code
> prevents
> double voting, or creating sock-puppet accounts to vote, etc etc.
>
> So I don't think based on what you say that the system is tripping over
> diffs. It is simply considering new text less trusted, and more revised
> text more trusted, which is what we wanted. It appears however we don't
> do
> a very good job on the web site describing the algorithm (I guess we put
> most of the description work in writing the papers... we will try to
> improve
> the web site).
>
> We don't measure "edit work" in number of edits, but in number of words
> changed.
> As you say, for our system, changing 1000 words in separate edits is the
> same (provided the edits are all kept, i.e., not reverted) as providing a
> single 1000-word contribution. We thought of giving a larger prize to
> larger contributions: precisely, of making the reputation increment
> proportional to n^a, where n is the number of words, and a > 1. This did
> not work well for the Wikipedia, because it ended up not rewarding enough
> the work of the many editors, who clean and polish the articles, thus
> making
> many small edits. Technically it would be trivial to change the code to
> include such a non-linear reward scheme (to adopt rewards proportional to
> n^a rather than n); whether it is desirable, I have no idea. It does not
> lead to better quantitative performance of the system, i.e., the resulting
> trust is not better at predicting future text deletions.
>
> Luca
>
>
>> The USCS system did work, but gave me odd results. Apparently I have a
>> very bad reputation, because when I look in the History at the first
>> versions, which I wrote in entirety, it colored it all yellow!
>>
>> Newer versions of the same articles had much more white, even though
>> huge portions of the text were still from the origial. This may be due
>> to diff problems -- I consider diff to be largely random in
>> effectiveness, sometimes it works, but othertimes a single whitespace
>> change, especially vertical, will make it think the entire article was
>> edited.
>>
>> My guess is that the system is tripping over diffs like this, and thus
>> considering the article to have been re-written by another editor.
>> Since this has happened, MY reputation goes down, or so I understand
>> it.
>>
>> I don?t think this system could possibly work if based on wiki's
>> diffs. If its going to work it?s going to need to use a much more
>> reliable system.
>>
>> Another problem I see with it is that it will rank an author who?s
>> contributions are 1000 unchanged comma inserts to be as reliable as an
>> author who created a perfect 1000 character article (or perhaps rate
>> the first even higher). There should be some sort of length bias, if
>> an author makes a big edit, out of character, that?s important to
>> know.
>>
>> Maury
>>
>> _______________________________________________
>> Wikipedia-l mailing list
>> Wikipedia-l(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
>>
>
>
> ------------------------------
>
> _______________________________________________
> Wikipedia-l mailing list
> Wikipedia-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
>
>
> End of Wikipedia-l Digest, Vol 64, Issue 3
> ******************************************
>
Hi, I'm a citizen of Republic of Moldova and I want to inform you
that in our country everyone is writing Moldovan language with latin
letters.
When we were under soviet union occupation, they tryed to russificate us
and forced to have our language written with cyrilic.
In 1991, after getting the freedom to choose, we choose our language to be
written with latin letters, as we did before russians conquest us (without
ask the people) and divided from Romania (our mother land).
Thereby, as a free moldovan speaking man, I'm asking you to remove
mo.wikipedia.org (witch is in cyrillic and is very offensive for us) and
respect our choice as a independent nation or to make it with latin letters.
Thank you.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello all,
We're working on an update to the Wikipedia logo, which can be used in
3-D, which will be correcting all the incorrect glyphs, and include
many other scripts that are not presently in the logo.
The project page is at <http://meta.wikimedia.org/wiki/Wikipedia/Logo>
and we're still looking for community members to discuss, to help sort
out characters, font styles and representations for the additional
alphabets as well as continue discussing the current glyphs on the
talk page at <http://meta.wikimedia.org/wiki/Talk:Wikipedia/Logo>.
Your input is greatly appreciated!
Cary
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFJGI0DyQg4JSymDYkRArwXAJsEYbANI9RrU4HWR9PXjE7Qz4IqxgCgrapl
7IiOfWJ+7jRPaioYuoBXGHg=
=nwDW
-----END PGP SIGNATURE-----