As some of you might remember, we have been working on author reputation and text trust systems for wikis; some of you may have seen our demo at WikiMania 2007, or the on-line demo http://wiki-trust.cse.ucsc.edu/
Since then, we have been busy at work to build a system that can be deployed on any wiki, and display the text trust information. And we finally made it!
We are pleased to announce the release of WikiTrust version 2!
With it, you can compute author reputation and text trust of your wikis in real-time, as edits to the wiki are made, and you can display text trust via a new "trust" tab. The tool can be installed as a MediaWiki extension, and is released open-source, under the BSD license; the project page is http://trust.cse.ucsc.edu/WikiTrust
WikiTrust can be deployed both on new, and on existing, wikis. WikiTrust stores author reputation and text trust in additional database tables. If deployed on an existing wiki, WikiTrust first computes the reputation and trust information for the current wiki content, and then processes new edits as they are made. The computation is scalable, parallel, and fault-tolerant, in the sense that WikiTrust adaptively fills in missing trust or reputation information.
On my MacBook, running under Ubuntu in vmware, WikiTrust can analize some 10-20 revisions / second of a wiki; so with a little patience, unless your wiki is truly huge, you can just deploy it and wait a bit. Go to http://trust.cse.ucsc.edu/WikiTrust for more information and for the code!
Feedback, comments, etc are much appreciated!
Luca de Alfaro (with Ian Pye and Bo Adler)
If someone edits an an article, is pretty new and the WikiTrust system colors his edits in red, then no one else is editing that particular article for several years, will the coloring change according to any trust the contributor gain or is the trust coloring of the article locked to his trust at the given moment in time?
As I recall, the trust for older contributions are a kind of sticky to the trust at that earlyer moment?
John
Luca de Alfaro skrev:
As some of you might remember, we have been working on author reputation and text trust systems for wikis; some of you may have seen our demo at WikiMania 2007, or the on-line demo http://wiki-trust.cse.ucsc.edu/
Since then, we have been busy at work to build a system that can be deployed on any wiki, and display the text trust information. And we finally made it!
We are pleased to announce the release of WikiTrust version 2!
With it, you can compute author reputation and text trust of your wikis in real-time, as edits to the wiki are made, and you can display text trust via a new "trust" tab. The tool can be installed as a MediaWiki extension, and is released open-source, under the BSD license; the project page is http://trust.cse.ucsc.edu/WikiTrust
WikiTrust can be deployed both on new, and on existing, wikis. WikiTrust stores author reputation and text trust in additional database tables. If deployed on an existing wiki, WikiTrust first computes the reputation and trust information for the current wiki content, and then processes new edits as they are made. The computation is scalable, parallel, and fault-tolerant, in the sense that WikiTrust adaptively fills in missing trust or reputation information.
On my MacBook, running under Ubuntu in vmware, WikiTrust can analize some 10-20 revisions / second of a wiki; so with a little patience, unless your wiki is truly huge, you can just deploy it and wait a bit. Go to http://trust.cse.ucsc.edu/WikiTrust for more information and for the code!
Feedback, comments, etc are much appreciated!
Luca de Alfaro (with Ian Pye and Bo Adler)
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Hi John,
you are correct. If person A makes her first edit to the Wikipedia, the edit is colored red as A has no reputation at the time of the edit. Suppose nobody then edits that page again. Even if A later gains reputation, the edit stays colored red.
In practice, we have not found this to be a problem on the Wikipedia. The pages that are even mildly interesting are receiving at least some edit every one in a long while; there are very few pages that are both interesting, and have not received an edit in the past year. We can certainly be suboptimal in "fringe" pages, but it is not a big issue.
On the other hand, we suspect the above can be a problem in other wikis. I am thinking for instance at technical wikis, where someone may have documented long time ago some piece of code, or some protocol, or have added some technical specifications, and the information is then left unchanged, as it is understood to be correct, and not relevant enough to warrant tinkering with the text.
So it would certainly be interesting to develop a sort of "trust correction crawler" that picks pages that have not been touched in a long while, and recomputes the trust of their content based on the CURRENT reputation of the authors. I think that not only it would be interesting, but it would also not be very hard to do. If anyone of you feels like adding this, we can help explain how to do it in the code; otherwise, we may add it to the tool ourselves at some point. If you don't mind John, can I put this on the "todo" list?
Luca
2008/8/26 John Erling Blad john.erling.blad@jeb.no
If someone edits an an article, is pretty new and the WikiTrust system colors his edits in red, then no one else is editing that particular article for several years, will the coloring change according to any trust the contributor gain or is the trust coloring of the article locked to his trust at the given moment in time?
As I recall, the trust for older contributions are a kind of sticky to the trust at that earlyer moment?
John
Luca de Alfaro skrev:
As some of you might remember, we have been working on author reputation and text trust systems for wikis; some of you may have seen our demo at WikiMania 2007, or the on-line demo http://wiki-trust.cse.ucsc.edu/
Since then, we have been busy at work to build a system that can be deployed on any wiki, and display the text trust information. And we finally made it!
We are pleased to announce the release of WikiTrust version 2!
With it, you can compute author reputation and text trust of your wikis in real-time, as edits to the wiki are made, and you can display text trust via a new "trust" tab. The tool can be installed as a MediaWiki extension, and is released open-source, under the BSD license; the project page is http://trust.cse.ucsc.edu/WikiTrust
WikiTrust can be deployed both on new, and on existing, wikis. WikiTrust stores author reputation and text trust in additional database tables. If deployed on an existing wiki, WikiTrust first computes the reputation and trust information for the current wiki content, and then processes new edits as they are made. The computation is scalable, parallel, and fault-tolerant, in the sense that WikiTrust adaptively fills in missing trust or reputation information.
On my MacBook, running under Ubuntu in vmware, WikiTrust can analize some 10-20 revisions / second of a wiki; so with a little patience, unless your wiki is truly huge, you can just deploy it and wait a bit. Go to http://trust.cse.ucsc.edu/WikiTrust for more information and for the code!
Feedback, comments, etc are much appreciated!
Luca de Alfaro (with Ian Pye and Bo Adler)
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
I dropped a note on our signpost and there are some, not much, but some interest in testing this. As we (http://no.wikipedia.org) are a medium sized wiki we could experiment with something like this, without disturbing to many users.. :-O
I'll drop a note about how this is done right now.
John
Luca de Alfaro skrev:
Hi John,
you are correct. If person A makes her first edit to the Wikipedia, the edit is colored red as A has no reputation at the time of the edit. Suppose nobody then edits that page again. Even if A later gains reputation, the edit stays colored red.
In practice, we have not found this to be a problem on the Wikipedia. The pages that are even mildly interesting are receiving at least some edit every one in a long while; there are very few pages that are both interesting, and have not received an edit in the past year. We can certainly be suboptimal in "fringe" pages, but it is not a big issue.
On the other hand, we suspect the above can be a problem in other wikis. I am thinking for instance at technical wikis, where someone may have documented long time ago some piece of code, or some protocol, or have added some technical specifications, and the information is then left unchanged, as it is understood to be correct, and not relevant enough to warrant tinkering with the text.
So it would certainly be interesting to develop a sort of "trust correction crawler" that picks pages that have not been touched in a long while, and recomputes the trust of their content based on the CURRENT reputation of the authors. I think that not only it would be interesting, but it would also not be very hard to do. If anyone of you feels like adding this, we can help explain how to do it in the code; otherwise, we may add it to the tool ourselves at some point. If you don't mind John, can I put this on the "todo" list?
Luca
2008/8/26 John Erling Blad <john.erling.blad@jeb.no mailto:john.erling.blad@jeb.no>
If someone edits an an article, is pretty new and the WikiTrust system colors his edits in red, then no one else is editing that particular article for several years, will the coloring change according to any trust the contributor gain or is the trust coloring of the article locked to his trust at the given moment in time? As I recall, the trust for older contributions are a kind of sticky to the trust at that earlyer moment? John Luca de Alfaro skrev: > As some of you might remember, we have been working on author > reputation and text trust systems for wikis; some of you may have seen > our demo at WikiMania 2007, or the on-line demo > http://wiki-trust.cse.ucsc.edu/ > > Since then, we have been busy at work to build a system that can be > deployed on any wiki, and display the text trust information. > And we finally made it! > > We are pleased to announce the release of WikiTrust version 2! > > With it, you can compute author reputation and text trust of your > wikis in real-time, as edits to the wiki are made, and you can display > text trust via a new "trust" tab. > The tool can be installed as a MediaWiki extension, and is released > open-source, under the BSD license; the project page is > http://trust.cse.ucsc.edu/WikiTrust > > WikiTrust can be deployed both on new, and on existing, wikis. > WikiTrust stores author reputation and text trust in additional > database tables. If deployed on an existing wiki, WikiTrust first > computes the reputation and trust information for the current wiki > content, and then processes new edits as they are made. The > computation is scalable, parallel, and fault-tolerant, in the sense > that WikiTrust adaptively fills in missing trust or reputation > information. > > On my MacBook, running under Ubuntu in vmware, WikiTrust can analize > some 10-20 revisions / second of a wiki; so with a little patience, > unless your wiki is truly huge, you can just deploy it and wait a > bit. > Go to http://trust.cse.ucsc.edu/WikiTrust for more information and for > the code! > > Feedback, comments, etc are much appreciated! > > Luca de Alfaro > (with Ian Pye and Bo Adler) > > ------------------------------------------------------------------------ > > _______________________________________________ > Wikiquality-l mailing list > Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> > https://lists.wikimedia.org/mailman/listinfo/wikiquality-l > _______________________________________________ Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Do we have any idea what server load is like? Is this something WMF could potentially deploy at this point?
Mike
_____
From: Luca de Alfaro [mailto:luca@dealfaro.org] Sent: August 23, 2008 10:56 PM To: wikiquality-l@lists.wikimedia.org Subject: [Wikiquality-l] WikiTrust v2 released: reputation and trust foryour wiki in real-time!
As some of you might remember, we have been working on author reputation and text trust systems for wikis; some of you may have seen our demo at WikiMania 2007, or the on-line demo http://wiki-trust.cse.ucsc.edu/
Since then, we have been busy at work to build a system that can be deployed on any wiki, and display the text trust information. And we finally made it!
We are pleased to announce the release of WikiTrust version 2!
With it, you can compute author reputation and text trust of your wikis in real-time, as edits to the wiki are made, and you can display text trust via a new "trust" tab. The tool can be installed as a MediaWiki extension, and is released open-source, under the BSD license; the project page is http://trust.cse.ucsc.edu/WikiTrust
WikiTrust can be deployed both on new, and on existing, wikis. WikiTrust stores author reputation and text trust in additional database tables. If deployed on an existing wiki, WikiTrust first computes the reputation and trust information for the current wiki content, and then processes new edits as they are made. The computation is scalable, parallel, and fault-tolerant, in the sense that WikiTrust adaptively fills in missing trust or reputation information.
On my MacBook, running under Ubuntu in vmware, WikiTrust can analize some 10-20 revisions / second of a wiki; so with a little patience, unless your wiki is truly huge, you can just deploy it and wait a bit. Go to http://trust.cse.ucsc.edu/WikiTrust for more information and for the code!
Feedback, comments, etc are much appreciated!
Luca de Alfaro (with Ian Pye and Bo Adler)
This is a very good question. In general, we would be delighted to find a small wiki (i.e., smaller than the En / De / Fr ones) that wants to try this out, and we think that this is realistic. We think that it would be essential to experiment first on smaller wikis to gain more experience with load, and how to distribute it on the hardware infrastructure, before jumping to the largest wikis. This is kind of obvious of course. Note that in any case you can throttle down the load of the extension without affecting the main wiki, as it is all implemented in asynchronous fashion (the main wiki never needs to wait for the extension when serving requests).
Let me now give a somewhat detailed answer on the load.
- When there is an edit, there is an asynchronous job started to analyze the edit. The key word is asynchronous: it does not slow down the edit http processing on the main server. Right now, the analysis job is running on the same machine that runs mediawiki; this can change if desired. The analysis takes a fraction of a second, but requires reading 5-10 revisions from the database, and some other minor db access as well. So clearly edits become more expensive, but even on the English Wikipedia (5 edits / second at most?) a single CPU would suffice. As long as edits are a small percentage of the reads (which is true of most wikis), we do not think that the analysis of edits is a significant load.
- When you ask to look at trust information, it is very quick: we just read some alternative markup, rather than the standard one.
- For each revision that is analyzed for trust, mediawiki stores the original, unchanged revision, and we store in an additional database table the same revision, annotated for trust and text origin. If you want to have trust information for all revisions, this causes the db size to exand by a factor of roughly 2.5: not an issue for small wikis, but an issue for very large ones. In practice, very few people are interested in the trust information for very old article versions, so we could keep the trust information only for the most recent 50 or so revisions (in fact, there is a better way to determine the threshold, but let's simplify). This would reduce storage, and make it proportional to the number of articles rather than the number of revisions. It would be quite easy for us to add a variable that enables such pruning; let me add it to our todo list.
- The hard job is when you have a big existing wiki, and you need to analyze all its past, to get the extension up to date. We are trying to decide whether to build special tools that facilitate this "catch-up" starting from xml dumps; if the WMF expressed clear interest in our extension, we would consider it. The current implementation can "catch up with the past" at something like 10 revisions / second; to limit db usage, one can throttle this down, e.g. to 4 revisions / second. This "catch up" analysis can be run in the background, and can use a spare cpu connected to the db, or whatever. I believe the main bottleneck would be the db load, rather than the cpu load (the cpu load can occur on a separate cpu, the db is shared with the wiki servers). But if one accepts that the catch-up takes a bit of time, one can just throttle down the analysis; after all, this needs to be done only once for each wiki.
Suggestions and comments are welcome.
Luca
On Tue, Aug 26, 2008 at 5:33 PM, mike.lifeguard mike.lifeguard@gmail.comwrote:
Do we have any idea what server load is like? Is this something WMF could potentially deploy at this point?
Mike
*From:* Luca de Alfaro [mailto:luca@dealfaro.org] *Sent:* August 23, 2008 10:56 PM *To:* wikiquality-l@lists.wikimedia.org *Subject:* [Wikiquality-l] WikiTrust v2 released: reputation and trust foryour wiki in real-time!
As some of you might remember, we have been working on author reputation and text trust systems for wikis; some of you may have seen our demo at WikiMania 2007, or the on-line demo http://wiki-trust.cse.ucsc.edu/
Since then, we have been busy at work to build a system that can be deployed on any wiki, and display the text trust information. And we finally made it!
We are pleased to announce the release of WikiTrust version 2!
With it, you can compute author reputation and text trust of your wikis in real-time, as edits to the wiki are made, and you can display text trust via a new "trust" tab. The tool can be installed as a MediaWiki extension, and is released open-source, under the BSD license; the project page is http://trust.cse.ucsc.edu/WikiTrust
WikiTrust can be deployed both on new, and on existing, wikis. WikiTrust stores author reputation and text trust in additional database tables. If deployed on an existing wiki, WikiTrust first computes the reputation and trust information for the current wiki content, and then processes new edits as they are made. The computation is scalable, parallel, and fault-tolerant, in the sense that WikiTrust adaptively fills in missing trust or reputation information.
On my MacBook, running under Ubuntu in vmware, WikiTrust can analize some 10-20 revisions / second of a wiki; so with a little patience, unless your wiki is truly huge, you can just deploy it and wait a bit. Go to http://trust.cse.ucsc.edu/WikiTrust for more information and for the code!
Feedback, comments, etc are much appreciated!
Luca de Alfaro (with Ian Pye and Bo Adler)
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
even on the English Wikipedia (5 edits / second at most?) a single CPU
would suffice
IIRC, the rate is far far higher than that. Perhaps ask a syadmin how many edits/sec (or are db transactions more relevant?) we're looking at currently. 5 edit/s seems low to me. But maybe I have been dazzled by numbers such as ">80000 SQL queries per sec" (from http://dammit.lt/tech/velocity-wikipedia-domas-2008.pdf; probably all WMF, not enwiki) etc O.o
Mike
On Tue, Aug 26, 2008 at 9:42 PM, mike.lifeguard mike.lifeguard@gmail.com wrote:
even on the English Wikipedia (5 edits / second at most?) a single CPU would suffice
But why start so large? Pick a smaller test wiki first like, say, en.wikibooks? We can throw that into the queue of things we want installed down at WB.
--Andrew Whitworth
Yes, I fully agree. We should start on a small project where people are interested. We can consider the bigger wikis later, once we are confident that we like it and it works the way we want. I was citing Enwiki just to discuss potential performance. Ian and I can help with advice etc anyone who wants to try this out. Luca
On Tue, Aug 26, 2008 at 6:54 PM, Andrew Whitworth wknight8111@gmail.comwrote:
On Tue, Aug 26, 2008 at 9:42 PM, mike.lifeguard mike.lifeguard@gmail.com wrote:
even on the English Wikipedia (5 edits / second at most?) a single CPU would suffice
But why start so large? Pick a smaller test wiki first like, say, en.wikibooks? We can throw that into the queue of things we want installed down at WB.
--Andrew Whitworth
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Well, en.labs would be best, as it is a site for testing :D But yes, English Wikibooks is a good candidate once we have explored the issue of long-untouched pages (ie the trust should be recalculated, I think)
Mike
_____
From: wikiquality-l-bounces@lists.wikimedia.org [mailto:wikiquality-l-bounces@lists.wikimedia.org] On Behalf Of Luca de Alfaro Sent: August 26, 2008 10:59 PM To: Wikimedia Quality Discussions Subject: Re: [Wikiquality-l] WikiTrust v2 released: reputation andtrustforyour wiki in real-time!
Yes, I fully agree. We should start on a small project where people are interested. We can consider the bigger wikis later, once we are confident that we like it and it works the way we want. I was citing Enwiki just to discuss potential performance. Ian and I can help with advice etc anyone who wants to try this out. Luca
On Tue, Aug 26, 2008 at 6:54 PM, Andrew Whitworth wknight8111@gmail.com wrote:
On Tue, Aug 26, 2008 at 9:42 PM, mike.lifeguard
mike.lifeguard@gmail.com wrote:
even on the English Wikipedia (5 edits / second at most?) a single CPU would suffice
But why start so large? Pick a smaller test wiki first like, say, en.wikibooks? We can throw that into the queue of things we want installed down at WB.
--Andrew Whitworth
_______________________________________________ Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Anyhow, it seems like there are some interest at no.wikipedia.org, but we do have a traffic load even if its not one of the main wikis. Its about 10-100 000 hits each hour.
Is it possible to split out the tables to another database? That way it isn't going to impact the overall performance. Also, is it possible to completly split out the preprocessing?
John
mike.lifeguard skrev:
Well, en.labs would be best, as it is a site for testing :D But yes, English Wikibooks is a good candidate once we have explored the issue of long-untouched pages (ie the trust should be recalculated, I think)
Mike
*From:* wikiquality-l-bounces@lists.wikimedia.org [mailto:wikiquality-l-bounces@lists.wikimedia.org] *On Behalf Of *Luca de Alfaro *Sent:* August 26, 2008 10:59 PM *To:* Wikimedia Quality Discussions *Subject:* Re: [Wikiquality-l] WikiTrust v2 released: reputation andtrustforyour wiki in real-time!
Yes, I fully agree. We should start on a small project where people are interested. We can consider the bigger wikis later, once we are confident that we like it and it works the way we want. I was citing Enwiki just to discuss potential performance. Ian and I can help with advice etc anyone who wants to try this out. Luca
On Tue, Aug 26, 2008 at 6:54 PM, Andrew Whitworth <wknight8111@gmail.com mailto:wknight8111@gmail.com> wrote:
On Tue, Aug 26, 2008 at 9:42 PM, mike.lifeguard
<mike.lifeguard@gmail.com mailto:mike.lifeguard@gmail.com> wrote:
even on the English Wikipedia (5 edits / second at most?) a single CPU would suffice
But why start so large? Pick a smaller test wiki first like, say, en.wikibooks? We can throw that into the queue of things we want installed down at WB.
--Andrew Whitworth
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org mailto:Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Hi John,
1) Splitting the tables to another database: this can be done. But note that when I compute the coloring, I need access also to the standard tables (e.g. to read the revision text to analyze). Do you think this is high priority for you? We can certainly done it in, I guess, a couple of hours of coding.
2) Splitting the computation altogether to another server. Let me consider what "computation" is. There are three kinds of computation.
a) The computation to "catch up with the past" i.e. color the existing revisions of a wiki. This is by far the heaviest load. This can (and I argue should) be done on a separate machine provided of course the machine has access to the db(s). I can add some way of throttling down the computation, so that the db access does not become too much.
b) The computation to analyze an edit. This is proportional to the n. of edits, not hits, so unless you get at least 1 edit / second, you don't need to worry.
c) The computation to serve a trust-colored page. This is a bit difficult to split to another server, as this is done via php hooks that are supposed to run on the same machine. The coloring is performed by reading a different version of a page (from a different db table), and doing a bit of html processing to it. I don't think the load is much larger than serving a normal page; we need some testing.
So I propose that I add some way of avoiding the db being accessed too often while "catching up with the past". Let me know if I should separate the wikitrust tables in a different db. Would you think these two measures are enough?
Luca
2008/8/27 John Erling Blad john.erling.blad@jeb.no
Anyhow, it seems like there are some interest at no.wikipedia.org, but we do have a traffic load even if its not one of the main wikis. Its about 10-100 000 hits each hour.
Is it possible to split out the tables to another database? That way it isn't going to impact the overall performance. Also, is it possible to completly split out the preprocessing?
John
mike.lifeguard skrev:
Well, en.labs would be best, as it is a site for testing :D But yes, English Wikibooks is a good candidate once we have explored the issue of long-untouched pages (ie the trust should be recalculated, I think)
Mike
*From:* wikiquality-l-bounces@lists.wikimedia.org [mailto:wikiquality-l-bounces@lists.wikimedia.org] *On Behalf Of *Luca de Alfaro *Sent:* August 26, 2008 10:59 PM *To:* Wikimedia Quality Discussions *Subject:* Re: [Wikiquality-l] WikiTrust v2 released: reputation andtrustforyour wiki in real-time!
Yes, I fully agree. We should start on a small project where people are interested. We can consider the bigger wikis later, once we are confident that we like it and it works the way we want. I was citing Enwiki just to discuss potential performance. Ian and I can help with advice etc anyone who wants to try this out. Luca
On Tue, Aug 26, 2008 at 6:54 PM, Andrew Whitworth <wknight8111@gmail.com mailto:wknight8111@gmail.com> wrote:
On Tue, Aug 26, 2008 at 9:42 PM, mike.lifeguard
<mike.lifeguard@gmail.com mailto:mike.lifeguard@gmail.com> wrote:
even on the English Wikipedia (5 edits / second at most?) a single CPU would suffice
But why start so large? Pick a smaller test wiki first like, say, en.wikibooks? We can throw that into the queue of things we want installed down at WB.
--Andrew Whitworth
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org mailto:Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Splitting out the table was just a wild guess on how it could be done on another computer.
If you come up with working ideas it is no problem for me how you do it as long as it is a viable solution for use on Wikipedia.
Note also that it is _some_ interest in this at no.wp, not a huge _we_want_this_ -type of choir! ;)
I think it is one of several interesting tools to check edits. The most interesting part is perhaps that it can be used to show to the general public whats trustworthy and whats not. Perhaps an article can be set in a state (stable versions) where additions that is not within a given limit will be visible. Then it can be added handles to those specific additions that let users easilly change them. Thus, we enable the common reader to interact easilly with vandalized content while the correct content are only available through ordinary editing, thereby exposing the vandals to an even more harsh environment than before.
John
Luca de Alfaro skrev:
Hi John,
- Splitting the tables to another database: this can be done. But
note that when I compute the coloring, I need access also to the standard tables (e.g. to read the revision text to analyze). Do you think this is high priority for you? We can certainly done it in, I guess, a couple of hours of coding.
- Splitting the computation altogether to another server. Let me
consider what "computation" is. There are three kinds of computation.
a) The computation to "catch up with the past" i.e. color the existing revisions of a wiki. This is by far the heaviest load. This can (and I argue should) be done on a separate machine provided of course the machine has access to the db(s). I can add some way of throttling down the computation, so that the db access does not become too much.
b) The computation to analyze an edit. This is proportional to the n. of edits, not hits, so unless you get at least 1 edit / second, you don't need to worry.
c) The computation to serve a trust-colored page. This is a bit difficult to split to another server, as this is done via php hooks that are supposed to run on the same machine. The coloring is performed by reading a different version of a page (from a different db table), and doing a bit of html processing to it. I don't think the load is much larger than serving a normal page; we need some testing.
So I propose that I add some way of avoiding the db being accessed too often while "catching up with the past". Let me know if I should separate the wikitrust tables in a different db. Would you think these two measures are enough?
Luca
2008/8/27 John Erling Blad <john.erling.blad@jeb.no mailto:john.erling.blad@jeb.no>
Anyhow, it seems like there are some interest at no.wikipedia.org <http://no.wikipedia.org>, but we do have a traffic load even if its not one of the main wikis. Its about 10-100 000 hits each hour. Is it possible to split out the tables to another database? That way it isn't going to impact the overall performance. Also, is it possible to completly split out the preprocessing? John mike.lifeguard skrev: > > Well, en.labs would be best, as it is a site for testing :D But yes, > English Wikibooks is a good candidate once we have explored the issue > of long-untouched pages (ie the trust should be recalculated, I think) > > > > Mike > > > > ------------------------------------------------------------------------ > > *From:* wikiquality-l-bounces@lists.wikimedia.org <mailto:wikiquality-l-bounces@lists.wikimedia.org> > [mailto:wikiquality-l-bounces@lists.wikimedia.org <mailto:wikiquality-l-bounces@lists.wikimedia.org>] *On Behalf Of *Luca > de Alfaro > *Sent:* August 26, 2008 10:59 PM > *To:* Wikimedia Quality Discussions > *Subject:* Re: [Wikiquality-l] WikiTrust v2 released: reputation > andtrustforyour wiki in real-time! > > > > Yes, I fully agree. > We should start on a small project where people are interested. > We can consider the bigger wikis later, once we are confident that we > like it and it works the way we want. > I was citing Enwiki just to discuss potential performance. > Ian and I can help with advice etc anyone who wants to try this out. > Luca > > On Tue, Aug 26, 2008 at 6:54 PM, Andrew Whitworth > <wknight8111@gmail.com <mailto:wknight8111@gmail.com> <mailto:wknight8111@gmail.com <mailto:wknight8111@gmail.com>>> wrote: > > On Tue, Aug 26, 2008 at 9:42 PM, mike.lifeguard > > <mike.lifeguard@gmail.com <mailto:mike.lifeguard@gmail.com> <mailto:mike.lifeguard@gmail.com <mailto:mike.lifeguard@gmail.com>>> wrote: > > >>even on the English Wikipedia (5 edits / second at most?) a single CPU > >> would suffice > > But why start so large? Pick a smaller test wiki first like, say, > en.wikibooks? We can throw that into the queue of things we want > installed down at WB. > > --Andrew Whitworth > > > _______________________________________________ > Wikiquality-l mailing list > Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> > <mailto:Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org>> > https://lists.wikimedia.org/mailman/listinfo/wikiquality-l > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Wikiquality-l mailing list > Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> > https://lists.wikimedia.org/mailman/listinfo/wikiquality-l > _______________________________________________ Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Regarding the issue of long-untouched pages:
1) are there some statistics, in mediawiki, that allow a process to check how many times a page has been accessed? I can of course add a new hook and a db table, but it may not work due to caching, and as this would run for every hit, I don't want to increase load.
2) The best way to add this reconsideration of long-unchanged pages is to implement it as a batch process, i.e., not run from apache. This would mean that the job can be run as a cron job, or as a daemon, or something like that, in the background, at low enough rate not to cause trouble. Does this approach sound satisfactory to you?
Oh, and what is en.labs?
Luca
On Wed, Aug 27, 2008 at 2:42 AM, mike.lifeguard mike.lifeguard@gmail.comwrote:
Well, en.labs would be best, as it is a site for testing :D But yes, English Wikibooks is a good candidate once we have explored the issue of long-untouched pages (ie the trust should be recalculated, I think)
Mike
*From:* wikiquality-l-bounces@lists.wikimedia.org [mailto: wikiquality-l-bounces@lists.wikimedia.org] *On Behalf Of *Luca de Alfaro *Sent:* August 26, 2008 10:59 PM *To:* Wikimedia Quality Discussions *Subject:* Re: [Wikiquality-l] WikiTrust v2 released: reputation andtrustforyour wiki in real-time!
Yes, I fully agree. We should start on a small project where people are interested. We can consider the bigger wikis later, once we are confident that we like it and it works the way we want. I was citing Enwiki just to discuss potential performance. Ian and I can help with advice etc anyone who wants to try this out. Luca
On Tue, Aug 26, 2008 at 6:54 PM, Andrew Whitworth wknight8111@gmail.com wrote:
On Tue, Aug 26, 2008 at 9:42 PM, mike.lifeguard
mike.lifeguard@gmail.com wrote:
even on the English Wikipedia (5 edits / second at most?) a single CPU would suffice
But why start so large? Pick a smaller test wiki first like, say, en.wikibooks? We can throw that into the queue of things we want installed down at WB.
--Andrew Whitworth
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
As to the first question, unfortunately no, only squid hit logs are maintained.
-Aaron Schulz
Date: Wed, 27 Aug 2008 11:39:20 -0700 From: luca@dealfaro.org To: wikiquality-l@lists.wikimedia.org Subject: Re: [Wikiquality-l] WikiTrust v2 released: reputation andtrustforyour wiki in real-time!
Regarding the issue of long-untouched pages:
1) are there some statistics, in mediawiki, that allow a process to check how many times a page has been accessed? I can of course add a new hook and a db table, but it may not work due to caching, and as this would run for every hit, I don't want to increase load.
2) The best way to add this reconsideration of long-unchanged pages is to implement it as a batch process, i.e., not run from apache. This would mean that the job can be run as a cron job, or as a daemon, or something like that, in the background, at low enough rate not to cause trouble. Does this approach sound satisfactory to you?
Oh, and what is en.labs?
Luca
On Wed, Aug 27, 2008 at 2:42 AM, mike.lifeguard mike.lifeguard@gmail.com wrote:
Well, en.labs would be best, as it is a site for testing :D But yes, English Wikibooks is a good candidate once we have explored the issue of long-untouched pages (ie the trust should be recalculated, I think)
Mike
From: wikiquality-l-bounces@lists.wikimedia.org [mailto:wikiquality-l-bounces@lists.wikimedia.org] On Behalf Of Luca de Alfaro
Sent: August 26, 2008 10:59 PM
To: Wikimedia Quality Discussions
Subject: Re: [Wikiquality-l] WikiTrust v2 released: reputation andtrustforyour wiki in real-time!
Yes, I fully agree.
We should start on a small project where people are interested.
We can consider the bigger wikis later, once we are confident that we like it and it works the way we want.
I was citing Enwiki just to discuss potential performance.
Ian and I can help with advice etc anyone who wants to try this out.
Luca
On Tue, Aug 26, 2008 at 6:54 PM, Andrew Whitworth wknight8111@gmail.com wrote:
On Tue, Aug 26, 2008 at 9:42 PM, mike.lifeguard
mike.lifeguard@gmail.com wrote:
even on the
English Wikipedia (5 edits / second at most?) a single CPU
would suffice
But why start so large? Pick a smaller test wiki first like, say,
en.wikibooks? We can throw that into the queue of things we want
installed down at WB.
--Andrew Whitworth
_______________________________________________
Wikiquality-l mailing list
Wikiquality-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
_______________________________________________
Wikiquality-l mailing list
Wikiquality-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
_________________________________________________________________ Get thousands of games on your PC, your mobile phone, and the web with Windows®. http://clk.atdmt.com/MRT/go/108588800/direct/01/
Accumulate a log from IRC, and write a todo-list regularly. Ie, the todo-list will contain both users and articles.
I have done some processing on the squid logs, and it seems like there are a time limit when all pages are visited. I think the todo-list can be compared to the complete list of articles and all that has not been on the the todo-list can be expected to need a rewrite of the trust if any one of the contributors are listed on the todo-list.
John
Aaron Schulz skrev:
As to the first question, unfortunately no, only squid hit logs are maintained.
-Aaron Schulz
Date: Wed, 27 Aug 2008 11:39:20 -0700 From: luca@dealfaro.org To: wikiquality-l@lists.wikimedia.org Subject: Re: [Wikiquality-l] WikiTrust v2 released: reputation andtrustforyour wiki in real-time!
Regarding the issue of long-untouched pages:
- are there some statistics, in mediawiki, that allow a process to
check how many times a page has been accessed? I can of course add a new hook and a db table, but it may not work due to caching, and as this would run for every hit, I don't want to increase load.
- The best way to add this reconsideration of long-unchanged pages is
to implement it as a batch process, i.e., not run from apache. This would mean that the job can be run as a cron job, or as a daemon, or something like that, in the background, at low enough rate not to cause trouble. Does this approach sound satisfactory to you?
Oh, and what is en.labs?
Luca
On Wed, Aug 27, 2008 at 2:42 AM, mike.lifeguard <mike.lifeguard@gmail.com mailto:mike.lifeguard@gmail.com> wrote:
Well, en.labs would be best, as it is a site for testing :D But yes, English Wikibooks is a good candidate once we have explored the issue of long-untouched pages (ie the trust should be recalculated, I think) Mike ------------------------------------------------------------------------ *From:* wikiquality-l-bounces@lists.wikimedia.org <mailto:wikiquality-l-bounces@lists.wikimedia.org> [mailto:wikiquality-l-bounces@lists.wikimedia.org <mailto:wikiquality-l-bounces@lists.wikimedia.org>] *On Behalf Of *Luca de Alfaro *Sent:* August 26, 2008 10:59 PM *To:* Wikimedia Quality Discussions *Subject:* Re: [Wikiquality-l] WikiTrust v2 released: reputation andtrustforyour wiki in real-time! Yes, I fully agree. We should start on a small project where people are interested. We can consider the bigger wikis later, once we are confident that we like it and it works the way we want. I was citing Enwiki just to discuss potential performance. Ian and I can help with advice etc anyone who wants to try this out. Luca On Tue, Aug 26, 2008 at 6:54 PM, Andrew Whitworth <wknight8111@gmail.com <mailto:wknight8111@gmail.com>> wrote: On Tue, Aug 26, 2008 at 9:42 PM, mike.lifeguard <mike.lifeguard@gmail.com <mailto:mike.lifeguard@gmail.com>> wrote: >>even on the English Wikipedia (5 edits / second at most?) a single CPU >> would suffice But why start so large? Pick a smaller test wiki first like, say, en.wikibooks? We can throw that into the queue of things we want installed down at WB. --Andrew Whitworth _______________________________________________ Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikiquality-l _______________________________________________ Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Get thousands of games on your PC, your mobile phone, and the web with Windows®. Game with Windows
http://clk.atdmt.com/MRT/go/108588800/direct/01/
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Having the process run in the background (part of the job queue? Not sure if that's a good idea or not; grab a developer for that perhaps.) seems fine to me. We have pages which sit for a long time without being touched, so having old measures of trust is not a great idea. Perhaps it should be recalculated when the cache is purged? (or rather, every time the page is served not-from-cache) But that might be too much load.
I suppose I should point out I am not a programmer, so technical details are far beyond me for the most part.
Mike
_____
From: wikiquality-l-bounces@lists.wikimedia.org [mailto:wikiquality-l-bounces@lists.wikimedia.org] On Behalf Of Luca de Alfaro Sent: August 27, 2008 3:39 PM To: Wikimedia Quality Discussions Subject: Re: [Wikiquality-l] WikiTrust v2 released: reputationandtrustforyour wiki in real-time!
Regarding the issue of long-untouched pages:
1) are there some statistics, in mediawiki, that allow a process to check how many times a page has been accessed? I can of course add a new hook and a db table, but it may not work due to caching, and as this would run for every hit, I don't want to increase load.
2) The best way to add this reconsideration of long-unchanged pages is to implement it as a batch process, i.e., not run from apache. This would mean that the job can be run as a cron job, or as a daemon, or something like that, in the background, at low enough rate not to cause trouble. Does this approach sound satisfactory to you?
Oh, and what is en.labs?
Luca
On Wed, Aug 27, 2008 at 2:42 AM, mike.lifeguard mike.lifeguard@gmail.com wrote:
Well, en.labs would be best, as it is a site for testing :D But yes, English Wikibooks is a good candidate once we have explored the issue of long-untouched pages (ie the trust should be recalculated, I think)
Mike
_____
From: wikiquality-l-bounces@lists.wikimedia.org [mailto:wikiquality-l-bounces@lists.wikimedia.org] On Behalf Of Luca de Alfaro Sent: August 26, 2008 10:59 PM To: Wikimedia Quality Discussions Subject: Re: [Wikiquality-l] WikiTrust v2 released: reputation andtrustforyour wiki in real-time!
Yes, I fully agree. We should start on a small project where people are interested. We can consider the bigger wikis later, once we are confident that we like it and it works the way we want. I was citing Enwiki just to discuss potential performance. Ian and I can help with advice etc anyone who wants to try this out. Luca
On Tue, Aug 26, 2008 at 6:54 PM, Andrew Whitworth wknight8111@gmail.com wrote:
On Tue, Aug 26, 2008 at 9:42 PM, mike.lifeguard
mike.lifeguard@gmail.com wrote:
even on the English Wikipedia (5 edits / second at most?) a single CPU would suffice
But why start so large? Pick a smaller test wiki first like, say, en.wikibooks? We can throw that into the queue of things we want installed down at WB.
--Andrew Whitworth
_______________________________________________ Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
_______________________________________________ Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Actually, I would love to see this running on wikibooks! On all of them, or on some? What would you be interested in?
Luca
On Tue, Aug 26, 2008 at 6:54 PM, Andrew Whitworth wknight8111@gmail.comwrote:
On Tue, Aug 26, 2008 at 9:42 PM, mike.lifeguard mike.lifeguard@gmail.com wrote:
even on the English Wikipedia (5 edits / second at most?) a single CPU would suffice
But why start so large? Pick a smaller test wiki first like, say, en.wikibooks? We can throw that into the queue of things we want installed down at WB.
--Andrew Whitworth
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
On Tue, Aug 26, 2008 at 10:04 PM, Luca de Alfaro luca@dealfaro.org wrote:
Actually, I would love to see this running on wikibooks! On all of them, or on some? What would you be interested in?
I can't speak for all wikibooks language projects, of course. Technically, I can't "speak for" en.wikibooks either. I don't think we should run it on the wikimedia labs website because that doesn't get any real traffic.
simple.wikibooks gets occasional traffic, if we want to start very very small. en.wikibooks is much larger but still not overwhelming. I guess it all depends what projects, if any, the server admins are willing to install it on.
--Andrew Whitworth
Is enwikibooks a single database, or one mysql database for each book? Do you have data for n. of articles and n. of revisions of enwikibooks?
Our plan was to try internally to UCSC (and to the world in read-only) our code on wikibooks, using a dump, but I see that there is no dump available due to lack of disk space. Does anyone has a dump we could use? It would be interesting for us to try to set it up with a dump, and try it, before asking the admins.
Lacking that, we are going to experiment with other dumps and wikis...
Who are the admins of enwikibooks? Same as the general admins, i.e., Brion and others?
Luca
On Tue, Aug 26, 2008 at 7:11 PM, Andrew Whitworth wknight8111@gmail.comwrote:
On Tue, Aug 26, 2008 at 10:04 PM, Luca de Alfaro luca@dealfaro.org wrote:
Actually, I would love to see this running on wikibooks! On all of them, or on some? What would you be interested in?
I can't speak for all wikibooks language projects, of course. Technically, I can't "speak for" en.wikibooks either. I don't think we should run it on the wikimedia labs website because that doesn't get any real traffic.
simple.wikibooks gets occasional traffic, if we want to start very very small. en.wikibooks is much larger but still not overwhelming. I guess it all depends what projects, if any, the server admins are willing to install it on.
--Andrew Whitworth
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
On Tue, Aug 26, 2008 at 10:15 PM, Luca de Alfaro luca@dealfaro.org wrote:
Is enwikibooks a single database, or one mysql database for each book? Do you have data for n. of articles and n. of revisions of enwikibooks?
It's one database. The whole setup is almost identical to enwikipedia. I know this because we've been begging for a few changes to this setup for a while.
Who are the admins of enwikibooks? Same as the general admins, i.e., Brion and others?
Yes, the same team of rockstars. Brion knows much more about all of this then I do.
--Andrew Whitworth
Luca de Alfaro wrote:
Is enwikibooks a single database, or one mysql database for each book? Do you have data for n. of articles and n. of revisions of enwikibooks?
Each wiki is one single database. 106,341 total pages (31,167 good). http://en.wikibooks.org/wiki/Special:Statistics
Revisions: $ 7z e -so enwikibooks-20080712-pages-meta-history.xml.7z |grep -c '<revision>' 1011483
Our plan was to try internally to UCSC (and to the world in read-only) our code on wikibooks, using a dump, but I see that there is no dump available due to lack of disk space. Does anyone has a dump we could use? It would be interesting for us to try to set it up with a dump, and try it, before asking the admins.
Lacking that, we are going to experiment with other dumps and wikis...
There's a full enwikibooks dump from 20080712 http://download.wikimedia.org/enwikibooks/20080712/
Who are the admins of enwikibooks? Same as the general admins, i.e., Brion and others?
Luca
Don't confuse the wiki's admins (sysops and bureaucrats), which with the system admins, which are the same for all WMF wikis (Brion, Domas, Tim Starling...).
Dear Platonides,
many many thanks for pointing the dump to us! This is extremely welcome. It does not seem so large, so we will start experimenting with it immediately. I will report back once we get it running, also with some stats on the time the analysis took. How did you find it? From download.mediawiki.org, I saw only the interrupted dump in progress; how does one get hold of older dumps?
Yes, I know the distinction between computer-admins and wiki-admins (sysops, bureaucrats, etc). Who are the latter for wikibooks?
Best,
Luca
On Wed, Aug 27, 2008 at 3:25 AM, Platonides platonides@gmail.com wrote:
Luca de Alfaro wrote:
Is enwikibooks a single database, or one mysql database for each book? Do you have data for n. of articles and n. of revisions of enwikibooks?
Each wiki is one single database. 106,341 total pages (31,167 good). http://en.wikibooks.org/wiki/Special:Statistics
Revisions: $ 7z e -so enwikibooks-20080712-pages-meta-history.xml.7z |grep -c '<revision>' 1011483
Our plan was to try internally to UCSC (and to the world in read-only) our code on wikibooks, using a dump, but I see that there is no dump available due to lack of disk space. Does anyone has a dump we could use? It would be interesting for us to try to set it up with a dump, and try it, before asking the admins.
Lacking that, we are going to experiment with other dumps and wikis...
There's a full enwikibooks dump from 20080712 http://download.wikimedia.org/enwikibooks/20080712/
Who are the admins of enwikibooks? Same as the general admins, i.e., Brion and others?
Luca
Don't confuse the wiki's admins (sysops and bureaucrats), which with the system admins, which are the same for all WMF wikis (Brion, Domas, Tim Starling...).
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Andrew and I are both sysops at English Wikibooks, hence our interest. Others are listed at http://en.wikibooks.org/wiki/Special:Listadmins - not sure if any others take an interest in this, or are subscribed to the mailing list.
Mike
_____
From: wikiquality-l-bounces@lists.wikimedia.org [mailto:wikiquality-l-bounces@lists.wikimedia.org] On Behalf Of Luca de Alfaro Sent: August 27, 2008 3:33 PM To: Wikimedia Quality Discussions Subject: Re: [Wikiquality-l] WikiTrust v2 released: reputation andtrustforyour wiki in real-time!
Dear Platonides,
many many thanks for pointing the dump to us! This is extremely welcome. It does not seem so large, so we will start experimenting with it immediately. I will report back once we get it running, also with some stats on the time the analysis took. How did you find it? From download.mediawiki.org, I saw only the interrupted dump in progress; how does one get hold of older dumps?
Yes, I know the distinction between computer-admins and wiki-admins (sysops, bureaucrats, etc). Who are the latter for wikibooks?
Best,
Luca
On Wed, Aug 27, 2008 at 3:25 AM, Platonides platonides@gmail.com wrote:
Luca de Alfaro wrote:
Is enwikibooks a single database, or one mysql database for each book? Do you have data for n. of articles and n. of revisions of enwikibooks?
Each wiki is one single database. 106,341 total pages (31,167 good). http://en.wikibooks.org/wiki/Special:Statistics
Revisions: $ 7z e -so enwikibooks-20080712-pages-meta-history.xml.7z |grep -c '<revision>' 1011483
Our plan was to try internally to UCSC (and to the world in read-only) our code on wikibooks, using a dump, but I see that there is no dump available due to lack of disk space. Does anyone has a dump we could use? It would be interesting for us to try to set it up with a dump, and try it, before asking the admins.
Lacking that, we are going to experiment with other dumps and wikis...
There's a full enwikibooks dump from 20080712 http://download.wikimedia.org/enwikibooks/20080712/
Who are the admins of enwikibooks? Same as the general admins, i.e., Brion and others?
Luca
Don't confuse the wiki's admins (sysops and bureaucrats), which with the system admins, which are the same for all WMF wikis (Brion, Domas, Tim Starling...).
_______________________________________________ Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Luca de Alfaro escribió:
Dear Platonides,
many many thanks for pointing the dump to us! This is extremely welcome. It does not seem so large, so we will start experimenting with it immediately. I will report back once we get it running, also with some stats on the time the analysis took. How did you find it? From download.mediawiki.org http://download.mediawiki.org, I saw only the interrupted dump in progress; how does one get hold of older dumps?
Visit the dump page, while is_broken( dump ) Follow the link at the top "previous dump from ..."
Or just removing the last uri component, shows you a folder list with the archived dumps.
Yes, I know the distinction between computer-admins and wiki-admins (sysops, bureaucrats, etc). Who are the latter for wikibooks?
Just wanted to make it clear. See http://en.wikibooks.org/wiki/Special:ListUsers/sysop
Why would you need page hits? If anything relies on page hits to do a background computation, it should be using the job queue instead. On small systems the job queue will be running based on page hits, on bigger ones (eg. wikipedia) there will be some boxes dedicated to it (maintenance/runJobs.php). Front-end caching would interfer with it, anyway. I think you would need to restrict the expiry time for the cached page to the expected one an editor trust will be different. If the page is edited before, it is updated (and a new expiry time is set). If the cache expires, the next hit will make it regenerated. You can always set an upper bound for the caching. This is the mechanism used for some magic words such as {{currentmonth}}. Note that i'm not talkin about the browser/squid cache but also the parser cache.
At no.wp there are now questions about establishing thrust for users within small or closed groups where it is no or very little knowledge outside the group. Such situations will occur very rarely on english, or one of the other large wikis, yet it can occur very easily on smaller communities. It can although happen on a wiki with a large community if the given area of knowledge is sufficiently small.
An example is something like tetraploid salmon in fish farming (http://en.wikipedia.org/wiki/Fish_farming), where the english community might be able to check if this is meaningful and parhaps also the norwegian community, irish community and chilean community, but what about the polish community? Will they figure out the relations? Will a editor writing about this gain correct trust? Note that the places are known for salmon fish farming, someone in the polish community might now about this because there are fish farming in Poland - just not with salmon.
I believe the system in the mean will produce valuable information, but it will not be very accurate without additional measures.
John
I tend to think that this may not be a huge problem... but "the proof is in the pudding": our server is now up and running, we are experimenting with some wikis, why don't we try to color a dump of the Norwegian wikipedia and see how the result looks like? I am sure it won't be perfect, and the reasons you cite are the reasons why the tool actually never displays people's reputation value (we don't want visitors to read too much into it). But perhaps the tool will still enable the easy visual detection of recent changes, and unvalidated changes, and perhaps contributors will find this useful.
I downloaded the latest dump I could find (June 2008), and I will color it right away. I am not sure we will install all the extensions required for it to look pretty, and images may or may not work, so this will be a purely experimental setting (all of these things would work if we colored the real Norwegian Wikipedia, it's just that I am not sure I can find the time in these days to install all required extensions). Still, it will give you an idea.
(I haven't implemented yet the code to bring up to date the trust of long-unchanged pages; I will do it).
Luca
2008/8/31 John Erling Blad john.erling.blad@jeb.no
At no.wp there are now questions about establishing thrust for users within small or closed groups where it is no or very little knowledge outside the group. Such situations will occur very rarely on english, or one of the other large wikis, yet it can occur very easily on smaller communities. It can although happen on a wiki with a large community if the given area of knowledge is sufficiently small.
An example is something like tetraploid salmon in fish farming (http://en.wikipedia.org/wiki/Fish_farming), where the english community might be able to check if this is meaningful and parhaps also the norwegian community, irish community and chilean community, but what about the polish community? Will they figure out the relations? Will a editor writing about this gain correct trust? Note that the places are known for salmon fish farming, someone in the polish community might now about this because there are fish farming in Poland - just not with salmon.
I believe the system in the mean will produce valuable information, but it will not be very accurate without additional measures.
John
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Jusrt drop a note and I'll post a notice on our signpost. I think it is valuable as a clue as to whats going on in an article, but I also believe it will be circumstances where it might give a false impression of thrustwortyness. Lets see what people say if they have an example and can see for them selves! :D John
Luca de Alfaro skrev:
I tend to think that this may not be a huge problem... but "the proof is in the pudding": our server is now up and running, we are experimenting with some wikis, why don't we try to color a dump of the Norwegian wikipedia and see how the result looks like? I am sure it won't be perfect, and the reasons you cite are the reasons why the tool actually never displays people's reputation value (we don't want visitors to read too much into it). But perhaps the tool will still enable the easy visual detection of recent changes, and unvalidated changes, and perhaps contributors will find this useful.
I downloaded the latest dump I could find (June 2008), and I will color it right away. I am not sure we will install all the extensions required for it to look pretty, and images may or may not work, so this will be a purely experimental setting (all of these things would work if we colored the real Norwegian Wikipedia, it's just that I am not sure I can find the time in these days to install all required extensions). Still, it will give you an idea.
(I haven't implemented yet the code to bring up to date the trust of long-unchanged pages; I will do it).
Luca
2008/8/31 John Erling Blad <john.erling.blad@jeb.no mailto:john.erling.blad@jeb.no>
At no.wp there are now questions about establishing thrust for users within small or closed groups where it is no or very little knowledge outside the group. Such situations will occur very rarely on english, or one of the other large wikis, yet it can occur very easily on smaller communities. It can although happen on a wiki with a large community if the given area of knowledge is sufficiently small. An example is something like tetraploid salmon in fish farming (http://en.wikipedia.org/wiki/Fish_farming), where the english community might be able to check if this is meaningful and parhaps also the norwegian community, irish community and chilean community, but what about the polish community? Will they figure out the relations? Will a editor writing about this gain correct trust? Note that the places are known for salmon fish farming, someone in the polish community might now about this because there are fish farming in Poland - just not with salmon. I believe the system in the mean will produce valuable information, but it will not be very accurate without additional measures. John _______________________________________________ Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
I agree - this is why I think using both an implicit method (trust colouring) *and* an explicit method (flaggedrevs) together will be best for small/slow projects like English Wikibooks etc.
Mike
-----Original Message----- From: wikiquality-l-bounces@lists.wikimedia.org [mailto:wikiquality-l-bounces@lists.wikimedia.org] On Behalf Of John Erling Blad Sent: September 1, 2008 12:43 PM To: Wikimedia Quality Discussions Subject: Re: [Wikiquality-l] WikiTrust v2 released: reputation and trustforyour wiki in real-time!
Jusrt drop a note and I'll post a notice on our signpost. I think it is valuable as a clue as to whats going on in an article, but I also believe it will be circumstances where it might give a false impression of thrustwortyness. Lets see what people say if they have an example and can see for them selves! :D John
Luca de Alfaro skrev:
I tend to think that this may not be a huge problem... but "the proof is in the pudding": our server is now up and running, we are experimenting with some wikis, why don't we try to color a dump of the Norwegian wikipedia and see how the result looks like? I am sure it won't be perfect, and the reasons you cite are the reasons why the tool actually never displays people's reputation value (we don't want visitors to read too much into it). But perhaps the tool will still enable the easy visual detection of recent changes, and unvalidated changes, and perhaps contributors will find this useful.
I downloaded the latest dump I could find (June 2008), and I will color it right away. I am not sure we will install all the extensions required for it to look pretty, and images may or may not work, so this will be a purely experimental setting (all of these things would work if we colored the real Norwegian Wikipedia, it's just that I am not sure I can find the time in these days to install all required extensions). Still, it will give you an idea.
(I haven't implemented yet the code to bring up to date the trust of long-unchanged pages; I will do it).
Luca
2008/8/31 John Erling Blad <john.erling.blad@jeb.no mailto:john.erling.blad@jeb.no>
At no.wp there are now questions about establishing thrust for users within small or closed groups where it is no or very little knowledge outside the group. Such situations will occur very rarely on english, or one of the other large wikis, yet it can occur very easily on smaller communities. It can although happen on a wiki with a large community if the given area of knowledge is sufficiently small. An example is something like tetraploid salmon in fish farming (http://en.wikipedia.org/wiki/Fish_farming), where the english community might be able to check if this is meaningful and parhaps also the norwegian community, irish community and chilean community, but what about the polish community? Will they figure out the relations? Will a editor writing about this gain correct trust? Note that the places are known for salmon fish farming, someone in the polish community might now about this because there are fish farming in Poland - just not with salmon. I believe the system in the mean will produce valuable information, but it will not be very accurate without additional measures. John _______________________________________________ Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
By the way:
It is easy for us to add a mechanism so that when one flags a revision in flaggedrevs, the trust of the revision increases, similarly (pehaps a bit less? we can discuss) as when there is an edit. Originally the difficulty in doing this was that someone could click on "sighted" a lot of times, thus increasing too much the trust of a revision -- additional clicks by the same author on the same revision should not cause additional trust. In the latest version of the code, this has been solved, as we keep track who has caused the trust of the text to raise, and we can discard duplicate clicks.
So if there is concrete interest from a wiki that has flaggedrevs running, and wants to deploy wikitrust, we can make sure the two extensions work together well. The only reason we have not implemented this yet is that I wanted to know more about the concrete interest. Let me know!
Luca
On Mon, Sep 1, 2008 at 11:11 AM, mike.lifeguard mike.lifeguard@gmail.comwrote:
I agree - this is why I think using both an implicit method (trust colouring) *and* an explicit method (flaggedrevs) together will be best for small/slow projects like English Wikibooks etc.
Mike
-----Original Message----- From: wikiquality-l-bounces@lists.wikimedia.org [mailto:wikiquality-l-bounces@lists.wikimedia.org] On Behalf Of John Erling Blad Sent: September 1, 2008 12:43 PM To: Wikimedia Quality Discussions Subject: Re: [Wikiquality-l] WikiTrust v2 released: reputation and trustforyour wiki in real-time!
Jusrt drop a note and I'll post a notice on our signpost. I think it is valuable as a clue as to whats going on in an article, but I also believe it will be circumstances where it might give a false impression of thrustwortyness. Lets see what people say if they have an example and can see for them selves! :D John
Luca de Alfaro skrev:
I tend to think that this may not be a huge problem... but "the proof is in the pudding": our server is now up and running, we are experimenting with some wikis, why don't we try to color a dump of the Norwegian wikipedia and see how the result looks like? I am sure it won't be perfect, and the reasons you cite are the reasons why the tool actually never displays people's reputation value (we don't want visitors to read too much into it). But perhaps the tool will still enable the easy visual detection of recent changes, and unvalidated changes, and perhaps contributors will find this useful.
I downloaded the latest dump I could find (June 2008), and I will color it right away. I am not sure we will install all the extensions required for it to look pretty, and images may or may not work, so this will be a purely experimental setting (all of these things would work if we colored the real Norwegian Wikipedia, it's just that I am not sure I can find the time in these days to install all required extensions). Still, it will give you an idea.
(I haven't implemented yet the code to bring up to date the trust of long-unchanged pages; I will do it).
Luca
2008/8/31 John Erling Blad <john.erling.blad@jeb.no mailto:john.erling.blad@jeb.no>
At no.wp there are now questions about establishing thrust for users within small or closed groups where it is no or very little knowledge outside the group. Such situations will occur very rarely on english, or one of the other large wikis, yet it can occur very easily on smaller communities. It can although happen on a wiki with a large community if the given area of knowledge is sufficiently small. An example is something like tetraploid salmon in fish farming (http://en.wikipedia.org/wiki/Fish_farming), where the english community might be able to check if this is meaningful and parhaps also the norwegian community, irish community and chilean community, but what about the polish community? Will they figure out the relations? Will
a
editor writing about this gain correct trust? Note that the places
are
known for salmon fish farming, someone in the polish community might now about this because there are fish farming in Poland - just not with salmon. I believe the system in the mean will produce valuable information, but it will not be very accurate without additional measures. John _______________________________________________ Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
At no.wp we are running a system for patrolling, whereby a user marks a revision as patrolled. This uses the old patrolling solution, and some added roles. It works pretty well as a simplyfied flaggedrevs. John
Luca de Alfaro skrev:
By the way:
It is easy for us to add a mechanism so that when one flags a revision in flaggedrevs, the trust of the revision increases, similarly (pehaps a bit less? we can discuss) as when there is an edit. Originally the difficulty in doing this was that someone could click on "sighted" a lot of times, thus increasing too much the trust of a revision -- additional clicks by the same author on the same revision should not cause additional trust. In the latest version of the code, this has been solved, as we keep track who has caused the trust of the text to raise, and we can discard duplicate clicks.
So if there is concrete interest from a wiki that has flaggedrevs running, and wants to deploy wikitrust, we can make sure the two extensions work together well. The only reason we have not implemented this yet is that I wanted to know more about the concrete interest. Let me know!
Luca
On Mon, Sep 1, 2008 at 11:11 AM, mike.lifeguard <mike.lifeguard@gmail.com mailto:mike.lifeguard@gmail.com> wrote:
I agree - this is why I think using both an implicit method (trust colouring) *and* an explicit method (flaggedrevs) together will be best for small/slow projects like English Wikibooks etc. Mike -----Original Message----- From: wikiquality-l-bounces@lists.wikimedia.org <mailto:wikiquality-l-bounces@lists.wikimedia.org> [mailto:wikiquality-l-bounces@lists.wikimedia.org <mailto:wikiquality-l-bounces@lists.wikimedia.org>] On Behalf Of John Erling Blad Sent: September 1, 2008 12:43 PM To: Wikimedia Quality Discussions Subject: Re: [Wikiquality-l] WikiTrust v2 released: reputation and trustforyour wiki in real-time! Jusrt drop a note and I'll post a notice on our signpost. I think it is valuable as a clue as to whats going on in an article, but I also believe it will be circumstances where it might give a false impression of thrustwortyness. Lets see what people say if they have an example and can see for them selves! :D John Luca de Alfaro skrev: > I tend to think that this may not be a huge problem... but "the proof > is in the pudding": our server is now up and running, we are > experimenting with some wikis, why don't we try to color a dump of the > Norwegian wikipedia and see how the result looks like? I am sure it > won't be perfect, and the reasons you cite are the reasons why the > tool actually never displays people's reputation value (we don't want > visitors to read too much into it). But perhaps the tool will still > enable the easy visual detection of recent changes, and unvalidated > changes, and perhaps contributors will find this useful. > > I downloaded the latest dump I could find (June 2008), and I will > color it right away. > I am not sure we will install all the extensions required for it to > look pretty, and images may or may not work, so this will be a purely > experimental setting (all of these things would work if we colored the > real Norwegian Wikipedia, it's just that I am not sure I can find the > time in these days to install all required extensions). Still, it > will give you an idea. > > (I haven't implemented yet the code to bring up to date the trust of > long-unchanged pages; I will do it). > > Luca > > > 2008/8/31 John Erling Blad <john.erling.blad@jeb.no <mailto:john.erling.blad@jeb.no> > <mailto:john.erling.blad@jeb.no <mailto:john.erling.blad@jeb.no>>> > > At no.wp there are now questions about establishing thrust for users > within small or closed groups where it is no or very little knowledge > outside the group. Such situations will occur very rarely on > english, or > one of the other large wikis, yet it can occur very easily on smaller > communities. It can although happen on a wiki with a large > community if > the given area of knowledge is sufficiently small. > > An example is something like tetraploid salmon in fish farming > (http://en.wikipedia.org/wiki/Fish_farming), where the english > community > might be able to check if this is meaningful and parhaps also the > norwegian community, irish community and chilean community, but what > about the polish community? Will they figure out the relations? Will a > editor writing about this gain correct trust? Note that the places are > known for salmon fish farming, someone in the polish community > might now > about this because there are fish farming in Poland - just not > with salmon. > > I believe the system in the mean will produce valuable > information, but > it will not be very accurate without additional measures. > > John > > _______________________________________________ > Wikiquality-l mailing list > Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> > <mailto:Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org>> > https://lists.wikimedia.org/mailman/listinfo/wikiquality-l > > > ------------------------------------------------------------------------ > > _______________________________________________ > Wikiquality-l mailing list > Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> > https://lists.wikimedia.org/mailman/listinfo/wikiquality-l > _______________________________________________ Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Suppose I make an executable that takes, as parameters, the revision id being flagged, and the user id doing the flagging. something like:
vote_trusted <revision_id> <user_id>
You could call this e.g. from the php which implements the flagging. Would this work, i.e., would that php be able to call it?
If it does, then I can think at how to implement it.
One important question: does one flag only the most recent revision of an article, or do people usually flag older revisions? I would imagine they flag only the most recent, as if they prefer an older one, they could first revert to that or fix whatever they don't like in the revision, then flag it. Is this correct?
Luca
2008/9/1 John Erling Blad john.erling.blad@jeb.no
At no.wp we are running a system for patrolling, whereby a user marks a revision as patrolled. This uses the old patrolling solution, and some added roles. It works pretty well as a simplyfied flaggedrevs. John
Luca de Alfaro skrev:
By the way:
It is easy for us to add a mechanism so that when one flags a revision in flaggedrevs, the trust of the revision increases, similarly (pehaps a bit less? we can discuss) as when there is an edit. Originally the difficulty in doing this was that someone could click on "sighted" a lot of times, thus increasing too much the trust of a revision -- additional clicks by the same author on the same revision should not cause additional trust. In the latest version of the code, this has been solved, as we keep track who has caused the trust of the text to raise, and we can discard duplicate clicks.
So if there is concrete interest from a wiki that has flaggedrevs running, and wants to deploy wikitrust, we can make sure the two extensions work together well. The only reason we have not implemented this yet is that I wanted to know more about the concrete interest. Let me know!
Luca
On Mon, Sep 1, 2008 at 11:11 AM, mike.lifeguard <mike.lifeguard@gmail.com mailto:mike.lifeguard@gmail.com> wrote:
I agree - this is why I think using both an implicit method (trust colouring) *and* an explicit method (flaggedrevs) together will be best for small/slow projects like English Wikibooks etc. Mike -----Original Message----- From: wikiquality-l-bounces@lists.wikimedia.org <mailto:wikiquality-l-bounces@lists.wikimedia.org> [mailto:wikiquality-l-bounces@lists.wikimedia.org <mailto:wikiquality-l-bounces@lists.wikimedia.org>] On Behalf Of John Erling Blad Sent: September 1, 2008 12:43 PM To: Wikimedia Quality Discussions Subject: Re: [Wikiquality-l] WikiTrust v2 released: reputation and trustforyour wiki in real-time! Jusrt drop a note and I'll post a notice on our signpost. I think it is valuable as a clue as to whats going on in an article, but I also believe it will be circumstances where it might give a false impression of thrustwortyness. Lets see what people say if they have an example and can see for them selves! :D John Luca de Alfaro skrev: > I tend to think that this may not be a huge problem... but "the proof > is in the pudding": our server is now up and running, we are > experimenting with some wikis, why don't we try to color a dump of the > Norwegian wikipedia and see how the result looks like? I am sure
it
> won't be perfect, and the reasons you cite are the reasons why the > tool actually never displays people's reputation value (we don't want > visitors to read too much into it). But perhaps the tool will
still
> enable the easy visual detection of recent changes, and unvalidated > changes, and perhaps contributors will find this useful. > > I downloaded the latest dump I could find (June 2008), and I will > color it right away. > I am not sure we will install all the extensions required for it to > look pretty, and images may or may not work, so this will be a purely > experimental setting (all of these things would work if we colored the > real Norwegian Wikipedia, it's just that I am not sure I can find the > time in these days to install all required extensions). Still, it > will give you an idea. > > (I haven't implemented yet the code to bring up to date the trust
of
> long-unchanged pages; I will do it). > > Luca > > > 2008/8/31 John Erling Blad <john.erling.blad@jeb.no <mailto:john.erling.blad@jeb.no> > <mailto:john.erling.blad@jeb.no <mailto:john.erling.blad@jeb.no>>> > > At no.wp there are now questions about establishing thrust for users > within small or closed groups where it is no or very little knowledge > outside the group. Such situations will occur very rarely on > english, or > one of the other large wikis, yet it can occur very easily on smaller > communities. It can although happen on a wiki with a large > community if > the given area of knowledge is sufficiently small. > > An example is something like tetraploid salmon in fish farming > (http://en.wikipedia.org/wiki/Fish_farming), where the english > community > might be able to check if this is meaningful and parhaps also the > norwegian community, irish community and chilean community, but what > about the polish community? Will they figure out the relations? Will a > editor writing about this gain correct trust? Note that the places are > known for salmon fish farming, someone in the polish community > might now > about this because there are fish farming in Poland - just not > with salmon. > > I believe the system in the mean will produce valuable > information, but > it will not be very accurate without additional measures. > > John > > _______________________________________________ > Wikiquality-l mailing list > Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> > <mailto:Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org>> > https://lists.wikimedia.org/mailman/listinfo/wikiquality-l > > >
> > _______________________________________________ > Wikiquality-l mailing list > Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> > https://lists.wikimedia.org/mailman/listinfo/wikiquality-l > _______________________________________________ Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
People does not do this consistently. Usually they flagg revisions as patrolled if they do't find anything wrong. If they can't establish its correctness they usually leave it. If they fix the problem it might very well be that it is vandalized or it might be a spelling error. They may very well mark it as patrolled after they have fixed the problem.
Baasically, if they mark it as fixed and does nothing it is safe to increase the thrust. If they do some follow up actions then probably nothing can be said about the trust. If the revision are left without any patrolling then also nothing can be said about the trust, but I think it is possible to say that probably the edit is difficult to patrol of some reason, whattever that may be.
John
Luca de Alfaro skrev:
Suppose I make an executable that takes, as parameters, the revision id being flagged, and the user id doing the flagging. something like:
vote_trusted <revision_id> <user_id>
You could call this e.g. from the php which implements the flagging. Would this work, i.e., would that php be able to call it?
If it does, then I can think at how to implement it.
One important question: does one flag only the most recent revision of an article, or do people usually flag older revisions? I would imagine they flag only the most recent, as if they prefer an older one, they could first revert to that or fix whatever they don't like in the revision, then flag it. Is this correct?
Luca
2008/9/1 John Erling Blad <john.erling.blad@jeb.no mailto:john.erling.blad@jeb.no>
At no.wp we are running a system for patrolling, whereby a user marks a revision as patrolled. This uses the old patrolling solution, and some added roles. It works pretty well as a simplyfied flaggedrevs. John Luca de Alfaro skrev: > By the way: > > It is easy for us to add a mechanism so that when one flags a revision > in flaggedrevs, the trust of the revision increases, similarly (pehaps > a bit less? we can discuss) as when there is an edit. > Originally the difficulty in doing this was that someone could click > on "sighted" a lot of times, thus increasing too much the trust of a > revision -- additional clicks by the same author on the same revision > should not cause additional trust. > In the latest version of the code, this has been solved, as we keep > track who has caused the trust of the text to raise, and we can > discard duplicate clicks. > > So if there is concrete interest from a wiki that has flaggedrevs > running, and wants to deploy wikitrust, we can make sure the two > extensions work together well. The only reason we have not > implemented this yet is that I wanted to know more about the concrete > interest. Let me know! > > Luca > > On Mon, Sep 1, 2008 at 11:11 AM, mike.lifeguard > <mike.lifeguard@gmail.com <mailto:mike.lifeguard@gmail.com> <mailto:mike.lifeguard@gmail.com <mailto:mike.lifeguard@gmail.com>>> wrote: > > I agree - this is why I think using both an implicit method (trust > colouring) *and* an explicit method (flaggedrevs) together will be > best for > small/slow projects like English Wikibooks etc. > > Mike > > -----Original Message----- > From: wikiquality-l-bounces@lists.wikimedia.org <mailto:wikiquality-l-bounces@lists.wikimedia.org> > <mailto:wikiquality-l-bounces@lists.wikimedia.org <mailto:wikiquality-l-bounces@lists.wikimedia.org>> > [mailto:wikiquality-l-bounces@lists.wikimedia.org <mailto:wikiquality-l-bounces@lists.wikimedia.org> > <mailto:wikiquality-l-bounces@lists.wikimedia.org <mailto:wikiquality-l-bounces@lists.wikimedia.org>>] On Behalf Of > John Erling > Blad > Sent: September 1, 2008 12:43 PM > To: Wikimedia Quality Discussions > Subject: Re: [Wikiquality-l] WikiTrust v2 released: reputation and > trustforyour wiki in real-time! > > Jusrt drop a note and I'll post a notice on our signpost. > I think it is valuable as a clue as to whats going on in an > article, but > I also believe it will be circumstances where it might give a false > impression of thrustwortyness. > Lets see what people say if they have an example and can see for them > selves! :D > John > > Luca de Alfaro skrev: > > I tend to think that this may not be a huge problem... but "the > proof > > is in the pudding": our server is now up and running, we are > > experimenting with some wikis, why don't we try to color a dump > of the > > Norwegian wikipedia and see how the result looks like? I am sure it > > won't be perfect, and the reasons you cite are the reasons why the > > tool actually never displays people's reputation value (we don't > want > > visitors to read too much into it). But perhaps the tool will still > > enable the easy visual detection of recent changes, and unvalidated > > changes, and perhaps contributors will find this useful. > > > > I downloaded the latest dump I could find (June 2008), and I will > > color it right away. > > I am not sure we will install all the extensions required for it to > > look pretty, and images may or may not work, so this will be a > purely > > experimental setting (all of these things would work if we > colored the > > real Norwegian Wikipedia, it's just that I am not sure I can > find the > > time in these days to install all required extensions). Still, it > > will give you an idea. > > > > (I haven't implemented yet the code to bring up to date the trust of > > long-unchanged pages; I will do it). > > > > Luca > > > > > > 2008/8/31 John Erling Blad <john.erling.blad@jeb.no <mailto:john.erling.blad@jeb.no> > <mailto:john.erling.blad@jeb.no <mailto:john.erling.blad@jeb.no>> > > <mailto:john.erling.blad@jeb.no <mailto:john.erling.blad@jeb.no> <mailto:john.erling.blad@jeb.no <mailto:john.erling.blad@jeb.no>>>> > > > > At no.wp there are now questions about establishing thrust > for users > > within small or closed groups where it is no or very little > knowledge > > outside the group. Such situations will occur very rarely on > > english, or > > one of the other large wikis, yet it can occur very easily > on smaller > > communities. It can although happen on a wiki with a large > > community if > > the given area of knowledge is sufficiently small. > > > > An example is something like tetraploid salmon in fish farming > > (http://en.wikipedia.org/wiki/Fish_farming), where the english > > community > > might be able to check if this is meaningful and parhaps > also the > > norwegian community, irish community and chilean community, > but what > > about the polish community? Will they figure out the > relations? Will a > > editor writing about this gain correct trust? Note that the > places are > > known for salmon fish farming, someone in the polish community > > might now > > about this because there are fish farming in Poland - just not > > with salmon. > > > > I believe the system in the mean will produce valuable > > information, but > > it will not be very accurate without additional measures. > > > > John > > > > _______________________________________________ > > Wikiquality-l mailing list > > Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> > <mailto:Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org>> > > <mailto:Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> > <mailto:Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org>>> > > https://lists.wikimedia.org/mailman/listinfo/wikiquality-l > > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Wikiquality-l mailing list > > Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> > <mailto:Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org>> > > https://lists.wikimedia.org/mailman/listinfo/wikiquality-l > > > > > _______________________________________________ > Wikiquality-l mailing list > Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> > <mailto:Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org>> > https://lists.wikimedia.org/mailman/listinfo/wikiquality-l > > > ------------------------------------------------------------------------ > > _______________________________________________ > Wikiquality-l mailing list > Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> > https://lists.wikimedia.org/mailman/listinfo/wikiquality-l > _______________________________________________ Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Yes, I think setting the trust uniformly at a certain (high) level once a page has been flagged makes sense. I suppose that should be configurable - some wikis have a very lightweight process (dewiki, IIRC) and others will have a more robust method (enwiki, if it ever decides to get FlaggedRevs). And some, like English Wikibooks, will be somewhere in between (if the sysadmins ever enable it for us). So dewiki should perhaps not have a huge change in trust when a page is sighted. But on enwiki, they will probably want a larger effect on trust, since they will probably have very-highly-trusted users making such judgments. I would certainly like to revisit this issue after English Wikibooks gets some experience with FlaggedRevs. We are currently waiting for it to be enabled. After a month or two of using it we may have a better idea of what would work best here.
Mike
_____
From: wikiquality-l-bounces@lists.wikimedia.org [mailto:wikiquality-l-bounces@lists.wikimedia.org] On Behalf Of Luca de Alfaro Sent: September 1, 2008 3:25 PM To: Wikimedia Quality Discussions Subject: Re: [Wikiquality-l] WikiTrust v2 released: reputation andtrustforyour wiki in real-time!
By the way:
It is easy for us to add a mechanism so that when one flags a revision in flaggedrevs, the trust of the revision increases, similarly (pehaps a bit less? we can discuss) as when there is an edit. Originally the difficulty in doing this was that someone could click on "sighted" a lot of times, thus increasing too much the trust of a revision -- additional clicks by the same author on the same revision should not cause additional trust. In the latest version of the code, this has been solved, as we keep track who has caused the trust of the text to raise, and we can discard duplicate clicks.
So if there is concrete interest from a wiki that has flaggedrevs running, and wants to deploy wikitrust, we can make sure the two extensions work together well. The only reason we have not implemented this yet is that I wanted to know more about the concrete interest. Let me know!
Luca
On Mon, Sep 1, 2008 at 11:11 AM, mike.lifeguard mike.lifeguard@gmail.com wrote:
I agree - this is why I think using both an implicit method (trust colouring) *and* an explicit method (flaggedrevs) together will be best for small/slow projects like English Wikibooks etc.
Mike
-----Original Message----- From: wikiquality-l-bounces@lists.wikimedia.org
[mailto:wikiquality-l-bounces@lists.wikimedia.org] On Behalf Of John Erling Blad Sent: September 1, 2008 12:43 PM To: Wikimedia Quality Discussions Subject: Re: [Wikiquality-l] WikiTrust v2 released: reputation and trustforyour wiki in real-time!
Jusrt drop a note and I'll post a notice on our signpost. I think it is valuable as a clue as to whats going on in an article, but I also believe it will be circumstances where it might give a false impression of thrustwortyness. Lets see what people say if they have an example and can see for them selves! :D John
Luca de Alfaro skrev:
I tend to think that this may not be a huge problem... but "the proof is in the pudding": our server is now up and running, we are experimenting with some wikis, why don't we try to color a dump of the Norwegian wikipedia and see how the result looks like? I am sure it won't be perfect, and the reasons you cite are the reasons why the tool actually never displays people's reputation value (we don't want visitors to read too much into it). But perhaps the tool will still enable the easy visual detection of recent changes, and unvalidated changes, and perhaps contributors will find this useful.
I downloaded the latest dump I could find (June 2008), and I will color it right away. I am not sure we will install all the extensions required for it to look pretty, and images may or may not work, so this will be a purely experimental setting (all of these things would work if we colored the real Norwegian Wikipedia, it's just that I am not sure I can find the time in these days to install all required extensions). Still, it will give you an idea.
(I haven't implemented yet the code to bring up to date the trust of long-unchanged pages; I will do it).
Luca
2008/8/31 John Erling Blad <john.erling.blad@jeb.no mailto:john.erling.blad@jeb.no>
At no.wp there are now questions about establishing thrust for users within small or closed groups where it is no or very little knowledge outside the group. Such situations will occur very rarely on english, or one of the other large wikis, yet it can occur very easily on smaller communities. It can although happen on a wiki with a large community if the given area of knowledge is sufficiently small. An example is something like tetraploid salmon in fish farming (http://en.wikipedia.org/wiki/Fish_farming), where the english community might be able to check if this is meaningful and parhaps also the norwegian community, irish community and chilean community, but what about the polish community? Will they figure out the relations? Will a editor writing about this gain correct trust? Note that the places are known for salmon fish farming, someone in the polish community might now about this because there are fish farming in Poland - just not with salmon. I believe the system in the mean will produce valuable information, but it will not be very accurate without additional measures. John _______________________________________________ Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org <mailto:Wikiquality-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
_______________________________________________ Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
I may be wrong... but I simply took the number of the most recent revision at 20 min intervals, and made the ratio... also note that 5 edits / sec is 400,000 new revisions per day; I am not sure, but the English Wikipedia is at about 50M revisions last time I checked, so that would be 120 days of such growth... Does anyone know the numbers?
Luca
On Tue, Aug 26, 2008 at 6:42 PM, mike.lifeguard mike.lifeguard@gmail.comwrote:
even on the English Wikipedia (5 edits / second at most?) a single CPU
would suffice
IIRC, the rate is far far higher than that. Perhaps ask a syadmin how many edits/sec (or are db transactions more relevant?) we're looking at currently. 5 edit/s seems low to me. But maybe I have been dazzled by numbers such as ">80000 SQL queries per sec" (from http://dammit.lt/tech/velocity-wikipedia-domas-2008.pdf; probably all WMF, not enwiki) etc O.o
Mike
Wikiquality-l mailing list Wikiquality-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikiquality-l
wikiquality-l@lists.wikimedia.org