Hey,
I've put together an extension for rating articles if anyone is interested. It's just a first version and hasn't been tested much, but the details can be found here:
http://www.wikihow.com/WikiHow:RateArticle-Extension
You can see an example here on our development server:
http://wiki16.wikidiy.com/Get-a-Better-Deal-on-a-Home-Loan
(username password wikihow / wikihow2006) - scroll down to the bottom of the page for the checkmarks.
I'd appreciate feedback if anyone has any. If someone wants to add this to extensions in svn, that'd be great.
Thanks, Travis
I've put together an extension for rating articles if anyone is interested. It's just a first version and hasn't been tested much, but the details can be found here:
[..snip..]
I'd appreciate feedback if anyone has any.
Some technical feedback:
1) I think you may need to max the rating out at 5.
E.g. I entered this malicious input to try and ballot-stuff the ratings:
http://wiki16.wikidiy.com/Special:RateArticle?page_id=10743&rating=99999
And then the rating of the loan article comes out as 5 stars. I would expect it to come out at 4.1 (or something like that), based on my prior ratings, if it was being capped at 5.
2) You might not want to show the "rate this article" bit when there is no article (e.g. non-existent http://wiki16.wikidiy.com/FAKE-FAKE-FAKE page shows the ratings). Maybe something along these lines in wfRateArticleForm() function in RateArticle.php : $page_id = $wgArticle->getID(); + if (!$page_id) return;
3) Trivial stuff in wfSpecialRateArticle() function :
$rat_page = $wgRequest->getVal("page_id"); $rat_user = $wgUser->getID(); $rat_user_text = $wgUser->getName(); $rat_rating = $wgRequest->getVal('rating');
$dbw =& wfGetDB( DB_MASTER ); $dbw->insert( 'rating', array( 'rat_page' => $rat_page, 'rat_user' => $rat_user, 'rat_user_text' => $rat_user_text, 'rat_rating' => $rat_rating), $fname);
Maybe getInt() instead of getVal() for the "page_id" and 'rating' fields? Maybe verify that the page_id corresponds to a valid existing page?
4) Maybe a max of one rating per user per page? (Perhaps later votes overwrite/overrule earlier votes? Or just discard later votes?)
5) Fatal error on http://wiki16.wikidiy.com/Special:ListRatings (probably due to some of the bad input I introduced in doing the above testing - Sorry). The special page should be ideally be robust against bad input though, e.g.:
$t = Title::newFromID($row->rat_page); + if (is_null($t)) continue; $wgOut->addHTML("<li>" . $sk->makeLinkObj($t, $t->getFullText() ) . " ({$row->C}, {$row->R})</li>");
(or something along those lines).
All the best, Nick.
Awesome, thanks.
On 8/30/06, Nick Jenkins nickpj@gmail.com wrote:
I've put together an extension for rating articles if anyone is interested. It's just a first version and hasn't been tested much, but the details can be found here:
- Maybe a max of one rating per user per page? (Perhaps later votes overwrite/overrule earlier votes? Or just discard later votes?)
This point is interesting and it's something I've been considering. Right now I think the best route is to capture the vote and ignore it later if required. I think an appropriate rating system will be weighted (with respect to time) and take into account that wiki pages change frequently. The purpose of gathering ratings for us is to draw attention to pages that need improvement and to appropriately sort similar articles in search results based on quality, since our wiki can have multiple articles on similar topics.
It might also be interesting to have a weighted system to weight votes based on how much a page has changed over time. So, when a vote of 3 is submitted and the page changes by so many characters, or so many times, and a vote of 5 comes along, the resulting rating is higher than (3+5) /2 = 4.
Travis
- Maybe a max of one rating per user per page? (Perhaps later votes overwrite/overrule earlier votes? Or just discard
later votes?)
This point is interesting and it's something I've been considering. Right now I think the best route is to capture the vote and ignore it later if required. I think an appropriate rating system will be weighted (with respect to time) and take into account that wiki pages change frequently. The purpose of gathering ratings for us is to draw attention to pages that need improvement and to appropriately sort similar articles in search results based on quality, since our wiki can have multiple articles on similar topics.
Capturing everything probably doesn't hurt (e.g. it allows you to say "which articles have improved the most with time, and which have gotten worse?")
But if I give something a rating of 3, and then I come back a few months later and say "this page has improved a lot, I think it's a 5 now", and submit a rating of 5, should my original vote of 3 still count?
It might also be interesting to have a weighted system to weight votes based on how much a page has changed over time.
That could work.
An alternative could be to have a sliding window, whereby only votes in the last month or so counts - that way votes based on old revisions age out of the system. Trouble is if a page is infrequently edited then votes should last longer (since what was voted upon a while ago will probably still be similar to the current revision), and if a page is infrequently voted upon then votes should last longer too (because otherwise your sample size will so small that just the last few votes will be able to totally skew the ratings).
Awesome, thanks.
By the way, I might have just missed it, but if I haven't you might also want to include something somewhere about which license or licenses you're releasing your extension under ;-)
All the best, Nick.
An alternative could be to have a sliding window, whereby only votes in the last month or so counts - that way votes based on old revisions age out of the system. Trouble is if a page is infrequently edited then votes should last longer (since what was voted upon a while ago will probably still be similar to the current revision), and if a page is infrequently voted upon then votes should last longer too (because otherwise your sample size will so small that just the last few votes will be able to totally skew the ratings).
Those are good points. It would be neat to use diffs to calculate how much an article has changed between edits, so if you voted 3 on a stub that was entirely rewritten (nothing in the original version was contained in the current version) your vote would have a weight of 0%, and if I voted 4 on a version that was only 50% different than the current version, my vote would have a weight of 50%. This would be hard to do on a per request basis, but would be easy to do on a daily basis and store the results in an intermediate table.
Awesome, thanks.
By the way, I might have just missed it, but if I haven't you might also want to include something somewhere about which license or licenses you're releasing your extension under ;-)
I hadn't thought that far ahead! What license are Wikipedia extensions usually released with? GPL?
On 9/5/06, Travis Derouin travis@wikihow.com wrote:
I hadn't thought that far ahead! What license are Wikipedia extensions usually released with? GPL?
All MediaWiki extensions have to be GPLd. Possibly altogether, but definitely if you want them on SVN/Wikimedia projects. AFAIK.
"Simetrical" Simetrical+wikitech@gmail.com wrote in message news:7c2a12e20609051655s35c52d71sac58a05bcd1003e7@mail.gmail.com...
On 9/5/06, Travis Derouin travis@wikihow.com
wrote:
I hadn't thought that far ahead! What license are Wikipedia extensions usually released with? GPL?
All MediaWiki extensions have to be GPLd. Possibly altogether, but definitely if you want them on SVN/Wikimedia projects. AFAIK.
Actually, I'm not sure that's true. So long as your extension doesn't include any MW (or other GPL) code then it can be distributed under whatever license you like. If you distribute the extension _with_ MediaWiki, then I imagine that changes things and GPL must be used, but an independent distribution that only consists of your own code can surely use any license, even though you also need to install MediaWiki for it to be of any use.
Of course, if you want the code to be available in MediaWiki SVN then it will need to be GPL, although I think this is more a matter of policy than a legal requirement (so long as the above conditions remain true).
But then, ianal...
- Mark Clements (HappyDog)
On 9/5/06, Mark Clements gmane@kennel17.co.uk wrote:
Actually, I'm not sure that's true. So long as your extension doesn't include any MW (or other GPL) code then it can be distributed under whatever license you like.
Probably. I'm a bit hazy on what exactly constitutes a "derivative work".
GPL is fine for me, I've updated the code with the GPL license and made some changes that were suggested.
http://www.wikihow.com/WikiHow:RateArticle-Extension
How do extensions generally deal with images and stylesheets that need to be available outside the extensions directory, such as in the skins directory, when users are accessing them through SVN?
Travis
On 9/5/06, Simetrical Simetrical+wikitech@gmail.com wrote:
On 9/5/06, Travis Derouin travis@wikihow.com wrote:
I hadn't thought that far ahead! What license are Wikipedia extensions usually released with? GPL?
All MediaWiki extensions have to be GPLd. Possibly altogether, but definitely if you want them on SVN/Wikimedia projects. AFAIK. _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Hmm . . . have you considered digging up some voting theorists and e-mailing them for advice on how to set this up? I suspect that any system that hasn't had a great deal of systematic thought put into it will inevitably be deeply flawed on various levels. If you could get a professor interested in having e-mail chats who could hook you up with some equations that get at what we want, I think you'd probably avoid all the issues that people brought up above, plus most of the issues they didn't think of.
On 9/6/06, Travis Derouin travis@wikihow.com wrote:
How do extensions generally deal with images and stylesheets that need to be available outside the extensions directory, such as in the skins directory, when users are accessing them through SVN?
Hooks, surely? The skins must have hooks in lots of convenient places. What do you mean, available outside the extensions directory? You're only running code inside the extensions directory.
On 9/6/06, Simetrical Simetrical+wikitech@gmail.com wrote:
Hmm . . . have you considered digging up some voting theorists and e-mailing them for advice on how to set this up? I suspect that any system that hasn't had a great deal of systematic thought put into it will inevitably be deeply flawed on various levels. If you could get a professor interested in having e-mail chats who could hook you up with some equations that get at what we want, I think you'd probably avoid all the issues that people brought up above, plus most of the issues they didn't think of.
I hadn't considered that. :) Thanks for the offer. I agree, any simple schemes definitely won't be perfect, but it might be suitable for finding which pages are good, and which pages need to be improved. Another option is to use a forumla similar to IMDB's, which I assume has been arrived at due to some amount of research.
On 9/6/06, Travis Derouin travis@wikihow.com wrote:
How do extensions generally deal with images and stylesheets that need to be available outside the extensions directory, such as in the skins directory, when users are accessing them through SVN?
Hooks, surely? The skins must have hooks in lots of convenient places. What do you mean, available outside the extensions directory? You're only running code inside the extensions directory.
The extension uses images for buttons for rating the pages, and these buttons have to reside in a directory that is accessible through the browser, /extensions is not. So, in this case, do extensions generally just provide a tarball and have the user untar the images into the skins directory?
Thanks, Travis
Another option is to use a formula similar to IMDB's, which I assume has been arrived at due to some amount of research.
IMDB only count registered active users towards their "Top 250" search (which would be the equivalent of ignoring votes from anons). Maybe a good idea for things that you think people are likely to try and cheat on.
How do extensions generally deal with images and stylesheets that need to be available outside the extensions directory, such as in the skins directory, when users are accessing them through SVN?
Hooks, surely? The skins must have hooks in lots of convenient places. What do you mean, available outside the extensions directory? You're only running code inside the extensions directory.
The extension uses images for buttons for rating the pages, and these buttons have to reside in a directory that is accessible through the browser, /extensions is not. So, in this case, do extensions generally just provide a tarball and have the user untar the images into the skins directory?
Maybe have a look at the wikihiero extension ( http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/wikihiero/ )
For example if I type "<hiero>Y5</hiero>" into a Wikipedia sandbox, then the resulting image path is: http://en.wikipedia.org/w/extensions/wikihiero/img/hiero_Y5.png (i.e. proof-by-existence that it is possible to store the extension's images in the extensions directory).
All the best, Nick.
On Mon, Sep 11, 2006 at 06:44:14PM +1000, Nick Jenkins wrote:
Another option is to use a formula similar to IMDB's, which I assume has been arrived at due to some amount of research.
IMDB only count registered active users towards their "Top 250" . search (which would be the equivalent of ignoring votes from anons) . Maybe a good idea for things that you think people are likely to try . and cheat on .
And this seems like a good time for me to make a point near and dear to my heart:
Those who assert that the design of a system like this needn't be secure, "since nothing is actually based on it [now]" need to read RISKS a little more regularly.
If you *create* a system like this, people will eventually base things on it -- that's sort of what it's *for* -- and therefore you need to 1) choose the safest set of assumptions for that sort of environment, and b) document exactly what those assumptions are (by preference, in the code that makes them) so that later users can make an informed decision about exactly whether the code is reasonable for their use.
Cheers, -- jra
"Travis Derouin" wrote:
- Maybe a max of one rating per user per page? (Perhaps later votes
overwrite/overrule earlier votes? Or just discard later votes?)
This point is interesting and it's something I've been considering. Right now I think the best route is to capture the vote and ignore it later if required. I think an appropriate rating system will be weighted (with respect to time) and take into account that wiki pages change frequently. The purpose of gathering ratings for us is to draw attention to pages that need improvement and to appropriately sort similar articles in search results based on quality, since our wiki can have multiple articles on similar topics.
It might also be interesting to have a weighted system to weight votes based on how much a page has changed over time. So, when a vote of 3 is submitted and the page changes by so many characters, or so many times, and a vote of 5 comes along, the resulting rating is higher than (3+5) /2 = 4.
Travis
A vote system must be able to deal with all kind of trickies. People overvoting, voting ech day, trying to improve their 'wikipedia pagerank', or shut down others'... Maybe you could set a ip-hash so vote only affects if it's different than last vote (or last 5 votes...). Then it could be remplaced by the semisum of both votes of those hash matching ips. Maybe even with a special sensibility for repeated queries from the same ISP (dynamic ips connecting-voting-disconnecting).
About the vote forgiveness on change take into account that most wight changes are those of blanking/reverting so some blank-revert processes could reset the counter although content haven't changed.
P.S. http://wiki16.wikidiy.com/Special:ListRatings -> Fatal error: Call to a member function on a non-object in /var/www/html/wiki16/extensions/RateArticle.php on line 187
On 01/09/06, Platonides Platonides@gmail.com wrote:
Maybe you could set a ip-hash so vote only affects if it's different than last vote (or last 5 votes...). Then it could be remplaced by the semisum of both votes of those hash matching ips. Maybe even with a special sensibility for repeated queries from the same ISP (dynamic ips connecting-voting-disconnecting).
Far, far simpler than that:
* Don't allow unregistered users to rate pages * When a user rates page, their new rating replaces existing ratings for that page from that user, allowing people's opinions to change over time
No sense in fiddling about with all this IP address hashing bollocks and super-secret hashing when there's no need. I think the idea of personal information and IP addresses and storage and so forth is going to people's heads.
Rob Church
"Rob Church" wrote:
Far, far simpler than that:
- Don't allow unregistered users to rate pages
- When a user rates page, their new rating replaces existing ratings
for that page from that user, allowing people's opinions to change over time
No sense in fiddling about with all this IP address hashing bollocks and super-secret hashing when there's no need. I think the idea of personal information and IP addresses and storage and so forth is going to people's heads.
Rob Church
Much easier. But probably you'll get the most feedback from anons. At least how i understood it. If you have a number of wikipedists 'reviewing articles' may be used.
Such hashing is a weak system to avoid multiple votes. You see there's no IP addresses stored, e.g. only your ip module 12345. So it's higly improbable two different anons with colliding ip voting for the same article in a short amount of time. Much more improbable if you only compare the last voter. But then you could fill votes alternating two ips. So i mentioned checking X last voters.
On 01/09/06, Platonides Platonides@gmail.com wrote:
Much easier. But probably you'll get the most feedback from anons. At least how i understood it. If you have a number of wikipedists 'reviewing articles' may be used.
[snipped stuff I don't necessarily disagree with]
I think the point is that we're not talking about votes to elect a new Prime Minister or President, and we're not talking about something that's oh-so-vital that a few dud votes will kill us off.
It's a casual rating system to give people a bit of an idea as to how good various people think a page is. Supplemental to stable versioning, perhaps, but it still doesn't replace people checking their own sources, etc.
Your point about feedback coming from anons is a good one, and quite salient, since it's quite correct that most (and the best, from the point of view of what our "audience" thinks) feedback will come from viewers who might not have accounts.
Whatever is chosen in the implementation is going to be a toss-up, because life is not perfect, people are not perfect, and shared IP addressing is a sordid reality we have to put up with. :)
Rob Church
On 9/1/06, Rob Church robchur@gmail.com wrote: [snip]
It's a casual rating system to give people a bit of an idea as to how good various people think a page is. Supplemental to stable versioning, perhaps, but it still doesn't replace people checking their own sources, etc.
Your point about feedback coming from anons is a good one, and quite salient, since it's quite correct that most (and the best, from the point of view of what our "audience" thinks) feedback will come from viewers who might not have accounts.
Whatever is chosen in the implementation is going to be a toss-up, because life is not perfect, people are not perfect, and shared IP addressing is a sordid reality we have to put up with. :)
If it's just a casual rating system: don't make it a vote. Allow everyone to provide input (believe it or not, we do get useful complaints to OTRS from non-editors)... Collect everything, expose what we've collected.... people can then write their own tools to do things with the data.
The software should implement mechanism but not policy wherever possible.
Gregory Maxwell wrote:
On 9/1/06, Rob Church robchur@gmail.com wrote: [snip]
It's a casual rating system to give people a bit of an idea as to how good various people think a page is. Supplemental to stable versioning, perhaps, but it still doesn't replace people checking their own sources, etc.
Your point about feedback coming from anons is a good one, and quite salient, since it's quite correct that most (and the best, from the point of view of what our "audience" thinks) feedback will come from viewers who might not have accounts.
Whatever is chosen in the implementation is going to be a toss-up, because life is not perfect, people are not perfect, and shared IP addressing is a sordid reality we have to put up with. :)
If it's just a casual rating system: don't make it a vote. Allow everyone to provide input (believe it or not, we do get useful complaints to OTRS from non-editors)... Collect everything, expose what we've collected.... people can then write their own tools to do things with the data.
The software should implement mechanism but not policy wherever possible.
Absolutely. There are all sorts of ways to address sockpuppetry, bogus reviews, and ballot-stuffing.
For example, a Bayesian "rating the raters" approach might be useful, given a small set of "ground truth" ratings. If this is designed right, it should even be able to extract useful information from raters who are actively trying to game the system, for example by performing honest ratings of numerous other articles as well as plugging their favourites or down-rating articles they disapprove of.
Yet another approach might be to offer a "rate a random article" feature. This will both guarantee breadth of coverage from such ratings, and also make ratings of random articles more difficult to ballot-stuff by several orders of magnitude. (Note that calling up an article this way, and then _not_ rating it, would also convey useful information).
-- Neil
A vote system must be able to deal with all kind of trickies. People overvoting, voting ech day, trying to improve their 'wikipedia pagerank', or shut down others'... Maybe you could set a ip-hash so vote only affects if it's different than last vote (or last 5 votes...). Then it could be remplaced by the semisum of both votes of those hash matching ips. Maybe even with a special sensibility for repeated queries from the same ISP (dynamic ips connecting-voting-disconnecting).
About the vote forgiveness on change take into account that most wight changes are those of blanking/reverting so some blank-revert processes could reset the counter although content haven't changed.
I think this could be a feature that might be hard to speculate on what exactly what will happen if it's applied on a large scale, so it might take some sample data before determining which choice is best. And of course, the results will vary between wikis, so it'd probably be best to have included in the extension a few parameters for which it can be used (e.g. max votes per user, # of days back into the past to consider votes, discard votes over 5, use a scale of 1-x, etc)...
P.S. http://wiki16.wikidiy.com/Special:ListRatings -> Fatal error: Call to a member function on a non-object in /var/www/html/wiki16/extensions/RateArticle.php on line 187
Fixed. Thanks.
thank you for your message see you soon
Travis Derouin travis@wikihow.com a écrit : > A vote system must be able to deal with all kind of trickies. People
overvoting, voting ech day, trying to improve their 'wikipedia pagerank', or shut down others'... Maybe you could set a ip-hash so vote only affects if it's different than last vote (or last 5 votes...). Then it could be remplaced by the semisum of both votes of those hash matching ips. Maybe even with a special sensibility for repeated queries from the same ISP (dynamic ips connecting-voting-disconnecting).
About the vote forgiveness on change take into account that most wight changes are those of blanking/reverting so some blank-revert processes could reset the counter although content haven't changed.
I think this could be a feature that might be hard to speculate on what exactly what will happen if it's applied on a large scale, so it might take some sample data before determining which choice is best. And of course, the results will vary between wikis, so it'd probably be best to have included in the extension a few parameters for which it can be used (e.g. max votes per user, # of days back into the past to consider votes, discard votes over 5, use a scale of 1-x, etc)...
P.S. http://wiki16.wikidiy.com/Special:ListRatings -> Fatal error: Call to a member function on a non-object in /var/www/html/wiki16/extensions/RateArticle.php on line 187
Fixed. Thanks. _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
--------------------------------- Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail
On 8/30/06, Travis Derouin travis@wikihow.com wrote:
You can see an example here on our development server:
http://wiki16.wikidiy.com/Get-a-Better-Deal-on-a-Home-Loan
(username password wikihow / wikihow2006) - scroll down to the bottom of the page for the checkmarks.
Seems to work well. Would it be possible to extend it to allow multiple ratings along different axes? It would depend on the individual wiki, but it's very likely to be useful to be able to assess quality of writing separately from comprehensiveness, or importance of topic separetely from quality of article or whatever...
Steve
wikitech-l@lists.wikimedia.org