On Tue, Dec 20, 2016 at 12:45 AM, Quim Gil <qgil(a)wikimedia.org> wrote:
The questions for this session are being crowdsourced
at
http://www.allourideas.org/wikidev17-product-technology-questions. Anyone
can propose questions and vote, anonymously, as many times as you want. At
the moment, we have 25 questions and 451 votes.
An important technical detail: questions posted later have also good
chances to make it to the top of the list as long as new voters select
them. The ranking is made out of comparisons between questions, not
accumulation of votes. For instance, the current top question is in fact
one of the last that has been submitted so far.
Right now the top question has a score of 70 based on 88 votes; the second
question has a score of 67 based on 1 vote. (This is not some super-rare
accident, either: number 8 and 9 on the popularity list both have 4 votes.)
I argued that All Our Ideas is too experimental to be relied on back when
it was considered as the voting tool for an early iteration of what ended
up being the Community Tech Wishlist, and I still think that's the case.
The way their voting system works is that they assume each idea has some
appeal (an arbitrary real number) for each voter, the appeals for a given
idea are normally distributed, and when a voter is shown a question pair,
their probability of voting a given way is a certain function of the
difference in appeals. They then use various statistical methods to come up
with random values for the appeals which match the observed votes, and
using those values they can calculate the probability for each question
that a randomly selected voter would prefer that question to a randomly
selected alternative; those probabilities are used to score the questions.
That means that the scores can be heavily underspecified (ie. mostly result
from the random numbers generated by their algorithm and not actual votes)
for some questions; this is especially true for recently submitted
questions, which have a very small number of votes, so they will basically
get a random position in the ranking. As far as I can see, the journal
article [1] where they present their method doesn't discuss this problem at
all. This is not terribly useful as a real-world ranking model IMO, so I
hope that 1) there will be some human oversight when evaluating the
results, and 2) that we don't intend to use this system for any voting that
actually matters (getting weirdly prioritized results for a Q&A session is,
of course, not a huge deal).
[1]
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0123483