Hello Everybody,
In the temporary silence after hot election and Wikipedia research Journal debates and discussions (I hope at least the second one continues), I would like to use the opportunity to introduce our new manuscript, titled "Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data" and available at http://arxiv.org/abs/1211.0970.
There is also a rather fair review of this work at http://www.technologyreview.com/view/507076/now-wikipedia-used-to-predict-mo... .
As always, comments, critics, encouragements, etc, are most welcome (if you are shy, please write me off-list).
Bests, Taha Yasseri
Dr. Taha Yasseri. --------------------------------------------- www.phy.bme.hu/~yasseri http://www.phy.bme.hu/%7Eyasseri
Department of Theoretical Physics Institute of Physics Budapest University of Technology and Economics
Budafoki út 8. H-1111 Budapest, Hungary
tel: +36 1 463 4110 fax: +36 1 463 3567 ---------------------------------------------
I think the idea that Wikipedia activity (or other social media activity) can be used as some kind of predictor of public popularity is probably relatively sound. If something is attracting a lot of public interest, then one might expect that a certain proportion of interested people to be editors of Wikipedia (or users of Twitter, etc) leading to a corresponding uptick in activity in those spaces. Of course the editors of Wikipedia arent a typical demographic sample, so probably things that appeal more to the Wikipedia demographics are more likely to manifest as Wikipedia activity, but for very popular movies the target market is somewhat similar to the Wikipedia editor demographic so it probably correlates OK. However, that might account for the lower ability to predict the box office for less popular movies maybe the audiences for those movies arent statistically as likely to be Wikipedia editors?
But I am less sure whether there is any practical use for this finding in relation to movies. Where are the Wikipedia editors getting their advance movie information from? Presumably from the marketing activity of the movie itself. Which movies get the big marketing budget? The expected blockbusters. Its something of a self-fulfilling prophecy I suspect.
From the point of view of the movie makers, there isnt much they can learn
from the level of WP activity because from their perspective their money has largely been spent long ago on making the movie. They need to be able to predict its success a couple of years earlier, long before there will be a single edit on WP or any tweeting. I guess my point is that the ability to predict something is really only useful if the prediction can be made in advance of making an important decision. A really exciting result would be the ability to predict stock price movements from WP editing behaviour! We could use the profits from that to fund the journal, which could have a policy of publishing only unaffiliated authors as we would all be retired on our stock market riches! :-)
Kerry
_____
From: wiki-research-l-bounces@lists.wikimedia.org [mailto:wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of Taha Yasseri Sent: Wednesday, 7 November 2012 10:34 PM To: Research into Wikimedia content and communities Subject: [Wiki-research-l] Wikipedia Used to Predict Movie Box OfficeRevenues
Hello Everybody,
In the temporary silence after hot election and Wikipedia research Journal debates and discussions (I hope at least the second one continues), I would like to use the opportunity to introduce our new manuscript, titled "Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data" and available at http://arxiv.org/abs/1211.0970.
There is also a rather fair review of this work at http://www.technologyreview.com/view/507076/now-wikipedia-used-to-predict-mo vie-box-office-revenues.
As always, comments, critics, encouragements, etc, are most welcome (if you are shy, please write me off-list).
Bests, Taha Yasseri
Dr. Taha Yasseri. --------------------------------------------- www.phy.bme.hu/~yasseri http://www.phy.bme.hu/%7Eyasseri
Department of Theoretical Physics Institute of Physics Budapest University of Technology and Economics
Budafoki út 8. H-1111 Budapest, Hungary
tel: +36 1 463 4110 fax: +36 1 463 3567 ---------------------------------------------
Thank you Kerry for the comment.
** ** ** ** **
I think the idea that Wikipedia activity (or other social media activity) can be used as some kind of predictor of public popularity is probably relatively sound. If something is attracting a lot of public interest, then one might expect that a certain proportion of interested people to be editors of Wikipedia (or users of Twitter, etc) leading to a corresponding uptick in activity in those spaces. Of course the editors of Wikipedia aren’t a “typical” demographic sample, so probably things that appeal more to the Wikipedia demographics are more likely to manifest as Wikipedia activity, but for very popular movies the target market is somewhat similar to the Wikipedia editor demographic so it probably correlates OK. However, that might account for the lower ability to predict the box office for less popular movies – maybe the audiences for those movies aren’t statistically as likely to be Wikipedia editors?
That is one possibility, and the other would be existence of a threshold. We see that for the less successful movies, the prediction usually underestimates the box office revenue, suggesting that the movies should be more popular than a threshold to evoke enough Wikipedia activity proportional to their public popularity.
But I am less sure whether there is any practical use for this finding in relation to movies. Where are the Wikipedia editors getting their advance movie information from? Presumably from the marketing activity of the movie itself. Which movies get the big marketing budget? The expected blockbusters. It’s something of a self-fulfilling prophecy I suspect.
I'm not actually sure about this. I guess there are many professional movie followers among Wikipedians who gather information much earlier than the start of advertisement campaigns from different, more specialized channels. That is basically the main difference and strength compare to the Twitter model, where public audience start to tweet about the movie, most likely evoked by marketing stimuli and only very close to release time.
From the point of view of the movie makers, there isn’t much they can learn from the level of WP activity because from their perspective their money has largely been spent long ago on making the movie. They need to be able to predict its success a couple of years earlier, long before there will be a single edit on WP or any tweeting. I guess my point is that the ability to predict something is really only useful if the prediction can be made in advance of making an important decision. A really exciting result would be the ability to predict stock price movements from WP editing behaviour! We could use the profits from that to fund the journal, which could have a policy of publishing only unaffiliated authors as we would all be retired on our stock market riches! J
Sure, but please note that we are not claiming at fortune-telling! We are data people, and can not have any result before any data is generated. There are movie consultant companies who have models to predict the movie success at the "idea level". But nobody knows about their methods and performance. The other thing is that we should distinguish between movie makers and movie distributors. As far as I know, it is very important for the movie distributor companies to have an estimation of movie success even as late as the first weekend AFTER release. Our model gives good results already one month BEFORE release.
Thank you again for the comments, and I would assure you that once I find a really good money predictor, I would defiantly retire myself immediately and leave academy toward a real life! ;-)
bests, Taha
Kerry ****
*From:* wiki-research-l-bounces@lists.wikimedia.org [mailto: wiki-research-l-bounces@lists.wikimedia.org] *On Behalf Of *Taha Yasseri *Sent:* Wednesday, 7 November 2012 10:34 PM *To:* Research into Wikimedia content and communities *Subject:* [Wiki-research-l] Wikipedia Used to Predict Movie Box OfficeRevenues****
Hello Everybody,
In the temporary silence after hot election and Wikipedia research Journal debates and discussions (I hope at least the second one continues), I would like to use the opportunity to introduce our new manuscript, titled "Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data" and available at http://arxiv.org/abs/1211.0970.
There is also a rather fair review of this work at http://www.technologyreview.com/view/507076/now-wikipedia-used-to-predict-mo... .
As always, comments, critics, encouragements, etc, are most welcome (if you are shy, please write me off-list).
Bests, Taha Yasseri
Dr. Taha Yasseri.
www.phy.bme.hu/~yasseri http://www.phy.bme.hu/%7Eyasseri
Department of Theoretical Physics ****Institute** of **Physics**** ****Budapest** **University**** of Technology and Economics
Budafoki út 8. H-1111 ****Budapest**, **Hungary****
tel: +36 1 463 4110 fax: +36 1 463 3567 ---------------------------------------------****
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
I agree, the movie distributors and the movie theatre owners can probably benefit from one month out predictions.
Kerry
wiki-research-l@lists.wikimedia.org