Dear Wikimedia Analytics team,
My name is Albrecht Wirthmann. I am working in a task team Big Data at Eurostat. This is the European statistical office, which makes part of the European Commission. The task team is exploring new data sources for their feasibility of producing official statistics. We have been looking at various internet data sources including Wikimedia. The idea that we are currently following up is identifying Wikipedia pages in English that are referring to World Heritage sites and to analyse the number and development of page views of those pages as an indicator of exposure to culture. For this purpose we downloaded the page views files from http://dumps.wikimedia.org/other/pagecounts-ez/. The data should be later on included in a pocket book showing statistics on culture in the European Union.
I am contacting you to make you aware of our intentions, to ask if there would be any concerns related to our project and to possibly have a chat with you and your team to ask some technical questions and about the possibility of getting some additional data. We would be interested in page hits by country in order to be more specific on the statistics that we would compile.
We would be very glad about a positive reply and remain at your disposal,
Kind regards,
Albrecht Wirthmann
TF Big Data Eurostat BECH building 5, rue Alphonse Weicker L 2721 Luxembourg Albrecht.Wirthmann@ec.europa.eumailto:Albrecht.Wirthmann@ec.europa.eu Tel +352 4301 33728 Fax +352 4301 34359 http://www.cros-portal.eu/content/big-data
Hi Albrecht,
This list is a good place to talk about your additional data requests. We are going to launch a public pageview API very soon, and that'll be announced on this list. We have some country breakdown reports [1] but we're currently re-vamping the definition of a "page view" and that'll trickle into these reports by the end of this year.
As for concerns related to our project, there would be no negative concerns. It's public data and all are free to use it as they wish. And your project sounds cool to me :)
[1] https://stats.wikimedia.org/wikimedia/squids/SquidReportPageViewsPerCountryO...
On Tue, Oct 6, 2015 at 5:35 AM, Albrecht.Wirthmann@ec.europa.eu wrote:
Dear Wikimedia Analytics team,
My name is Albrecht Wirthmann. I am working in a task team Big Data at Eurostat. This is the European statistical office, which makes part of the European Commission. The task team is exploring new data sources for their feasibility of producing official statistics. We have been looking at various internet data sources including Wikimedia. The idea that we are currently following up is identifying Wikipedia pages in English that are referring to World Heritage sites and to analyse the number and development of page views of those pages as an indicator of exposure to culture. For this purpose we downloaded the page views files from *http://dumps.wikimedia.org/other/pagecounts-ez/* http://dumps.wikimedia.org/other/pagecounts-ez/. The data should be later on included in a pocket book showing statistics on culture in the European Union.
I am contacting you to make you aware of our intentions, to ask if there would be any concerns related to our project and to possibly have a chat with you and your team to ask some technical questions and about the possibility of getting some additional data. We would be interested in page hits by country in order to be more specific on the statistics that we would compile.
We would be very glad about a positive reply and remain at your disposal,
Kind regards,
Albrecht Wirthmann
TF Big Data Eurostat BECH building 5, rue Alphonse Weicker L 2721 Luxembourg *Albrecht.Wirthmann@ec.europa.eu* Albrecht.Wirthmann@ec.europa.eu Tel +352 4301 33728 Fax +352 4301 34359 *http://www.cros-portal.eu/content/big-data* http://www.cros-portal.eu/content/big-data
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Dear Albrecht, as an avid Eurostat consumer let me congratulate you for a very interesting project, which will be of great interest not only to the general population but also to direct consumers of Eurostat such as * Europeana, * the European Commission (for policy implications), * EU governments (some of which supported Wiki Loves Monuments, e.g. for Pompei), * the Council of Europe (for their support for Wiki Loves Monuments).
Dan already pointed out what will be the best route for your to pursue in the future, but let me point out what data and tools people have needed and made available so far: * for recent raw data, https://wikitech.wikimedia.org/wiki/Analytics/Data/Pagecounts-all-sites ; * global usage of World Heritage Sites images: 190k usages for 24k images (via http://tools.wmflabs.org/glamtools/glamorous.php?doit=1&category=World+H... ); * pageviews in a wiki (e.g. English Wikipedia) for all articles classified as world heritage sites on Wikidata: says ~3 M/month for ~800 articles (via WDQ claim[1435:9259] in Listeria > PagePile > TreeViews https://tools.wmflabs.org/glamtools/treeviews/?q=%7B%22pagepile%22%3A%22962%... ); * for Europeana, https://meta.wikimedia.org/wiki/Europeana/Stats ; * for WLM images accesses in all languages, e.g. http://tools.wmflabs.org/glamtools/baglama2/#gid=60&month=201508 ; * for multilingual pageviews of an individual topic, e.g. https://tools.wmflabs.org/hay/langviews/index.php?url=http://en.wikipedia.or... .
Federico
Albrecht.Wirthmann@ec.europa.eu, 06/10/2015 11:35:
Dear Wikimedia Analytics team, My name is Albrecht Wirthmann. I am working in a task team Big Data at Eurostat. This is the European statistical office, which makes part of the European Commission. The task team is exploring new data sources for their feasibility of producing official statistics. We have been looking at various internet data sources including Wikimedia. The idea that we are currently following up is identifying Wikipedia pages in English that are referring to World Heritage sites and to analyse the number and development of page views of those pages as an indicator of exposure to culture. For this purpose we downloaded the page views files from _http://dumps.wikimedia.org/other/pagecounts-ez/_. The data should be later on included in a pocket book showing statistics on culture in the European Union. I am contacting you to make you aware of our intentions, to ask if there would be any concerns related to our project and to possibly have a chat with you and your team to ask some technical questions and about the possibility of getting some additional data. We would be interested in page hits by country in order to be more specific on the statistics that we would compile. We would be very glad about a positive reply and remain at your disposal, Kind regards, Albrecht Wirthmann TF Big Data Eurostat BECH building 5, rue Alphonse Weicker L 2721 Luxembourg