Yeah, I think let's keep it around and if people need it we can try to vet it more thoroughly. I think Tomasz only cares about the ratio between the different apps and not so much about the total numbers. So let's just go with the new data and make sure that's accurate.
Hi,
we are currently bringing the device property, and platform
computations back to life outside of Hadoop. Data for the last few
days has been computed and the jobs are running.
However, I am not sure about the old data that we have. Should we
blend that in?
* For device properties, I found that
http://stats.wikimedia.org/kraken-public/webrequest/mobile/device/props
seem to contain property data for 2013-03-01 until 2013-05-15.
Since this data stopped already in mid-May, I assume we have more data
to blend in (end of May, June, July) at a different place.
Do we have such data?
Do we know if the above data is good or it's just a relict from test runs?
* For platform data, I found that
http://stats.wikimedia.org/kraken-public/webrequest/mobile/platform/mobile_platform-daily.tsv
has platform data from 2013-04-14 until 2013-07-20 in
However, I am not sure which of this data is valid. Naive, uneducated
plausibility checks fail badly [1].
Do we know if/which data is good?
Do we have a better or other sources for the platform job?
Best regards,
Christian
[1] For example when only looking at the last few data points
for Android for example Tuesdays we get [2]:
2013-04-16: 6438000
2013-04-23: 6300000
2013-04-30: 6559000
2013-05-06: 7267000
2013-05-13: 6954000
2013-05-27: 33335000
2013-06-04: 14388000
2013-06-11: 8563000
2013-06-18: 10241000
2013-06-25: 6896000
2013-07-09: 3454000
2013-07-16: 7206000
The highest value (33M) is 10 times as high as the lowest (3M)—within
only three months.
Even when considering those data points outliers (and we have readings
that are even further out. Ranging from 1M–37M for Android), the
lowest data point is half the highest data point.
All on the same weekday!
This looks suspicious.
[2] There is no data for 2013-05-20, and 2013-07-02.
--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Gruendbergstrasze 65a Email: christian@quelltextlich.at
4040 Linz, Austria Phone: +43 732 / 26 95 63
Fax: +43 732 / 26 95 63
Homepage: http://quelltextlich.at/
---------------------------------------------------------------
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics