Hi,
recently, we again saw a few million requests to the 'undefined', and 'Undefined' pages. While some requests for Undefined are real proper requests, by far most of them seem to stem from gone-wrong-JavaScript. (See bug 66352 [1])
While the fact that we there is gone-wrong-JavaScript is an issue on it's own, I am preparing patches to drop [uU]ndefined from the webstatscollector pageview definition [2], and hence no longer count them in the pagecounts-raw [3] and pagecounts-all-sites [4] files.
If you think it's worth keeping them, please let us know soonish. (I'll try to get the changes merged on 2014-10-09.)
Have fun, Christian
[1] https://bugzilla.wikimedia.org/show_bug.cgi?id=66352 [2] https://wikitech.wikimedia.org/wiki/Analytics/Webstatscollector#Used_Page_Vi... [3] https://wikitech.wikimedia.org/wiki/Analytics/Pagecounts-raw http://dumps.wikimedia.org/other/pagecounts-raw/ [4] https://wikitech.wikimedia.org/wiki/Analytics/Pagecounts-all-sites http://dumps.wikimedia.org/other/pagecounts-all-sites/
Hi,
On Wed, Oct 08, 2014 at 12:28:38AM +0200, Christian Aistleitner wrote:
If you think it's worth keeping them, please let us know soonish. (I'll try to get the changes merged on 2014-10-09.)
The changes are ready, but deployment was postponed to next week :-/
Have fun, Christian
Hi,
On Fri, Oct 10, 2014 at 02:25:49AM +0200, quelltextlich e.U. - Christian Aistleitner wrote:
On Wed, Oct 08, 2014 at 12:28:38AM +0200, Christian Aistleitner wrote:
If you think it's worth keeping them, please let us know soonish. (I'll try to get the changes merged on 2014-10-09.)
The changes are ready, but deployment was postponed to next week :-/
The new webstatscollector implementations have been deployed today (2014-10-15). Thanks ottomata!
The last files with the [uU]ndefined counts and redirect counting are
http://dumps.wikimedia.org/other/pagecounts-raw/2014/2014-10/pagecounts-2014... [1] http://dumps.wikimedia.org/other/pagecounts-raw/2014/2014-10/projectcounts-2... [1] http://dumps.wikimedia.org/other/pagecounts-all-sites/2014/2014-10/pagecount... http://dumps.wikimedia.org/other/pagecounts-all-sites/2014/2014-10/projectco...
The first files without them are
http://dumps.wikimedia.org/other/pagecounts-raw/2014/2014-10/pagecounts-2014... [1] http://dumps.wikimedia.org/other/pagecounts-raw/2014/2014-10/projectcounts-2... [1] http://dumps.wikimedia.org/other/pagecounts-all-sites/2014/2014-10/pagecount... http://dumps.wikimedia.org/other/pagecounts-all-sites/2014/2014-10/projectco...
Have fun, Christian
[1] When restarting collector and filter for the C implementation of webstatscollector, where was a period (<2 minutes) where the new collector and the old filter have been running. Hence, during this period a few [uU]ndefined and redirects made it the 20:00:00 file.