Yuri/Stas:
This thread is missing some background context info as to what the issues are, if you could forward it it will be great.
Thanks, though using distinct User-Agent may be easier for analysis, since those are stored as separate fields, and doing operations on separate field would be much easier than extracting comments from query field e.g. when doing Hive data processing.
X-analytics is a separate field in our hive data, we like it when info intended for analytics is dropped there. Please see docs: https://wikitech.wikimedia.org/wiki/X-Analytics
On Sun, Oct 2, 2016 at 1:32 PM, Yuri Astrakhan yastrakhan@wikimedia.org wrote:
I would highly recommend using X-Analytics header for this, and establishing a "well known" key name(s). X-Analytics gets parsed into key-value pairs (object field) by our varnish/hadoop infrastructure, whereas the user agent is basically a semi-free form text string. Also, user agent cannot be set for by any javascript client, so we will constantly have to perform two types of analysis - those that came from the "backend" and those that were made by the browser.
On Sun, Oct 2, 2016 at 4:28 PM Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
I'll try to throw in a #TOOL: comment where I can remember using SPARQL, but I'll be bound to forget a few...
Thanks, though using distinct User-Agent may be easier for analysis, since those are stored as separate fields, and doing operations on separate field would be much easier than extracting comments from query field e.g. when doing Hive data processing.
-- Stas Malyshev smalyshev@wikimedia.org
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics