I'm pleased to say we now have the prototype pageviews definition as a UDF!
For those with cluster access:
CREATE TEMPORARY FUNCTION pageview as
'org.wikimedia.analytics.refinery.hive.isPageviewUDF';
...and then just apply it. It outputs a boolean, so you can easily go
WHERE is.Pageview(fields) and treat it as a conditional. Great
success!
What this means for the definition is twofold; it means it's a lot
easier to tests it accuracy, and it means that it's a lot easier to
make sure we're all using the same definition going forward. Once we
have the legacy definition as a UDF, refining and testing will proceed
at great speed, although I encourage anyone with time on their hands
who wants to help out to do some testing of their own :)
--
Oliver Keyes
Research Analyst
Wikimedia Foundation
Hi,
When trying to run 'eventlogging-devserver' command, like suggested in the
EventLogging guide [1], I get an error [2]. This happens in Vagrant and
also on my local environment. In README.rst from EventLogging/server/ I
found some small dependencies and I installed them.
I also tried this after a fresh install of MediaWiki, and after deleting
the egg files from /vagrant/mediawiki/extensions/EventLogging/server/ but
with the same result.
This is what I have added in my LocalSettings.php [3]. I mention that I
have enabled the eventlogging role in Vagrant.
And when verifying the setup indicated here [4] the output is "ready" and
it gives me an object.
I was wondering if I have a bad setup or if I missed something.
[1]
https://www.mediawiki.org/wiki/Extension:EventLogging/Guide#Installing_the_…
[2] http://pastebin.com/C6bkVpHU
[3] http://pastebin.com/tyRt7mk8
[4] https://www.mediawiki.org/wiki/Extension:EventLogging#Developer_setup
Thank you,
Roxana Necula
Hi,
This is not urgetn, but if anyone is interested in troubleshooting a
(probably basic) problem I am having with mysql, I would appreciate it!
I am trying to query the MobileWebClickTracking table, which is ginormous
and keeps timing out or boiling my RAM So Dario suggested I use screen to
dump a section of the table into a more workable table.
Here is his test query that seemed to work:
*screen mysql -h analytics-store.eqiad.wmnet -B -e "CREATE TABLE
staging.jkatz_foo SELECT * FROM enwiki.user LIMIT 300;"*
I tried to do this on my own (I use analytics-slave instead, as my
credentials don't seem to work on analytics-store), and it doesn't seem to
do anything. Can you try on your end and let me know if you're having any
luck?
Here is the query:
jkatz@bast1001:~$ screen
jkatz@bast1001:~$ mysql -h analytics-slave.eqiad.wmnet -u research
-pJoFjnA90Ajyp -B -e "Insert into staging.jkatz_clicktracking1 Select *
from log.MobileWebClickTracking_5929948 WHERE ('timestamp' between
20141101000000 and 20141130000000) and wiki like 'enwiki';"
When I look in processes in stat1003, I only see a sleep command--no query:
[image: Inline image 2]
Is the query within that sleep? Anyway, I seem to be creating tables but
they do not have any rows in them. I have double checked that the data
between those dates exists in that table.
Any thoughts?
Best,
J
Hi,
TL;DR: If “Angelsberg” does not ring a bell, you're not affected :-)
Otherwise, you probably run machinery against request logs that cares
about filtering requests made by WMF monitoring infrastructure.
Just a heads up that it seems at
https://gerrit.wikimedia.org/r/#/c/182558/
a discussion is starting on how to detect such monitoring requests and
how they might change.
Chime in there, or at least watch it's outcome to make sure your
scripts keep on detecting/filtering monitoring requests.
Have fun,
Christian
--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Kefermarkterstrasze 6a/3 Email: christian(a)quelltextlich.at
4293 Gutau, Austria Phone: +43 7946 / 20 5 81
Fax: +43 7946 / 20 5 81
Homepage: http://quelltextlich.at/
---------------------------------------------------------------