Hi,
just a quick heads up that due to database issues, geowiki currently
cannot update daily with new data.
So pages with daily active editor counts like
http://gp.wmflabs.org/graphs/active_editors_totalhttp://gp.wmflabs.org/graphs/enwiki_editor_countshttp://gp.wmflabs.org/graphs/frwiki_editor_countshttp://gp.wmflabs.org/graphs/eowiki_editor_counts
[...]
and the private per country breakdowns at
https://stats.wikimedia.org/geowiki-private/
will not see updates until the issue is resolved.
Older data is not affected by the issue. So data up to May 1st is good
to use (with the usual geowiki caveats).
Best regards,
Christian
P.S.: The root issue is not severe, and I guess it can be fixed in the
next couple of days.
--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Gruendbergstrasze 65a Email: christian(a)quelltextlich.at
4040 Linz, Austria Phone: +43 732 / 26 95 63
Fax: +43 732 / 26 95 63
Homepage: http://quelltextlich.at/
---------------------------------------------------------------
Hi,
people from gerrit's “Analytics” group [1] currently hold
* Push (including Force Push)
* Push Merge Commit
* Forge Author Identiy
* Forge Committer Identity
permissions on “analytics/*” projects in gerrit. But those permissions
got and get in the way one way or the other.
Do we need those permissions for our repos?
If no one objects, I'll start removing them on 2014-04-28.
Best regards,
Christian
[1] https://gerrit.wikimedia.org/r/#/admin/groups/uuid-d34747bee94be39cff54b5fd…
--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Gruendbergstrasze 65a Email: christian(a)quelltextlich.at
4040 Linz, Austria Phone: +43 732 / 26 95 63
Fax: +43 732 / 26 95 63
Homepage: http://quelltextlich.at/
---------------------------------------------------------------
Hi,
TL;DR: When consuming EventLogging data, only rely on the 'log'
database available from m2 replicas, like analytics-store.eqiad.wmnet.
Other representations might not get updated, might not get fix-ups or
may (on purpose) give you unvalidated data.
----------------------------------
Due to the versatile design of EventLogging, its data exists/existed
in many different representations, which got me confused around the
data quality expectations. Also I could not find them publicly
documented. After talking about different aspects with a few people, I
wanted to put my current understanding of it up for public discussion.
Please let me know (either in private or on list), if something looks
wrong or does not match your use of EventLogging data.
* MySQL / MariaDB database on m2
This database is the best place to consume EventLogging data from.
Available as 'log' database on m2 replicas, such as
analytics-store.eqiad.wmnet.
Only validated events enter the database.
In case of bugs, this database is the only place that gets fixes like
cleanup of historic data, or live fixes.
* 'all-events' JSON log files [1]
Use this data source only to debug issues around ingestion into the m2
database.
Entries are JSON objects.
Only validated events get written.
In case of bugs, historic data does not get fixed.
* Raw client and server side log files [2]
Use this data source only to debug issues around ingestion into the m2
database.
Entries are parameters to the event.gif's request. They are not
decoded at all.
In case of bugs, historic data does not get fixed. Neither need
hot-fixes reach those files.
* Kafka:
EventLogging data is no longer fed into Kafka since 2014-06-12 [3].
The EventLogging data in Kafka had no users.
Turning it on again is tracked in bug 66528 [4].
* MongoDB:
EventLogging data is no longer fed into MongoDB since 2014-02-13 [5].
The EventLogging data in MongoDB did not appear to get used.
I am not aware of plans to revive feeding the data into MongoDB.
* ZMQ:
ZMQ is available from vanadium.
In case of bugs, historic data cannot get fixed :-)
Data coming from the forwarders (ports 8421, 8422) is not validated
and need not see hot-fixes.
Data coming from processors (port 8521, 8522) and multiplexer (port
8600) is validated.
Have fun,
Christian
[1] Available as
stats1002:/a/eventlogging/archive/all-events.log-$DATE.gz
stats1003:/srv/eventlogging/archive/all-events.log-$DATE.gz
vanadium:/var/log/eventlogging/...
[2] Available as
stats1002:/a/eventlogging/archive/client-side-events.log-$DATE.gz
stats1002:/a/eventlogging/archive/server-side-events.log-$DATE.gz
stats1003:/srv/eventlogging/archive/client-side-events.log-$DATE.gz
stats1003:/srv/eventlogging/archive/server-side-events.log-$DATE.gz
vanadium:/var/log/eventlogging/...
[3] https://git.wikimedia.org/commitdiff/operations%2Fpuppet.git/f85b1dbcd61bbb…
[4] https://bugzilla.wikimedia.org/show_bug.cgi?id=66528
[5] https://git.wikimedia.org/commitdiff/operations%2Fpuppet.git/05b4027973c59b…
--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Kefermarkterstrasze 6a/3 Email: christian(a)quelltextlich.at
4293 Gutau, Austria Phone: +43 7946 / 20 5 81
Fax: +43 7946 / 20 5 81
Homepage: http://quelltextlich.at/
---------------------------------------------------------------
(to public list and cc-ing Nemo)
Hello,
Since last time we had an increase in throughput in Even Logging Nemo had
to notify us via e-mail this is just a brief note to the list to say that
we now have throughput monitoring for event logging and it is working.
We had a throughput spike today that we shall be investigating.
Thanks,
Nuria
---------- Forwarded message ----------
From: <icinga(a)neon.wikimedia.org>
Date: Wed, Jun 25, 2014 at 4:46 AM
Subject: ** PROBLEM alert - tungsten/Throughput of event logging events is
CRITICAL **
To: nuria(a)wikimedia.org
❤❤❤❤❤ Icinga ❤❤❤❤❤
Notification Type: PROBLEM
Service: Throughput of event logging events
Host: tungsten
Address: 10.64.0.18
State: CRITICAL
Date/Time: Wed Jun 25 02:46:22 UTC 2014
Additional Info:
CRITICAL: 7.14% of data exceeded the critical threshold [500.0]
Love, Icinga
Hi,
No action required. Just a heads up.
Since there were some problems around machine capacities, slow queries
and subsequent slave lag and alarms in the past months, springle
suggested to point the s[23467]-analytics-slave aliases to the new
analytics-store machine. That should kill some of the problems at the
root.
This migration gets tracked in Bug 66068 [1].
No action is required from you, the change should be transparent.
Keep using the aliases in the same manner as you did up to now.
This email is merely a heads up about the switch.
Have fun,
Christian
[1] https://bugzilla.wikimedia.org/show_bug.cgi?id=66068
--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Gruendbergstrasze 65a Email: christian(a)quelltextlich.at
4040 Linz, Austria Phone: +43 732 / 26 95 63
Fax: +43 732 / 26 95 63
Homepage: http://quelltextlich.at/
---------------------------------------------------------------
Minutes and slides from Monday's quarterly review of the Foundation's
Analytics team are now available at
https://meta.wikimedia.org/wiki/Metrics_and_activities_meetings/Quarterly_r…
.
On Wed, Dec 19, 2012 at 6:49 PM, Erik Moeller <erik(a)wikimedia.org> wrote:
> Hi folks,
>
> to increase accountability and create more opportunities for course
> corrections and resourcing adjustments as necessary, Sue's asked me
> and Howie Fung to set up a quarterly project evaluation process,
> starting with our highest priority initiatives. These are, according
> to Sue's narrowing focus recommendations which were approved by the
> Board [1]:
>
> - Visual Editor
> - Mobile (mobile contributions + Wikipedia Zero)
> - Editor Engagement (also known as the E2 and E3 teams)
> - Funds Dissemination Committe and expanded grant-making capacity
>
> I'm proposing the following initial schedule:
>
> January:
> - Editor Engagement Experiments
>
> February:
> - Visual Editor
> - Mobile (Contribs + Zero)
>
> March:
> - Editor Engagement Features (Echo, Flow projects)
> - Funds Dissemination Committee
>
> We’ll try doing this on the same day or adjacent to the monthly
> metrics meetings [2], since the team(s) will give a presentation on
> their recent progress, which will help set some context that would
> otherwise need to be covered in the quarterly review itself. This will
> also create open opportunities for feedback and questions.
>
> My goal is to do this in a manner where even though the quarterly
> review meetings themselves are internal, the outcomes are captured as
> meeting minutes and shared publicly, which is why I'm starting this
> discussion on a public list as well. I've created a wiki page here
> which we can use to discuss the concept further:
>
> https://meta.wikimedia.org/wiki/Metrics_and_activities_meetings/Quarterly_r…
>
> The internal review will, at minimum, include:
>
> Sue Gardner
> myself
> Howie Fung
> Team members and relevant director(s)
> Designated minute-taker
>
> So for example, for Visual Editor, the review team would be the Visual
> Editor / Parsoid teams, Sue, me, Howie, Terry, and a minute-taker.
>
> I imagine the structure of the review roughly as follows, with a
> duration of about 2 1/2 hours divided into 25-30 minute blocks:
>
> - Brief team intro and recap of team's activities through the quarter,
> compared with goals
> - Drill into goals and targets: Did we achieve what we said we would?
> - Review of challenges, blockers and successes
> - Discussion of proposed changes (e.g. resourcing, targets) and other
> action items
> - Buffer time, debriefing
>
> Once again, the primary purpose of these reviews is to create improved
> structures for internal accountability, escalation points in cases
> where serious changes are necessary, and transparency to the world.
>
> In addition to these priority initiatives, my recommendation would be
> to conduct quarterly reviews for any activity that requires more than
> a set amount of resources (people/dollars). These additional reviews
> may however be conducted in a more lightweight manner and internally
> to the departments. We’re slowly getting into that habit in
> engineering.
>
> As we pilot this process, the format of the high priority reviews can
> help inform and support reviews across the organization.
>
> Feedback and questions are appreciated.
>
> All best,
> Erik
>
> [1] https://wikimediafoundation.org/wiki/Vote:Narrowing_Focus
> [2] https://meta.wikimedia.org/wiki/Metrics_and_activities_meetings
> --
> Erik Möller
> VP of Engineering and Product Development, Wikimedia Foundation
>
> Support Free Knowledge: https://wikimediafoundation.org/wiki/Donate
>
> _______________________________________________
> Wikimedia-l mailing list
> Wikimedia-l(a)lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
--
Tilman Bayer
Senior Operations Analyst (Movement Communications)
Wikimedia Foundation
IRC (Freenode): HaeB
The "rate" graphs in graphite for each EventLogging schema don't work
anymore: https://graphite.wikimedia.org/
It seems like they died around May 16th:
https://graphite.wikimedia.org/render/?width=586&height=308&_salt=140310005…
Is this a known issue? At the moment I'm doing manual calculations with the
database to see how much I've been able to reduce Media Viewer's usage of
EventLogging, if those graphs worked it would be a lot faster to get the
results.
Hi,
Columns for country data in EventLogging tables sometimes not only
contain the country code, but also larger chunks of the client
cookies, which may put sensitive data into the tables.
The corresponding bug is
https://bugzilla.wikimedia.org/show_bug.cgi?id=66478
At least
NavigationTiming
MultimediaViewerNetworkPerformance
schemas for end of April 2014 onwards are affected.
If you publish reports exposing the country information/aggregates,
please make sure to validate the country data against exposing
sensitive information, or remove the exposing reports until the issue
is fixed.
Sorry for the inconveniences,
Christian
--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Kefermarkterstrasze 6a/3 Email: christian(a)quelltextlich.at
4293 Gutau, Austria Phone: +43 7946 / 20 5 81
Fax: +43 7946 / 20 5 81
Homepage: http://quelltextlich.at/
---------------------------------------------------------------
Hi,
while doing some unrelated maintenance work around the EventLogging
tables, I noticed that there are a few tables that look like cruft:
* Appended “_1”
MultimediaViewerNetworkPerformance_7917896_1
(It's only about the table ending in “_1”, not the table without
this suffix)
* zz_MobileWebInfobox_6221064
zz_ModuleStorage_6356853
zz_NavigationTiming_7494934
zz_PageContentSaveComplete_5303086
zz_PageContentSaveComplete_5588433
zz_PageSaveTiming_5557427
zz_TimingData_7254808
(It's only about the tables starting in “zz_”, not the tables without
this prefix)
None of those received data since some time, and their names looks
like they should removed anyways. Also, they are getting in the way
for maintenance.
If no one speaks up until 2016-06-23, I'll consider them cruft and
start working towards removing them.
Have fun,
Christian
P.S.: This item is getting tracked at
https://bugzilla.wikimedia.org/show_bug.cgi?id=66649
--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Kefermarkterstrasze 6a/3 Email: christian(a)quelltextlich.at
4293 Gutau, Austria Phone: +43 7946 / 20 5 81
Fax: +43 7946 / 20 5 81
Homepage: http://quelltextlich.at/
---------------------------------------------------------------