Hey.
I'm quickly checking through the EventLogging database, and there are a lot
of redundant tables in there which hold test data from before the app
launched that we're never going to use for anything. As a consequence,
these tables are safe to delete. The tables don't have many records in them
so we're not going to free up much space, but they'll at least declutter
the database a bit.
The tables are:
1. MobileWikiAppEdit_8188113
2. MobileWikiAppEdit_8197809
3. MobileWikiAppLogin_8134781
4. MobileWikiAppLogin_8234429
5. MobileWikiAppLogin_9122346
6. MobileWikiAppOperatorCode_8678522
7. MobileWikiAppReadingAction_8233710
Please don't delete anything other than these specific ones!
Thanks,
Dan
--
Dan Garry
Associate Product Manager, Mobile Apps
Wikimedia Foundation
Hi,
Ops changed the password for the "research" user on analytics-store (I
don't know what the context is, I just saw the commit summary on the Ops
ML), which our team was using to generate tsvs for limn. Maybe it wasn't
the right user to be using, I'm not sure, but one way or another we're
going to need new credentials in order to keep our dashboards working. If
someone has the credentials we should be using, I would be grateful to
receive them through whatever medium is appropriate :)
Thanks!
Ok, so we should be set here to setup the log cleanup on 1002 , Andrew.
Toby: Can you reply to the RT ticket giving the green light to do the
cleanup? (#8760)
Thanks
Nuria
On Wed, Nov 5, 2014 at 7:53 AM, Aaron Halfaker <ahalfaker(a)wikimedia.org>
wrote:
> I didn't find that obvious. Thanks Dario. I agree that a 90 day data
> retention window would be long enough to deal with any corrupt/invalid
> data.
>
>
> On Tue, Nov 4, 2014 at 11:12 PM, Dario Taraborelli <
> dtaraborelli(a)wikimedia.org> wrote:
>
>> On Nov 4, 2014, at 2:08 PM, Nuria <nuria(a)wikimedia.org> wrote:
>>
>>
>> No, database records are not affected. It should not impact your work
>> with EL in any way as your findings come from the records in the database.
>> The logs are used mainly for operational purposes by the dev team as
>> maintainers of the system.
>>
>>
>> this obviously means that any missing, corrupted or invalid data from the
>> log DB can only be recovered within the data retention window, which seems
>> reasonable to me.
>>
>> On Nov 4, 2014, at 11:55 AM, Aaron Halfaker <ahalfaker(a)wikimedia.org>
>> wrote:
>>
>> Hey guys,
>>
>> Sorry for the late response, but I'm still not sure what lives in
>> */a/eventlogging/archive/**
>>
>> Will deleting from there affect what logs we have stored in the DB? Is
>> this an intermediate log storage place, a canonical one, etc.?
>>
>> What will we no longer be able to do after it is pruned?
>>
>> -Aaron
>>
>> On Thu, Oct 30, 2014 at 2:35 PM, Nuria Ruiz <nuria(a)wikimedia.org> wrote:
>>
>>> >Also, I'm not clear on the significance of the EL archive directory.
>>> Can you remind me/direct me to documentation?
>>> Well, the logs just record the incoming pipeline of events, we have used
>>> them to troubleshoot operational issues in the past but the bulk of data
>>> analysis in EL happens from data stored on database.
>>>
>>> Some info here:
>>> https://wikitech.wikimedia.org/wiki/EventLogging#Data_storage
>>>
>>>
>>> On Thu, Oct 30, 2014 at 12:22 PM, Aaron Halfaker <
>>> ahalfaker(a)wikimedia.org> wrote:
>>>
>>>> Nuria, can you specify which logs will be trimmed.
>>>>
>>>> Also, I'm not clear on the significance of the EL archive directory.
>>>> Can you remind me/direct me to documentation?
>>>>
>>>> On Thu, Oct 30, 2014 at 12:36 PM, Nuria Ruiz <nuria(a)wikimedia.org>
>>>> wrote:
>>>>
>>>>>
>>>>> Hello,
>>>>>
>>>>> To comply with our privacy policy we are going to purge logs in 1002
>>>>> that are older than 90 days. Please let us know whether this is an issue.
>>>>> We hope to have these changes done by the end of next week.
>>>>>
>>>>> A concrete example:
>>>>>
>>>>> Logs in, for example, the eventlogging archiving directory:
>>>>>
>>>>> @stat1002:/a/eventlogging/archive$
>>>>>
>>>>>
>>>>> will be restricted to the last 90 days.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Nuria
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Analytics mailing list
>>>>> Analytics(a)lists.wikimedia.org
>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Analytics mailing list
>>>> Analytics(a)lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> Analytics(a)lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
> _______________________________________________
> Analytics mailing list
> Analytics(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
On October 30, updates were made to how metrics are calculated for the
Vital Signs dashboards [1]. The result is an apparent jump up or down on
some of the metrics starting October 30th. This is because we did not
update the existing historical data. We are planning on recalculating all
the historical data after more work is done on the backend over the next
month.
Here are the changes on the metrics:
- Namespace Edits and Pages Created now include pages in all namespaces and
pages that have been deleted. The plots for these metrics generally show a
step up. On wikis where most of activity occurs on pages other than in
namespace '0' (like Meta and Commons), you can see a dramatic difference in
the data [2].
- We exclude bots in Rolling Active Editor, Rolling Surviving New Active
Editor and Rolling Recurring Old Active Editor. The plots for these
metrics show a small step down. This change is not as conspicuous as the
previous one.
If you were relying on this data right now, and need it consistent across
time, please speak up.
[1] https://metrics.wmflabs.org/static/public/dash/
[2]
https://metrics.wmflabs.org/static/public/dash/#projects=commonswiki,metawi…
Hello,
To comply with our privacy policy we are going to purge logs in 1002 that
are older than 90 days. Please let us know whether this is an issue. We
hope to have these changes done by the end of next week.
A concrete example:
Logs in, for example, the eventlogging archiving directory:
@stat1002:/a/eventlogging/archive$
will be restricted to the last 90 days.
Thanks,
Nuria
Could we temporarily moderate Aileen, please? This is getting somewhat
ridiculous and cluttering the archives (and my inbox) with automated dross.
(Mandatory pause while I wait for Aileen's autoresponder to prove my point)
--
Oliver Keyes
Research Analyst
Wikimedia Foundation
Hi,
Google Code-In (GCI) will soon take place again - a contest for 13-17
year old students to contribute to free software projects.
Wikimedia wants to take part again.
Last year's GCI results were surprisingly good - see
https://www.mediawiki.org/wiki/Google_Code-in_2013
We need your help:
1) Go to
https://www.mediawiki.org/wiki/Google_Code-in_2014#Mentors.27_corner and
read the information there. If something is unclear, ask!
2) Add yourself to the table of mentors on
https://www.mediawiki.org/wiki/Google_Code-in_2014#Contacting_Wikimedia_men…
- the more mentors are listed the better our chances are that Google
accepts us.
3) Please take ten minutes and go through open recent tickets in
https://bugzilla.wikimedia.org in your area of interest. If you see
self-contained, non-controversial issues with a clear approach which you
can recommend to new developers and would mentor: Add the task to
https://www.mediawiki.org/wiki/Google_Code-in_2014#Proposed_tasks
Until Sunday November 12th, we need at least five tasks from each of
these categories (plus some less technical beginner tasks as well):
* Code: Tasks related to writing or refactoring code
* Documentation/Training: Tasks related to creating/editing documents
and helping others learn more - no translation tasks
* Outreach/research: Tasks related to community management,
outreach/marketing, or studying problems and recommending solutions
* Quality Assurance: Tasks related to testing and ensuring code is of
high quality
* User Interface: Tasks related to user experience research or user
interface design and interaction
Google wants every organization to have 100+ tasks available on December
1st. Last year, we had 273 tasks in the end.
Note that you could also create rather generic tasks, for example fixing
two interface messages from the list of dependencies of
https://bugzilla.wikimedia.org/show_bug.cgi?id=38638
Helpful Bugzilla links:
* Reports that were proposed for GCI last year and are still open:
https://bugzilla.wikimedia.org/buglist.cgi?quicksearch=ALL%20whiteboard%3Ag…
* Open Analytics tickets created in the last six months (if I got your
products and components right):
https://bugzilla.wikimedia.org/buglist.cgi?bug_status=UNCONFIRMED&bug_statu…
* 3 existing Analytics "easy" tickets (are they still valid? Are they
really self-contained, non-controversial issues with a clear approach?
Could some of them be GCI tasks that you would mentor? If so, please tag
them as described above!):
https://bugzilla.wikimedia.org/buglist.cgi?bug_status=UNCONFIRMED&bug_statu…
Could you imagine mentoring some of these tasks?
Thank you for your help in reaching out to new contributors and making
GCI a success again! Please ask if you have questions.
Cheers,
andre
PS: And in a future Phabricator world, Bugzilla tickets with the 'easy'
keyword will become Phabricator tasks with the 'easy' project.
--
Andre Klapper | Wikimedia Bugwrangler
http://blogs.gnome.org/aklapper/
Hi,
just a quick heads up that the s1 replication lag on
analytics-store.eqiad.wmnet (i.e.: dbstore1002), is currently at
15 hours and increasing.
I filed RT ticket 8792:
https://rt.wikimedia.org/Ticket/Display.html?id=8792
If you need to query s1, you can use s1-analytics-slave as up-to-date
replica instead until the other replica is fixed.
Best regards,
Christian
--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Kefermarkterstrasze 6a/3 Email: christian(a)quelltextlich.at
4293 Gutau, Austria Phone: +43 7946 / 20 5 81
Fax: +43 7946 / 20 5 81
Homepage: http://quelltextlich.at/
---------------------------------------------------------------
Forwarding to Research and Analytics for discussion.
Pine
On Oct 30, 2014 8:13 PM, "MZMcBride" <z(a)mzmcbride.com> wrote:
> Hi.
>
> Splitting this out from the GLAMs/Chapters thread, I continue to regularly
> wonder whether we need stricter guidelines and guidance in the area of
> experimenting on Wikimedians.
>
> Erik mentioned trying to further implement A/B testing in software
> development, but to me that quickly raises consent and trust issues. My
> view is that Wikimedians should be treated as colleagues, not customers.
>
> Of course the stark reality is that A/B testing on users (typically
> readers, not editors) during the annual Wikimedia Foundation fundraiser
> has been a major component of the Wikimedia Foundation's growth.
>
> Worth repeating, from <https://meta.wikimedia.org/wiki/Experiments>:
>
> ---
> Current practices in Web analytics reflect their commercial origins. For
> better or worse, the greatest motor behind the use of Web analytics has
> been the profit interests of online retailers and social networks, for
> whom the user is a commodity. These profit interests have profoundly
> shaped the discourse of Web analytics, setting both the tenor and the tone
> of debate (consider the values implicit in "funnels," a term of art).
>
> A thoughtless application of Web analytics to Wikimedia wikis would import
> a moral outlook that is incompatible with (and, indeed, rightfully
> offensive to) its community. It also wouldn't work well, because neither
> Wikimedia wikis nor their editing communities are for sale. It is
> therefore crucial that technical efforts be accompanied by a process of
> reflection, the goal of which should be to articulate criteria for Web
> analytics that express and promote the broader ambitions of the Wikimedia
> movement and the moral commitments that underlie it.
> ---
>
> I think this about sums it up better than I ever could.
>
> MZMcBride
>
>
>
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l(a)lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>