Hi,
while taking some jobs off of Hadoop, I realized that some of our pig
scripts rely on
geoip-1.2.9-patch-2-SNAPSHOT.jar
, although only the patch-1 variant is provided in maven central [1],
and on our nexus [2]. We also do not have a publicly visible fork of
geoip in gerrit [3]. The patch-2 only exists in our hadoop [4]
cluster.
How do we build that jar?
What extra patches are in there?
Best regards,
Christian
[1] http://search.maven.org/#browse|1125542321
[2] http://nexus.wmflabs.org/nexus/index.html#nexus-search;quick~geoip
[3] https://gerrit.wikimedia.org/r/#/admin/projects/?filter=geo
[4] See /libs/geoip-1.2.9-patch-2-SNAPSHOT.jar in hadoop
--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Gruendbergstrasze 65a Email: christian(a)quelltextlich.at
4040 Linz, Austria Phone: +43 732 / 26 95 63
Fax: +43 732 / 26 95 63
Homepage: http://quelltextlich.at/
---------------------------------------------------------------
Heya,
We would like to turn off the production version of UMAPI that is running
on stat1001 and accessible through metrics.wikimedia.org
As far as I know, everybody who use UMAPI on stat1001 also can access the
instance on stat1 (using the SSH tunnel).
If you are using UMAPI on metrics.wikimedia.org and do not have rights to
access UMAPI on stat1 then please let me know. We plan to switch it off by
the end of this week.
D
Hi,
when doing some basic sanity checks between the output of the existing
zero_country and zero_carrier Pig scripts, it seems that the sum of
the number of requests of the output of zero_country per day is ~40k
larger than for zero_carrier.
First, I've been told that the sum of the number of requests has to
match.
Afterwards, I've been told that this is ok, as zero_country should
hold all of the mobile requests from a country, and zero_carrier is a
drill-down on the specific carriers.
When reading the Pig scripts/Java code, it is obvious that the first
explanation does not meet the code. The scripts take completely
different paths through our code base and count completely different
things :-(
However, the latter explanation does not make much sense to me either,
as it's hard to believe that the requests from our zero partners make
up >90% of each countries mobile requests. Besides, this explanation
would not meet how we generate the raw log files.
Whom could I ask about what the desired semantics of
zero_{carrier,country} are?
Best regards,
Christian
--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Gruendbergstrasze 65a Email: christian(a)quelltextlich.at
4040 Linz, Austria Phone: +43 732 / 26 95 63
Fax: +43 732 / 26 95 63
Homepage: http://quelltextlich.at/
---------------------------------------------------------------
Greetings,
I am James Hare, president of the Washington, DC chapter. At Wikimania I have been learning about the editor retention data the Wikimedia Foundation has been collecting and analyzing. I was discussing it with Ryan Kaldari and he noted that while the data was available at the national level, it was not yet available at the state level.
How difficult would it be to implement state-level analysis? Would it just be a matter of simply changing the geolocation lookup code, or would it be a very expensive change that would benefit relatively few people? For Wikimedia DC's sake I am interested in data for the District of Columbia, Maryland, Delaware, Virginia, and West Virginia (our defined chapter region).
Regards,
James Hare
Hi, at Wikimania we decided to have this key discussion about technical
community metrics at wikitech-l, since this is where our tech community
can be found. Forwarding here as a heads up.
-------- Original Message --------
Subject: Key community metrics to influence our plans
Date: Mon, 12 Aug 2013 19:21:06 +0200
From: Quim Gil <qgil(a)wikimedia.org>
To: Wikimedia developers <wikitech-l(a)lists.wikimedia.org>
你好, 近排點呀? [0]
At Wikimania there was an improvised lunch-meeting about community
metrics with Jesús González-Barahona (Bitergia), Sumana, RobLa and
myself. The main conclusion was that http://korma.wmflabs.org needs to
show a very few key metrics that could drive decisions affecting our
strategy and resources.
Let's agree first on key factors to watch, in the scope of projects
deployed in Wikimedia servers:
* Are the teams more efficient processing contributions?
* Is the share of non-WMF contributions growing?
* Are WMF and non-WMF contributions treated equally?
* Are the attraction and retention of new contributors improving?
* Are we improving the sustainability of our community?
If those are the key factors, these could be the key metrics:
# Who is contributing merged code each quarter? More origins == Better.
## WMF / WMDE / other Wikimedia / companies / OSS projects / independents
## Location of contributors (based on provided data)
# Time to resolve Gerrit changes. Shorter == Better.
## Authored by WMF/WMDE employees vs independent. Should be the same.
## Merged vs rejected. Similar progress?
## Best projects vs bottlenecks.
# Queue of open Gerrit change requests in relation to total amount of
contributions. Shorter == Better.
## Same points as above.
# New contributors with 1 / 2-5 / 6+ changes submitted in the past 3
months. More == Better.
## % of merged / rejected.
## Who is growing / stable / vanishing?
# Age of contributors since they started in the project. Small and
regular decline == Better.
## Number of contributions from each age group in the past quarters.
## WMF / non-WMF ratio for each age group.
Please have your say. I will be updating
http://www.mediawiki.org/wiki/Community_metrics#Key_metrics following
the discussion.
[0] http://wikitravel.org/en/Cantonese_phrasebook#Phrase_list ;)
--
Quim Gil
Technical Contributor Coordinator @ Wikimedia Foundation
http://www.mediawiki.org/wiki/User:Qgil
Heya,
One of the questions I want to seek a collaborative answer to in the coming
weeks is how does Limn fit into the overall product offering of the
Analytics team and what feature set should it support and what feature set
should it not support.
To kickstart this discussion I want to propose some boundaries to Limn's
features just to gauge response and get feedback on what people think Limn
should be. So here we go (remember this is for discussion purposes ATM)
Please have a look at
http://www.mediawiki.org/wiki/Analytics/Limn/Roadmapand add your
suggestions for features that Limn should support but more
importantly tell us what you think Limn should *not* do!
Thx!
D
Hi all,
Wikistats always used a fixed list of namespaces for which
article/edit/editor counts and other metrics should be collected.
This list was pretty short: always ns 0, for commons also 6 and 14 (binaries
and categories), for wikisource 102,104,106, for strategy 106.
As more and more projects create extra namespaces after community vote, in
order to get and stay in sync wikistats from now on will use the api to
determine which namespaces to include.
Countable namespaces are flagged in the api response with: content=""
Example:
api.php?action=query&meta=siteinfo&siprop=namespaces
http://tinyurl.com/mes5l4a
102: cookbooks
110: wikijunior
Here is the latest list of wikis with extra namespaces
http://tinyurl.com/n8thnfl (csv file)
( wb=wikibooks, wk=wiktionary, wn=wikinews, wo=wikivoyage, wp=wikipedia,
wq=wikiquote, ws=wikisource, wv=wikiversity )
Note that namespaces which were always counted will still be included even
when the api does not report them.
See e.g. http://tinyurl.com/mjc55bu (commons)
Here only ns 0 is flagged as content namespace, but we will still include ns
6 and 14.
If you see namespaces which are missing, please flag them wherever that
should be done, so they will be picked up on next wikistats run.
Thanks,
Erik Zachte
Hi all,
I'm one of the persons working in the Wikimedia tech community metrics
[1] that Quim announced some days ago.
I'm attending WikiSym+OpenSym, in Hong Kong, from Aug 4 to Aug 6, just
the days before Wikimania. If any of you happen to be there, and want to
chat about software development metrics, how they are being implemented
for Wikimedia, or just have a coffee while talking about counting
commits and estimating time-to-fix, please drop me a line.
Saludos,
Jesus.
[1] http://korma.wmflabs.org/ (still pre-beta)
--
--
Bitergia: http://bitergia.comhttp://blog.bitergia.com