Hello,
We often have the case of a change to an extension depending on a
pending patch to MediaWiki core. I have upgraded our CI scheduler -
Zuul - a couple weeks ago and it now supports marking dependencies even
in different repositories.
Why does it matter? To make sure the dependency is fulfilled one
usually either:
* CR-2 the patch until dependent change is merged
* write a test that exercise the required patch in MediaWiki.
With the first solution (lack of test), once both are merged, nothing
prevent one from cherry picking a patch without its dependent patch.
For example for MediaWiki minor releases or Wikimedia deployment branches.
When a test covers the dependency, it will fail until the dependent one
is merged which is rather annoying.
Zuul now recognizes the header 'Depends-On' in git messages, similar to
'Change-Id' and 'Bug'. 'Depends-On' takes as parameter a change-id and
multiple ones can be added.
When a patch is proposed in Gerrit, Zuul looks for Gerrit changes
matching the 'Depends-On' and verify whether any are still open. In such
a case, it will craft references for the open patches so all the
dependencies can be tested as if they got merged.
Real world example
------------------
The ContentTranslation extension is tested with the Wikidata one and was
not passing the test. Wikidata created a patch and we did not want to
merge it until we confirm the ContentTranslation one is passing properly.
The Wikidata patch is https://gerrit.wikimedia.org/r/#/c/252227/
Change-Id: I0312c23628d706deb507b5534b868480945b6163
On ContentTranslation we indicated the dependency:
https://gerrit.wikimedia.org/r/#/c/252172/1..2//COMMIT_MSG
+ Depends-On: I0312c23628d706deb507b5534b868480945b6163
Which is the Wikidata patch.
Zuul:
* received the patch for ContentTranslation
* looked up the change-id and found the Wikidata
* created git references in both repo to point to the proper patches
Jenkins:
* zuul-cloner cloned both repos and fetched the references created by
the Zuul service
* run tests
* SUCCESS
That confirmed us the Wikidata patch was actually fixing the issue for
ContentTranslation. Hence we CR+2 both and all merged fine.
Please take a moment to read upstream documentation:
http://docs.openstack.org/infra/zuul/gating.html#cross-repository-dependenc…
Wikidata/ContentTranslation task:
https://phabricator.wikimedia.org/T118263
--
Antoine "hashar" Musso
Dear Wikipedia data/API user,
The WMF’s Engineering, Product and Partnerships teams are conducting a
short survey to help us understand how organizations are pulling and using
data from our projects. This information will inform future features and
improvements to our data tools and APIs.
We would appreciate a few minutes of your time. The link to the Survey
below will take you to a Google Form - there is no need to sign up to fill
out the survey. The survey should take no more than 10 minutes.
https://docs.google.com/forms/d/1yUrHzyLABN419RCDbzepjoRWCbaWYV4wbtbKPa95C4…
Thank you for your input and feedback!
Warm wishes,
Sylvia
PS-- Apologies for the cross posting, you might see this note on a couple
of other lists.
--
*Sylvia Ventura **Strategic Partnerships **Wikimedia Foundation **+1 (415)
839 6885 x6788 *
Hi all!
If you are interested in code quality and you are attending the Dev Summit,
please have a look at <https://phabricator.wikimedia.org/T119032>.
I would appreciate any input on which additional sessions you think are
important, or how you would prioritize and group them. Any session proposed at
<https://phabricator.wikimedia.org/tag/wikimedia-developer-summit-2016/> is fair
game, but they should fit into the "code quality" topic somehow (other proposals
will be discussed elsewhere).
Context:
The ArchCom is currently in the process of figuring sorting through the list of
proposed sessions, and trying to prioritize and group them. To this end, we have
identified 5 broad topic areas ("Content Format", "Access and APIs",
"Collaboration", "Software Engineering", and "User Interface" - see T119018 for
an overview).
My job is now to figure out which sessions we want in the "Software Engineering"
(aka "code quality") part of the event. I have started to do this at
<https://phabricator.wikimedia.org/T119032>. If you have any thoughts on how
these sessions should be prioritized or grouped, or what is missing, please comment.
Thanks,
Daniel
Hi,
I am Yeongjin Jang, a Ph.D. Student at Georgia Tech.
In our lab (SSLab, https://sslab.gtisc.gatech.edu/),
we are working on a project called B2BWiki,
which enables users to share the contents of Wikipedia through WebRTC
(peer-to-peer sharing).
Website is at here: http://b2bwiki.cc.gatech.edu/
The project aims to help Wikipedia by donating computing resources
from the community; users can donate their traffic (by P2P communication)
and storage (indexedDB) to reduce the load of Wikipedia servers.
For larger organizations, e.g. schools or companies that
have many local users, they can donate a mirror server
similar to GNU FTP servers, which can bootstrap peer sharing.
Potential benefits that we think of are following.
1) Users can easily donate their resources to the community.
Just visit the website.
2) Users can get performance benefit if a page is loaded from
multiple local peers / local mirror (page load time got faster!).
3) Wikipedia can reduce its server workload, network traffic, etc.
4) Local network operators can reduce network traffic transit
(e.g. cost that is caused by delivering the traffic to the outside).
While we are working on enhancing the implementation,
we would like to ask the opinions from actual developers of Wikipedia.
For example, we want to know whether our direction is correct or not
(will it actually reduce the load?), or if there are some other concerns
that we missed, that can potentially prevent this system from
working as intended. We really want to do somewhat meaningful work
that actually helps run Wikipedia!
Please feel free to give as any suggestions, comments, etc.
If you want to express your opinion privately,
please contact sslab(a)cc.gatech.edu.
Thanks,
--- Appendix ---
I added some detailed information about B2BWiki in the following.
# Accessing data
When accessing a page on B2BWiki, the browser will query peers first.
1) If there exist peers that hold the contents, peer to peer download happens.
2) otherwise, if there is no peer, client will download the content
from the mirror server.
3) If mirror server does not have the content, it downloads from
Wikipedia server (1 access per first download, and update).
# Peer lookup
To enable content lookup for peers,
we manage a lookup server that holds a page_name-to-peer map.
A client (a user's browser) can query the list of peers that
currently hold the content, and select the peer by its freshness
(has hash/timestamp of the content,
has top 2 octet of IP address
(figuring out whether it is local peer or not), etc.
# Update, and integrity check
Mirror server updates its content per each day
(can be configured to update per each hour, etc).
Update check is done by using If-Modified-Since header from Wikipedia server.
On retrieving the content from Wikipedia, the mirror server stamps a timestamp
and sha1 checksum, to ensure the freshness of data and its integrity.
When clients lookup and download the content from the peers,
client will compare the sha1 checksum of data
with the checksum from lookup server.
In this settings, users can get older data
(they can configure how to tolerate the freshness of data,
e.g. 1day older, 3day, 1 week older, etc.), and
the integrity is guaranteed by mirror/lookup server.
More detailed information can be obtained from the following website.
http://goo.gl/pSNrjR
(URL redirects to SSLab@gatech website)
Please feel free to give as any suggestions, comments, etc.
Thanks,
--
Yeongjin Jang
Hi, here is a simple game that we ask you to play by the end of next
Monday, December 7.
Please help us define the must-have sessions at the Wikimedia Developer
Summit.
Go to https://phabricator.wikimedia.org/tag/wikimedia-developer-summit-2016/
and select your must-have sessions, specially in the areas that you are
directly involved (scroll to the right to see all columns!!).
Then, go to https://phabricator.wikimedia.org/T119593 and post your
recommendations in a comment. Just like with code review and +1, please
don't recommend your own sessions.
The Architecture Committee, WMF Engineering management, and the Summit
organizers will send their recommendations as +2.
We have 21 slots of 80 minutes for pre-scheduled sessions. If the total of
must-have candidates is 21 or less, then great. If it is more, we will have
to make hard calls. Or not so hard, because the rest of the sessions will
still have time and space, only in unconference mode.
https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit_2016#Program
--
Quim Gil
Engineering Community Manager @ Wikimedia Foundation
http://www.mediawiki.org/wiki/User:Qgil
Hi,
I created this ticket: https://phabricator.wikimedia.org/T119878
The basic idea is that it shouldn't be a big problem to compress
output of api.php script using some widely available library, like
gzip.
That way the size of communication between client and server would be
much smaller and users with slow internet might benefit from this. I
am not sure how much the data would be reduced, but it could be a
significant number in some cases.
What do you think about it? Is there any reason not to do that?
Note I don't propose some breaking change, rather just create an
optional parameter "compression" that would be passed for API
requests.
Hi all,
here is the weekly look at our most important readership metrics (apologies
for the delay). Apart from the usual data, this time there is an additional
chart to illuminate how our mobile readership ratio has developed since
this spring, the iOS app retention stats are back after Apple fixed their
data, and we conclude with some inspiring quotes about climate change
awareness ;)
As laid out earlier
<https://lists.wikimedia.org/pipermail/mobile-l/2015-September/009773.html>,
the main purpose of this report is to raise awareness about how these are
developing, call out the impact of any unusual events in the preceding
week, and facilitate thinking about core metrics in general. We are still
iterating on the presentation and eventually want to create dashboards for
those which are not already available in that form already. Feedback and
discussion welcome.
Now to the usual data. (All numbers below are averages for November 16-22,
2015 unless otherwise noted.)
Pageviews
Total: 540 million/day (-0.0% from the previous week)
Context (April 2015-November 2015):
( see also the Vital Signs dashboard
<https://vital-signs.wmflabs.org/#projects=all/metrics=Pageviews>)
The Analytics team improved web crawler detection further last week
<https://meta.wikimedia.org/w/index.php?title=Dashiki%3APageviewsAnnotations…>,
meaning an “optical” (as opposed to real) drop in human pageviews from
November 19 on - presumably smaller though than the one for September that
we reported in the preceding report.
Desktop: 57.2% (previous week: 57.5%)
Mobile web: 41.6% (previous week: 41.3%)
Apps: 1.2% (previous week: 1.2%)
Context (April 2015-November 2015):
These percentages usually don’t change rapidly from week to week. For a
wider perspective, I’m including a chart of the (aggregate) mobile
percentage this time, too. Technically this information is already
contained in the usual chart above. But here we can see even clearer
indications for an impact of the HTTPS-only switchover during June (it
appears to have taken out desktop traffic mainly), as well as the strong
weekly periodicity (higher mobile ratio on weekends). It looks like mobile
won’t overtake desktop anytime soon.
Global North ratio: 77.3% of total pageviews (previous week: 77.6%)
Context (April 2015-November 2015):
New app installations
Android: 30.9k/day (-44.2% from the previous week)
Daily installs per device, from Google Play
Context (last month):
As described in the previous report, the Android Wikipedia app was featured
in the "New
and Updated Apps" section of the Google Play store from November 5-12, and
while the huge positive impact overall on download numbers is obvious, they
also decreased markedly afterwards. They seem to be coming back up a bit
now, but we are still waiting for some more data before making a final
estimate for the overall effect, and have also contacted Google to see if
they can help us illuminate the mechanism behind this apparent effect.
iOS: 4.69k/day (+2.2% from the previous week)
Download numbers from App Annie
Context (last three months):
No news here.
App user retention
Android: 14.8% (previous week: 15.2%)
(Ratio of app installs opened again 7 days after installation, among all
installed during the previous week. 1:100 sample)
Context (last three months):
iOS: 12.0% (previous week: 11.9%)
(Ratio of app installs opened again 7 days after installation, among all
installed during the previous week. From iTunes Connect, opt-in only = ca.
20-30% of all users)
Context (installation dates from October 18-November 15, 2015):
This metric was left out of last week’s report because of inconsistencies.
Indeed, Apple has since issued a correction notice
<http://www.talkingnewmedia.com/2015/11/24/apple-issues-corrected-itunes-con…>.
Unfortunately it looks like the data underlying the report for the week
until November 8 was affected too, so please disregard the iOS retention
figure given in that report.
Unique app users
Android: 1.190 million / day (-2.2% from the previous week)
Context (last three months):
This too will need another look.
iOS: 281k / day (+0.1% from the previous week)
Context (last three months):
No news here.
After publishing this report regularly for a bit over two months, we may be
rethinking the weekly publication schedule a little - also to keep the
balance between newsworthiness and keeping up general awareness for
longterm developments. In that vein, some inspiring quotes about a weekly
climate change newsletter
<http://www.niemanlab.org/2015/11/climate-change-is-depressing-and-horrible-…>
that begins every issue by reciting the current CO2 ratio in the atmosphere
as a KPI ;)
Ultimately, Meyer said, the newsletter comes out of the idea that “if
you’re worried about something, you should pay regular attention to it.”
“By paying attention to it over time, and watching its texture change over
time, you will come to have ideas about it,” he said. “You will come to
understand it in a new way, and you will contribute in a very small way to
how society addresses this big problem.”
[...]
So it seemed as if a newsletter might be a good way to cover the issue.
[...] “You can get a continuity of storyline,” Meyer said. “You can’t cover
all of everything that’s happening every week in the climate, but you can
watch certain parts develop, and hopefully bring people in over time.” He
leads off the “Macro Trends” section of each issue with the molecules per
million of carbon dioxide in the atmosphere:
The atmosphere is filling with greenhouse gases. The Mauna Loa Observatory
measured an average of 398.51 CO2 molecules per million in the atmosphere
this week. A year ago, it measured 395.84 ppm. Ten years ago, it measured
376.93 ppm.
“What we’re doing now won’t show up in that number for a decade or so,” he
said. “But by reminding myself of it every week, and thinking about its
contours and its direction, that’s a way to stay focused on what matters.”
----
For reference, the queries and source links used are listed below (access
is needed for each). Most of the above charts are available on Commons, too
<https://commons.wikimedia.org/w/index.php?title=Special:ListFiles&offset=20…>
.
hive (wmf)> SELECT SUM(view_count)/7000000 AS avg_daily_views_millions FROM
wmf.projectview_hourly WHERE agent_type = 'user' AND
CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")) BETWEEN "2015-11-16"
AND "2015-11-22";
hive (wmf)> SELECT year, month, day,
CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")) as date,
sum(IF(access_method <> 'desktop', view_count, null)) AS mobileviews,
SUM(view_count) AS allviews FROM wmf.projectview_hourly WHERE year=2015 AND
agent_type = 'user' GROUP BY year, month, day ORDER BY year, month, day
LIMIT 1000;
hive (wmf)> SELECT access_method, SUM(view_count)/7 FROM
wmf.projectview_hourly WHERE agent_type = 'user' AND
CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")) BETWEEN "2015-11-16"
AND "2015-11-22" GROUP BY access_method;
hive (wmf)> SELECT SUM(IF (FIND_IN_SET(country_code,
'AD,AL,AT,AX,BA,BE,BG,CH,CY,CZ,DE,DK,EE,ES,FI,FO,FR,FX,GB,GG,GI,GL,GR,HR,HU,IE,IL,IM,IS,IT,JE,LI,LU,LV,MC,MD,ME,MK,MT,NL,NO,PL,PT,RO,RS,RU,SE,SI,SJ,SK,SM,TR,VA,AU,CA,HK,MO,NZ,JP,SG,KR,TW,US')
> 0, view_count, 0))/SUM(view_count) FROM wmf.projectview_hourly WHERE
agent_type = 'user' AND
CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")) BETWEEN "2015-11-16"
AND "2015-11-22";
hive (wmf)> SELECT year, month, day,
CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")), SUM(view_count) AS
all, SUM(IF (FIND_IN_SET(country_code,
'AD,AL,AT,AX,BA,BE,BG,CH,CY,CZ,DE,DK,EE,ES,FI,FO,FR,FX,GB,GG,GI,GL,GR,HR,HU,IE,IL,IM,IS,IT,JE,LI,LU,LV,MC,MD,ME,MK,MT,NL,NO,PL,PT,RO,RS,RU,SE,SI,SJ,SK,SM,TR,VA,AU,CA,HK,MO,NZ,JP,SG,KR,TW,US')
> 0, view_count, 0)) AS Global_North_views FROM wmf.projectview_hourly
WHERE year = 2015 AND agent_type='user' GROUP BY year, month, day ORDER BY
year, month, day LIMIT 1000;
https://console.developers.google.com/storage/browser/pubsite_prod_rev_0281…
(“overview”)
https://www.appannie.com/dashboard/252257/item/324715238/downloads/?breakdo…
(select “Total”)
SELECT LEFT(timestamp, 8) AS date, SUM(IF(event_appInstallAgeDays = 0, 1,
0)) AS day0_active, SUM(IF(event_appInstallAgeDays = 7, 1, 0)) AS
day7_active FROM log.MobileWikiAppDailyStats_12637385 WHERE timestamp LIKE
'201511%' AND userAgent LIKE '%-r-%' AND userAgent NOT LIKE '%Googlebot%'
GROUP BY date ORDER BY DATE;
(with the retention rate calculated as day7_active divided by day0_active
from seven days earlier, of course)
https://analytics.itunes.apple.com/#/retention?app=324715238
hive (wmf)> SELECT SUM(IF(platform = 'Android',unique_count,0))/7 AS
avg_Android_DAU_last_week, SUM(IF(platform = 'iOS',unique_count,0))/7 AS
avg_iOS_DAU_last_week FROM wmf.mobile_apps_uniques_daily WHERE
CONCAT(year,LPAD(month,2,"0"),LPAD(day,2,"0")) BETWEEN 20151116 AND
20151122;
hive (wmf)> SELECT CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0"))
as date, unique_count AS Android_DAU FROM wmf.mobile_apps_uniques_daily
WHERE platform = 'Android';
hive (wmf)> SELECT CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0"))
as date, unique_count AS iOS_DAU FROM wmf.mobile_apps_uniques_daily WHERE
platform = 'iOS';
--
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB
Hi folks,
This week's RFC review meeting is scheduled for Wednesday, November 25
at 2pm PST (22:00 UTC). Event particulars can be found at
<https://phabricator.wikimedia.org/E92>
The main task this week is to plan out what we will define the minimum
PHP version to be for MediaWiki 1.27 (the next LTS version). The
viable choices seem to be:
* PHP 5.3 (the status quo) - this version is no longer supported
upstream, and doesn't have widespread support even in conservatively
updated Linux distros.
* PHP 5.4 - this version is no longer supported by The PHP Group, but
is still part of older supported Linux distros (e.g. Debian Wheezy)
* PHP 5.5 - this is the lowest version with reliable LTS support in
major Linux distros
The RFC additionally stipulates some coding standards, since even
though it upgrading our version of PHP would make use of some features
possible, that doesn't automatically make their use a good idea. The
author broke up the feature set into "encouraged", "tolerated" and
"verboten". Please read the RFC directly for more info on this:
<https://phabricator.wikimedia.org/T118932>
Please comment on T118932 if you have further thoughts to share and/or
please attend the meeting on Wednesday.
Thanks
Rob
https://phabricator.wikimedia.org/T119779
Graph extension generates different html output depending on the isPreview
parser option, but if user previews a page and saves it right after without
any changes, the parser reuses previous output. Is there a way to force
parser regenerate on save? Thanks!