Hello Wikimedia's system administrators and developers,
I am a PhD student from Institute IMDEA Networks, Spain. I very much
appreciate the data that you have published in Wikimedia Report Card. I
am writing this email as a kind request for a possible cooperation for
assisting me in my research work by sharing your data without violating
anyone's privacy.
I have a task of developing a list of Internet Service Providers (ISPs)
around the globe for 2012. One way of doing it is by mapping the IP
addresses of the internet users who visit your websites to their
corresponding Autonomous System Numbers (ASNs) of the ISPs. For this I
need to have IP address dataset logged by your web server in 2012 and
certainly I do not seek it (IPs) being well aware of the privacy
concerns of any website companies. So instead of the IP addresses, if
you can cooperate in running a simple bash script (which I will send if
you agree) on my behalf which will map the IP address (2012 recorded)
from your database to its corresponding ASN and handover the ASNs
dataset to me, I will be thankful and greatly appreciate for your time
and cooperation.
Awaiting for your reply.
Thanks & regards
Pradeep Bangera
PhD student
Institute IMDEA Networks
Madrid, Spain
Ph: +34 914 816 986
Hey,
I am wondering if we have IRC bots that can report changes to specific
extensions (both on gerrit, ie when a comment is made or stuff is merged,
and on bugzilla). This would be useful for the #wikimedia-wikidata and
#semantic-mediawiki channels, and possible others as well.
Cheers
--
Jeroen De Dauw
http://www.bn2vs.com
Don't panic. Don't be evil.
--
https://bugzilla.wikimedia.org/show_bug.cgi?id=28339 has been just sitting
their stale for quite a while. I know as a toolserver user, that there is a
potential for a lot of useful tools. Who do I need to bribe or murder in
order to facilitate this process?
John
Hi,
I saw this morning those reviewed but not merged code changes in gerrit:
Parser issue for HTML definition list
Bug 11748: Handle optionally-closed HTML tags without tidy
2012-04-17
Owner: GWicke
Review: +1 by saper
https://bugzilla.wikimedia.org/11748https://gerrit.wikimedia.org/r/#/c/5174/
(bug 32381) Allow descending order for list=backlinks, list=embeddedin
and list=imageusage
2012-04-30
Owner : Umherirrender
Review: +1 by Aaron Schulz
https://bugzilla.wikimedia.org/32381https://gerrit.wikimedia.org/r/#/c/6108/
Upgrade cortado-ovt to newer version (seems to work fine locally)
2012-05-05
Owner : Reedy
Review : +1 by awjrichards
https://gerrit.wikimedia.org/r/#/c/6640/
Would it be interesting to generate an automated report detecting > 45
jours code submission having one at least +1 review but still not
merged?
--
Best Regards,
Sébastien Santoro aka Dereckson
http://www.dereckson.be/
>
> Absolutely not. We have debated the "show notice to broken browsers"
>
thing multiple times--and the answer is always "it's annoying as hell
>
when sites do it and it's not our place to do so."
>
> The stance on "supporting crappy old browsers" has largely over time
>
turned into--continue supporting all browsers with at least 1% of our
>
readers (roughly,I don't believe that number's ever been set in stone).
>
Once they are less than 1%, continue supporting unless it's a burden
>
to do so and/or makes support for newer browsers impossible. And lastly,
>
never purposefully break a browser if you can help it.
Chad, in a couple of years when this number does touch 1%, would there be a
notification for users of such browsers beforehand? I expect there might be
some sort of alert, so that the unsuspecting users are aware that the
problem might be the browser and not the website. So you would still have
something annoying, but seen by 1% rather than the 6% today.
My main point of contention is that the number of people using IE7 without
an alternative option is minuscule and with some awareness, the current 6%
number would touch 1% sooner than later, which is a good thing for both
developers as well as users. On the other hand, the developers can continue
to spend effort at backward compatibility for browsers that do not work as
they should. And there will continue to be users of such browsers, because
everything seems to work fine for them. Is it worth sustaining this cycle
longer than it needs to be?
PS: I will admit, I am a newbie here and probably missed the most exciting
IE7 threads :) I'd appreciate if you could link me to the archives off-list
so I can read up everyone's arguments. My selfish interest in this is less
time spent to check compatibility and more to try out new stuff.
--
j.mp/ArunGanesh
Hi everyone,
As we do more frequent deploys, it's going to become critical that we
get database schema changes correct, and that we do so in a way that
gives us time to prepare for said changes and roll back to old
versions of the software should a deploy go poorly. This applies both
to MediaWiki core and to WMF-deployed extensions.
I'd like to propose that we make the following standard practice:
1. All schema changes must go through a period of being optional.
For example, instead of changing the format of a column, create a new
column, make all writes happen to the old and new column (if it
exists) and deprecate use of the old column. Check if the new column
exists before blindly assuming that it does. Only eliminate support
for the old column after it's clear the schema migration has happened
and there's no chance that we'll need to roll back to the old version
of the software.
2. There might be cases where rule #1 will be prohibitive from a
performance perspective. However, schema changes like that should be
rare to begin with, and should have prominent discussion on this list.
In the case where it's impossible to follow rule #1, it is still
critical to write scripts to roll back to the pre-change state.
3. For anything that involves a schema change to the production dbs,
make sure Asher Feldman (afeldman(a)wikimedia.org) is on the reviewer
list. He's already keeping an eye on this stuff the best he can, but
it's going to be easy for him to miss changes in extensions should
they happen.
I don't have a strong opinion about whether we need to follow rule #1
above through an iteration of our six month tarball release cycle, but
we at least need to follow it through the two week deployment cycle.
Assuming this seems sensible to everyone, I can update this page with this:
http://www.mediawiki.org/wiki/Development_policy
(/me desperately tries to avoid yak shaving and updating the policy
above for Git)
Rob
Hi all,
I've been tasked with setting up a local copy of the English
Wikipedia for researchers - sort of like another Toolserver. I'm not
having much luck, and wondered if anyone has done this recently, and
what approach they used? We only really need the current article text
- history and meta pages aren't needed.
Things I have tried:
1) Downloading and mounting the SQL dumps
No good because they don't contain article text
2) Downloading and mounting other SQL "research dumps" (eg
ftp://ftp.rediris.es/mirror/WKP_research)
No good because they're years out of date
3) Using WikiXRay on the enwiki-latest-pages-meta-history?.xml-.....xml files
No good because they decompress to astronomically large. I got about
halfway through decompressing them and was over 7Tb.
Also, WikiXRay appears to be old and out of date (although
interestingly its author Felipe Ortega has just committed to the
gitorious repository[1] on Monday for the first time in over a year)
4) Using MWDumper (http://www.mediawiki.org/wiki/Manual:MWDumper)
No good because it's old and out of date: it only supports export
version 0.3, and the current dumps are 0.6
5) Using importDump.php on a latest-pages-articles.xml dump [2]
No good because it just spews out 7.6Gb of this output:
PHP Warning: xml_parse(): Unable to call handler in_() in
/usr/share/mediawiki/includes/Import.php on line 437
PHP Warning: xml_parse(): Unable to call handler out_() in
/usr/share/mediawiki/includes/Import.php on line 437
PHP Warning: xml_parse(): Unable to call handler in_() in
/usr/share/mediawiki/includes/Import.php on line 437
PHP Warning: xml_parse(): Unable to call handler in_() in
/usr/share/mediawiki/includes/Import.php on line 437
...
So, any suggestions for approaches that might work? Or suggestions for
fixing the errors in step 5?
Steve
[1] http://gitorious.org/wikixray
[2] http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.b…