Forwarding to wider reach.
Cheers,
CT
---------- Forwarded message ----------
From: Ct Woo <ctwoo(a)wikimedia.org>
Date: Thu, Oct 4, 2012 at 11:24 AM
Subject: Update on Ashburn data center migration/switchover date
To: ops(a)wikimedia.org, WMF Engineering Management <emgt(a)wikimedia.org>
All,
We set a Q2 goal to migrate our primary data center from Tampa to
Ashburn, and earlier, communicated the planned switchover to be on
15th Oct., 2012.
While we have been making good progress and completing the work on
several key components such as databases, image-scalers, apache
servers, and /home, we have also been encountering issues with
Swift software (i.e., cross data center replication), Swift hardware
failures (meltdown actually), Varnish video streaming and unexpected
network cabling issues with the memcached servers. Resources have been
diverted to work on resolving those issues and caused some delay in
the schedule as a result.
This week, Mark and Asher performed an assessment on the migration
readiness state and concluded they are increasingly uncomfortable
meeting the migration date, and even if we were to complete the rest
of the outstanding tasks, the migration risks are high especially when
we will be deploying new technologies (i.e., Redis and NetApp as
multimedia store) that are inadequately tested.
Since the Fundraising team is ramping up A/B testings now, and will be
going full scale Fundraising mode this November till December (or
January, ) we have decided to postpone the switchover till after all
the Fundraising activities are over.
The migration team will continue to work on completing the
outstanding tasks, and to ready our Ashburn infrastructure for the
big switchover day. Meantime, we will set up Ashburn data center to
be a 'warm' backup site, in a standby mode, ready to take over full
production traffic within a short period of time, should the need
arises.
Thanks.
CT
Hi everyone,
Just letting everyone know: mediawiki/core is now replicating from
gerrit to github.
https://github.com/mediawiki/core
Next step: extensions!
-Chad
Could we have an HTML dump for X amount of money?
Something like a paid feature.
Include the CSS of course.
Also, leave the <math> tags as they are, as those have to be processed by
3rd party libraries.
2012/9/17 Pablo N. Mendes <pablomendes(a)gmail.com>
>
> I also think the HTML dumps would be super useful!
>
> Cheers
> Pablo
> On Sep 17, 2012 8:05 PM, "James L" <james_leaver(a)hotmail.com> wrote:
>
>> I’m all vote for continuing the HTML wiki dumps that were once done, *2007
>> was the last*? Why are these discontinued? they would be more useful
>> than the so called “XML”.
>>
>> There is no complete solution to processing dumps, the XML is most
>> certainly not XML in its lowest form, and it IS DEFINITELY a moving target!
>>
>> Regards,
>>
>> *From:* Roberto Flores <f.roberto.isc(a)gmail.com>
>> *Sent:* Sunday, September 09, 2012 8:07 PM
>> *To:* Wikimedia developers <wikitech-l(a)lists.wikimedia.org>
>> *Cc:* Wikipedia Xmldatadumps-l <xmldatadumps-l(a)lists.wikimedia.org>
>> *Subject:* Re: [Xmldatadumps-l] [Wikitech-l] HTML wikipedia dumps: Could
>> you please provide them, or make public the code for interpreting templates?
>>
>> Allow me to reply to each point:
>>
>> (By the way, my offline app is called WikiGear Offline:)
>> http://itunes.apple.com/us/app/wikigear-offline/id453614487?mt=8
>>
>> > Templates are dumped just like all other pages are...
>>
>> Yes, but that's only a text description of what the template does.
>> Code must be written to actually process them into HTML.
>> There are tens of thousands of them, and some can't be even programmed by
>> me (e.g., Wiktionary's conjugation templates)
>> If they were already pre-processed into HTML inside the articles'
>> contents, that would solve all of my problems.
>>
>> > what purpose would the dump serve? you dont want to keep the full dump
>> > on the device.
>>
>> I made an indexing program that selects only content articles (namespaces
>> included) and compresses it all to a reasonable size (e.g. about 7gb for
>> the English Wikipedia)
>>
>> > How would this template API function? What does import mean?
>>
>> By this I mean, a set of functions written in some computer language to
>> which I could send them the template within the wiki markup and receive
>> HTML to display.
>>
>> Wikipedia does this whenever a page is requested, but I ignore the exact
>> mechanism through which it's performed.
>> Maybe you just need to make that code publicly available, and I'll try to
>> make it work with my application somehow.
>>
>>
>> 2012/9/9 Jeremy Baron <jeremy(a)tuxmachine.com>
>>
>>> On Sun, Sep 9, 2012 at 6:34 PM, Roberto Flores <f.roberto.isc(a)gmail.com>
>>> wrote:
>>> > I have developed an offline Wikipedia, Wikibooks, Wiktionary, etc. app
>>> for
>>> > the iPhone, which does a somewhat decent job at interpreting the wiki
>>> > markup into HTML.
>>> > However, there are too many templates for me to program (not to
>>> mention,
>>> > it's a moving target).
>>> > Without converting these templates, many articles are simply
>>> unreadable and
>>> > useless.
>>>
>>> Templates are dumped just like all other pages are. Have you found
>>> them in the dumps? which dump are you looking at right now?
>>>
>>> > Could you please provide HTML dumps (I mean, with the templates
>>> > pre-processed into HTML, everything else the same as now) every 3 or 4
>>> > months?
>>>
>>> 3 or 4 month frequency seems unlikely to be useful to many people.
>>> Otherwise no comment.
>>>
>>> > Or alternatively, could you make the template API available so I could
>>> > import it in my program?
>>>
>>> How would this template API function? What does import mean?
>>>
>>> -Jeremy
>>>
>>> _______________________________________________
>>> Wikitech-l mailing list
>>> Wikitech-l(a)lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>>
>>
>> ------------------------------
>> _______________________________________________
>> Xmldatadumps-l mailing list
>> Xmldatadumps-l(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
>>
>>
>> _______________________________________________
>> Xmldatadumps-l mailing list
>> Xmldatadumps-l(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
>>
>>
Hello,
Our scheduled Git+Gerrit session starts in ca. 40 minutes from now.
Everything will happen via SIP audioconference and SSH connection.
Please make sure your SIP and SSH clients works!
More information on the setup:
https://www.mediawiki.org/wiki/Git/Workshop
I am already available on SIP as well as on IRC
(#git-gerrit on Freenode) if you would like
to test your setup.
See you soon!
Marcin Cieślak
(saper)
As of ~11:15AM EDT SPF is deployed for the domain wikimedia.org. Please
let me know ASAP if you discover any issues with mail sent from a
@wikimedia.org address.
Thanks!
jg
Jeff Green
Operations Engineer, Special Projects
Wikimedia Foundation
149 New Montgomery Street, 3rd Floor
San Francisco, CA 94105
415-839-6885 x6807
jgreen(a)wikimedia.org
P.S. Ops folks, rollback is simply a matter of reverting the wikimedia.org
zone file and running authdns-update. I set the TTL to 10 min just in
case.
---------- Forwarded message ----------
Date: Fri, 28 Sep 2012 11:00:08 -0700 (PDT)
From: Jeff Green <jgreen(a)wikimedia.org>
Reply-To: Wikimedia developers <wikitech-l(a)lists.wikimedia.org>
To: wmfall(a)lists.wikimedia.org, wikimedia-l(a)lists.wikimedia.org,
wikitech-l(a)lists.wikimedia.org
Subject: [Wikitech-l] SPF (email spoof prevention feature) test-rollout Weds
10/5
I'm planning to deploy Sender Policy Framework (SPF) for the wikimedia.org
domain on Weds October 5. SPF is a framework for validating outgoing mail,
which gives the receiving side useful information for spam filtering. The main
goal is to cause spoofed @wikimedia.org mail to be correctly identified as
such. It should also improve our odds of getting fundraiser mailings into
inboxes rather than spam folders.
The change should not be noticeable, but the most likely problem would be
legitimate @wikimedia.org mail being treated as spam. If you hear of this
happening please let me know.
Technical details are below for anyone interested . . .
Thanks,
jg
Jeff Green
Operations Engineer, Special Projects
Wikimedia Foundation
149 New Montgomery Street, 3rd Floor
San Francisco, CA 94105
jgreen(a)wikimedia.org
. . . . . . .
SPF overview http://en.wikipedia.org/wiki/Sender_Policy_Framework
The October 8 change will be simply a matter of adding a TXT record to the
wikimedia.org DNS zone:
wikimedia.org IN TXT "v=spf1 ip4:91.198.174.0/24 ip4:208.80.152.0/22
ip6:2620:0:860::/46 include:_spf.google.com ip4:74.121.51.111 ?all"
The record is a list of subnets that we identify as senders (all wmf subnets,
google apps, and the fundraiser mailhouse). The "?all" is a "neutral"
policy--it doesn't state either way how mail should be handled.
Eventually we'll probably bump "?all" to a stricter "~all" aka SoftFail, which
tells the receiving side that only mail coming from the listed subnets is
valid. Most ISPs will route 'other' mail to a spam folder based on SoftFail.
Please bug me with any questions/comments!
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Hi everyone!
Is it possible to use MediaWiki as a service whereas the UI is located
on a Facebook app? So all the editing and viewing is take place on a
Facebook and MediaWiki provide the storage, revision control and lots
of extensions?
-----
Yury Katkov
Dear all,
Starting October 1, 2012, translatewiki.net will drop support for all
MediaWiki extensions it currently supports that still remain in the
Subversion repository svn.wikimedia.org.
Many, if not all of the extensions that by that time have not been
migrated to git/Gerrit, are clearly not that well maintained and we want
to prevent our translators from spending their time on it. At the moment,
a maximum of 273 extensions are affected.
In case this message makes you wonder what git and Gerrit is all about,
and how you can revive your precious extension, please see the following
links:
* https://www.mediawiki.org/wiki/Git/Tutorial
* https://www.mediawiki.org/wiki/Git/Conversion/Extensions_queue
Cheers!
Siebrand
Time to time, we receive a strange warning message
in fenari:/home/wikipedia/log/syslog/apache.log
Oct 3 01:01:03 10.0.11.59 apache2[20535]: PHP Warning: * require() [<a
href='function.require'>function.require</a>]: GC cache entry
'/usr/local/apache/common-local/wmf-config/ExtensionMessages-1.20wmf12.php'
(dev=2049 ino=10248005) was on gc-list for 601 seconds in
/usr/local/apache/common-local/php-1.20wmf12/includes/AutoLoader.php* on
line 1150
Definitely this issue comes from *APC*, source code from package
apc-3.1.6-r1.
When item is inserted into user cache or file cache, this function is
called.
static void process_pending_removals(apc_cache_t* cache TSRMLS_DC)
{
slot_t** slot;
time_t now;
/* This function scans the list of removed cache entries and deletes any
* entry whose reference count is zero (indicating that it is no longer
* being executed) or that has been on the pending list for more than
* cache->gc_ttl seconds (we issue a warning in the latter case).
*/
if (!cache->header->deleted_list)
return;
slot = &cache->header->deleted_list;
now = time(0);
while (*slot != NULL) {
int gc_sec = cache->gc_ttl ? (now - (*slot)->deletion_time) : 0;
if ((*slot)->value->ref_count <= 0 || gc_sec > cache->gc_ttl) {
slot_t* dead = *slot;
if (dead->value->ref_count > 0) {
switch(dead->value->type) {
case APC_CACHE_ENTRY_FILE:
apc_warning("GC cache entry '%s' (dev=%d ino=%d)
was on gc-list for %d seconds" TSRMLS_CC,
dead->value->data.file.filename,
dead->key.data.file.device, dead->key.data.file.inode, gc_sec);
break;
case APC_CACHE_ENTRY_USER:
apc_warning("GC cache entry '%s'was on gc-list for
%d seconds" TSRMLS_CC, dead->value->data.user.info, gc_sec);
break;
}
}
*slot = dead->next;
free_slot(dead TSRMLS_CC);
}
else {
slot = &(*slot)->next;
}
}
}
>From APC configuration (
http://us.php.net/manual/en/apc.configuration.php#ini.apc.gc-ttl )
*apc.gc_ttl integer*
The number of seconds that a cache entry may remain on the
garbage-collection list. This value provides a fail-safe in the event that
a server process dies while executing a cached source file; if that source
file is modified, the memory allocated for the old version will not be
reclaimed until this TTL reached. Set to zero to disable this feature.
We get messages "GC cache entry '%s' (dev=%d ino=%d) was on gc-list for %d
seconds" or "GC cache entry '%s'was on gc-list for %d seconds" in this
condition:
(gc_sec > cache->gc_ttl) && (dead->value->ref_count > 0)
First condition means, item was deleted later then apc.gc_ttl seconds ago
and its still in garbage collector list.
Seconds condition means, item is still referenced.
e.g., when a process unexpectedly died, reference is not decreased. First
apc.ttl seconds is active in APC cache, then is deleted (there isn't next
hit on this item). Now item is on garbage collector list (GC) and
apc.gc_ttl timeout is running.
When apc.gc_ttl is less then (now - item_deletion_time), warning is written
and item is finally completely flushed.
So what should we do?