Hi,
I've been thinking about this for the last week or so because it's becoming
incredibly clear to me that core isn't scaling. It's already taking up over
4GB on the Gerrit box, and this is the primary reason core operations are
slow.
A couple of things I can think of to help the situation.
- We can repack core on manganese. This should provide a bit of relief,
but won't help long term. Core would have to be read-only for about an hour
or two.
-We can rewrite history (git-filter-branch) to remove some mistakes that
exploded the repo size. Binaries later removed, things accidentally checked
into ./extensions, etc. This could potentially greatly reduce object sizes
and allow for tighter repacks. Major issue with history rewriting is
everyone will have to reclone (all sha1s would change). I've not tested my
theory yet.
I'm open to any other ideas that could help core.
-Chad
On Mar 10, 2013 5:15 PM, "Ori Livneh" <ori(a)wikimedia.org> wrote:
Hello,
I'm in the process of re-working mediawiki-vagrant, which is a set of
scripts for provisioning a virtual machine for MediaWiki development. I'm
struggling to identify the best way of fetching mediawiki/core.
An ideal solution would have the following attributes:
- Fast.
- Includes .git metadata, to facilitate contribution of patches.
- Viable on slow network connections.
- Does not require a Gerrit account (to help newcomers get started quickly)
What I tried:
- A shallow (--depth=0) git-clone over HTTPS took around half an hour and
required transferring 272MB, with 200MB taken up by .git/objects/pack.
- The nightlies on
integration.mediawiki.org are small (18MB) and easy to
retrieve, but the most recent one is from December, and they don't contain
any .git metadata.
- The snapshots Krinkle maintains on the toolserver are both small and
up-to-date, but they too do not contain any .git metadata.
- The snapshot link on
http://www.mediawiki.org/wiki/Download (
https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=snapshot;h=ref…)
just didn't work. It hangs for a while and then spits out HTML.
- Getting a snapshot from GitHub would probably work, but I am loathe to
depend on it.
Does anyone have any suggestions?
--
Ori Livneh
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l