Hello colleagues and shareholders (community :)!
Has been a while since my last review of operations (aka hosting
report) - so I will try to overview some of things we've been doing =)
First of all, I'd like to thank mr.Moore for his fabulous law. It
allowed Wikipedia to stay alive - even though we had to grow again in
all directions.
We still have Septembers. Well, it is a nice name to describe the
recurring pattern, which provides Shock and Awe to us - after a
period of stable usage, every autumn number of users suddenly goes up
and stays there - to allow us think we've finally reached some
saturation and will never grow more. Until next September.
We still have World Events. People rush to us to read about conflicts
and tragedies, joys and celebrations. Sometimes because we had
information for ages, sometimes because it all matured in seconds or
minutes. Nowhere else document can require that much of concurrent
collaboration, and nowhere else it can provide as much value
immediately.
We still have history. From day one of the project, we can see people
going into dramas, discussing, evolving and revolving every idea on
the site. Every edit stays there - accumulating not only final pieces
of information, but the whole process of assembling the content.
We still advance. Tools to facilitate the community get more complex,
we start growing ecosystem of tools and processes inside and outside
core software and platform. Users are the actual developers of the
project, core technology just lags behind assisting.
Our operation becomes more and more demanding - and thats quite a bit
of work to handle.
Ok, enough of such poetic introduction :)
== Growth ==
Over second half of 2006 traffic and reqeuests to our cluster doubled
(actually, that happened just in few months)
Over 2007 traffic and requests to our cluster doubled.
Pics:
http://www.nedworks.org/~mark/reqstats/trafficstats-yearly.pnghttp://www.nedworks.org/~mark/reqstats/reqstats-yearly.png
== Hardware expansion ==
Back in September 2006 we had quite huge load increase, and we went
for capacity expansion, which included:
* 20 new Squid servers ($66k)
* 2 storage servers ($24k)
* 60 application servers ($232k)
German foundation additionally assisted with purchasing 15 Squid
servers in November for Amsterdam facility.
Later in January 2007 we added 6 more database servers (for $39k),
three additional application servers for auxiliary tasks (such as
mail), and some network and datacenter gear.
The growth over autumn/winter led us to quite big ($240k) capacity
expansion back in March, which included:
* 36 very capable 8-core application servers (thank you Moore yet
again :) - that was around $120k
* 20 Squid servers for Tampa facility
* Router for Amsterdam facility
* Additional networking gear (switches, linecards, etc) for Tampa
The only serious capacity increase afterwards was another
'German' (thanks yet again, Verein) batch of 15 Squid servers for
Amsterdam in December 2007.
We do plan to improve on database and storage servers soon - that
would add to stability of our dumps building and processing, as well
as better support for various batch jobs.
We have been especially pushy about exploiting warranties on all
servers, and nearly all machines ever purchased are in working state,
doing one or another kind of workload. All the veterans of 2005 are
still running at amazing speeds doing the important jobs :)
Rob joining to help us with datacenter operations has allowed to have
really nice turnarounds with pretty much every datacenter work - as
volunteer remote hands became not available during critical moments
anymore. Oh, and look how tidy cabling is: http://flickr.com/photos/
midom/2134991985/ !
== Networking ==
This has been mainly in capable Mark's and River's hands - where we
underwent transition from hosting customer to internet service
provider (or at least - equal peer to ISPs) ourselves. We have our
independent autonomous systems both in Europe and US - allowing to
pick best available connectivity options, resolve routing glitches,
and get free traffic peering at internet exchanges. That provides
quite lots of flexibility, of course, at the cost of more work and
skills required.
This is also part of overall well-managed powerful datacenter
strategy. Instead of low-efficiency small datacenters scattered
around the world, core facility like one in Amsterdam provides high
availability, close proximity to major Internet hubs and carriers,
and is generally in center of region's inter-tubes. Though it would
be possible to reach out into multiple donated hosting places, that
would just lead to slower service for our users, and someone would
still have to pay for the bandwidth. As we are pushing nearly 4 Gbps
of traffic, there're not much donors who wouldn't feel such traffic.
== Software ==
There has been lots of overall engineering effort, that was often
behind the scenes. Various bits had to be rewritten to act properly
on user activity. The most prominent example of such work is Tim's
rewrite of parser to more efficiently handle huge template
hierarchies. In perfect case, users will not see any visible change,
except multiple-factor faster performance at expensive operations.
In past year, lots of activities - how people use customized software
- bots, javascript extensions, etc - have changed performance
profile, and nowadays lots of performance work at backend is to
handle various fresh activities - and anomalies.
One of core activities was polishing caching of our content, so we
could have our application layer to concentrate on most important
process - collaboration, instead of content delivery.
Lots and lots of small things have been added or fixed - though some
developments where quite demanding - like multimedia integration,
which was challenging due to our freedom requirements.
Still, there was constant tradeoff management, as not every feature
was worth the performance sacrifice and costs, and on the other hand
- having the best possible software for collaboration is also
important :) Introducing new features, or migrating them from outside
to the core platform has been always serious engineering effort.
Besides, there would be quite a lot of communication - explaining how
things have to be built for them not to collapse at live site,
discussing security implications, change of usage patterns, ...
Of course, MediaWiki is still one of most actively developed web
software - and here Brion and Tim lead the volunteers, as well, as
spend their days and nights in the code.
At the overall stack, we have worked at every layer - tuning kernels
for our high-performance networking, experimenting with database
software (some servers are running our own fork of MySQL, based on
Google changes), perfecting Squid (Mark and Tim ended up in authors
list) - our web caching software, digging into problems and
specialties of PHP engine. Quite a lot of problems we hit are very
huge-site-specific, and even if other huge shops hit them, we're the
ones that are always free to release our changes and fixes. Still,
colleagues from other shops are willing to assist us too :)
There were lots of tiny architecture tweaks - that allowed us to use
resources more efficiently, but none of them are any major - pure
engineering all the time. It seems, that lately we stabilized on lots
of things in how Wikipedia works - and it seems to work quite
fluently. Of course, one must mention Jens' keen eye, taking care of
various especially important but easily overlooked things.
River has dedicated lots of attention to supporting the community
tools infrastructure at the Toolserver - and also maintaining off-
site copies of projects.
Site doesn't fall down the very same minute nobody is looking at it,
and it is quite an improvement over the years :)
== Notes ==
People have been discussing if running a popular site is really a
mission of WMF. Well, users created magnificent resource, we try to
support it, we do what we can. Thanks to everyone involved - though
it has been far less stressful ride than previous years, still, nice
work. ;-)
== More reading ==
May hurt your eyes: https://wikitech.leuksman.com/view/Server_admin_log
Platform description: http://dammit.lt/uc/workbook2007.pdf
== Disclaimer ==
Some numbers can be wrong, as this review was based not on audit, but
on vague memories :)
--
Domas Mituzas -- http://dammit.lt/ -- [[user:midom]]
I think that this begins to hit at the core of the debate.
That said, even what is permitted by law seems to be more restrictive than
what is currently occurring on the English Wikipedia.
Danny
In a message dated 1/7/2008 12:19:42 PM Eastern Standard Time,
rarohde(a)gmail.com writes:
Some people, myself included, want to create the best possible no-cost
encyclopedia. From that point of view, copyleft and free content is a means
to that end.
Other people (including much of the WMF Board apparently) feel creating free
content is an end in itself that justifies sacrificing some encyclopedic
coverage and limiting our exercise of fair use rights to a much narrower set
of circumstances than allowed by law.
I can understand that point of view, even though I don't agree with it.
**************Start the year off right. Easy ways to stay in shape.
http://body.aol.com/fitness/winter-exercise?NCID=aolcmp00300000002489
Greetings and Happy New Year 2008 to all!
This year has been declared by the UN as "International Year of Languages."*
What exactly that will mean depends in part on what UNESCO - which is
charged with coordinating the Year - and in part on what various groups and
individuals dealing with languages and linguistics decide to make of it.
I would like to propose that Wikimedia - which is in many ways on the
cutting edge of multilingual exploitation of the potential of the web, but
which has some language projects slated for deletion after a period of being
"closed" (which I understand also means being placed in an "incubator"
status) - declare for the duration of the IYL (2008) a moratorium on
deletion of language projects.
The moratorium period would also be used to discuss (and implement) new
means to save and develop projects in incubator status, which may involve
any of the following and more:
* A "mentor" or "champion" for each project that is "closed"/"in the
incubator"
** This person would advocate for the project within Wikimedia and outside,
and coordinate efforts on its behalf
* Developing a methodology or set of guidelines for searching for relevant
experts and language bodies that might help with the project in question
* An "incubator" period longer than the currently typical (as I understand
it) one year for languages that meet certain criteria
** The criteria would probably involve the number of speakers
* Develop a project proposal for outside funding to support development of
Wikimedia projects in less-widely spoken languages
A permanent change might also be considered:
* Change in terminology since "close" and "delete" sound equally final to
average users, when in fact a "closed" project still lives
I'm particularly concerned about this issue because some African language
projects are at risk, and I think that part of the problem is that there
needs to be new ways of proactively identifying people and resources to save
and develop such editions. I would mention that for example the Afar
Wikipedia is slated for closure (which I understand means it is on the
"incubator" and not deleted), but at the same time Afar has a locale and
there is a project to localize AbiWord in it. That's an interesting
juxtaposition of facts, which is probably not unusual, but is not always (or
perhaps almost never?) noted in discussions on closure/deletion.
Part of the problem is successfully reaching people who are activists or
"mavens" (per Gladwell's Tipping Point) for/in the language who simply are
not connected with Wikipedia or perhaps not even really aware of it or how
it could be useful in their efforts. Setting up a system with something like
a mentor and a longer stay of execution for inactive projects could pay off
with more active projects in more languages sooner - beginning with the ones
that exist but are not yet active.
Part of the problem with closing a project while saying that "well, when
there's a community, they can apply for a new project" is that the bar is
also raised. It is much easier to work with the fact that the Wikipedia
space is already there and get a handful of individuals involved to get it
started than to have to prove the concept and get a group organized to apply
for a new project. Much easier to push start a car with the key in the
ignition than to take away the key until they get a proper repair job done.
Anyway I put this forth for discussion in the spirit of IYL 2008. All the
best.
Don Osborn
Bisharat.netPanAfriL10n.org
OK, let's change the subject here, as I think this deserves a seperate thread.
I agree with Thomas that no responsibilities have been determined yet.
However, let me please make clear that I absolutely disagree with the
idea to have "representatives" of local arbcoms in the meta-arbcom.
Not in any case. Why? Well, because this meta-arbcom would in any case
mainly be dealing with small projects and cases that are
multi-project. Other then expertise these local arbcoms have nothing
to bring in. I think that for the sake of neutrality it would even be
better to not have local arbcommers in the meta-arbcom. It's either
another, either a higher jurisdiction. In neither of the cases it
would be wishful to have local arbcom people in the meta-arbcom.
Please let us not get stuck in details here by the way. The language
is mere a practical issue that the arbcom will have to solve on
itself. I think however that for practical reasons language sections
would not be successful. It is not scalable. You can't get 280
language sections, or we should hire language-miracles here, but I
think they can spend their time much better writing travel guides ;-)
Best regards,
Lodewijk
2008/1/4, Thomas Dalton <thomas.dalton(a)gmail.com>:
> > I am not sure I quite agree. The local arbitrators on say the Portuguese
> > wikipedia might not have been chosen for their familiarity with minor
> > languages in the (former and current) Portuguese colonies, just as an
> > example, which a putative meta arbcom team with a working language
> > of Portuguese, might quite easily be.
>
> I guess that all depends on what responsibilities the meta-arbcom
> would have, which I don't think has been decided on. If it's primarily
> arbitrating disputes (as the name would suggest), the skills needed
> are much the same regardless of the nature of the dispute.
>
> > I don't quite see how a pure english language meta-arbcom would be
> > truer.
>
> It wouldn't be an English Language arbcom, it would an arbcom that
> uses English as a lingua franca. That's the only way to allow people
> from different languages to work together to resolve issues, which I
> think would be a good feature of a central arbcom.
>
> _______________________________________________
> foundation-l mailing list
> foundation-l(a)lists.wikimedia.org
> Unsubscribe: http://lists.wikimedia.org/mailman/listinfo/foundation-l
>
The mission of Wikimedia is to generate "neutral educational content under a
free content license". The Foundation's resolution from March 2007 states
that EDP use must be minimal, within narrow limits.
Subsequent to the resolution being passed, a number of efforts were
undertaken to limit fair use usage on en.wikipedia. This affected
discographies, episode lists, and character lists. A *huge* number of
debates erupted over these removals. One such debate was covered at
http://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2007-05-07/Fair_u….
The disputes have never ended. For discographies and episode lists,
the
debate has simmered down for the most part, with occasional flare ups. For
character lists, the debate is still raging.
What has been the rule of thumb in removing the images is that an image of
the character being used for depiction of that character only is allowable
on that character's particular article, but not on articles collecting
multiple characters into a single article. The rationale here is that if a
character is notable enough for an article, they're notable enough for an
image, and vice versa. Allowances have been made for "cast" type images
showing multiple characters in a single image from the copyright holder (not
montages made by editors).
Nevertheless, the debate has raged endlessly, and has recently exploded. It
stands now on a precipice, and it is highly likely that fair use
inclusionists will 'win' in that per-character images are going to be
permitted on character articles (for example, see
http://en.wikipedia.org/wiki/Hogwarts_students ).
Some discussion exists currently at
http://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard#Fair_u…
and scattered through a variety of sections of
http://en.wikipedia.org/wiki/Wikipedia_talk:Non-free_content
If Wikipedia is truly a free content encyclopedia, if you truly care about
free content, we must limit fair use usage per the Foundation's resolution.
As it stands now, this debate is lost in favor of people who are more
focused on whether something is suitable as a guide than focused on being a
free content resource.
A strong voice from the Foundation would be appreciated, most especially in
favor of a new section added to clarify the local EDP at the second
paragraph of this version of the guideline:
http://en.wikipedia.org/w/index.php?title=Wikipedia:Non-free_content&oldid=…
(paragraph since removed in an edit war)
Thank you,
Hammersoft
I will be out of the office starting 01/01/2008 and will not return until
31/01/2008.
I will respond to your message when I return.
E-mail communication is not secure and may be intercepted by a third party.
This message is confidential to the intended addressee. If you are not the
intended addressee, please inform us immediately and then delete this
message. The Royal Bank of Scotland plc does not accept responsibility for
changes made to this message after it was sent. Although The Royal Bank of
Scotland plc believes this e-mail is free of any virus or other defect
which may affect a computer, it is the responsibility of the recipient to
ensure that it is virus free and The Royal Bank of Scotland plc does not
accept any responsibility for any loss or damage arising from its use.
The Royal Bank of Scotland plc. Registered in Scotland No.90312.
Registered Office: 36 St Andrew Square, Edinburgh, EH2 2YB.
Authorised and regulated by the Financial Services Authority.
Dear Wikimedians,
Wikimedia Commons is happy to announce that the 2007 Picture of the
Year competition will be held soon. Any user who is registered at any
Wikimedia wiki and has more than 200 edits is invited to vote.
The competition is among the 514 images that became Featured Pictures
at Wikimedia Commons between 2007-01-01 and 2007-12-31. There are
literally hundreds of beautiful high quality pictures... please help
us choose the best one!
Voting will be conducted through a tool on the toolserver (to make it
easier to count compared to editing on a wiki). Users can request a
voting token on
<http://commons.wikimedia.org/wiki/Commons:Picture_of_the_Year/2007/Voting>
. You will need to have email enabled for the user account you intend
to vote from. You can only vote once, even if you have multiple
accounts that meet the edit requirement. The voter log will be public
although the actual votes themselves will be private.
Voting starts on January 10th. There are two rounds of voting. In the
first round, you can vote for as many images as you like, regardless
of category. In the final (28), you can only vote for one image.
Thanks,
Wikimedia Commons Picture of the Year committee
<http://commons.wikimedia.org/wiki/Commons:Picture_of_the_Year/2007>
--
They've just been waiting in a mountain for the right moment:
http://modernthings.org/
User RichardF from English Wikipedia came over to Wikinews asking if there
was any way to automatically include links to Wikinews articles in
Category/Topic portals on Wikipedia.
With a very short delay, user slackr (also on en.wp) has devised a bot,
which will take the output of DPLs on portal sub-pages on Wikinews and
format it into a template for use on the corresponding WP portal. One
example would be en.wp's television portal.
As this has all been a case of IRC and talk page "who do I ask?" stuff this
possibly hasn't had a lot of publicity. Although, on the Wikinews mailing
list there has been some enthusiasm and a suggestion it would be very useful
on other language project pairs.
I think this has a lot of potential - depending on where it is used. At
least on the English Wikinews many articles are primarily the work of one
contributor, Virginia Tech and the London bombings are obvious exceptions to
this. It also has the obvious advantage that with an appropriate "update
this list" invitation can increase Wikipedia contributors writing on the
Wikinews project.
So far we've had one French contributor say they'd like it on their wiki, I
suspect the Polish would be interested in it too, IIRC they have a deal to
dual-license some news from another source who'll also pick up their
coverage. Not so sure about the ."derivative". logo on their front page
though. http://pl.wikinews.org/wiki/Strona_g%C5%82%C3%B3wna .
Brian McNeil
>Hello
>May I suggest that until he understands that it isn't appropriate to remove
>these links, is to remove sysop from him and re-add the links? and may be
>make one (or more) of these users a sysop (s) to do the usual cleanup.
>Users with 250 contributions or more at Russian Wikibooks:
User:Rubynovich - inactive
User:Karagota - inactive
User:Vladimir Petrov - Not much active recently, possible problems with neutrality due to a fringe theory he writes about
User:Greck - inactive
User:Student - inactive
User:Alexsmail - not much active, last edit in November
User:Imz - last edit about a month ago, probaly will suit
User:DenZzz - inactive
User:Dark Magus - hell no, he's about to be permabanned on Russian Wikipedia
User:TRicK BZ - inactive
Apparently, Ramir's behaviour had already pissed off lots of
contributors. Probably, a couple of trusted users from Wikipedia
could be given temporary sysop status until the community is healed
and elects new administrators.
--
Max Semenik mailto:maxsem.wiki@gmail.com