Re: [Wikitech-l] [RFC] performance standards for new mediawiki features

22 Mar 2013

      Asher,
Do we know what our numbers are now? That's probably a pretty good baseline
to start with as a discussion.
p99 banner request latency of 80ms
Fundraising banners? From start of page load; or is this specifically how
fast our API requests run?
On the topic of APIs; we should set similar perf goals for requests to the
API / jobs. This gets very subjective though because now we're talking
about CPU time, memory usage, HDD usage, cache key space usage -- are these
in your scope; or are we simply starting the discussion with response times?
Further down the road -- consistency is going to be important (my box will
profile differently than someone else's) so it seems like this is a good
candidate for 'yet another' continuous integration test. I can easily see
us being able to get an initial feel for response times in the
CI environment. Or maybe we should just continuously hammer the alpha/beta
servers...
On deployment though -- currently the only way I know of to see if
something is performing is to look directly at graphite -- can
icinga/something alert us -- presumably via email? Ideally we would be able
to set up new metrics as we go (obviously start with global page loads; but
maybe I want to keep an eye on banner render time). I would love to get an
email about something I've deployed under-performing.
~Matt Walker
Wikimedia Foundation
Fundraising Technology Team
On Thu, Mar 21, 2013 at 6:40 PM, Asher Feldman afeldman@wikimedia.orgwrote:
...
I'd like to push for a codified set of minimum performance standards that
new mediawiki features must meet before they can be deployed to larger
wikimedia sites such as English Wikipedia, or be considered complete.
These would look like (numbers pulled out of a hat, not actual
suggestions):

p999 (long tail) full page request latency of 2000ms
p99 page request latency of 800ms
p90 page request latency of 150ms
p99 banner request latency of 80ms
p90 banner request latency of 40ms
p99 db query latency of 250ms
p90 db query latency of 50ms
1000 write requests/sec (if applicable; writes operations must be free

from concurrency issues)

guidelines about degrading gracefully
specific limits on total resource consumption across the stack per

request

etc..

Right now, varying amounts of effort are made to highlight potential
performance bottlenecks in code review, and engineers are encouraged to
profile and optimize their own code.  But beyond "is the site still up for
everyone / are users complaining on the village pump / am I ranting in
irc", we've offered no guidelines as to what sort of request latency is
reasonable or acceptable.  If a new feature (like aftv5, or flow) turns out
not to meet perf standards after deployment, that would be a high priority
bug and the feature may be disabled depending on the impact, or if not
addressed in a reasonable time frame.  Obviously standards like this can't
be applied to certain existing parts of mediawiki, but systems other than
the parser or preprocessor that don't meet new standards should at least be
prioritized for improvement.
Thoughts?
Asher
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] [RFC] performance standards for new mediawiki features