I have a wiki with a high volume of edits that need to be made (via API), about 2-3 per second. The wiki can't keep up with them; there ends up being a backlog of edits that need to be made as, e.g., attempts are being made by 10 different bots to make 10 edits simultaneously (all to different pages) and it takes a long time for them to save so that the bots can move on to their next edits.
What's the best way to boost performance in the ways needed to reduce the time needed to save edits, or to be able to save more edits per second by concurrently running bots? Thanks, -- Starstruck
What's the best way to boost performance ...
Depends on a myriad of factors, such as the OS, web server and database and hardware etc.
The simplest answer is to increase your hardware resources, meaning if the site has one CPU, give it two. For anything more specific, we would need more details about the server software and hardware.
Hershel On Wed, Nov 28, 2018 at 2:41 PM Star Struck starstruck7200@gmail.com wrote:
I have a wiki with a high volume of edits that need to be made (via API), about 2-3 per second. The wiki can't keep up with them; there ends up being a backlog of edits that need to be made as, e.g., attempts are being made by 10 different bots to make 10 edits simultaneously (all to different pages) and it takes a long time for them to save so that the bots can move on to their next edits.
What's the best way to boost performance in the ways needed to reduce the time needed to save edits, or to be able to save more edits per second by concurrently running bots? Thanks, -- Starstruck _______________________________________________ MediaWiki-l mailing list To unsubscribe, go to: https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
The server is just the localhost that I use for testing; it's Apache and MySQL running on Ubuntu 18.04 on an HP Elite 3.0ghz with 4GB of RAM https://www.amazon.com/HP-Elite-Professional-Certified-Refurbished/dp/B0094JF1HA. I haven't really figured out what kind of hardware or software I want to use for production, because I haven't done a lot of server administration (I've typically just used a VPS when I needed webhosting, but perhaps my needs now have expanded beyond that, because this is going to be a huge wiki, the same scale as Wikipedia; although my main concern at the moment is with making page saves, rather than page loads, more efficient, since I don't necessarily anticipate having a lot of visitors from the Internet reading the wiki, or else I'd be focusing more on stuff like caching; I mostly just want to set up a workable proof-of-concept for the moment.)
I've heard that Wikimedia splits their enwiki database up among more than one server; is that how they're able to handle several page saves per second on the master?
On Wed, Nov 28, 2018 at 8:19 AM Hershel Robinson hershelsr@gmail.com wrote:
What's the best way to boost performance ...
Depends on a myriad of factors, such as the OS, web server and database and hardware etc.
The simplest answer is to increase your hardware resources, meaning if the site has one CPU, give it two. For anything more specific, we would need more details about the server software and hardware.
Hershel
I've heard that Wikimedia splits their enwiki database up among more than one server; is that how they're able to handle several page saves per second on the master?
That is correct. See here https://meta.wikimedia.org/wiki/Wikimedia_servers for more details.
It's not split up (sharded) across servers, a least as far as page and revision tables go. There is one active master at any given time hat handles all writes; the current host has 160GB of memory and 10 physical cores (20 with hyperthreading). The actual revision *content* for all projects is indeed split up across several servers, in an 'external storage' cluster. The current server configuration is available at https://noc.wikimedia.org/conf/highlight.php?file=db-eqiad.php
You can get basic specs as well as load information on these servers by looking up each one in grafana; here's db1067 (the current enwiki master) as an example: https://grafana.wikimedia.org/d/000000607/cluster-overview?orgId=1&var-d...
Ariel
On Wed, Nov 28, 2018 at 4:34 PM Hershel Robinson hershelsr@gmail.com wrote:
I've heard that Wikimedia splits their enwiki database up among more than one server; is that how they're able to handle several page saves per second on the master?
That is correct. See here https://meta.wikimedia.org/wiki/Wikimedia_servers for more details.
-- http://civihosting.com/ Simply the best in shared hosting
MediaWiki-l mailing list To unsubscribe, go to: https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
What is your caching setup (e.g. $wgMainCacheType and friends)? Caching probably has more of an effect on read time than save time, but it will also have an effect on save time, probably a significant one. If its just one server, apcu (i.e. CACHE_ACCEL) is probably the easiest to setup.
There's a number of factors that can effect page save time. In many cases it can depend on what the content of your page edits (e.g. If your edits have lots of embedded images, 404 handling can result in significant improvements).
The first step I would suggest would be to do profiling - https://www.mediawiki.org/wiki/Manual:Profiling This will tell you what part is being slow, and we can give more specific advice based on that
-- Brian
On Wed, Nov 28, 2018 at 2:17 PM Star Struck starstruck7200@gmail.com wrote:
The server is just the localhost that I use for testing; it's Apache and MySQL running on Ubuntu 18.04 on an HP Elite 3.0ghz with 4GB of RAM < https://www.amazon.com/HP-Elite-Professional-Certified-Refurbished/dp/B0094J...
.
I haven't really figured out what kind of hardware or software I want to use for production, because I haven't done a lot of server administration (I've typically just used a VPS when I needed webhosting, but perhaps my needs now have expanded beyond that, because this is going to be a huge wiki, the same scale as Wikipedia; although my main concern at the moment is with making page saves, rather than page loads, more efficient, since I don't necessarily anticipate having a lot of visitors from the Internet reading the wiki, or else I'd be focusing more on stuff like caching; I mostly just want to set up a workable proof-of-concept for the moment.)
I've heard that Wikimedia splits their enwiki database up among more than one server; is that how they're able to handle several page saves per second on the master?
On Wed, Nov 28, 2018 at 8:19 AM Hershel Robinson hershelsr@gmail.com wrote:
What's the best way to boost performance ...
Depends on a myriad of factors, such as the OS, web server and database and hardware etc.
The simplest answer is to increase your hardware resources, meaning if the site has one CPU, give it two. For anything more specific, we would need more details about the server software and hardware.
Hershel
MediaWiki-l mailing list To unsubscribe, go to: https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
mediawiki-l@lists.wikimedia.org