Hello,
Running all tests for MediaWiki and matching what CI/Jenkins is running has been a constant challenge for everyone, myself included. Today I am introducing Quibble, a python script that clone MediaWiki, set it up and run test commands.
It is a follow up to the Vienna Hackathon in 2017. We had a lot of discussion to make the CI jobs reproducible on a local machine and to unify the logic at a single place. Today, I have added a few jobs to mediawiki/core.
An immediate advantage is that they run in Docker containers and will start running as soon as an execution slot is available. That will be faster than the old jobs (suffixed with -jessie) that had to wait for a virtual machine to be made available.
A second advantage, is one can exactly reproduce the build on a local computer and even hack code for a fix up.
The setup guide is available from the source repository (integration/quibble.git): https://gerrit.wikimedia.org/r/plugins/gitiles/integration/quibble/
The minimal example would be:
git clone https://gerrit.wikimedia.org/r/p/integration/quibble cd quibble python3 -m pip install -e . quibble
A few more details are available in this post on the QA list: https://lists.wikimedia.org/pipermail/qa/2018-April/002699.html
Please give it a try and send issues, support requests to Phabricator: https://phabricator.wikimedia.org/tag/quibble/
Next week I will polish up support for MediaWiki extensions and skins and eventually Quibble will take over most of the CI jobs running for MediaWiki related projects.
Cool. Pardon my novice level understanding of containers and devops. Am I correct in saying that the plan is to use Docker to improve the efficiency of testing for MediaWiki?
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Thu, Apr 19, 2018 at 2:43 PM, Antoine Musso hashar+wmf@free.fr wrote:
Hello,
Running all tests for MediaWiki and matching what CI/Jenkins is running has been a constant challenge for everyone, myself included. Today I am introducing Quibble, a python script that clone MediaWiki, set it up and run test commands.
It is a follow up to the Vienna Hackathon in 2017. We had a lot of discussion to make the CI jobs reproducible on a local machine and to unify the logic at a single place. Today, I have added a few jobs to mediawiki/core.
An immediate advantage is that they run in Docker containers and will start running as soon as an execution slot is available. That will be faster than the old jobs (suffixed with -jessie) that had to wait for a virtual machine to be made available.
A second advantage, is one can exactly reproduce the build on a local computer and even hack code for a fix up.
The setup guide is available from the source repository (integration/quibble.git): https://gerrit.wikimedia.org/r/plugins/gitiles/integration/quibble/
The minimal example would be:
git clone https://gerrit.wikimedia.org/r/p/integration/quibble cd quibble python3 -m pip install -e . quibble
A few more details are available in this post on the QA list: https://lists.wikimedia.org/pipermail/qa/2018-April/002699.html
Please give it a try and send issues, support requests to Phabricator: https://phabricator.wikimedia.org/tag/quibble/
Next week I will polish up support for MediaWiki extensions and skins and eventually Quibble will take over most of the CI jobs running for MediaWiki related projects.
-- Antoine "hashar" Musso
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 20/04/2018 03:18, Pine W wrote:
Cool. Pardon my novice level understanding of containers and devops. Am I correct in saying that the plan is to use Docker to improve the efficiency of testing for MediaWiki?
It is partly about efficiency. We have jobs running on a pool of virtual machine on top of the OpenStack cluster, the instances are deleted after each build. A software (Nodepool) takes care of the deletion and replenish the pool by asking for new instances to be started.
The pool is fairly limited in size and rate limited since any time it asks for too many request the OpenStack cluster gets in trouble. That has been a source of pain and headhaches and is overall quite slow.
Docker addresses that part: * we provide the frozen environments CI uses to run tests * anyone can download them with a docker pull * the Docker container is spawned at start of the build and typically takes just a few miliseconds to start
It is also next to impossible to reproduce a CI build for MediaWiki. There are too many requirements: * the image used to spawn the virtual machine, which is not publicly available * Parameters being injected by the workflow system (Zuul) * Jenkins jobs written in yaml (integration/config) * Shell scripts in a git repo (integration/jenkins)
Quibble addresses that second part: it aggregates all the logic and flow in a single script. That is arguably easier to run.
Bonus: the Docker containers contain Quibble. Hence the container has everything one needs to properly reproduce a build.
As for the efficiency, there are a few optimizations that are left to be done. Namely: clone the repositories in parallel and skipping some tests when we know another job already ran it. For example running the JavaScript eslint check should only be done once.
But overall, yes the switch should make it faster to get feedback on changes and to have them merged.
Hi!
A second advantage, is one can exactly reproduce the build on a local computer and even hack code for a fix up.
This is great! CI errors not reproducible locally has been a huge annoyance and very hard to debug. Thanks for making it easier!
On 19/04/2018 23:43, Antoine Musso wrote:
Hello,
Running all tests for MediaWiki and matching what CI/Jenkins is running has been a constant challenge for everyone, myself included. Today I am introducing Quibble, a python script that clone MediaWiki, set it up and run test commands.
Hello,
Quibble is running just fine on MediaWiki core and vendor in the CI context. The latest version is 0.0.11 which is deployed on CI.
I am running it over the week-end against all MediaWiki extensions and skins. I will fill bugs next week. You can already trigger it for your extension and skin: In Gerrit, browse to a change and comment 'check experimental'.
For developers, there are two major issues:
[T191035] MediaWiki core @Database tests fail with sqlite
[T193222] MariaDB on Stretch uses the utf8mb4 character set. Attempting to create a key on VARCHAR(192) or larger would cause: Error: 1071 Specified key was too long; max key length is 767 bytes
Reducing the key length is the obvious solution and some fields could use to be converted to ENUM.
Some enhancements have been made and bugs fixed,
[Timo Tihof] * Qunit is run with MW_SCRIPT_PATH set to '/' instead of ''. * Run 'npm run selenium-test' instead of 'grunt webdriver:test'. Which delegates the implementation to MediaWiki developers. T179190
[Antoine Musso] * Clone repository set in $ZUUL_PROJECT. Typically the repository that had a job triggered by Gerrit/Zuul. * chromedriver could not be found if it was in /usr/local/bin. Pass $PATH from the OS environment before trying to invoke it. Thanks Željko Filipin. * Fix git 2.15 by allowing a more recent version of GitPython. It should be either < 2.1.2 or > 2.1.7. Intermediate versions have performances regressions. T193057 * Runs on extensions and supports EXT_DEPENDENCIES (it is easier to just pass repositories to clone as arguments).
[Željko Filipin] * Fix README git clone example. T192239
Known issues:
[T193164] Lacks of documentation / tutorial
[T192644] Fails to initialize MySQL on Ubuntu 17/ MySQL 5.7
[T192132] Lack of license
Happy new yea^W^Wweek-end!
On Fri, Apr 27, 2018 at 5:58 PM, Antoine Musso hashar+wmf@free.fr wrote:
[T193222] MariaDB on Stretch uses the utf8mb4 character set. Attempting to create a key on VARCHAR(192) or larger would cause: Error: 1071 Specified key was too long; max key length is 767 bytes
Reducing the key length is the obvious solution and some fields could use to be converted to ENUM.
Personally, I'd rather we didn't use more enums. They work inconsistently for comparisons and ordering, and they require a schema change any time a new value is needed. It'd probably be better to use NameTableStore instead.
The fact that Mediawiki (I know, I think it is none of core, but some important extensions) doesn't work out of the box on the latest/distro-available versions of MySQL and MariaDB is worrying to me (specially when those were supposed to be the best supported systems(?)).
I know there is lots of people involved here, and that each database may have differents degrees of support, based on the user base, plus there has been a lot of changes on those databases since MySQL 5.0, but I had been crying wolf for over 2 years already: T112637. Supporting MySQL/MariaDB 5.5 doesn't mean we should't support the latest stable versions, too, if the changes are sensible.
Note I am not ranting here, and I am the first that will help anyone to fix their code, just want to urge everybody to:
* Support "real" (4-byte) UTF-8: utf8mb4 in MySQL/MariaDB (default in the latest versions) and start deprecating "fake" (3-byte) UTF-8: utf8 * Check code works as intended in "strict" mode (default in the latest versions), at least regarding testing * Check support for latest unicode standards ("emojies") * Avoiding unsafe writes to the database (non-deterministic statements) like UPDATE... LIMIT without ORDER BY * Add primary keys to all tables
Fixing those will likely reveal many hidden bugs by supposing a too lenient storage system and will allow better support for clustering solutions.
Anomie- I think you were thinking on (maybe?) abstracting schema for mediawiki- fixing the duality of binary (defining sizes in bytes) vs. UTF-8 (defining sizes in characters) would be an interesting problem to solve- the duality is ok, what I mean is being able to store radically different size of contents based on that setting.
I am also offering all mediawiki contributors time if you do not feel confident enough with sql/persistence storage systems to make those fixes, if you need support.
On Sat, Apr 28, 2018 at 3:58 PM, Brad Jorsch (Anomie) <bjorsch@wikimedia.org
wrote:
On Fri, Apr 27, 2018 at 5:58 PM, Antoine Musso hashar+wmf@free.fr wrote:
[T193222] MariaDB on Stretch uses the utf8mb4 character set. Attempting to create a key on VARCHAR(192) or larger would cause: Error: 1071 Specified key was too long; max key length is 767 bytes
Reducing the key length is the obvious solution and some fields could use to be converted to ENUM.
Personally, I'd rather we didn't use more enums. They work inconsistently for comparisons and ordering, and they require a schema change any time a new value is needed. It'd probably be better to use NameTableStore instead.
-- Brad Jorsch (Anomie) Senior Software Engineer Wikimedia Foundation _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Mon, Apr 30, 2018 at 9:05 AM, Jaime Crespo jcrespo@wikimedia.org wrote:
- Support "real" (4-byte) UTF-8: utf8mb4 in MySQL/MariaDB (default in the
latest versions) and start deprecating "fake" (3-byte) UTF-8: utf8
MediaWiki currently doesn't even try to support UTF-8 in MySQL. The core MySQL schema specifically uses "varbinary" and "blob" types for almost everything.
Ideally we'd change that, but see below.
- Check code works as intended in "strict" mode (default in the latest
versions), at least regarding testing
While it's not actually part of "strict mode" (I think), I note that MariaDB 10.1.32 (tested on db1114) with ONLY_FULL_GROUP_BY still seems to have the issues described in https://phabricator.wikimedia.org/T108255#2415773.
Anomie- I think you were thinking on (maybe?) abstracting schema for mediawiki- fixing the duality of binary (defining sizes in bytes) vs. UTF-8 (defining sizes in characters) would be an interesting problem to solve- the duality is ok, what I mean is being able to store radically different size of contents based on that setting.
That would be an interesting problem to solve, but doing so may be difficult. We have a number of fields that are currently defined as varbinary(255) and are fully indexed (i.e. not using a prefix).
- Just changing them to varchar(255) using utf8mb4 makes the index exceed MySQL's column length limit. - Changing them to varchar(191) to keep within the length limit breaks content in primarily-ASCII languages that is taking advantage of the existing 255-byte limit to store more than 191 codepoints. - Using a prefixed index makes ORDER BY on the column filesort. - Or the column length limit can be raised if your installation jumps through some hoops, which seem to be the default in 5.7.7 but not before: innodb_large_prefix https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_large_prefix set to ON, innodb_file_format https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_file_format set to "Barracuda", innodb_file_per_table https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_file_per_table set to ON, and tables created with ROW_FORMAT=DYNAMIC or COMPRESSED. I don't know what MariaDB might have as defaults or requirements in which versions.
The ideal, I suppose, would be to require those hoops be jumped through in order for utf8mb4 mode to be enabled. Then a lot of code in MediaWiki would have to vary based on that mode flag to enforce limits on bytes versus codepoints.
BTW, for anyone reading this who's interested, the task for that schema abstraction idea is https://phabricator.wikimedia.org/T191231.
MediaWiki currently doesn't even try to support UTF-8
I thought the installer gave the option to chose between binary and utf8 83-bytes)? It is ok if we support UTF-8 thought binary fields + custom library collations, but I think the sane approach would be to either move everthing to binary or we support the most complete collation, not the confusing combination of the 2. Note I don't need utf8mb4 to be enabled, I just want Mediawiki to work out of the box on any MySQL or MariaDB versions supported, including the latest 2 of each one- even if that means doing some workarounds.
While it's not actually part of "strict mode"
It is not- we can delay the group by change until mariadb supports it properly according to sql standard and we do not support any older database version behaviour. However, strict mode ("don't add corrupt data") is available on all versions and the default in the latest ones, and it should be enabled at least on testing environments.
innodb_large_preffix cannot be enabled anymore because it is enabled (hardcoded) automatically on MySQL 8.0.
On Mon, Apr 30, 2018 at 4:40 PM, Brad Jorsch (Anomie) <bjorsch@wikimedia.org
wrote:
On Mon, Apr 30, 2018 at 9:05 AM, Jaime Crespo jcrespo@wikimedia.org wrote:
- Support "real" (4-byte) UTF-8: utf8mb4 in MySQL/MariaDB (default in the
latest versions) and start deprecating "fake" (3-byte) UTF-8: utf8
MediaWiki currently doesn't even try to support UTF-8 in MySQL. The core MySQL schema specifically uses "varbinary" and "blob" types for almost everything.
Ideally we'd change that, but see below.
- Check code works as intended in "strict" mode (default in the latest
versions), at least regarding testing
While it's not actually part of "strict mode" (I think), I note that MariaDB 10.1.32 (tested on db1114) with ONLY_FULL_GROUP_BY still seems to have the issues described in https://phabricator.wikimedia.org/T108255#2415773.
Anomie- I think you were thinking on (maybe?) abstracting schema for mediawiki- fixing the duality of binary (defining sizes in bytes) vs.
UTF-8
(defining sizes in characters) would be an interesting problem to solve- the duality is ok, what I mean is being able to store radically different size of contents based on that setting.
That would be an interesting problem to solve, but doing so may be difficult. We have a number of fields that are currently defined as varbinary(255) and are fully indexed (i.e. not using a prefix).
- Just changing them to varchar(255) using utf8mb4 makes the index
exceed MySQL's column length limit.
- Changing them to varchar(191) to keep within the length limit breaks
content in primarily-ASCII languages that is taking advantage of the existing 255-byte limit to store more than 191 codepoints.
- Using a prefixed index makes ORDER BY on the column filesort.
- Or the column length limit can be raised if your installation jumps
through some hoops, which seem to be the default in 5.7.7 but not before: innodb_large_prefix https://dev.mysql.com/doc/refman/5.7/en/innodb- parameters.html#sysvar_innodb_large_prefix set to ON, innodb_file_format https://dev.mysql.com/doc/refman/5.7/en/innodb- parameters.html#sysvar_innodb_file_format set to "Barracuda", innodb_file_per_table https://dev.mysql.com/doc/refman/5.7/en/innodb- parameters.html#sysvar_innodb_file_per_table set to ON, and tables created with ROW_FORMAT=DYNAMIC or COMPRESSED. I don't know what MariaDB might have as defaults or requirements in which versions.
The ideal, I suppose, would be to require those hoops be jumped through in order for utf8mb4 mode to be enabled. Then a lot of code in MediaWiki would have to vary based on that mode flag to enforce limits on bytes versus codepoints.
BTW, for anyone reading this who's interested, the task for that schema abstraction idea is https://phabricator.wikimedia.org/T191231.
-- Brad Jorsch (Anomie) Senior Software Engineer Wikimedia Foundation _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Mon, Apr 30, 2018 at 10:57 AM, Jaime Crespo jcrespo@wikimedia.org wrote:
MediaWiki currently doesn't even try to support UTF-8
I thought the installer gave the option to chose between binary and utf8 83-bytes)?
Hmm. Yes, it looks like it does. But if all fields are varbinary, does it matter? Maybe it should be removed from the installer.
There's also a $wgDBmysql5 configuration setting, which controls whether MediaWiki does "SET NAMES 'utf8'" or "SET NAMES 'binary'". I don't know what difference this makes, maybe none since all the columns are varbinary.
innodb_large_preffix cannot be enabled anymore because it is enabled (hardcoded) automatically on MySQL 8.0.
That's good, once we raise the supported version that far. Currently it looks like we still support 5.5.8, which at least has the setting to enable.
I've created https://phabricator.wikimedia.org/T194125 No matter if we enable innodb_large_preffix, migrate to binary only or reduce the maximum size of indexes, there is work/migrations/installer changes/maintenance needed for each of the possible solutions.
On Mon, Apr 30, 2018 at 5:18 PM, Brad Jorsch (Anomie) <bjorsch@wikimedia.org
wrote:
On Mon, Apr 30, 2018 at 10:57 AM, Jaime Crespo jcrespo@wikimedia.org wrote:
MediaWiki currently doesn't even try to support UTF-8
I thought the installer gave the option to chose between binary and utf8 83-bytes)?
Hmm. Yes, it looks like it does. But if all fields are varbinary, does it matter? Maybe it should be removed from the installer.
There's also a $wgDBmysql5 configuration setting, which controls whether MediaWiki does "SET NAMES 'utf8'" or "SET NAMES 'binary'". I don't know what difference this makes, maybe none since all the columns are varbinary.
innodb_large_preffix cannot be enabled anymore because it is enabled (hardcoded) automatically on MySQL 8.0.
That's good, once we raise the supported version that far. Currently it looks like we still support 5.5.8, which at least has the setting to enable.
-- Brad Jorsch (Anomie) Senior Software Engineer Wikimedia Foundation _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org