Tonights RFC session on #wikimedia-office will be about "Overhaul Interwiki map, unify with Sites and WikiMap", see https://phabricator.wikimedia.org/E171 https://phabricator.wikimedia.org/T113034
The meeting will start at 21:00 UTC https://www.timeanddate.com/worldclock/ fixedtime.html?msg=RFC+session&iso=20160511T21&p1=1440&ah=1
We have talked about overhauling interwiki before. Today, I would like to revisit the topic, look at the current state of things, and discuss next steps and open questions:
Status ------- * Please review: //factor storage logic out of Interwiki// https:// gerrit.wikimedia.org/r/#/c/250150/ (I7d7424345)
Next Steps ------------ * split CDB from SQL implementation * implement array-based InterwikiLookup (loads from multiple JSON or PHP files) * indexes should be generated on the fly, if not present in the loaded data * proposed structure: P3044 * that InterwikiLookup implementation should also implement SiteLookup. Alternatively, only implement SiteLookup, and provide an adapter (SiteLookupInterwikiLookup) that implements InterwikiLookup on top of a SiteLookup. * implement maintenance script that can convert between different interwiki representations. * use InterwikiLookup for (multipke) input sources (db/files), InterwikiStore for output * we want an InterwikiStore that can write the new array structure (as JSON or PHP) * we want an InterwikiStore that can write the old CDB structure (as CDB or PHP) * Provide a config variable for specifying which files to read interwiki info from. If not set, use old settings and old interwiki storage.
Questions ----------- * is this a good plan? (see below for rationale) * how does interwiki/site info relate to local wiki config (wgConf/SiteMatrix/ WikiMap)? * should all information always be loaded? (see also {T114772}) * do we need caching? * do we need to support new features also for the SQL based InterwikiLookup? * needs: interwiki_ids table, interwiki_groups table, and blob field with JSON or an interwiki_props table. * Should SiteMatrix continue to work based on wgConf, or should it be ported to use Sites? Or combine both? Currently it has [[https:// gerrit.wikimedia.org/r/#/c/211119/|problems]] with Wikimedia-specific configurations, e.g. for [[https://meta.wikimedia.org/wiki/ Special_language_codes|special language codes]].
Later ------- * decide on how wikis on the WMF cluster should load their interwiki config * proposal: three files: family (shared by e.g. all wikipedias), language (shared by e.g. all english wikis), and local. * create a script that generates the family, language, and local files for all the wikis (as JSON or PHP) based on config. Should work like dumpInterwiki. * check this: generating CDB based on the relevant family/language/local file for a given wiki should return the same CDB as dumpInterwiki for that site. * create a deployment process that generates PHP files from the checked-in JSON files, for faster loading. * action=siteinfo&siprop=interwikimap could be ported to Sites and expose more information. Distinction from SiteMatrix is becoming somewhat unclear then.
wikitech-l@lists.wikimedia.org