Am 02.11.20 um 19:24 schrieb Daniel Kinzler:

[Re-posting with fixed links. Thanks for pointing this out Cormac!]

This is the weekly TechCom board review.  Remember that there is no meeting on Wednesday, any discussion should happen via email. For individual RFCs, please keep discussion to the Phabricator tickets.

That's another issue I wanted to raise: Platform Engineeing is working on switching ParserCache to JSON. For that, we have to make sure extensions only put JSON-Serializable data into ParserOutput objects, via setProperty() and setExtensionData(). We are currently trying to figure out how to best do that for TemplateData.

TemplateData already uses JSON serialization, but then compresses the JSON output, to make the data fit into the page_props table. This results in binary data in ParserOutput, which we can't directly put into JSON. There are several solutions under discussion, e.g.:

* Don't write the data to page_props, treat it as extension data in ParserOutput. Compression would become unnecessary. However, batch loading of the data becomes much slower, since each ParserOutput needs to be loaded from ParserCache. Would it be too slow?

* Apply compression for page_props, but not for the data in ParserOutput. We would have to introduce some kind of serialization mechanism into PageProps and LinksUpdate. Do we want to encourage this use of page_props?

* Introduce a dedicated database table for templatedata. Cleaner, but schema changes and data migration take a long time.

* Put templatedata into the BlobStore, and just the address into page_props. Makes loading slower, maybe even slower than the solution that relies on ParserCache.

* Convert TemplateData to MCR. This is the cleanest solution, but would require us to create an editing interface for templatedata, and migrate out existing data from wikitext. This is a long term perspective.

To unblock migration of ParserCache to JSON, we need at least a temporary solution that can be implemented quickly. A somewhat hacky solution I can see is:

* detect binary page properties and apply base64 encoding to them when serializing ParserOutput to JSON. This is possible because page properties can only be scalar values. So can convert to something like { _encoding_: "base64", data: "34c892ur3d40" }, and recognize the structure when decoding. This wouldn't work for data set with setTemplateData, since that could already be an arbitrary structure.

-- 
Daniel Kinzler
Principal Software Engineer, Core Platform
Wikimedia Foundation