On Tuesday, July 12, 2016, Daniel Kinzler daniel.kinzler@wikimedia.de wrote:
Am 12.07.2016 um 18:00 schrieb Rob Lanphier:
On Tue, Jul 12, 2016 at 1:40 AM, Daniel Kinzler <
daniel.kinzler@wikimedia.de javascript:;
The original design of ContentHandler used integer IDs for content
models
and formats in the DB. A mapping to human readable names is only needed for logging and error messages anyway.
This oversimplifies things greatly. Integer IDs need to be mapped to
some
well-specified, non-local (global?) identifier for many many purposes (reading exports, writing exports, reading site content, displaying site content for many contexts, etc)
Yea, sorry. That we only need this for logging is what I assumed back then. Not exposing the numeric ID at all, and using the canonical name in dumps, the API, etc, avoids a lot of trouble (but doesn't come free).
Yes, numeric ids are internal and never to be exposed ideally. We should've done same wth namespaces but got dragged into compat hell. :)
We need to put a lot of thought into content model management generally. This statement implies managing content models outside of the database is easy.
Well, it's the same as namespaces: they are easy to set up, but also too easy to change, so it's easy to create a mess...
As explained in my earlier response, I now realized that content models differ from namespaces in that they are not really configured by people, but rather registered by extensions. That makes it a lot less awkward to have them in the database. We still have to agree on a good trigger for the registration, but it doesn't seem to be a tricky issue.
Yeah an auto insert if needed is good in theory, though I worry about write contention on the central mapping table. If no write locks kept in the common case of no insertion needed then I think the ideas proposed should work.
What we still need to figure out is how to solve the chicken-and-egg situation with Multi-Content-Rev. At the moment, I'm thinking this might work:
- introduce content model (and format) registry in the DB, and populate it.
- leave page and revision table as they are for now.
- introduce slots table, use the new content_model (and content_format)
table.
- stop using the content model (and format) from the page and revision
tables
- drop the content model (and format) from the page and revision tables
Does that sound liek a good plan? Let's for a moment assume we can get slots fully rolled out by the end of the year.
This sounds good to me - lets us introduce a more space efficient model mapping and drop the extra fields from page and rev later.
-- brion
-- Daniel Kinzler Senior Software Developer
Wikimedia Deutschland Gesellschaft zur Förderung Freien Wissens e.V.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org javascript:; https://lists.wikimedia.org/mailman/listinfo/wikitech-l