On Tuesday, July 12, 2016, Daniel Kinzler <daniel.kinzler(a)wikimedia.de>
wrote:
Am 12.07.2016 um 18:00 schrieb Rob Lanphier:
On Tue, Jul 12, 2016 at 1:40 AM, Daniel Kinzler
<
daniel.kinzler(a)wikimedia.de <javascript:;>
> The original design of ContentHandler used
integer IDs for content
models
and
formats in the DB. A mapping to human readable names is only needed
for logging and error messages anyway.
This oversimplifies things greatly. Integer IDs need to be mapped to
some
well-specified, non-local (global?) identifier
for many many purposes
(reading exports, writing exports, reading site content, displaying site
content for many contexts, etc)
Yea, sorry. That we only need this for logging is what I assumed back
then. Not
exposing the numeric ID at all, and using the canonical name in dumps, the
API,
etc, avoids a lot of trouble (but doesn't come free).
Yes, numeric ids are internal and never to be exposed ideally. We should've
done same wth namespaces but got dragged into compat hell. :)
We need to put a lot of thought into content
model management generally.
This statement implies managing content models outside of the database is
easy.
Well, it's the same as namespaces: they are easy to set up, but also too
easy to
change, so it's easy to create a mess...
As explained in my earlier response, I now realized that content models
differ
from namespaces in that they are not really configured by people, but
rather
registered by extensions. That makes it a lot less awkward to have them in
the
database. We still have to agree on a good trigger for the registration,
but it
doesn't seem to be a tricky issue.
Yeah an auto insert if needed is good in theory, though I worry about write
contention on the central mapping table. If no write locks kept in the
common case of no insertion needed then I think the ideas proposed should
work.
What we still need to figure out is how to solve the chicken-and-egg
situation
with Multi-Content-Rev. At the moment, I'm thinking this might work:
* introduce content model (and format) registry in the DB, and populate it.
* leave page and revision table as they are for now.
* introduce slots table, use the new content_model (and content_format)
table.
* stop using the content model (and format) from the page and revision
tables
* drop the content model (and format) from the page and revision tables
Does that sound liek a good plan? Let's for a moment assume we can get
slots
fully rolled out by the end of the year.
This sounds good to me - lets us introduce a more space efficient model
mapping and drop the extra fields from page and rev later.
-- brion
--
Daniel Kinzler
Senior Software Developer
Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org <javascript:;>
https://lists.wikimedia.org/mailman/listinfo/wikitech-l