As part of Wikidata, I'm rather fundamentally redesigning the namespace logic of MediaWiki, and I'd like to invite input, specifically on
1) whether I should try to get these changes ready for 1.5, or commit them to a separate branch, 2) whether there are objections or suggestions regarding any of the below.
Simply put, I am moving namespaces from the language files and from Namespace.php into the database. In my present implementation, the namespaces are loaded from the DB on every request, but I will add support for memcached as well.
The goal is to make it easier to change and add namespace names and properties (some names cannot currently be changed at all without editing the Language*.php file, which is impractical when you're using a shared codebase), and to have an arbitrary number of synonyms for any namespace.
The table structures are as follows:
mysql> show columns from namespace; +-------------------+--------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------------------+--------------+------+-----+---------+-------+ | ns_id | int(8) | | PRI | 0 | | | ns_system | varchar(80) | | | 0 | | | ns_subpages | tinyint(1) | | | 0 | | | ns_search_default | tinyint(1) | | | 0 | | | ns_target | varchar(200) | YES | | NULL | | +-------------------+--------------+------+-----+---------+-------+
(ns_target is a new feature I'm working on which allows you to specify a default "destination prefix" for any link from within that namespace. This can be an InterWiki link or a namespace. That will be useful in a variety of contexts, for example, within a Wikibooks module, all links could point to pages within the namespace by default; within a Wikidata page, all links could go to Wikipedia by default, etc.)
mysql> show columns from namespace_names; +------------+--------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +------------+--------------+------+-----+---------+-------+ | ns_id | int(8) | | | 0 | | | ns_name | varchar(200) | | | | | | ns_default | tinyint(1) | | | 0 | | +------------+--------------+------+-----+---------+-------+
ns_default is the namespace all other namespace names for that ns_id will redirect to.
I have got the backend fully working and am now working on the namespace manager frontend and the installer code. Essentially, what it means is that you can have stuff like:
mysql> select * from namespace_names where ns_id=6; +-------+---------+------------+ | ns_id | ns_name | ns_default | +-------+---------+------------+ | 6 | File | 1 | | 6 | Image | 0 | | 6 | Video | 0 | | 6 | Sound | 0 | +-------+---------+------------+
This means that the default namespace name is "File", and that "Image", "Video" and "Sound" redirect to it. No more "Image" prefix for sound files! It's quite a pleasure to see this working. There can be an arbitrary number of namespace synonyms for any namespace.
The major part of the effort is getting this to work properly with all language files. Essentially, I intend to import the initial namespace names and English canonical names during installation or upgrade, similar to the way the MediaWiki namespace is imported. I would also like to try to figure out a smooth upgrade procedure for $wgMetaNamespace and $wgExtraNamespaces, as well as the other namespace-related settings.
The namespace manager will have a separate set of permissions which can be atomically assigned to a group. I intend to add some checks here, e.g.: * don't allow it to delete a namespace that contains pages * don't allow it to create a namespace if there are existing pages with that prefix (avoid conflicts with pseudo-namespaces) * warn if there are conflicts with InterWiki prefixes
Again, please let me know what you think about this, if there are any questions, and whether I should try to get this ready for 1.5. Are we near/at the feature freeze stage? Is there going to be a REL1_5 branch soon?
Best,
Erik
On 30/05/05, Erik Moeller erik_moeller@gmx.de wrote:
As part of Wikidata, I'm rather fundamentally redesigning the namespace logic of MediaWiki, and I'd like to invite input, specifically on
- whether I should try to get these changes ready for 1.5, or commit
them to a separate branch,
Well, I believe Brion is keen to see 1.5 ready sooner than later - according to http://mail.wikipedia.org/pipermail/wikitech-l/2005-April/028899.html he was hoping it would be in beta a month ago, so landing a change as fundamental as this is unlikely to be popular with him. And for the record, I think he's right - if the unstable branch stays unstable for too long at a time, features that are fully mature take ages to see the light of day.
- whether there are objections or suggestions regarding any of the below.
Simply put, I am moving namespaces from the language files and from Namespace.php into the database. In my present implementation, the namespaces are loaded from the DB on every request, but I will add support for memcached as well.
Seems reasonably sensible so far.
(ns_target is a new feature I'm working on which allows you to specify a default "destination prefix" for any link from within that namespace. This can be an InterWiki link or a namespace. That will be useful in a variety of contexts, for example, within a Wikibooks module, all links could point to pages within the namespace by default; within a Wikidata page, all links could go to Wikipedia by default, etc.)
That sounds like something a lot of people will like - not so much often-requested, as often-misunderstood-that-it-already-works. One thing occurs to me though - can the "main" namespace have synonyms (or, perhaps, be removed altogether)? This is important because in the Wikibooks scenario (see also something like the Mozilla Wiki, which uses namespaces more-or-less in this idiom), having [[foo]] always point to something completely different to [[:foo]] may not be an ideal way of distinguishing those two pages.
This means that the default namespace name is "File", and that "Image", "Video" and "Sound" redirect to it. No more "Image" prefix for sound files! It's quite a pleasure to see this working. There can be an arbitrary number of namespace synonyms for any namespace.
That's another feature that could be very useful, if used carefully - as well as that situation, there are sites where it would be nice for "Project", "Wikipedia", *and* a translated name *all* to work. I've also considered it would be useful to have abbreviated versions of namespace names - so "WP:" could be an alias for "Wikipedia:"; the existing "shortcuts" could be "Wikipedia:VP" etc but still work, and you could type "WP:Glossary" without anyone having created a shortcut for it.
Of course, as with InterWiki prefixes, the danger is that as soon as someone *removes* such a synonym, all links using it instantly break (or, more confusingly, not instantly, because of caching).
The major part of the effort is getting this to work properly with all language files.
This is mildly off-topic, but why do all language files override methods which differ only in a reference to a global array? Why not just have the array a member variable and use inheritance properly? (The reason I discovered this was identifying a hack in LanguageFr.php::getNsIndex() that was accidentally copied to LanguageOc.php, which makes "Wikipedia:" a synonym for the actual translated name) I believe Special pages do something similar, defining global functions rather than just overriding an inheritted one. Is this because of a flaw in PHP's OOP model, or is it just bad design? [Or am I wrong and this is actually *good* design in some way?]
Rowan Collins wrote:
This is mildly off-topic, but why do all language files override methods which differ only in a reference to a global array? Why not just have the array a member variable and use inheritance properly?
The original design was borken and it hasn't been changed yet.
-- brion vibber (brion @ pobox.com)
On Mon, 2005-05-30 at 15:55 -0700, Brion Vibber wrote:
This is mildly off-topic, but why do all language files override methods which differ only in a reference to a global array? Why not just have the array a member variable and use inheritance properly?
The original design was borken and it hasn't been changed yet.
I think I had a hangover that day...
Rowan:
Well, I believe Brion is keen to see 1.5 ready sooner than later - according to http://mail.wikipedia.org/pipermail/wikitech-l/2005-April/028899.html he was hoping it would be in beta a month ago, so landing a change as fundamental as this is unlikely to be popular with him. And for the record, I think he's right - if the unstable branch stays unstable for too long at a time, features that are fully mature take ages to see the light of day.
That makes sense. Depending on when I'm finished, I'll either commit to a separate branch or ask for 1.5 itself to be branched.
That sounds like something a lot of people will like - not so much often-requested, as often-misunderstood-that-it-already-works. One thing occurs to me though - can the "main" namespace have synonyms (or, perhaps, be removed altogether)? This is important because in the Wikibooks scenario (see also something like the Mozilla Wiki, which uses namespaces more-or-less in this idiom), having [[foo]] always point to something completely different to [[:foo]] may not be an ideal way of distinguishing those two pages.
Interesting point. Redirecting pages with no prefix to a namespace might be possible, would that have the result you want? Are there any particular problems that would have to be dealt with in such a scenario?
Of course, as with InterWiki prefixes, the danger is that as soon as someone *removes* such a synonym, all links using it instantly break (or, more confusingly, not instantly, because of caching).
Yes. One major reason to move namespaces into the DB is that we can do proper validations before mucking about with them -- that is almost impossible when it's all done in variables in the configuration files. For example, it's currently completely possible to inadvertently hide InterWiki prefixes or pseudo-namespaces by adding an extra namespace, or to break links by changing one.
One problem that will have to be dealt with is that a change of the language code will require new namespace names to be used. A "Load from language file" link on Special:Namespaces might be a reasonable temporary solution, but in the long run, it would probably be desirable to move site-wide preferences like this into the DB as well, so we can validate changes and trigger certain effects. I recall there was some work on this a while ago, hopefully it can be revived at some point.
Best,
Erik
Erik Moeller wrote:
Again, please let me know what you think about this, if there are any questions, and whether I should try to get this ready for 1.5. Are we near/at the feature freeze stage? Is there going to be a REL1_5 branch soon?
This rather too experimental for 1.5 at this point (which will be going live within a week or two).
-- brion vibber (brion @ pobox.com)
Erik Moeller wrote:
ns_default is the namespace all other namespace names for that ns_id will redirect to.
I would have called it "ns_canonical" or even "ns_is_canonical", but with sufficient documentation it shouldn't make much difference :)
This means that the default namespace name is "File", and that "Image", "Video" and "Sound" redirect to it.
How is this going to tie in with the [[Image:...]] syntax being special because it inserts an image rather than linking to it? That needs to be internationalised too. :) Oh, and don't forget about the [[Media:...]] syntax, which is also special.
- don't allow it to delete a namespace that contains pages
- don't allow it to create a namespace if there are existing pages with
that prefix (avoid conflicts with pseudo-namespaces)
Just a suggestion, but if it isn't too much work: * when deleting a namespace that contains pages, offer to change all those pages' names to be prefixed (i.e. turn it into a pseudo-namespace)? * when creating a namespace if there are existing pages with that prefix, offer the option of "importing" those pages into the new namespace?
Greetings, Timwi
Timwi:
ns_default is the namespace all other namespace names for that ns_id will redirect to.
I would have called it "ns_canonical" or even "ns_is_canonical", but with sufficient documentation it shouldn't make much difference :)
The behavior is different from what is currently called "canonical". The current canonical namespaces are essentially synonyms which always redirect to the local form. This, on the other hand, is the local form that all synonyms redirect to -- and one of these synonyms can be canonical, i.e. available in all languages by default.
How is this going to tie in with the [[Image:...]] syntax being special because it inserts an image rather than linking to it? That needs to be internationalised too. :) Oh, and don't forget about the [[Media:...]] syntax, which is also special.
In my opinion, any special functionality associated with the image namespace and its synonyms should be unrelated to the name you are using. Instead, the code should check what type of file it is dealing with, and make a decision based on that how to deal with it (embed as sound, image, video). Even different image types are not treated the same way (SVG is rendered to PNG first).
Whether we should create synonyms beyond the simple Image=>File redirect when we don't have special ways of dealing with these file types yet is arguable. The current behavior would be that [[Sound:soundfile.ogg]] would link to the file description page for that file, but not embed it (so would [[Image:soundfile.ogg]]). But it might be enough to have [[File:Soundfile.ogg]] for this purpose for now.
Just a suggestion, but if it isn't too much work:
- when deleting a namespace that contains pages, offer to change all those pages' names to be prefixed (i.e. turn it into a pseudo-namespace)?
Can you give me an example scenario where I might *want* a pseudo-namespace? They mostly seem to be used as a kludge when real namespaces are too hard to make. Personally, I strongly dislike them because they look like namespaces without acting like them; I would rather not create them in some transparent way.
- when creating a namespace if there are existing pages with that prefix, offer the option of "importing" those pages into the new namespace?
Yes, that shouldn't be too hard in the new database schema since there is only one row for each page to update.
Best,
Erik
"Erik Moeller" erik_moeller@gmx.de wrote in message news:429C002F.3020207@gmx.de... [snip]
Whether we should create synonyms beyond the simple Image=>File redirect when we don't have special ways of dealing with these file types yet is arguable. The current behavior would be that [[Sound:soundfile.ogg]] would link to the file description page for that file, but not embed it (so would [[Image:soundfile.ogg]]). But it might be enough to have [[File:Soundfile.ogg]] for this purpose for now.
I suppose it would be too much to hope that we could extend the {{}} syntax to images, wherein {{Image:Some-nice_picture.jpg}} embedded the image and [[Image:Some-nice_picture.jpg]] simply linked to the description page?
On 01/06/05, Phil Boswell phil.boswell@gmail.com wrote:
I suppose it would be too much to hope that we could extend the {{}} syntax to images, wherein {{Image:Some-nice_picture.jpg}} embedded the image and [[Image:Some-nice_picture.jpg]] simply linked to the description page?
Well, it might well be too much to hope that all articles on Wikipedia, and all old versions of those articles, could be either converted to obey that rule or somehow activate a "backwards-compatibility" mode.
Besides, while it makes sense that displaying an image is an inclusion, this doesn't actually extend very well to, for instance, sounds - unless we go the route of embedding plugins, sounds will always be more like a fancy link than an "inclusion". (For an example of what I think such a "fancy link" might look like, see my mockup at http://meta.wikimedia.org/wiki/Multimedia#Software_features )
On Wednesday, June 01, 2005 4:01 PM, Rowan Collins rowan.collins@gmail.com wrote:
On 01/06/05, Phil Boswell phil.boswell@gmail.com wrote:
I suppose it would be too much to hope that we could extend the {{}} syntax to images, wherein {{Image:Some-nice_picture.jpg}} embedded the image and [[Image:Some-nice_picture.jpg]] simply linked to the description page?
Well, it might well be too much to hope that all articles on Wikipedia, and all old versions of those articles, could be either converted to obey that rule or somehow activate a "backwards-compatibility" mode.
Besides, while it makes sense that displaying an image is an inclusion, this doesn't actually extend very well to, for instance, sounds - unless we go the route of embedding plugins, sounds will always be more like a fancy link than an "inclusion". (For an example of what I think such a "fancy link" might look like, see my mockup at http://meta.wikimedia.org/wiki/Multimedia#Software_features )
Hmm. If you link to a sound-file with {{Image:}} (or {{Sound:}} you'd want to transclude it, even if the software won't let you. [[Sound:]] would then be a link to the sound, and the overall manner of links would make more sense.
Yours,
On 01/06/05, James D. Forrester james@jdforrester.org wrote:
Hmm. If you link to a sound-file with {{Image:}} (or {{Sound:}} you'd want to transclude it, even if the software won't let you. [[Sound:]] would then be a link to the sound, and the overall manner of links would make more sense.
Sorry, I don't follow what you're saying here - are you saying that "transcluding a sound" *is* the same as displaying a specially formatted set of links/player, or that it isn't? Like I say, I can see how "transcluding an image" could mean displaying it inline, but I'm not sure that "transcluding a sound" is really a meaningful concept. At the moment, you can transclude the *description page*, but I don't think anyone would really miss that ability.
Don't get me wrong, I can see the argument for having a different syntax for "display inline" than for "link to", I'm just not 100% convinced that this is logically the same as "transclude from".
On Wednesday, June 01, 2005, at 16:30, Rowan Collins rowan.collins@gmail.com wrote:
On 01/06/05, James D. Forrester james@jdforrester.org wrote:
Hmm. If you link to a sound-file with {{Image:}} (or {{Sound:}} you'd want to transclude it, even if the software won't let you. [[Sound:]] would then be a link to the sound, and the overall manner of links would make more sense.
Sorry, I don't follow what you're saying here - are you saying that "transcluding a sound" *is* the same as displaying a specially formatted set of links/player, or that it isn't? Like I say, I can see how "transcluding an image" could mean displaying it inline, but I'm not sure that "transcluding a sound" is really a meaningful concept. At the moment, you can transclude the *description page*, but I don't think anyone would really miss that ability.
In short: I'm agreeing with you. :-)
We could do without {{Image:Foo}} importing the description content of media file Foo, and instead use it as a transclusion mechanism for the image. Then [[Image:Foo]] would be a link to the image, the syntax would suddenly make a lot more sense, and there would be happy children frollicking in the fields and all that.
I also agreed with you that having {{Sound:Foo}} would only make sense if you could in some way transclude the audio file Foo; for that to be useful one would need, as you say, some form of special format or inline player. Which would be fun (but obviously isn't a major feature request yet :-)).
Don't get me wrong, I can see the argument for having a different syntax for "display inline" than for "link to", I'm just not 100% convinced that this is logically the same as "transclude from".
Transclusion is the taking of the content of another page and displaying it, umm, inline. No? :-)
Yes, there is a bit of a mess between linking to the item and its description, and [[:Image:Foo]] handles this, but a plain link is much simpler.
Yours,
On Wed, 2005-06-01 at 12:15 +0100, Phil Boswell wrote:
"Erik Moeller" erik_moeller@gmx.de wrote in message news:429C002F.3020207@gmx.de... [snip]
Whether we should create synonyms beyond the simple Image=>File redirect when we don't have special ways of dealing with these file types yet is arguable. The current behavior would be that [[Sound:soundfile.ogg]] would link to the file description page for that file, but not embed it (so would [[Image:soundfile.ogg]]). But it might be enough to have [[File:Soundfile.ogg]] for this purpose for now.
I suppose it would be too much to hope that we could extend the {{}} syntax to images, wherein {{Image:Some-nice_picture.jpg}} embedded the image and [[Image:Some-nice_picture.jpg]] simply linked to the description page?
Actually, that's almost precisely what the syntax I'm working on does; embedded images are, after all, more like inclusions than links, so I use the same syntax for embedded images, inclusions, extensions like math, nowiki, comments, and variables (which are like min-inclusions).
wikitech-l@lists.wikimedia.org