[Foundation-l] Following the conventions: seperating Wikisource

Andrew Gray shimgray at gmail.com
Wed Jun 6 12:27:13 UTC 2007


On 06/06/07, GerardM <gerard.meijssen at gmail.com> wrote:
> Hoi,
> When you look at the details for the HTML it will tell you that the language
> is English. It is obviously not. Technically all content in
> Wikisource.orgthat is not English should be marked for the language
> that it is.
>
> Having content marked English while it is in actual fact not English means
> that the meta-data of the page is wrong. Having multiple languages within
> the same MediaWiki database is technically a disaster. It is not marked in
> any way what language it is. This is in and of itself bad.

Well, meta seems to manage well enough :-)

Seriously, though, there are projects where a hubbub of
multilinguality is pretty much inevitable - Commons being the obvious
example, even if we just write meta off as internal craziness. Would
it not be simplest to contrive some way of allowing the page content
to dictate the metadata published by mediawiki, rather than declaring
we just can't do it, period? A much more robust long-term solution.

After all, even if we import the entire known corpus, I can't see
ecr.wikisource.org ever consisting of more than a few kilobytes of
text...

-- 
- Andrew Gray
  andrew.gray at dunelm.org.uk

[ecr, for those wondering, is Eteocretan; the en.wp article says we
have less than ten identified extant fragments...]



More information about the foundation-l mailing list