Just a short note for developers: I've delayed the initialisation of
$wgIP, so you can no longer use that variable, except for debugging.
Instead, use wfGetIP(). It's cached, so feel free to call it regularly.
See the commit notice for more details:
http://mail.wikipedia.org/pipermail/mediawiki-cvs/2005-September/011072.html
-- Tim Starling
A while ago, Gerard posted this on Meta:
http://meta.wikimedia.org/wiki/Using_Ultimate_Wiktionary_for_Commons
It was a short explanation how UW could be used to internationalize
categories on the Wikimedia Commons. I've now hacked together a small
mock-up that demonstrates (hopefully) more clearly how this could work
in practice:
http://epov.org/uwd/index.php?title=Tag:Dog&action=edit
(Further demos will be posted on http://epov.org/uwd/ in the coming
weeks and months.)
It should work in Firefox and IE. The only active component are the
radio buttons you can click.
Essentially, what this shows is:
1) A new tag for images of dogs is created. (In this demo, I call
categories "tags", because I hope this will be what they are eventually
called.)
2) The user can choose from the languages they speak to clarify which
language this tag name is written in.
3) Based on the tag name and language, a lookup on UW is performed,
which fetches all the associated meanings for this word.
4) The user selects one of these meanings.
5) Automagically, another lookup is performed to determine the available
translations, if any. After saving the tag, it is then instantly
available under these names in the other languages.
In the demo, the first two meanings have translations available, while
the other two do not.
Why is this so powerful? Because, if UW itself is successful and
contains many words, it almost instantly makes the entire media
repository on Commons available to speakers of all languages. (Now,
hopefully, you can see why we've been excited about getting millions of
translations for free from the Logos project.) No need to create many
different tags - just select the right meaning. Furthermore, it builds
bridges from other projects to UW. The language work we are constantly
doing will no longer be redundant, but focused on one place.
A 14-year-old Italian kid can then use the tag "cane" to look for photos
of dogs, while a Maori girl from New Zealand can use "kurii". Moreover,
the same category hierarchy can be used to browse in different languages
(based on user perferences, a fallback hierarchy would be queried to
determine the language that should be used should no translation be
available).
We could also automatically make use of synonyms, plurals and
inflections (though this requires further changes to the category code
beyond internationalization). Given that we are mapping one of multiple
meanings to a single tag, there will be tag collisions -- those will
have to be dealt with through disambiguation. But this is not important:
Try to see the tag name merely as a key to a meaning. What this key is
called is secondary.
The key principle of selecting a meaning and then performing automatic
translations can be used in many different contexts. For example, in
Wikidata, one could use the same principle to internationalize field
names such as "Country", "Flag" and "Population".
This application also shows that UW must contain everything from words
to names to phrases. There is no limit to the scope of it. This makes it
a potentially massively useful tool for both human and machine translation.
The category internationalization functionality will not be part of the
first release of Ultimate Wiktionary, but we believe we can get funding
to work on this later. I believe that UW, in combination with better
tagging features in general, could make our tagging system the most
advanced one available. Flickr, for example, has no localization, is
unlikely to ever get semi-automatic localization, and apparently
supports no synonyms either.
See the demo footnotes for further explanations. Feedback is welcome.
(I'll be away until Wednesday.)
Best,
Erik
User Assarbe (Ain_xaitan(a)yahoo.es) says:
I have left a proposal for a new wikipedia in murcian (a language spoken by more than 300.000 persons) in the south-east of Spain, between the castilian and catalan) two weeks ago.I have already got the support of more than 6 persons. I am sure more people will be interested.
Well, now I want to know wich is the next step to do. Can I do thing something more at the moment?
Please, help me :)
Thanks
---------------------------------
Correo Yahoo!
Comprueba qué es nuevo, aquí
http://correo.yahoo.es
Paweł Dembowski wrote:
>>by the way, the frame issue was several times discussed on irc.
>>It seems most browsers do a total redirect. Only a couple of editors
>>reported the framing issue.
>>I do not know if *we* can do something on this.
>>Ant
>>
>>
>
>I get a frame both in IE and in Firefox.
>
>
>
Could this be because the frame-breaking code in
http://fr.wikipedia.org/skins-1.5/common/wikibits.js is executed in a
<head> context, rather than a <body> context?
http://www.thesitewizard.com/archive/framebreak.shtml specifically
states that the frame-breaker code must be executed in the <body>, not
the <head>.
-- Neil
I did some benchmarking of PHP compiled with gcc 4.0.1. Results and
methodology are described at:
http://wp.wikidev.net/GCC_benchmarking
I eventually settled on -O3 with profile-guided optimisation. This gave
a 5-10% improvement over the old PHP 4.3.11 build on all the benchmarks
except preg_replace. The preg_replace problem is probably because the
benchmark was atypical compared to the profiling data, which was
gathered using refreshLinks.php.
-- Tim Starling
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
> I'm afraid you'll have to work off the code (which isn't always 'right')
Never underestimate the convolutedness of parser.php. If I were you, I'd
start simple (like grabbing all wikilinks off a page) and experiment.
Handling wikitext is like a black magic at times. I've done some
elementary classes for parsing pages, if you're interested.
- --
Edward Z. Yang Personal: edwardzyang(a)thewritingpot.com
SN:Ambush Commander Website: http://www.thewritingpot.com/
GPGKey:0x869C48DA http://www.thewritingpot.com/gpgpubkey.asc
3FA8 E9A9 7385 B691 A6FC B3CB A933 BE7D 869C 48DA
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (MingW32)
iD8DBQFDGcbpqTO+fYacSNoRAih4AJ4zA/5S26KB16XsF5PvN6G74meX0gCfa2rI
oWmvA1QM1hRX8qsJAAgvpU4=
=yua+
-----END PGP SIGNATURE-----
There's been some questions recently about public backup dumps (or the
lack thereof). I've been working for the last few days on getting the
dump generation infrastructure up and running in a more consistent,
effective fashion.
Here's what's currently on my plate and the status thereof:
* Title corrections: some of the databases contain invalid page titles
left over from old bugs. This can sometimes break the import or export
process, so I'm writing a fixup script to find and rename them.
STATUS: Finding done, fixing to come.
Should be done with this later today.
* Dump filtering/processing: currently the dump has to run twice to
produce the current-only and all-revisions dump files. I'm working on a
postprocessing tool which will be able to split these two from a single
runthrough, as well as produce a filtered dump with the talk and user
pages removed.
Producing the split versions from one run should also mean that the dump
can run without having replication stopped the whole time.
It can also produce SQL for importing a dump directly into a database in
either 1.4 or 1.5 schema, for those using software based on the old
database layout. (We probably won't be hosting such files on our server
but you can run the program locally to filter XML-to-MySQL.)
STATUS: Mostly done. Some more testing and actually hooking up multiple
simultaneous outputs remains.
Should be done tonight or tomorrow.
* Progress and error reporting: The old backup script was a hacky shell
script with no error detection or recovery, requireing manually stopping
replication on a database server and reconfiguring the wiki cluster for
the duration. If something went awry, maybe nobody noticed... the
hackiness of this is a large part of why we've never just let it run
automatically on a cronjob.
I want to rework this for better automation and to provide useful
indications of what it's doing, where it's up to, and if something went
wrong.
STATUS: Not yet started. Hope to have done tomorrow or Friday.
* Clean up download.wikimedia.org further, make use of status files left
by the updated backup runner script.
STATUS: Not yet started. (Doesn't have to be up before the backup starts.)
-- brion vibber (brion @ pobox.com)
Howdy,
I'd like to add a spell checker to MediaWiki using the pspell library.
(Pspell
is part of php and it uses libaspell.) It doesn't help that I'm new to the
MediaWiki code base and PHP isn't exactly my favorite language. (I wouldn't
even call it my _third_ favorite.) Anyway, I'd like to get a little feedback
and advice on where to go from here.
I know a few people have proposed working on spell check before:
http://mail.wikimedia.org/pipermail/wikitech-l/2004-March/021358.html
But, as best I can tell no one has gone anywhere. Does anyone know what
happened to User:Archivist's spellchecker?
Right now I have a proof-of-concept running on my computer. You can see it
at
http://66.205.125.240/spell/index.php/Special:Spellcheck/Main_Page
It is a SpecialPage that reads the article from the database, spell checks
it,
lets the user choose the words from the drop-down box, and then sends a
FauxRequest to EditPage. Eventually I'd like to add it to EditPage, but I
started out with a special page so that I did not have to deal with the
complexity of EditPage.
Here's how I'd like the final version to work:
# There's a button at the bottom of an EditPage beside 'Show Perview' and
'Show
Changes' labeled 'Spell Check'.
# When the user clicks 'Spell Check', they get a preview of their edit where
misspelled words are replaced with drop-down boxes.
# The user changes the words they think are mispelled to one of the
suggestions
or leaves it as is. When they click 'Show Preview', they go back to the
preview page.
A few questions:
Do I not need to deal with multi-byte character functions like mb_substr
since
all the languages use utf-8?
Should the user spell check a preview or the wikitext?
If a word is misspelled in several places, should the user be asked once for
the word or should the user be asked everytime the word appears?
Thanks,
Jeff McGee
Ævar Arnfjörð Bjarmason wrote:
> Modified Files:
> Image.php
> Log Message:
> * Reverting back to 1.115, not inserting {{ and }} automatically means that the
> license selector can be used to insert arbitary text, not just templates,
> this doesn't break it either since you just have to change the entries in
> MediaWiki:Licenses from e.g:
> * GFDL|GNU Free Documentation License
> to:
> * {{GFDL}}|GNU Free Documentation License
> to get the same functionality as before
No you can't, because brace replacement happens at wfMsg() time, which
is before the text is picked apart and added to a list.
-- brion vibber (brion @ pobox.com)
Hi, excuse my ignorance... But after searching for information on this for
the past 2 hours, I can't seem to find any FAQ or simple instruction set on
what to do with with the new XML dumps provided at
http://download.wikimedia.org/
The sole link provided on the page mentions nothing of the new XML format,
or what to do with it. Searching through this mailing list hasn't shed much
more light. Any help would be very much appreciated (even a pointer to a
page that explains how to import these files into a 1.3.x or 1.4.xsystem...)!
Thanks,
John