I *think* I have tracked down the source of several bugs concerning
templates to Parser.php, line 1763. This caused (some) included
templates to be replaces by the UNIQ_PREFIX constant.
I have changed that in CVS HEAD, and it seems to work correctly now;
however, as I have only recently started bughunting again, I'm not that
familiar with the internals, especially of things like the template
system, the development of which I sorta missed.
Can anyone more familiar please check if I created a mess there? If not,
it will likely fix the "table params as a template" bug, the "__NOTOC__
in template" bug, etc.
I understand that this may not be best suited to this mailing list, so
accept my apologies if this is so, I have tried extensively to find a
forum to ask this question however and this is the most suitable I
I am a student developer, and I wish to be able to access wiktionary
content from my app. For example, building a standalone app that looks
up definitions for words the user types into my app, for this I would
very much like to use wiktionary as my source.
Is there a structure for accessing the wiktionary webpage in a certain
way that will allow me to easily add this functionality? Indeed, would
this be seen as a non-proper use of the wiktionary content? I would
certainly attribute the definitions to wiktionary if I do manage to
build my app.
Thanks for any help offered, it's very gratefully received,
Patent nonsense of course refers to the US and Japanese laws which allow software patents. Fortunately the rest of the world isn't (yet) so daft. However, before we get too caried away with banning the world's standard formats, lets consider what it involves us banning:
*JPEG (the Forgent patent, Sony paid and an unnamed comapny paid US$15 million). So, most current camera photos will be banned. I'm not yet aware of a suitable alternative, though I hope there's one. I doubt that there's one with wide enough browser support for our purpose. If you're not aware of this one, search - it'll probably be the top Google result.
*GIF of course, still royalties required in much of the world, though not the US. I assume that we'll ban something allowed in the US but encumbered elsewhere? I hope not,though - it would be foolish.
*Most video formats. JPEG, MPEG, MP4 (including DivX), Real Media, Microsofts. I'm aware of some work in this area but nothing which will meet the needs of our users for at least a year or three.
*Most fonts. Apple has patents on much of the core font hinting technology. We use fonts covered by these patents and if you've seen the alternatives, you probably won't want to switch to using them. See http://www.freetype.org/patents.html for a summary of the issues.
*Probably others - I didn't take more than a few seconds to come up with this list.
So, should we ban the most widely availabel and backwards compatible still image, audio, video and font technologies from Wikipedia?
With respect, that would be shooting ourselves in the foot.
We do need to recognise that these are the formats people have. And recognise that, however much we may like it, Windows 95 may not have support for Ogg. Libraries, schools, offices and many other places prohibit software installation without permission and we probably don't want to bar those users from our content, nor third world or poor users who may have older equipment.
We do, of course, want to encourage the use of formats which are less restricted. While doing that we should remember that it's common for patent holders to wait a decade and then spring a trap, even when they have participated in standards meetings where participants were asked to disclose their patents in advance. We do need to remember that the format we think is patent free may not be.
Rather than doing this piecemeal, lets try to find a policy we and our users can live with today, and one which encourages people to use new formats without blocking those who don't yet have those new formats. The ICANN did that yesterday. It announced the start of IPv6 support in the core DNS servers started yesterday and also said that IPv4 would be supported for about 20 years to allow a smooth transition time.
So, I propose for all encumbered formats:
*We accept and use the de facto standards, including JPEG, MP3, MPEG and Apple patented type.
*Where there is a less encumbered alternative, we encourage people to also make that available and to list the less encumbered format(s) first, to increase awareness of that format.
*Browsers send information on the formats they support. Where we have multiple formats available, we should serve the least encumbered format the client says it can support. This will involve changes to the software, which can, I hope, be done fairly slowly as we develop standards for specifying alternative forms of all multimedia works.
*Where there is a clear advantage in content or technically, we accept the encumbered format until it's clear that the vast majority (at least 99%) of users support the new format seamlesly, just as we have our target for browser support at about that level.
*Where content requires it, we support a format forever. One example is legal cases involving specific media types, where an accurate article will require the use of the exact media type used as evidence or alleged to be infringing. There's simoply no substitute for the exact file used in this sort of situation.
*When we do phase out a format, we provide a link to the old file and an archive of all such files with information about where they were used, so we don't completely lock out those who are not able to support the new format.
*When we plan to phase out a format, we give at least two yers notice and do not do so until we have alternatives available for all affected content.
*Where we are aware of external links to a specific format, we continue to make that format available for as long as the link exists.
This approach avoids failing to serve the existing clients, promotes the new formats and provides clear transition times and goals to allow resuers to adapt at a predictable pace.
I was tempted to suggest the copyleft approach: use all of the formats the US and Japanese laws encumber and not serve them in the US and Japan, to encourage people to get those nasty laws changed. I took pity on the people in those places, though.:)
We're making progress. We is four people who each solved a bit of the
It all started with a server reorg, after which the script had disappeared.
It is back, with proper privileges, synched to Apaches.
Still the script did not run.
Finally today we found out the script stumbles over new folder name
'wikipedia.org' in the executable path. It thinks all 60 or so chars after
the dot are a file extension. Oops.
I'll patch this tomorrow. After that it will still only run properly on iso
sites, but I'll add utf-8 support soon (for built-in european extended ascii
Later I will add support for full unicode character set, using external
fonts, but that will take one or two months, as I'm doing some other stuff
if you look at
you will see there's an article with the title:
'''Why SPECT?''' Similar to X-ray Computed Tomography
(CT) or Magnetic Resonance Imaging (MRI), Single Photon Emission
Computed Tomography (SPECT) allows us to visualize functional
information about a patient's specific organ or body system. '''
However, if you click on it, it will say the article doesn't exist.
The reason for this is that in the DB, the article title ends with a
'_'. These article titles are unreachable via the wiki.
Please could someone delete this entry from the DB directly?
DELETE FROM cur WHERE cur_id=291499
At this moment the bug tracker lists 214 open bugs out of 1098 total
reports; that's an average of about one bug report closed per day since
we started using it in mid-2002.
A lot of the remaining bug reports are duplicates, or refer to problems
that have since been fixed. A handful are really feature requests. A
bunch more are relatively minor problems that just haven't been gotten to...
I'd like to see us cut the open bug reports *in half* to 107 over the
next week. It may sound like a lot (three months' worth of closings!)
but I'm confident we can do it; in just the last day JeLuF and I have
cleared out about 15 bug reports. If we can keep up that rate, we'll
make the goal easily.
To encourage people to pay more attention to the nasty bugses, I've set
up a quickie bot that reports updates to bug tracker into the #mediawiki
-- brion vibber (brion @ pobox.com)
Magnus Manske wrote:
> + $isRedirect = false ;
> + if ( trim ( substr ( $s , 0 , 10 ) ) == '#REDIRECT' ) $isRedirect = true ;
This needs to use the keywords so that translations of the "REDIRECT"
keyword work as well. Also, the REDIRECT thingie is not supposed to be
as some of you know, a company will release a snapshot of the German
wikipedia on CD. They will use a dump from September 1, less than one
month from now.
We (de.wikipedia) would very much like to get the validation feature up
and running ASAP, even though it is only in the early stages (but
functional), so we can harvest WikiPower (tm) to sort out the junk, er,
the "articles" that should rather not go on CD.
Now for the scary part: Can we soon go live with what is CVS HEAD, or is
there something utterly unstable in it?
Also, any help with the validation feature will be greatly appreciated.
The current state of affairs can be seen at http://test.wikipedia.org
Shaihulud and me investigated the recent load balancing weirdness a bit
and came across some slow ping times to machines on the secondary
switch. Two averages of 250 pings from suda, run at the same time:
maurus: rtt min/avg/max/mdev = 0.138/2.063/25.486/4.660 ms
coronelli: rtt min/avg/max/mdev = 0.125/0.328/2.023/0.213 ms
More samples between various hosts also showed that the really slow ping
times only occur between machines on the first switch and the newer ones
on the secondary one (all hanging on a single port on the primary
The load balancing is based on icp response times, so a network
congestion can temporarily override it. I'm not sure if the GigE switch
is already ordered, if not i'd recommend to order it as soon as