Then we shall never agree. I believe its pretty much
accepted that we *like* the approach of "use the
environment variable if it's available, guess where we
think it is when it's not" is good. Of course, the env
variable is best: that's why it was added. What I fail to
understand is why trying to guess the path when the
variable isn't around is such a bad idea?
In my mind, the logic should be: 1) try the variable, 2)
try to guess the path if we don't know #1, and 3) fail.
You seem to be wanting to take out step 2, which makes
zero sense to me. Most installs never touch the default
directory structure, and should be able to fail back to
that just fine.
-Chad
On Aug 24, 2009 1:30 PM, "dan nessett" <dnessett(a)yahoo.com> wrote:
--- On Mon, 8/24/09, Chad <innocentkiller(a)gmail.com> wrote: > Why skip
trying to find the location?...
That's a reasonable question, stating in another way the useful maxim, "if
it ain't broke, don't fix it." The problem is I think it's "broke".
Here is my take on the pros/cons of leaving things unchanged:
Pros:
* Some administrators are used to simply typing the line php <utility>.php.
Making them type:
MW_INSTALL_PATH=/var/wiki/mediawiki php <utility>.php
would be inconvenient.
In answer to this, for the MW installations running on unix, it is pretty
simple to alias "MW_INSTALL_PATH=/var/wiki/mediawiki php" and put the
definition into .bash_profile (or the appropriate shell initialization
script). This is a one time effort and so the change isn't as onerous as it
might seem. I assume there is a similar tactic available for windows
systems.
Cons:
* The use of file position dependent code is a problem during development
and much less of a problem during installation and production (as you
suggest). Right now there are ~400 sub-directories in the extensions
directory. It seems to me reorganization of the extensions directory would
help understanding the relationship between individual extensions and the
core. For example, having two subdirectories, one for cli utilities and
another for hook based extensions would clarify the role each extension
plays. However, currently there are 29 extensions where $IP is set using the
relative position of the file in the MW directory structure (a couple of
other extensions set $IP based on MW_INSTALL_PATH). Reorganizing the
directory structure has the potential of breaking them.
* CLI utilities are moved around for reasons other than a reorganization of
the extensions directory. For example, as I understand it, DumpHTML was
moved from maintenance/ to extensions/. dumpHTML.php sets $IP based on its
relative position in the distribution tree. It was a happy coincidence that
when it was moved, its relative position didn't change. However, it is
unreasonable to think such reclassifications will always be as fortunate.
Since the cons outweigh the pros, I remain convinced that the change I
suggested (using die()) improves the code.
Dan _______________________________________________ Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikimania’s classic “Hacking Days” event is back, and better than ever
as the Wikimania Codeathon will be open throughout the entire conference
this year:
http://wikimania2009.wikimedia.org/wiki/Wikimania_Codeathon
Based on the success of April’s Developer Meet-up in Berlin, we’re
starting with an “unconference”-style planning session to let attendees
break out into common working groups.
The fun begins at 10am Tuesday, August 25 (note — this is the day before
the main conference begins). The coding room will remain open throughout
the rest of the conference, so folks can pop in and out between other
sessions.
We’ll be in Room F at the Centro Cultural, which should be nicely
spacious. Non-developers are welcome during the conference if you just
need a quiet place to sit and check on your sockpuppet accounts. ;)
There’ll be a wrap-up presentation in Room 3 at 14:00 Friday, August 28
to give folks a chance to do mini-talks on what they’ve been working on.
If you’re planning to attend, either in person or virtually via IRC,
please add yourself to the list on the wiki to help coordinate:
http://wikimania2009.wikimedia.org/wiki/Wikimania_Codeathon#I.27m_interested
-- brion vibber (brion @ wikimedia.org)
CTO, Wikimedia Foundation
San Francisco
Back in May 2004, Gabriel Wicke was creating a neat new skin called
Monobook. Unlike the old skins, it used good semantic markup with CSS
2 for style. Gabriel made sure to test in a lot of browsers and made
up files full of extensive fixes for browsers that had problems.
One such browser was the default KDE browser, Konqueror. Even
relatively web-savvy people have barely heard of it and never used it.
He was nice and checked whether it worked properly in his new skin
anyway. Since it didn't, he committed a quick fix in r3532 to
eliminate some horizontal scrollbars. Then everyone forgot about it,
because nobody uses KHTML.
It turns out there was a slight problem with his fix. He loaded it
based on this code:
var is_khtml = (navigator.vendor == 'KDE' || ( document.childNodes &&
!document.all && !navigator.taintEnabled ));
The problem here is pretty straightforward. A bug fix is being
loaded, without checking to see whether the bug exists. The fix is
loaded for all versions of KHTML past, present, and *future*. If the
KHTML devs fixed the bug, then they'd have a bogus stylesheet being
loaded that would mess up their display, and they couldn't do anything
about it.
Well, nobody much used or uses KHTML. But it just so happens that in
2003, Apple debuted a new web browser based on a fork of KHTML. And
in 2008, Google debuted another browser based on the same rendering
engine. And if you add them together, they now have 6% market share
or more. And we've still been serving them this broken KHTML fixes
file for something that was fixed eons ago.
Just recently, in WebKit r47255, they changed their code to better
match other browsers' handling of "almost standards mode". They
removed some quirk that was allowing them to render correctly despite
the bogus CSS we were serving them. And so suddenly they're faced
with the prospect of having to use a site-specific hack ("if path ends
in /KHTMLFixes.css, ignore the file") because we screwed up. See
their bug here: <https://bugs.webkit.org/show_bug.cgi?id=28350>
I had already killed KHTMLFixes.css in r53141, but it's still in every
MediaWiki release since 1.5. And this isn't the only time this has
happened. A while back someone committed some fixes for Opera RTL.
They loaded the fixes for, yes, Opera version 9 or greater, or some
similar check. When I checked on Opera 9.6, I found that the fix was
degrading display, not improving it.
Sometimes we need to do browser sniffing of some kind, because
sometimes browsers don't implement standards properly. There are two
ways to do it that are okay:
1) Capability testing. If possible, just check directly whether the
browser can do it. This works best with JS functionality, for
instance in getElementsByClassName in wikibits.js:
if ( typeof( oElm.getElementsByClassName ) == "function" ) {
/* Use a native implementation where possible FF3, Saf3.2, Opera 9.5 */
It can also be used in other cases sometimes. For instance, in r53347
I made this change:
- // TODO: better css2 incompatibility detection here
- if(is_opera || is_khtml ||
navigator.userAgent.toLowerCase().indexOf('firefox/1')!=-1){
- return 30; // opera&konqueror & old firefox don't understand
overflow-x, estimate scrollbar width
+ // For browsers that don't understand overflow-x, estimate scrollbar width
+ if(typeof document.body.style.overflowX != "string"){
+ return 30;
Instead of using a hardcoded list of browsers that didn't support
overflow-x, I checked whether the overflowX property existed. This
isn't totally foolproof, but it sure bets assuming that no future
version of Opera or KHTML will support overflow-x. (I'm pretty sure
both already do, in fact.)
2) "Version <= X." If it's not reasonable to check capabilities, then
at least allow browser implementers to fix their bugs in future
versions. If you find that all current versions of Firefox do
something or other incorrectly, then don't serve incorrect content to
all versions of Firefox. In that case, during Firefox 3.6
development, they'll find out that their improvements to standards
compliance cause Wikipedia to break! Instead, serve incorrect content
to Firefox 3.5 or less, and standard markup to all greater versions.
That way, during 3.6 development, they'll find out that their
*failure* to comply with standards causes Wikipedia to break. With
any luck, that will encourage them to fix the problem instead of
punishing them. It's not as good as being able to automatically serve
the right content if they haven't fixed things, but it's better than
serving bad content forever.
I tried to remove some browser-sniffing from wikibits.js, but there's
undoubtedly some I missed. Especially with the large amounts of JS
being added recently for usability/new upload/etc., could everyone
*please* check to make sure that there are no broken browser checks
being committed? This kind of thing hurts our users in the long term
(especially third parties who don't upgrade so often), and is really
unfair to browser developers who are trying to improve their standards
compliance. Thanks!
I am looking into the feasibility of writing a comprehensive parser regression test (CPRT). Before writing code, I thought I would try to get some idea of how well such a tool would perform and what gotchas might pop up. An easy first step is to run dump_HTML and capture some data and statistics.
I tried to run the version of dumpHTML in r54724, but it failed. So, I went back to 1.14 and ran that version against a small personal wiki database I have. I did this to get an idea of what structures dump_HTML produces and also to get some performance data with which to do projections of runtime/resource usage.
I ran dumpHTML twice using the same MW version and same database. I then diff'd the two directories produced. One would expect no differences, but that expectation is wrong. I got a bunch of diffs of the following form (I have put a newline between the two file names to shorten the line length):
diff -r HTML_Dump/articles/d/n/e/User~Dnessett_Bref_Examples_Example1_Chapter_1_4083.html
HTML_Dump2/articles/d/n/e/User~Dnessett_Bref_Examples_Example1_Chapter_1_4083.html
77,78c77,78
< Post-expand include size: 16145/2097152 bytes
< Template argument size: 12139/2097152 bytes
---
> Post-expand include size: 16235/2097152 bytes
> Template argument size: 12151/2097152 bytes
I looked at one of the html files to see where these differences appear. They occur in an html comment:
<!--
NewPP limit report
Preprocessor node count: 1891/1000000
Post-expand include size: 16145/2097152 bytes
Template argument size: 12139/2097152 bytes
Expensive parser function count: 0/100
-->
Does anyone have an idea of what this is for? Is there any way to configure MW so it isn't produced?
I will post some performance data later.
Dan
When I try and download the Wikipedia Mobile app on my 1st gen iPod
touch it won't let me - saying it requires a microphone... Unless it's
a voice command only UI, this is probably not intended ;)
- Trevor
Sent from my iPod
I've been using epeg for an NAS device as it can make thumbnails
rather quick (faster than convert). Though it's result might not look
as sharp as the result from imagemagick, I think it can be used as an
intermediate format when converting large jpegs.
If you want to test out epeg, we are keeping a debian package at
http://update.excito.org/pool/main/e/epeg/
Following are testing output I've made on my desktop computer (images
are attached, original can be found on commons):
[0:0][azatoth@azabox img]$ time epeg Boucher\,\ Francois\ -\
Landscape\ Near\ Beauvais.jpg epeg_direct.jpg -w 800 -w 600 -m 800 -q
100
real 0m3.957s
user 0m3.460s
sys 0m0.076s
[0:0][azatoth@azabox img]$ time epeg Boucher\,\ Francois\ -\
Landscape\ Near\ Beauvais.jpg epeg_tmp.jpg -w 1600 -w 1200 -m 1600 -q
100
real 0m5.896s
user 0m5.120s
sys 0m0.088s
[0:0][azatoth@azabox img]$ time convert epeg_tmp.jpg -quality 80
-thumbnail 800x -depth 8 epeg_convert.jpg
real 0m0.942s
user 0m0.696s
sys 0m0.032s
[0:0][azatoth@azabox img]$ time convert Boucher\,\ Francois\ -\
Landscape\ Near\ Beauvais.jpg -quality 80 -thumbnail 800x -depth 8
convert_direct.jpg
real 0m12.007s
user 0m9.929s
sys 0m0.800s
[0:0][azatoth@azabox img]$ ll Boucher\,\ Francois\ -\ Landscape\ Near\
Beauvais.jpg
-rw-r--r-- 1 azatoth azatoth 104554102 8 aug 04.16 Boucher, Francois
- Landscape Near Beauvais.jpg
[0:0][azatoth@azabox img]$ identify Boucher\,\ Francois\ -\ Landscape\
Near\ Beauvais.jpg
Boucher, Francois - Landscape Near Beauvais.jpg JPEG 12384x10064
12384x10064+0+0 8-bit DirectClass 99.71mb
[0:0][azatoth@azabox img]$
--
/Carl Fürstenberg <azatoth(a)gmail.com>
So. I checked out a copy of phase3 and extensions to start working on investigating the feasibility of a comprehensive parser regression test. After getting the working copy downloaded, I do what I usually do - blow away the extensions directory stub that comes with phase3 and soft link the downloaded copy of extensions in its place. I then began familiarizing myself with DumpHTML by starting it up in a debugger. Guess what happened.
If fell over. Why? Because DumpHTML is yet another software module that computes the value $IP. So what? Well, DumpHTML.php is located in ../extensions/DumpHTML. At line 57-59 it executes:
$IP = getenv( 'MW_INSTALL_PATH' );
if ( $IP === false ) {
$IP = dirname(__FILE__).'/../..';
}
This works on a deployed version of MW, since the extensions directory is embedded in /phase3. But, in a development version, where /extensions is a separate subdirectory, "./../.." does not get you to phase3, it gets you to MW root. So, when you execute the next line:
require_once( $IP."/maintenance/commandLine.inc" );
DumpHTML fails.
Of course, since I am going to change DumpHTML anyway, I can move it to /phase3/maintenance and change the '/../..' to '/..' and get on with it. But, for someone attempting to fix bugs in DumpHTML, the code that uses a knowledge of where DumpHTML.php is in the distribution tree is an issue.
Dan
One of the first problems to solve in developing the proposed CPRT is how to call a function with the same name in two different MW distributions. I can think of 3 ways: 1) use the Namespace facility of PHP 5.3, 2) use threads, or 3) use separate process and IPC. Since MAMP supports none of these I am off building an AMP installation from scratch.
Some questions:
* Are there other ways to solve the identifier collision problem?
* Are some of the options I mention unsuitable for a MW CPRT, e.g., currently MW only assumes PHP 5.0 and requiring 5.3 may unacceptably constrain the user base.
* Is MW thread safe?
Dan