OT: What language would *you* have done it in, and why? - Wikitech-l

List overview All Threads
Download

newer

OT: What language would you have done it in, and why?

older

MediaWiki automated test run...

IP adressePLEASE

Jay R. Ashworth

27 Jul 2006 27 Jul '06

11:01 p.m.

On Thu, Jul 27, 2006 at 12:05:57AM -0600, Chad Perrin wrote:

...

...
Please don't construe this statement as a veiled snipe at PHP (or Java, or any other language). It's just an observation of fact.

I should help stab someone for choosing PHP, too, for that matter -- but we work with what we've got.

Unthreaded: in a clear field, Chad, what *would* you have implemented MediaWiki in? And why?

(Thread-kill is your friendi, folks...)

Cheers, -- jra

-- Jay R. Ashworth jra@baylink.com Designer Baylink RFC 2100 Ashworth & Associates The Things I Think '87 e24 St Petersburg FL USA http://baylink.pitas.com +1 727 647 1274 Fanfic: read enough, and you'll loose your mind. --me

Show replies by date

Timwi

28 Jul 28 Jul

12:47 a.m.

New subject: OT: What language would *you* have done it in, and why?

Jay R. Ashworth wrote:

...

On Thu, Jul 27, 2006 at 12:05:57AM -0600, Chad Perrin wrote:

...
...
Please don't construe this statement as a veiled snipe at PHP (or Java, or any other language). It's just an observation of fact.

I should help stab someone for choosing PHP, too, for that matter -- but we work with what we've got.

Unthreaded: in a clear field, Chad, what *would* you have implemented MediaWiki in? And why?

I can't speak for other people, but I would do the parser in C (using lex/yacc) and the rest in Perl.

For the parser, I'm sure the "why" is self-evident. For the rest, the answer to "why" unfortunately includes "I don't know any Python or Ruby". All I know is that PHP is unsuitable for anything larger than a quick hack.

Timwi

Edward Z. Yang

7:40 a.m.

New subject: OT: What language would *you* have done it in, and why?

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Timwi wrote:

...

For the parser, I'm sure the "why" is self-evident. For the rest, the answer to "why" unfortunately includes "I don't know any Python or Ruby". All I know is that PHP is unsuitable for anything larger than a quick hack.

Delighted to see that you've called Yahoo! a quick hack. ;-)

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEyV0CqTO+fYacSNoRAma2AJ9sVnpPNYXOKOgFHt76NWi6eXdC2wCcDR24 ffEAy8VJBUKIRqcx4YBRDV4= =q/TZ -----END PGP SIGNATURE-----

Chad Perrin

1:28 p.m.

New subject: OT: What language would *you* have done it in, and why?

On Thu, Jul 27, 2006 at 08:40:35PM -0400, Edward Z. Yang wrote:

...

Timwi wrote:

...
For the parser, I'm sure the "why" is self-evident. For the rest, the answer to "why" unfortunately includes "I don't know any Python or Ruby". All I know is that PHP is unsuitable for anything larger than a quick hack.

Delighted to see that you've called Yahoo! a quick hack. ;-)

He said it was "unsuitable for a quick hack", not that people don't use it for anything more than a quick hack.

I (mostly) agree.

-- CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ] "A script is what you give the actors. A program is what you give the audience." - Larry Wall

Edward Z. Yang

29 Jul 29 Jul

6:26 a.m.

New subject: OT: What language would *you* have done it in, and why?

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Chad Perrin wrote:

...

On Thu, Jul 27, 2006 at 08:40:35PM -0400, Edward Z. Yang wrote:

...
Timwi wrote:

...
For the parser, I'm sure the "why" is self-evident. For the rest, the answer to "why" unfortunately includes "I don't know any Python or Ruby". All I know is that PHP is unsuitable for anything larger than a quick hack.

Delighted to see that you've called Yahoo! a quick hack. ;-)

He said it was "unsuitable for a quick hack", not that people don't use it for anything more than a quick hack.

I (mostly) agree.

Not sure if I'm interpreting you correctly: while you're picking holes in my logic (correct on all counts), you agree with the basic jist?

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEyp0XqTO+fYacSNoRAtWCAJkBAB9RIvaEHgmVAU/M5fLci1VvjgCcD1oP ItL50QLROZM9HCsvstgDhxU= =qNVb -----END PGP SIGNATURE-----

Chad Perrin

6:33 a.m.

New subject: OT: What language would *you* have done it in, and why?

On Fri, Jul 28, 2006 at 07:26:15PM -0400, Edward Z. Yang wrote:

...

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Chad Perrin wrote:

...
On Thu, Jul 27, 2006 at 08:40:35PM -0400, Edward Z. Yang wrote:

...
Timwi wrote:

...
For the parser, I'm sure the "why" is self-evident. For the rest, the answer to "why" unfortunately includes "I don't know any Python or Ruby". All I know is that PHP is unsuitable for anything larger than a quick hack.

Delighted to see that you've called Yahoo! a quick hack. ;-)

He said it was "unsuitable for a quick hack", not that people don't use it for anything more than a quick hack.

I (mostly) agree.

Not sure if I'm interpreting you correctly: while you're picking holes in my logic (correct on all counts), you agree with the basic jist?

I mostly agree that PHP is unsuitable for anything larger than a quick hack. That doesn't mean it doesn't get used for things larger than a quick hack. That's the point I was trying to make.

-- CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ] unix virus: If you're using a unixlike OS, please forward this to 20 others and erase your system partition.

Jeff Carr

2 Sep 2 Sep

7:04 a.m.

On 07/28/06 16:33, Chad Perrin wrote:

...

I mostly agree that PHP is unsuitable for anything larger than a quick hack. That doesn't mean it doesn't get used for things larger than a quick hack. That's the point I was trying to make.

Unfortunately your point is wrong.

Chad Perrin

8:06 a.m.

New subject: OT: What language would *you* have done it in, and why?

On Fri, Sep 01, 2006 at 05:04:15PM -0700, Jeff Carr wrote:

...

On 07/28/06 16:33, Chad Perrin wrote:

...
I mostly agree that PHP is unsuitable for anything larger than a quick hack. That doesn't mean it doesn't get used for things larger than a quick hack. That's the point I was trying to make.

Unfortunately your point is wrong.

Your argument is compelling. I yield the field to your superior debate acumen.

. . . months later.

-- CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ] "There comes a time in the history of any project when it becomes necessary to shoot the engineers and begin production." - MacUser, November 1990

Lee Daniel Crocker

27 Aug 27 Aug

11:15 a.m.

New subject: OT: What language would *you* have done it in, and why?

...

(Timwi timwi@gmx.net):

...
Unthreaded: in a clear field, Chad, what *would* you have implemented MediaWiki in? And why?

I can't speak for other people, but I would do the parser in C (using lex/yacc) and the rest in Perl.

For the parser, I'm sure the "why" is self-evident. For the rest, the answer to "why" unfortunately includes "I don't know any Python or Ruby". All I know is that PHP is unsuitable for anything larger than a quick hack.

Just for the record, I probably would have done the whole thing in Java except that PHP had two advantages I couldn't ignore: one, it let me get the whole thing up and running in 2-3 weeks at a time when the existing wiki was in serious meltdown and had to be fixed quickly; second, it allowed me to replace some of the existing code incrementally for easier testing.

Jay R. Ashworth

11:34 a.m.

New subject: OT: What language would *you* have done it in, and why?

On Sat, Aug 26, 2006 at 11:15:15PM -0500, Lee Daniel Crocker wrote:

...

...
(Timwi timwi@gmx.net):

...
Unthreaded: in a clear field, Chad, what *would* you have implemented MediaWiki in? And why?

I can't speak for other people, but I would do the parser in C (using lex/yacc) and the rest in Perl.

For the parser, I'm sure the "why" is self-evident. For the rest, the answer to "why" unfortunately includes "I don't know any Python or Ruby". All I know is that PHP is unsuitable for anything larger than a quick hack.

Just for the record, I probably would have done the whole thing in Java except that PHP had two advantages I couldn't ignore: one, it let me get the whole thing up and running in 2-3 weeks at a time when the existing wiki was in serious meltdown and had to be fixed quickly; second, it allowed me to replace some of the existing code incrementally for easier testing.

Well let us all give thanks for meltdown, then. :-)

Not the person I expected a reply from, but welcome back, I guess.

Cheers, -- jr "if you were, y'know, gone" a

-- Jay R. Ashworth jra@baylink.com Designer Baylink RFC 2100 Ashworth & Associates The Things I Think '87 e24 St Petersburg FL USA http://baylink.pitas.com +1 727 647 1274 The Internet: We paved paradise, and put up a snarking lot.

Greg Sabino Mullane

28 Jul 28 Jul

8:34 a.m.

New subject: OT: What language would *you* have done it in, and why?

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

...

Unthreaded: in a clear field, Chad, what *would* you have implemented MediaWiki in?

Dunno about Chad, but I would have implemented it in Perl, of course. With perhaps bison and Inline::* where needed.

...

And why?

That's a loaded question. :) More experienced people familiar with the language available for development. Namespaces. A mature database API. No php.ini mess. Unicode. Lexical variables. Real hashes. "use strict". Consistent naming, use of case, and return values. The ability to use qq{}. Perldoc[1]. Real references and data structures. Good comparison operators. XS. True object orientation.

However, PHP is what we got, and MediaWiki is pretty well written and head and shoulders above 99% of the PHP apps out there. Once I finish Postgres support for MediaWiki, I'll be converting it to Perl. Just don't hold your breath. :)

[1] As I'm writing this, www.php.net appears to be down.

- -- Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200607272132 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- iD8DBQFEyWkyvJuQZxSWSsgRAiPXAKCApWkqoQ40UhLReiccfvvcG4iTYQCgnIF/ YmnMXtQHTupcym0Pr7/LlUo= =8A5T -----END PGP SIGNATURE-----

Edward Z. Yang

8:45 a.m.

New subject: OT: What language would *you* have done it in, and why?

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Greg Sabino Mullane wrote:

...

[1] As I'm writing this, www.php.net appears to be down.

Down for me too, but the mirrors are working fine. Personally speaking, I think that PHP's manual is /a lot/ better than Perl's.

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEyWxVqTO+fYacSNoRAiERAJ9jHW81LqeuYhbLlp3bPsfX65N2QwCfabX9 FKFUB2CkmFxHmPkYx+pjUkw= =LrmK -----END PGP SIGNATURE-----

mboverload

10:03 a.m.

New subject: OT: What language would *you* have done it in, and why?

Everyone knows that Mediawiki should have been written in Javascript. The current implementation is a mess.

On 7/27/06, Edward Z. Yang edwardzyang@thewritingpot.com wrote:

...

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Greg Sabino Mullane wrote:

...
[1] As I'm writing this, www.php.net appears to be down.

Down for me too, but the mirrors are working fine. Personally speaking, I think that PHP's manual is /a lot/ better than Perl's. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEyWxVqTO+fYacSNoRAiERAJ9jHW81LqeuYhbLlp3bPsfX65N2QwCfabX9 FKFUB2CkmFxHmPkYx+pjUkw= =LrmK -----END PGP SIGNATURE-----

Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l

Chad Perrin

1:30 p.m.

New subject: OT: What language would *you* have done it in, and why?

On Thu, Jul 27, 2006 at 08:03:00PM -0700, mboverload wrote:

...

Everyone knows that Mediawiki should have been written in Javascript. The current implementation is a mess.

Frankly, if it was written end-to-end in Javascript, I (and millions of others) would probably use something else.

-- CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ] "The first rule of magic is simple. Don't waste your time waving your hands and hopping when a rock or a club will do." - McCloctnick the Lucid

mboverload

5:39 p.m.

New subject: OT: What language would *you* have done it in, and why?

On 7/27/06, Chad Perrin perrin@apotheon.com wrote:

...

On Thu, Jul 27, 2006 at 08:03:00PM -0700, mboverload wrote:

...
Everyone knows that Mediawiki should have been written in

Javascript. The

...
current implementation is a mess.

Frankly, if it was written end-to-end in Javascript, I (and millions of others) would probably use something else.

Don't take this as an insult, because I'm really not sure, but I was really joking. Building Mediawiki in javascript is a joke, which is impossible....right?. I mean, how would javascript write to the database. Maybe you were thinking I was talking about AJAX?

- mboverload

Domas Mituzas

5:42 p.m.

New subject: OT: What language would *you* have done it in, and why?

...

Don't take this as an insult, because I'm really not sure, but I was really joking. Building Mediawiki in javascript is a joke, which is impossible....right?. I mean, how would javascript write to the database. Maybe you were thinking I was talking about AJAX?

You haven't heard about bittorrent? There will be no need for the database!!!

(BTW, server side javascript, and various mutations, exists too).

-- Domas Mituzas -- http://dammit.lt/ -- [[user:midom]]

Chad Perrin

9:57 p.m.

New subject: OT: What language would *you* have done it in, and why?

On Fri, Jul 28, 2006 at 01:42:45PM +0300, Domas Mituzas wrote:

...

...
Don't take this as an insult, because I'm really not sure, but I was really joking. Building Mediawiki in javascript is a joke, which is impossible....right?. I mean, how would javascript write to the database. Maybe you were thinking I was talking about AJAX?

You haven't heard about bittorrent? There will be no need for the database!!!

(BTW, server side javascript, and various mutations, exists too).

Eww. After reading that, I think my brain needs a shower.

-- CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ] "Real ugliness is not harsh-looking syntax, but having to build programs out of the wrong concepts." - Paul Graham

Ilmari Karonen

10:05 p.m.

Chad Perrin wrote:

...

On Fri, Jul 28, 2006 at 01:42:45PM +0300, Domas Mituzas wrote:

...
(BTW, server side javascript, and various mutations, exists too).

Eww. After reading that, I think my brain needs a shower.

Modern JavaScript (okay, ECMAScript) is actually a pretty nice language. It's just the browser implementations that generally suck. But it's certainly no worse than PHP.

-- Ilmari Karonen

Chad Perrin

10:31 p.m.

New subject: OT: What language would *you* have done it in, and why?

On Fri, Jul 28, 2006 at 06:05:31PM +0300, Ilmari Karonen wrote:

...

Chad Perrin wrote:

...
On Fri, Jul 28, 2006 at 01:42:45PM +0300, Domas Mituzas wrote:

...
(BTW, server side javascript, and various mutations, exists too).

Eww. After reading that, I think my brain needs a shower.

Modern JavaScript (okay, ECMAScript) is actually a pretty nice language. It's just the browser implementations that generally suck. But it's certainly no worse than PHP.

Oh, it's almost certainly better than PHP. My brain still needs a shower -- unless this means that all PHP will now magically be replaced with server-side Javascript, in which case we might have a net win.

Platonides

29 Jul 29 Jul

12:42 a.m.

New subject: OT: What language would *you* have done it in, and why?

"Domas Mituzas"wrote:

...

...
Don't take this as an insult, because I'm really not sure, but I was really joking. Building Mediawiki in javascript is a joke, which is impossible....right?. I mean, how would javascript write to the database. Maybe you were thinking I was talking about AJAX?

You haven't heard about bittorrent? There will be no need for the database!!!

Well, this was thought as a joke, but it exists! It's a mix of wiki & peer2peer. You pass to your contacts those articles you think are good and block those bad. A complete anarchy, impractical for wikipedia, we would lose our benevolent dictator ;-)

Gerard Meijssen

1:11 a.m.

Platonides wrote:

...

"Domas Mituzas"wrote:

...
...
Don't take this as an insult, because I'm really not sure, but I was really joking. Building Mediawiki in javascript is a joke, which is impossible....right?. I mean, how would javascript write to the database. Maybe you were thinking I was talking about AJAX?

You haven't heard about bittorrent? There will be no need for the database!!!

Well, this was thought as a joke, but it exists! It's a mix of wiki & peer2peer. You pass to your contacts those articles you think are good and block those bad. A complete anarchy, impractical for wikipedia, we would lose our benevolent dictator ;-)

Hoi, Actually, the software for Wikipedia in a peer to peer environment has already largely been written.. The part that needs doing is the modelling of the distribution given an evolving demand for content. To do this a GRID network is available to do the emulation of the network and the algorithms involved. In order to run this emulation traffic data is needed. There is a tool that can collect our traffic data real time.. The waiting is for the implementation of all this... It can be implemented when there is a decision that such an exercise is not only of great academic interest but has also a great potential benefit.

PS this tool can also give us statistics of the traffic to our services ..

Thanks, GerardM

Gregory Maxwell

1:47 a.m.

New subject: OT: What language would *you* have done it in, and why?

On 7/28/06, Gerard Meijssen gerard.meijssen@gmail.com wrote:

...

Hoi, Actually, the software for Wikipedia in a peer to peer environment has already largely been written..

Really? Where can I download it?

Chad Perrin

28 Jul 28 Jul

9:57 p.m.

New subject: OT: What language would *you* have done it in, and why?

On Fri, Jul 28, 2006 at 03:39:35AM -0700, mboverload wrote:

...

On 7/27/06, Chad Perrin perrin@apotheon.com wrote:

...
On Thu, Jul 27, 2006 at 08:03:00PM -0700, mboverload wrote:

...
Everyone knows that Mediawiki should have been written in

Javascript. The

...
current implementation is a mess.

Frankly, if it was written end-to-end in Javascript, I (and millions of others) would probably use something else.

Don't take this as an insult, because I'm really not sure, but I was really joking. Building Mediawiki in javascript is a joke, which is impossible....right?. I mean, how would javascript write to the database. Maybe you were thinking I was talking about AJAX?

I had the impression it was probably a joke, but decided to answer the hypothetical unasked question anyway -- and it would be "impossible" to write the whole thing as it currently exists in Javascript, with current implementations of Javascript, but it's not impossible to create an online encyclopedia without any server-side scripting for the website.

-- CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ] "The measure on a man's real character is what he would do if he knew he would never be found out." - Thomas McCauley

Chad Perrin

1:44 p.m.

New subject: OT: What language would *you* have done it in, and why?

On Fri, Jul 28, 2006 at 01:34:31AM -0000, Greg Sabino Mullane wrote:

...

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

...
Unthreaded: in a clear field, Chad, what *would* you have implemented MediaWiki in?

Dunno about Chad, but I would have implemented it in Perl, of course. With perhaps bison and Inline::* where needed.

That's actually pretty much my answer.

...

...
And why?

That's a loaded question. :) More experienced people familiar with the language available for development. Namespaces. A mature database API. No php.ini mess. Unicode. Lexical variables. Real hashes. "use strict". Consistent naming, use of case, and return values. The ability to use qq{}. Perldoc[1]. Real references and data structures. Good comparison operators. XS. True object orientation.

Those are pretty much my reasons. Are you preplagiarizing me? (ahem)

...

However, PHP is what we got, and MediaWiki is pretty well written and head and shoulders above 99% of the PHP apps out there. Once I finish Postgres support for MediaWiki, I'll be converting it to Perl. Just don't hold your breath. :)

Ditto that first sentence. Cheers and applause for the second. I wasn't planning to, with regard to the third.

As for original content, rather than just metoos . . .

Another option, besides Perl, that appeals to me is Ruby -- and for many of the same reasons as Perl (though it lags in some areas, such as number of available developers and volume of existing code that could be used). It also has some benefits that distinguish it from Perl, such as a far better syntax for its object model and a tendency to encourage readable code more readily (that doesn't mean you can't write equally readable Perl, just that the language's syntax tends to "encourage" it more than Perl's). Both languages have thoroughly excellent regex engines, with Perl's having perhaps a slight advantage, easily made up by Ruby's facility with iteration. On the other hand, there's the simple fact that Perl execution performance kicks butt all over Ruby, and perhaps every other high-level, reasonably dynamic, comparable language, for most purposes.

If I knew enough of some Lisp to be functional (ha ha, 'scuse the pun), I might lean in that direction as well.

I like the proposed idea of writing one or two core, high-load components in C, or even (if we're really adventurous) something better performing like Ada, though that's probably really pushing it. Since I don't much enjoy looking at C code, though, I might just ask someone else to write the C, so I guess *I* in particular might not implement any of it in C. Eh.

Perl really strikes me as the clear winner, overall, with Ruby a close second about a hair's-breadth behind it.

-- CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ] "The ability to quote is a serviceable substitute for wit." - W. Somerset Maugham

Ivan Krstic

4:04 p.m.

Chad Perrin wrote:

...

Perl really strikes me as the clear winner

I'm really reluctant to get involved in this discussion, but I'll pitch in here. In practice, I've found that the only way to keep shared Perl codebases from turning into a heaping pile of craptitude is to have them hacked on exclusively by people with pretty deep Perl expertise, which tends to not be the case for (most) open source projects that aren't centered around the language. Your mileage may certainly vary.

FWIW, after reaching the second to last stage of Nat Torkington's 'Seven stages of a Perl programmer,' and using Perl as a primary language for more years than I care to count, I've personally found Zen in Python.

-- Ivan Krstic krstic@solarsail.hcs.harvard.edu | GPG: 0x147C722D

Steve Bennett

4:42 p.m.

New subject: OT: What language would *you* have done it in, and why?

On 7/28/06, Ivan Krstic krstic@solarsail.hcs.harvard.edu wrote:

...

FWIW, after reaching the second to last stage of Nat Torkington's 'Seven stages of a Perl programmer,' and using Perl as a primary language for more years than I care to count, I've personally found Zen in Python.

I googled this and found references to a seven stages by Tom Christiansen...

http://prometheus.frii.com/~gnat/yapc/2000-stages/slide1.html

According to that, you have not yet rewritten major parts of the compiler, or become on first-name terms with Larry's wife.

Steve

Ivan Krstic

5:09 p.m.

Steve Bennett wrote:

...

I googled this and found references to a seven stages by Tom Christiansen...

Gnat gave the (relatively famous) YAPC talk, though the original 'seven stages' observation was by Tom, yes.

-- Ivan Krstic krstic@solarsail.hcs.harvard.edu | GPG: 0x147C722D

djafer3107 maatallah

5:11 p.m.

New subject: RE : Re: OT: What language would *you* have done it in, and why?

english

Ivan Krstic krstic@solarsail.hcs.harvard.edu a écrit : Steve Bennett wrote:

...

I googled this and found references to a seven stages by Tom Christiansen...

Gnat gave the (relatively famous) YAPC talk, though the original 'seven stages' observation was by Tom, yes.

djafer3107 maatallah

5:20 p.m.

New subject: RE : RE : Re: OT: What language would *you* have done it in, and why?

hello I would like to learn the English language for reason strictly professionnele

djafer3107 maatallah djafer3107@yahoo.fr a écrit : english

Ivan Krstic a écrit : Steve Bennett wrote:

...

I googled this and found references to a seven stages by Tom Christiansen...

Gnat gave the (relatively famous) YAPC talk, though the original 'seven stages' observation was by Tom, yes.

-- Ivan Krstic | GPG: 0x147C722D _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l --------------------------------- Découvrez un nouveau moyen de poser toutes vos questions quelque soit le sujet ! Yahoo! Questions/Réponses pour partager vos connaissances, vos opinions et vos expériences. Cliquez ici. _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l --------------------------------- Découvrez un nouveau moyen de poser toutes vos questions quelque soit le sujet ! Yahoo! Questions/Réponses pour partager vos connaissances, vos opinions et vos expériences. Cliquez ici.

Tels

5:40 p.m.

New subject: OT: What language would *you* have done it in, and why?

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Moin,

On Friday 28 July 2006 08:44, Chad Perrin wrote: [snipauiteabit]

...

Perl really strikes me as the clear winner, overall, with Ruby a close second about a hair's-breadth behind it.

I don't know much about Ruby, except that I heard Unicode support is really lacking. Which pretty much rules it out for anything serious text processing in this age :-D

However, the entire Ruby project always stuck me as a me-too-lets-reinvent-the-wheel-and-this-time-make-it-rounder project, like so many others (*cough*Perl6*cough).

Yes, Perl5 has some problems, like carrying baggage from a decade or two that nobody really needs anymore, but I am not sure that yet-another-interpreted-language (that is only 60..90% complete, undertested etc) is the real answer. It just fragments the coder base even more.

We have way too many programming languages already.

Best wishes,

Tels

- -- Signed on Fri Jul 28 12:35:59 2006 with key 0x93B84C15. Visit my photo gallery at http://bloodgate.com/photos/ PGP key on http://bloodgate.com/tels.asc or per email.

"Wo die Schoschonen schön wohnen."

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) iQEVAwUBRMnpiXcLPEOTuEwVAQLGqAf8DhbwKBJT4yY4HxNGYBYJE67kRg4zLS49 uC09j/1tyqFxjclhB5rnuMwQK5xKpSNdEZDoaJHoYN0UdsfJWg3YdbooEfRDbkZj UV8GtcBNOi4LEKnNvHLslr8kl7BvP0+ZlKXPPVoHvVPmbXxclC24sbb/zxExgQpS 4L7/Zb+nEl7qX2XVYIllqbgzpUboLBSPrwd+p+BdfmuCVHhjE24Pde8kCAPQHbmo J7rrSKIrEACnHqTgzwEBR6R9SEBOHf8ECujuFDS/JnB8VR+aItLIjDjUuhidRoNE E08nwYvFDCvLZkSlc44SsoT5sBT0YoubW0/S6fBEjC6Gf89cnjZ53g== =0xKw -----END PGP SIGNATURE-----

Carlos

5:59 p.m.

Tels wrote:

...

I don't know much about Ruby, except that I heard Unicode support is really lacking. Which pretty much rules it out for anything serious text processing in this age :-D

Does MediaWiki do "anything serious text processing"?

henna

7:11 p.m.

New subject: OT: What language would *you* have done it in, and why?

On 7/28/06, Carlos angus@quovadis.com.ar wrote:

...

Tels wrote:

...
I don't know much about Ruby, except that I heard Unicode support is really lacking. Which pretty much rules it out for anything serious text processing in this age :-D

Does MediaWiki do "anything serious text processing"?

In case you didn't know, mediawiki works with several non-english language, which are more or less impossible to support unless your programming language understands Unicode.

henna

-- "Maybe you knew early on that your track went from point A to B, but unlike you I wasn't given a map at birth!" Alyssa, "Chasing Amy"

Carlos

7:39 p.m.

henna wrote:

...

On 7/28/06, Carlos angus@quovadis.com.ar wrote:

...
Tels wrote:

...
I don't know much about Ruby, except that I heard Unicode support is really lacking. Which pretty much rules it out for anything serious text processing in this age :-D

Does MediaWiki do "anything serious text processing"?

In case you didn't know, mediawiki works with several non-english language, which are more or less impossible to support unless your programming language understands Unicode.

And how is it possible now?

Tip: investigate before answering (search for 'xc0' in the code tree).

Platonides

29 Jul 29 Jul

12:47 a.m.

New subject: OT: What language would *you* have done it in, and why?

"Carlos" wrote:

...

...
In case you didn't know, mediawiki works with several non-english language, which are more or less impossible to support unless your programming language understands Unicode.

And how is it possible now?

Tip: investigate before answering (search for 'xc0' in the code tree).

MediaWiki uses the php binary strings, where you can store anything. Unicode or not.

Warhog (aja Julian Fleischer)

31 Jul 31 Jul

9:46 p.m.

Unicode support in PHP is also lacking, you can't safe a PHP-Script as UTF-8 or even UTF-16. But a programming language more or less does not even need to support, as unicode is designed to be compatible to older charsets. You can also handle strings as binary data, no problem then.

For PHP there is also a multibyte extension and there are some functions available which allow you to converts ISo-8859-X and UTF-8 (for XML support e.g.). I don't know how far mediawiki makes use of such features, as it uses UTF-8 only i guess - which is compatible and therefore there is no need for PHP to be "compliant" or something like that.

Carlos schrieb:

...

henna wrote:

...
On 7/28/06, Carlos angus@quovadis.com.ar wrote:

...
Tels wrote:

...
I don't know much about Ruby, except that I heard Unicode support is really lacking. Which pretty much rules it out for anything serious text processing in this age :-D

Does MediaWiki do "anything serious text processing"?

In case you didn't know, mediawiki works with several non-english language, which are more or less impossible to support unless your programming language understands Unicode.

And how is it possible now?

Tip: investigate before answering (search for 'xc0' in the code tree).

Chad Perrin

1 Aug 1 Aug

1:35 a.m.

New subject: OT: What language would *you* have done it in, and why?

On Mon, Jul 31, 2006 at 04:46:58PM +0200, Warhog (aja Julian Fleischer) wrote:

...

Unicode support in PHP is also lacking, you can't safe a PHP-Script as UTF-8 or even UTF-16. But a programming language more or less does not even need to support, as unicode is designed to be compatible to older charsets. You can also handle strings as binary data, no problem then.

For PHP there is also a multibyte extension and there are some functions available which allow you to converts ISo-8859-X and UTF-8 (for XML support e.g.). I don't know how far mediawiki makes use of such features, as it uses UTF-8 only i guess - which is compatible and therefore there is no need for PHP to be "compliant" or something like that.

PHP is supposedly planning to incorporate Python's ICU, which has some reasonable Unicode support for regexen, at some point in the future. Ruby is reportedly integrating Oniguruma (a regular expression engine) by the end of the year, which will apparently provide substantial Unicode support -- though Oniguruma can be used now as an external library, of course, and someone started supporting ICU support for Ruby a while ago too (as an external library -- though of course it's an external library in Python too). Perl, of course, probably has several dozen ways to support Unicode in CPAN.

. . . but as far as I'm aware, there's no such thing as a language that provides full native Unicode support. The best we could do is use an external library, which is something you can do with Ruby anyway.

-- CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ] unix virus: If you're using a unixlike OS, please forward this to 20 others and erase your system partition.

Simetrical

2:25 a.m.

New subject: OT: What language would *you* have done it in, and why?

On 7/31/06, Chad Perrin perrin@apotheon.com wrote:

...

. . . but as far as I'm aware, there's no such thing as a language that provides full native Unicode support.

It appears PHP 6 will: http://www.zend.com/zend/week/php-unicode-design.txt.

Tim Starling

4:52 a.m.

New subject: OT: What language would *you* have done it in, and why?

Chad Perrin wrote:

...

PHP is supposedly planning to incorporate Python's ICU, which has some reasonable Unicode support for regexen, at some point in the future.

PHP already has unicode regex support, because PCRE has had it for some time and PHP just bundles that. In fact, the simplest way to split a UTF-8 string by character in PHP 4-5 with no mbstring is to do preg_match_all('/./u',...). MediaWiki uses this on occasion.

In PHP 6, they are moving to a 16-bit character type (not sure if it's UTF-16 or UCS-2), with a distinct binary string type. If "unicode semantics" are enabled, string literals will be unicode by default, and all the usual string operations would be character-wise. I dare say this would cause some backwards compatibility problems for applications such as MediaWiki.

PHP 6 requires ICU for its internal unicode support, but I'm not sure to what extent they will be providing interfaces to ICU's more complex functions. Note that ICU is not "Python's ICU", it's a library written by IBM which is natively C, C++ and Java. There is a set of swig wrappers to bind the C++ API to Python.

-- Tim Starling

Chad Perrin

28 Jul 28 Jul

11:19 p.m.

New subject: OT: What language would *you* have done it in, and why?

On Fri, Jul 28, 2006 at 12:40:00PM +0200, Tels wrote:

...

On Friday 28 July 2006 08:44, Chad Perrin wrote: [snipauiteabit]

...
Perl really strikes me as the clear winner, overall, with Ruby a close second about a hair's-breadth behind it.

<offtopic>

I don't know much about Ruby, except that I heard Unicode support is really lacking. Which pretty much rules it out for anything serious text processing in this age :-D

"Really lacking" seems to be a bit too vehement for the current state of affairs. As far as I'm aware, regex support is excellent. Some kernel methods don't handle Unicode transformations as well as they could, though the problem isn't in Unicode support so much as in localization (handling capitalization in German and Turkish, for instance, which are a touch quirky by English language standards). I'm pretty sure Ruby isn't the only language to run into localization issues from time to time, and it may be that Ruby runs into them more often because of its wealth of convenient string operation methods that are often lacking in other languages.

...

However, the entire Ruby project always stuck me as a me-too-lets-reinvent-the-wheel-and-this-time-make-it-rounder project, like so many others (*cough*Perl6*cough).

1. That could be said of EVERY language. I suppose we could just rewrite MediaWiki in Assembly language, after all.

2. Ruby is quite NOT a "metoo" language. It's older than most realize (spending most of its formative years in Japan that largely gained Western recognition due to the advent of Rails does not equate to a lifespan measurable entirely in this century). It also offers a lot of capabilities and conveniences not found in many other languages. It may be the closest thing we've got to a Lisp with an imperative/OO syntax, for instance.

3. Perl 6 is an attempt to address some very real problems, not just a "reinvent and make it rounder" effort as you so easily dismiss it. The fact it's vaporware doesn't change the fact that, if it ever completes, it's likely to be an excellent and needed language. On the other hand, I suppose all we really need is a hex editor so we're not reinventing wheels.

...

Yes, Perl5 has some problems, like carrying baggage from a decade or two that nobody really needs anymore, but I am not sure that yet-another-interpreted-language (that is only 60..90% complete, undertested etc) is the real answer. It just fragments the coder base even more.

Did you perhaps mean "99.60..99.90%" there?

...

We have way too many programming languages already.

Yeah, progress sucks. Ahem.

-- CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ] "The measure on a man's real character is what he would do if he knew he would never be found out." - Thomas McCauley

jf＠mormo.org

11:39 p.m.

New subject: OT: What language would *you* have done it in, and why?

On Fri, Jul 28, 2006 at 12:44:33AM -0600, Chad Perrin wrote:

...

I like the proposed idea of writing one or two core, high-load components in C, or even (if we're really adventurous) something better performing like Ada, though that's probably really pushing it. Since I don't much enjoy looking at C code, though, I might just ask someone else to write the C, so I guess *I* in particular might not implement any of it in C. Eh.

We actually do this. The diff engine exists as a PHP plugin written in C (or C++, which is nearly the same). Tim is working on another plugin that will do much faster translations of UTF-8 strings from e.g. traditional to simplified Chinese.

Regards,

jens

Chad Perrin

11:59 p.m.

New subject: OT: What language would *you* have done it in, and why?

On Fri, Jul 28, 2006 at 06:39:12PM +0200, Jens Frank wrote:

...

On Fri, Jul 28, 2006 at 12:44:33AM -0600, Chad Perrin wrote:

...
I like the proposed idea of writing one or two core, high-load components in C, or even (if we're really adventurous) something better performing like Ada, though that's probably really pushing it. Since I don't much enjoy looking at C code, though, I might just ask someone else to write the C, so I guess *I* in particular might not implement any of it in C. Eh.

We actually do this. The diff engine exists as a PHP plugin written in C (or C++, which is nearly the same).

. . . aside from the fact that it often exhibits an at least arithmetic increase in execution time as compared with straight C.

Sorry, minor, largely pointless quibble. There's a bit of a language execution time debate going on somewhere else that has absorbed me recently. The subject is sorta stuck in my head at the moment.

-- CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ] print substr("Just another Perl hacker", 0, -2);

Jay R. Ashworth

29 Jul 29 Jul

3:58 a.m.

New subject: OT: What language would *you* have done it in, and why?

On Fri, Jul 28, 2006 at 10:59:40AM -0600, Chad Perrin wrote:

...

. . . aside from the fact that it often exhibits an at least arithmetic increase in execution time as compared with straight C.

Sorry, minor, largely pointless quibble.

It's not pointless when you're putting out >100Mbps continuously.

Cheers, -- jra

Tim Starling

30 Jul 30 Jul

8:19 p.m.

New subject: OT: What language would *you* have done it in, and why?

Chad Perrin wrote:

...

On Fri, Jul 28, 2006 at 06:39:12PM +0200, Jens Frank wrote:

[...]

...

...
We actually do this. The diff engine exists as a PHP plugin written in C (or C++, which is nearly the same).

. . . aside from the fact that it often exhibits an at least arithmetic increase in execution time as compared with straight C.

Sorry, minor, largely pointless quibble. There's a bit of a language execution time debate going on somewhere else that has absorbed me recently. The subject is sorta stuck in my head at the moment.

Through my aborted PhD, I have a couple of years experience writing highly efficient C++ programs. The aforementioned wikidiff2 extension does no memory allocation during the computational phase, instead of substring operations it passes around start and end iterators. There are no virtual functions. The only remaining performance hit is the need to pass "this" pointers and dereference them, although that's a feature of structured C programming as well, and provides a great deal of flexibility. The performance penalty can be controlled through the extensive use of inline functions.

There are certain programming styles which are popular in C++, which do produce a significant performance hit. That's not an issue when your main goal is to produce high-performance code -- in that case you can easily avoid the pitfalls, and the remaining optimisation issues are largely the same as in C.

-- Tim Starling

Steve Summit

10:31 p.m.

New subject: diff efficiency

On the subject of diff efficiency, which button is more expensive, Show Preview or Show Changes? I have an ultraparanoid bot which "presses" one or both before actually submitting, and it occurs to me that it would be marginally easier on the servers if the bot used the cheaper one first. (Though of course only if substantial numbers of its doublechecks failed, which they typically don't.)

mboverload

31 Jul 31 Jul

4:30 a.m.

New subject: diff efficiency

Show changes has to compare two versions, so I would guess that's the bigger hog

On 7/30/06, Steve Summit scs@eskimo.com wrote:

...

On the subject of diff efficiency, which button is more expensive, Show Preview or Show Changes? I have an ultraparanoid bot which "presses" one or both before actually submitting, and it occurs to me that it would be marginally easier on the servers if the bot used the cheaper one first. (Though of course only if substantial numbers of its doublechecks failed, which they typically don't.) _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l

Tim Starling

9:28 a.m.

New subject: diff efficiency

Steve Summit wrote:

...

On the subject of diff efficiency, which button is more expensive, Show Preview or Show Changes? I have an ultraparanoid bot which "presses" one or both before actually submitting, and it occurs to me that it would be marginally easier on the servers if the bot used the cheaper one first. (Though of course only if substantial numbers of its doublechecks failed, which they typically don't.)

For whole pages, generating a diff takes about 2ms plus 40ms to load the text to diff against, rendering a page takes about 800ms. So it's cheaper to use "show changes" by a factor of 20. In fact, we've been considering suppressing the "current version" display from diff pages by default, especially for large pages, since rendering the current version constitutes the vast majority of the time it takes to generate those pages. These figures are averages based on latest profiling, some pages take significantly longer than 800ms.

-- Tim Starling

Gregory Maxwell

11:43 a.m.

New subject: diff efficiency

On 7/30/06, Tim Starling t.starling@physics.unimelb.edu.au wrote:

...

For whole pages, generating a diff takes about 2ms plus 40ms to load the text to diff against, rendering a page takes about 800ms.

800ms is consistent from what I measured when I was trying to parse all past revisions.... but I thought I was doing something wrong at the time because I couldn't imagine that it was that slow.

Steve Bennett

3:05 p.m.

New subject: diff efficiency

On 7/31/06, Tim Starling t.starling@physics.unimelb.edu.au wrote:

...

For whole pages, generating a diff takes about 2ms plus 40ms to load the text to diff against, rendering a page takes about 800ms. So it's cheaper to use "show changes" by a factor of 20. In fact, we've been considering suppressing the "current version" display from diff pages by default, especially for large pages, since rendering the current version constitutes the vast majority of the time it takes to generate those pages. These figures are averages based on latest profiling, some pages take significantly longer than 800ms.

Now, if only the rendering all took place in JavaScript...!

Steve

Steve Summit

11:14 p.m.

New subject: diff efficiency

Tim Starling wrote:

...

For whole pages, generating a diff takes about 2ms plus 40ms to load the text to diff against, rendering a page takes about 800ms.

I'd gotten the impression that rendering was slower. Thanks for confirming.

...

So it's cheaper to use "show changes" by a factor of 20.

That much!

...

In fact, we've been considering suppressing the "current version" display from diff pages...

But until you do, I guess the point (i.e. which button a conscientious bot or user should press first) is moot, since the slow preview button takes 800 ms, and the fast diff button takes 40+800ms.

Simetrical

1 Aug 1 Aug

2:23 a.m.

New subject: diff efficiency

On 7/31/06, Steve Summit scs@eskimo.com wrote:

...

But until you do, I guess the point (i.e. which button a conscientious bot or user should press first) is moot, since the slow preview button takes 800 ms, and the fast diff button takes 40+800ms.

Nope, because "show changes" doesn't render the page. It just shows you the diff.

Steve Summit

9:56 a.m.

New subject: diff efficiency

Simetrical wrote:

...

On 7/31/06, Steve Summit scs@eskimo.com wrote:

...
Tim Starling wrote:

...
In fact, we've been considering suppressing the "current version" display from diff pages...

But until you do, I guess the point (i.e. which button a conscientious bot or user should press first) is moot, since the slow preview button takes 800 ms, and the fast diff button takes 40+800ms.

Nope, because "show changes" doesn't render the page. It just shows you the diff.

(smacks forehead)

I fell for Tim's sly misdirection utterly. :-) He was talking about, and just there I was thinking of, the diffs run off of the history page, not the edit page.

mboverload

10:47 a.m.

New subject: diff efficiency

How 'bout we get Cray to donate a supercomputer and we just have that do the diffs. =D

On 7/31/06, Steve Summit scs@eskimo.com wrote:

...

Simetrical wrote:

...
On 7/31/06, Steve Summit scs@eskimo.com wrote:

...
Tim Starling wrote:

...
In fact, we've been considering suppressing the "current version" display from diff pages...

But until you do, I guess the point (i.e. which button a conscientious bot or user should press first) is moot, since the slow preview button takes 800 ms, and the fast diff button takes 40+800ms.

Nope, because "show changes" doesn't render the page. It just shows you the diff.

(smacks forehead)

I fell for Tim's sly misdirection utterly. :-) He was talking about, and just there I was thinking of, the diffs run off of the history page, not the edit page. _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l

Stephen Bain

6:03 p.m.

New subject: diff efficiency

On 8/1/06, mboverload mboverload@gmail.com wrote:

...

How 'bout we get Cray to donate a supercomputer and we just have that do the diffs. =D

We'll get a few and give them typewriters, then they can go ahead and write the encyclopaedia. Easier to clean up after than all those monkeys.

-- Stephen Bain stephen.bain@gmail.com

Domas Mituzas

28 Jul 28 Jul

2:46 p.m.

New subject: OT: What language would *you* have done it in, and why?

Hi!

...

Dunno about Chad, but I would have implemented it in Perl, of course. With perhaps bison and Inline::* where needed.

Do I have still time to suggest ObjectPerlthonasskelisp++#.net?

Anyway, languages/environments used by bigger web shops:

PHP (Yahoo, Wikipedia!!!) Python (parts of Google?) Java (um, around) ColdFusion (MySpace :-)

Amazon has SOA with stuff around implemented in nearly all languages... Quite nice approach is Mono with different languages inside (like.. Python engine by Microsoft ;-) As for personal preferences, I'd run away from Perl (did drop it 5 years ago), and use Python (with psyco \o/ )

BR, Domas

-- Domas Mituzas -- http://dammit.lt/ -- [[user:midom]]

Chad Perrin

2:50 p.m.

New subject: OT: What language would *you* have done it in, and why?

On Fri, Jul 28, 2006 at 10:46:36AM +0300, Domas Mituzas wrote:

...

Hi!

...
Dunno about Chad, but I would have implemented it in Perl, of course. With perhaps bison and Inline::* where needed.

Do I have still time to suggest ObjectPerlthonasskelisp++#.net?

Anyway, languages/environments used by bigger web shops:

PHP (Yahoo, Wikipedia!!!) Python (parts of Google?) Java (um, around) ColdFusion (MySpace :-)

Perl (Slashdot)

Don't forget where the term "the slashdot effect" originated.

...

Amazon has SOA with stuff around implemented in nearly all languages... Quite nice approach is Mono with different languages inside (like.. Python engine by Microsoft ;-) As for personal preferences, I'd run away from Perl (did drop it 5 years ago), and use Python (with psyco \o/ )

Python makes my eyes bleed. I'd be much happier with Perl -- ESPECIALLY for anything involving regexen. Holy mudder o' gob, but Python's regex syntax is a fork in the eye, comparable in heinousness to PHP's.

-- CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ] "The ability to quote is a serviceable substitute for wit." - W. Somerset Maugham

henna

3:31 p.m.

New subject: OT: What language would *you* have done it in, and why?

On 7/28/06, Domas Mituzas midom.lists@gmail.com wrote:

...

Hi!

...
Dunno about Chad, but I would have implemented it in Perl, of course. With perhaps bison and Inline::* where needed.

Do I have still time to suggest ObjectPerlthonasskelisp++#.net?

Anyway, languages/environments used by bigger web shops:

PHP (Yahoo, Wikipedia!!!) Python (parts of Google?)

google does indeed use python for stuff, as well as C if I'm not mistaken

henna

-- "Maybe you knew early on that your track went from point A to B, but unlike you I wasn't given a map at birth!" Alyssa, "Chasing Amy"

Ivan Krstic

3:36 p.m.

henna wrote:

...

google does indeed use python for stuff, as well as C if I'm not mistaken

Java, Python, C++ are the three most common languages at Google, in that order, according to Greg Stein in '05.

-- Ivan Krstic krstic@solarsail.hcs.harvard.edu | GPG: 0x147C722D

Jay R. Ashworth

10:41 p.m.

New subject: FOLO: What would we lose, speciifically?

On Fri, Jul 28, 2006 at 10:46:36AM +0300, Domas Mituzas wrote:

...

Do I have still time to suggest ObjectPerlthonasskelisp++#.net?

That's sorta like the language name version of Chriskwaanzukkah, right?

...

Python (parts of Google?)

And this is the dog that didn't bark; I was a bit surprised that we didn't hear any more comments about Python as a possibility.

Ok, second question: how much leverage would we *lose* if we weren't in PHP? (Isn't some of our caching PHP-specific?)

Cheers, -- jra

Chad Perrin

10:44 p.m.

New subject: FOLO: What would we lose, speciifically?

On Fri, Jul 28, 2006 at 11:41:42AM -0400, Jay R. Ashworth wrote:

...

On Fri, Jul 28, 2006 at 10:46:36AM +0300, Domas Mituzas wrote:

...
Do I have still time to suggest ObjectPerlthonasskelisp++#.net?

That's sorta like the language name version of Chriskwaanzukkah, right?

I think that's "Chrismahannukwanzaaka", actually.

...

...
Python (parts of Google?)

And this is the dog that didn't bark; I was a bit surprised that we didn't hear any more comments about Python as a possibility.

I'm glad we didn't.

By the way . . . When I mentioned Slashdot as a Perl example, I apparently shot too low. I'd forgotten about Amazon.com, LiveJournal, IMDB, and del.icio.us, among others.

-- CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ] This sig for rent: a Signify v1.14 production from http://www.debian.org/

Erik Moeller

2:45 p.m.

New subject: OT: What language would *you* have done it in, and why?

On 7/27/06, Jay R. Ashworth jra@baylink.com wrote:

...

Unthreaded: in a clear field, Chad, what *would* you have implemented MediaWiki in? And why?

Much of recent development and administration has focused on caching, clustering, failover, load balancing, and so on. It seems to me that the decision to use a ready-made application server like JBoss, Resin or Zope (or WebSphere if you want proprietary) or a different database server would make a greater difference for long term deployment of a very large scale wiki farm like Wikimedia than the choice of a particular programming language (though of course one may imply the other).

That being said, from the rough numbers I've seen about similarly sized sites like eBay or Amazon.com, which typically use such application server architectures, we are running on a ridiculously small amount of hardware. For instance, eBay in 2004 was running with 200 database backend servers [1] -- I don't think you'll find detailed specs for these, but according to [2] we're talking about big Sun machines. Of course, we're not even close to Amazon.com's or eBay's reliability.

Nevertheless, it seems clear that our "roll your own" approach, while more intensive in developer work, can save significantly on hardware. It's also interesting to compare Flickr's technological evolution, which is quite similar to our own: http://www.ludicorp.com/flickr/zend-talk.ppt

Is similar information available about Yahoo!'s setup?

One major downside is that such "perpetually customized" setups can become very complex and hard to replicate quite quickly.

It's also important to emphasize that _Wikimedia_ (as opposed to MediaWiki) runs on more than just PHP. In fact, there's probably not a single mainstream programming language that hasn't been used somewhere on the Wikimedia servers. Brion seems to greatly enjoy experimenting with new languages, and even MediaWiki itself comes with an OCaml extension. :-) It's certainly a rich learning environment.

Erik

[1] http://www.eweek.com/article2/0,1895,1640310,00.asp [2] http://www.sun.com/service/about/success/ebay.xml

Domas Mituzas

3:08 p.m.

New subject: OT: What language would *you* have done it in, and why?

Hi!

...

Much of recent development and administration has focused on caching, clustering, failover, load balancing, and so on.

This is more part of architecture, than of software used.

...

It seems to me that the decision to use a ready-made application server like JBoss, Resin or Zope (or WebSphere if you want proprietary) or a different database server would make a greater difference for long term deployment of a very large scale wiki farm like Wikimedia than the choice of a particular programming language (though of course one may imply the other).

I'm not aware of any huge-scale environments, that would be running on Jboss, Resin or Zope (or Websphere). Application servers with all the integration magic are required mostly for complex applications that have hundreds of thousands of developers ;)

Regarding databases - here again, architecture imposes what you use. Right now our architecture consists of:

a) Small replicated core database sets (per-language) b) Pools of replicated text storage nodes c) Pool of in-memory hash lossy store nodes

Probably we might be adding a

d) Pool of fully clustered storage, for session objects.

What we can introduce - different storage paradigms for different objects, and here we choose software that works and is easy to maintain.

...

That being said, from the rough numbers I've seen about similarly sized sites like eBay or Amazon.com, which typically use such application server architectures, we are running on a ridiculously

I'm not totally compatible with all enterprise software, but at least Amazon is using pretty lightweight setup with most of stuff being routed to 'services' around, SOAP, WSDL, yadda yadda. I'm not sure if they need full-blown app server for this at the front.

...

Nevertheless, it seems clear that our "roll your own" approach, while more intensive in developer work, can save significantly on hardware. It's also interesting to compare Flickr's technological evolution, which is quite similar to our own: http://www.ludicorp.com/flickr/zend-talk.ppt

I'm not sure it takes us longer to roll our own stuff, than it would take to leverage all the other solutions.

...

Is similar information available about Yahoo!'s setup?

They're quite similar to us, it is just that they have more redundancy over multiple datacenters (and have practice of serving from multiple datacenters at the same time too).

...

with new languages, and even MediaWiki itself comes with an OCaml extension. :-) It's certainly a rich learning environment.

Well, yes, we even have direct PHP extensions written in C++/C, and there's various outside-of-mediawiki code in boo, python, C#, perl too ;-)

-- Domas Mituzas -- http://dammit.lt/ -- [[user:midom]]

Erik Moeller

3:31 p.m.

New subject: OT: What language would *you* have done it in, and why?

On 7/28/06, Domas Mituzas midom.lists@gmail.com wrote:

...

I'm not aware of any huge-scale environments, that would be running on Jboss, Resin or Zope (or Websphere).

The open source ones are fairly untested in _huge_-scale environments AFAICT (then again, so was PHP until recently). But for "enterprise level" scalability, see http://www.zope.com/customers/case_studies.html http://www.jboss.com/customers/index

eBay is "powered by WebSphere" and probably a good comparison as a dynamic web application. They're still getting more traffic than we do. ;-)

...

I'm not totally compatible with all enterprise software, but at least Amazon is using pretty lightweight setup with most of stuff being routed to 'services' around, SOAP, WSDL, yadda yadda. I'm not sure if they need full-blown app server for this at the front.

What I do know is that they're making a bloody mess out of all those aggregated web services. A little tagging here, a little wiki there, here's some recommendations, and a wishlist and a puppy, too ...

Erik

Delirium

29 Jul 29 Jul

4:36 a.m.

Erik Moeller wrote:

...

On 7/27/06, Jay R. Ashworth jra@baylink.com wrote:

...
Unthreaded: in a clear field, Chad, what *would* you have implemented MediaWiki in? And why?

Much of recent development and administration has focused on caching, clustering, failover, load balancing, and so on. It seems to me that the decision to use a ready-made application server like JBoss, Resin or Zope (or WebSphere if you want proprietary) or a different database server would make a greater difference for long term deployment of a very large scale wiki farm like Wikimedia than the choice of a particular programming language (though of course one may imply the other).

One downside to the application-server approach is that, unless carefully written so use of the application server was optional, it would significantly complicate the reuse potential of MediaWiki by smaller-scale operations. Now making MediaWiki maximally reusable open-source software is a secondary goal to using it ourselves, but insofar as it's the only reliable renderer of our content, it's not entirely irrelevant.

The current setup has the nice benefit that it is almost ludicrously easy to configure: The only prerequisites are the very standard MySQL and PHP, and then you just untar the source files and run config.php.

-Mark

Chad Perrin

5:41 a.m.

New subject: OT: What language would *you* have done it in, and why?

On Fri, Jul 28, 2006 at 05:36:38PM -0400, Delirium wrote:

...

The current setup has the nice benefit that it is almost ludicrously easy to configure: The only prerequisites are the very standard MySQL and PHP, and then you just untar the source files and run config.php.

. . . and that is just awesome.

Tels

7:11 a.m.

New subject: OT: What language would *you* have done it in, and why?

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Moin,

On Saturday 29 July 2006 00:41, Chad Perrin wrote:

...

On Fri, Jul 28, 2006 at 05:36:38PM -0400, Delirium wrote:

...
The current setup has the nice benefit that it is almost ludicrously easy to configure: The only prerequisites are the very standard MySQL and PHP, and then you just untar the source files and run config.php.

. . . and that is just awesome.

Indeed. It made painfully aware to me how complicated most softwareis to install, especially the one written by me. Taking a slice out of that and making software easier to install and use is now oneof my goals, and mediawiki is a role model for that.

best wishes,

tels

- -- Signed on Sat Jul 29 02:10:52 2006 with key 0x93B84C15. Visit my photo gallery at http://bloodgate.com/photos/ PGP key on http://bloodgate.com/tels.asc or per email.

"Some spammers have this warped idea that their freedom of speech is guaranteed all the way into my hard drive, but it is my firm belief that their rights end at my firewall." -- Nigel Featherston

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) iQEVAwUBRMqn0XcLPEOTuEwVAQJm4wf5Ab7W8QaNPnDcyJj/aiEQYt1FL4m9IYG0 dC2RZiYhZvmA2itXy+yipz8yKTUJE0sy5hJHAjrKqSXc/dBQL3Gfb6q0003nzGme kefyI3+h/RSitLeX3BQg8VZmIdZ+J3B4u/XIoWodbrjicu46RjIfIBJ+hjg6U5xu aIZI6hF+RWQ/pO7hSsIBsrADtp0ljujbi5kMDOVvlKZPVgRrXPZwVZ8FWloN0v/i AY0pV32xQ0Z0VT4FtH+n1HezTQSSFlQnCLRATcQeqK4bM0DItLy2cMedY8u8DdAG OkOP4SIILhpdbUPzqbJqxKcXaGTxcr64eynqNsAT4wF+tjKmKwbUiA== =mIBt -----END PGP SIGNATURE-----

6688

Age (days ago)

6725

Last active (days ago)

wikitech-l@lists.wikimedia.org

64 comments

27 participants

tags (0)

participants (27)

Carlos
Chad Perrin
Delirium
djafer3107 maatallah
Domas Mituzas
Edward Z. Yang
Erik Moeller
Gerard Meijssen
Greg Sabino Mullane
Gregory Maxwell
henna
Ilmari Karonen
Ivan Krstic
Jay R. Ashworth
Jeff Carr
jf＠mormo.org
Lee Daniel Crocker
mboverload
Platonides
Simetrical
Stephen Bain
Steve Bennett
Steve Summit
Tels
Tim Starling
Timwi
Warhog (aja Julian Fleischer)