PHP Wikipedia

List overview All Threads
Download

newer

older

PHP Wikipedia, Part 2

RE: [Wikipedia-l] PHP Wikipedia

Magnus Manske

24 Aug 2001 24 Aug '01

7:45 p.m.

Hi all,

as a few of you might know, I just wrote a complete (well, almost) Wikipedia software as a PHP script!

It has all essential wiki features like article editing, version management, user management, subpages, etc. Additionally, its data storage is completely MySQL (fast!), it has a file upload tool, some other goodies soon. Maybe best of all, script and database are prepared to support some kind of editor/superuser functionality for "locking" pages, as it is currently discussed.

Now to the bad sides (yes, there are some...) - I don't have a server to host it yet. Maybe I can run it on the Nupedia server sometimes. So, no trying yet, sorry. - The parser (to convert the source text into readable stuff) is very basic. I copied the HomePage and the SandBox from wikipedia, and they look about the same, but this is wherte the bugs will be. - Currently, I don't have a means to convert wikipedia to MySQL automatically, which is what would have to be done if (IF!) this script ever gets used.

Just letting you know there's an early but working alternative ready...

Magnus

Show replies by date

Jimmy Wales

24 Aug 24 Aug

9:30 p.m.

How difficult do you think it will be to import all the existing wikipedia data into your version?

What's your take on the general availability of PHP skills on the net, versus Perl skills?

I've been talking to Clifford Adams about the future of Wikipedia and UseModWiki, and he's given his blessing to a Wikipedia-centric fork of his code. Some of the design goals for UseModWiki, namely that it be easy to install, etc., aren't necessarily consistent with some of the more highly specialized needs of Wikipedia. He suggests that we find someone (Magnus? :-)) to take over the forked version, and that we could set up a modern CVS, etc.

I'm reluctant to go the PHP route, partially because I don't personally know much about PHP, but if you can convince me that it's sufficiently superior to a mod_perl or perl fastcgi solution, I'm thinking that we could go your route. Your code could be wikipedia-centric, and we could all learn to help you with it.

--Jimbo

Magnus Manske wrote:

...

Hi all,

as a few of you might know, I just wrote a complete (well, almost) Wikipedia software as a PHP script!

It has all essential wiki features like article editing, version management, user management, subpages, etc. Additionally, its data storage is completely MySQL (fast!), it has a file upload tool, some other goodies soon. Maybe best of all, script and database are prepared to support some kind of editor/superuser functionality for "locking" pages, as it is currently discussed.

Now to the bad sides (yes, there are some...)

I don't have a server to host it yet. Maybe I can run it on the Nupedia

server sometimes. So, no trying yet, sorry.

The parser (to convert the source text into readable stuff) is very basic.

I copied the HomePage and the SandBox from wikipedia, and they look about the same, but this is wherte the bugs will be.

Currently, I don't have a means to convert wikipedia to MySQL

automatically, which is what would have to be done if (IF!) this script ever gets used.

Just letting you know there's an early but working alternative ready...

Magnus

[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l

-- ************************************************* * http://www.nupedia.com/ * * The Ever Expanding Free Encyclopedia * *************************************************

Magnus Manske

8:10 p.m.

I downloaded one of these traballs, but didn't look at it in detail, so I'm uncertain how complicated that would be. But if we're to move to MySQL anyway, either with my solution or a new CGI script, conversion is unavoidable.

I guess there are enough people out there who know some PHP, and it's easy to learn (easier than CGI, I think). As you know, Nupedia is PHP-based, and it works quite well. The script could also become a wikipedia "article" itself, which can be copied to a real file after an upgrade or error correction. It might be a good idea to limit write access to that "article", though. That could make a CVS unnecessary.

Which is another thing I need to know. Should I implement a "rights management"? If yes, how? Probably group-based, with an "editor" group giving out special rights? And limits that can be changes for each page, or should that go by page name, like "/Talk is open to everyone, /Lock can only be changed by editors"? Please give me input on this!

BTW, who should I contact for testing my script and the DB on a "real" server? Toan? And should I make the script public now, so everyone can take a look? As a wiki article, or post it here on the list?

Oh, so many questions...

Finally, I'm not sure I can do the long-term of the script maintnance myself. I'll work on the rights management and bug-fixing until it runs without bugs, of course, and can fix minor things when they pop up. But in a few weeks/month, I'll be quite busy at university, and can't promise anything there.

Magnus

...

-----Original Message----- From: wikipedia-l-admin@nupedia.com [mailto:wikipedia-l-admin@nupedia.com]On Behalf Of Jimmy Wales Sent: Friday, August 24, 2001 11:30 PM To: wikipedia-l@nupedia.com Subject: Re: [Wikipedia-l] PHP Wikipedia

How difficult do you think it will be to import all the existing wikipedia data into your version?

What's your take on the general availability of PHP skills on the net, versus Perl skills?

I've been talking to Clifford Adams about the future of Wikipedia and UseModWiki, and he's given his blessing to a Wikipedia-centric fork of his code. Some of the design goals for UseModWiki, namely that it be easy to install, etc., aren't necessarily consistent with some of the more highly specialized needs of Wikipedia. He suggests that we find someone (Magnus? :-)) to take over the forked version, and that we could set up a modern CVS, etc.

I'm reluctant to go the PHP route, partially because I don't personally know much about PHP, but if you can convince me that it's sufficiently superior to a mod_perl or perl fastcgi solution, I'm thinking that we could go your route. Your code could be wikipedia-centric, and we could all learn to help you with it.

--Jimbo

Magnus Manske wrote:

...
Hi all,

as a few of you might know, I just wrote a complete (well,

almost) Wikipedia

...
software as a PHP script!

It has all essential wiki features like article editing,

version management,

...
user management, subpages, etc. Additionally, its data storage is completely MySQL (fast!), it

has a file

...
upload tool, some other goodies soon. Maybe best of all, script and database are prepared to support

some kind of

...
editor/superuser functionality for "locking" pages, as it is currently discussed.

Now to the bad sides (yes, there are some...)

I don't have a server to host it yet. Maybe I can run it on

the Nupedia

...
server sometimes. So, no trying yet, sorry.

The parser (to convert the source text into readable stuff)

is very basic.

...
I copied the HomePage and the SandBox from wikipedia, and they

look about

...
the same, but this is wherte the bugs will be.

Currently, I don't have a means to convert wikipedia to MySQL

automatically, which is what would have to be done if (IF!)

this script ever

...
gets used.

Just letting you know there's an early but working alternative ready...

Magnus

[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l

--
       http://www.nupedia.com/            *
 The Ever Expanding Free Encyclopedia     *
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l

lsanger＠ross.bomis.com

9:57 p.m.

On Fri, 24 Aug 2001, Magnus Manske wrote:

...

I guess there are enough people out there who know some PHP, and it's easy to learn (easier than CGI, I think). As you know, Nupedia is PHP-based, and it works quite well. The script could also become a wikipedia "article" itself, which can be copied to a real file after an upgrade or error correction. It might be a good idea to limit write access to that "article", though. That could make a CVS unnecessary.

But if we had the CVS, we could test the code more easily, couldn't we? Teach me, o master.

...

Which is another thing I need to know. Should I implement a "rights management"? If yes, how? Probably group-based, with an "editor" group giving out special rights? And limits that can be changes for each page, or should that go by page name, like "/Talk is open to everyone, /Lock can only be changed by editors"? Please give me input on this!

Basically, if you can swing it, I think each page should have its own *automatic* talk page that is not counted among the articles. This has all sorts of advantages. Then, eventually, we could automatically port all the [[Foo/Talk]] contents to the automatic-talk-page contents (if the latter didn't already have some contents). (We wouldn't want to do that by hand.)

...

Finally, I'm not sure I can do the long-term of the script maintnance myself. I'll work on the rights management and bug-fixing until it runs without bugs, of course, and can fix minor things when they pop up. But in a few weeks/month, I'll be quite busy at university, and can't promise anything there.

I think that's fantastic and no problems.

Larry

Jimmy Wales

10:07 p.m.

Magnus Manske wrote:

...

I downloaded one of these traballs, but didn't look at it in detail, so I'm uncertain how complicated that would be. But if we're to move to MySQL anyway, either with my solution or a new CGI script, conversion is unavoidable.

Agreed. I'm not sure why we want to move to MySQL, at least I'm not sure why *in particular*, although I understand the general desirability of a "real" database in general.

...

Which is another thing I need to know. Should I implement a "rights management"? If yes, how? Probably group-based, with an "editor" group giving out special rights? And limits that can be changes for each page, or should that go by page name, like "/Talk is open to everyone, /Lock can only be changed by editors"? Please give me input on this!

I should think that by default all pages should be open to everyone. If and when we need locking capability, we can see what we need it for, and add it in a limited way for that.

However, I personally will fight hard against the idea of locking *any* pages, with the *possible* exception of some "meta" pages which talk about how the software works, etc. Even there, I don't see a good reason to lock the pages.

...

BTW, who should I contact for testing my script and the DB on a "real" server? Toan? And should I make the script public now, so everyone can take a look? As a wiki article, or post it here on the list?

Contact jasonr@aristotle.bomis.com I'll have him help you.

Yes, let's have a look at the script, it sounds fun!

...

Finally, I'm not sure I can do the long-term of the script maintnance myself. I'll work on the rights management and bug-fixing until it runs without bugs, of course, and can fix minor things when they pop up. But in a few weeks/month, I'll be quite busy at university, and can't promise anything there.

*Nod* I understand. Since you wrote this initial version, and if we go with it as a replacement for UseMod, I'm sure you'll be the main guy to oversee it for the first few weeks. If it is easy enough for other people to grasp what is going on, it should be no problem for someone to step up to the plate to cover the role of maintainer.

-- ************************************************* * http://www.nupedia.com/ * * The Ever Expanding Free Encyclopedia * *************************************************

lsanger＠ross.bomis.com

9:52 p.m.

On Fri, 24 Aug 2001, Magnus Manske wrote:

...

It has all essential wiki features like article editing, version management, user management, subpages, etc.

Sounds fantastic. (Just don't support sub-sub-n-pages, please.)

...

Additionally, its data storage is completely MySQL (fast!), it has a file upload tool, some other goodies soon.

As far as I'm concerned, the question is: how soon can we test this out?

We don't know when the New York Times article is going to appear, but my understanding is that it could be as early as next Tuesday. Probably not that early, but it's possible. We might just get a one paragraph mention and only a little more traffic--but we might be TOTALLY INUNDATED with traffic. We've got to be ready. I'm prepared to work this weekend helping in whatever way I can, if others are up to it. (I have to get vacation days built up for my honeymoon anyway. :-) )

...

Maybe best of all, script and database are prepared to support some kind of editor/superuser functionality for "locking" pages, as it is currently discussed.

I think that having that functionality isn't a bad idea, but I agree with Jimbo that we should not lock down pages. If anyone wants to debate this precise point again, I'm game. If you don't understand why we're so adamantly opposed to it, I think I can make it clearer than I've made it so far.

No, the best thing of all would be something that can instantly and *without loss of data or format* transfer Wikipedia pages from UseModWiki into Wikipedia code. We could call this Encyclode (encyclopedia + code). On the other hand, that might be lame. That's just my idea. :-)

...

Now to the bad sides (yes, there are some...)

I don't have a server to host it yet. Maybe I can run it on the Nupedia

server sometimes. So, no trying yet, sorry.

Jimbo mentioned the possibility of a CVS system--wouldn't take long to set up. Unfortunately, it's Friday afternoon...

...

The parser (to convert the source text into readable stuff) is very basic.

I copied the HomePage and the SandBox from wikipedia, and they look about the same, but this is wherte the bugs will be.

Well, you mentioned something about being able to use the UseModWiki parser. Could you? Would that be a way to preserve all the old UseModWiki formatting?

...

Just letting you know there's an early but working alternative ready...

Thanks, Magnus. I'm eager to see it.

Larry

Tim Chambers

10:05 p.m.

New subject: Wikipedia publicity by the NYT?

lsanger@ross.bomis.com wrote:

...

Subject: Re: [Wikipedia-l] PHP Wikipedia We don't know when the New York Times article is going to appear, but my understanding is that it could be as early as next Tuesday.

Then Jimbo wrote:

...

There's NO REASON to think that we need to hurry this because of the New

York

...

Times article.

I don't claim title to any special status as a Wikipedian, but I do closely follow Wikipedia-L and RecentChanges. This is the first I've heard of the NYT doing a piece on Wikipedia. Would someone in the know mind filling us proletariats in, please? :-)

<>< Tim

lsanger＠ross.bomis.com

10:44 p.m.

New subject: Wikipedia publicity by the NYT?

I thought I said something about this...maybe not.

I was interviewed by a NYT reporter on Tuesday or Wednesday. Jimbo told me is was going to be (perhaps already have been) interviewed as well. The subject of the article is collaborative websites. The reporter seemed to take an interest in wikis generally. I can't tell you how prominently Wikipedia might feature in the article--it might just be one or two sentences, but it seems odd that the reporter would call me first and interview me for something like 45 minutes if it were just going to be one or two sentences--and that he'd want to interview Jimbo as well. My totally unconfirmed suspicion is that Wikipedia might be *very* prominently placed in a long feature article.

If so, Wikipedia is going to flooded with traffic--at least as much as with the Slashdotting. Very possibly more, because it's quite possible that other news sources will pick the story up from the NYT, as often happens. This happened when Newsbytes and Computerworld picked up the Nupedia story last year.

It's totally within the realm of possibility that we will have an order of magnitude greater amount of traffic than we had during the Slashdotting. Now, the server can probably handle even that (right, Jimbo?) but I have my worries about the software.

Larry

On Fri, 24 Aug 2001, Tim Chambers wrote:

...

lsanger@ross.bomis.com wrote:

...
Subject: Re: [Wikipedia-l] PHP Wikipedia We don't know when the New York Times article is going to appear, but my understanding is that it could be as early as next Tuesday.

Then Jimbo wrote:

...
There's NO REASON to think that we need to hurry this because of the New

York

...
Times article.

I don't claim title to any special status as a Wikipedian, but I do closely follow Wikipedia-L and RecentChanges. This is the first I've heard of the NYT doing a piece on Wikipedia. Would someone in the know mind filling us proletariats in, please? :-)

<>< Tim

[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l

Jimmy Wales

25 Aug 25 Aug

12:20 a.m.

New subject: Wikipedia publicity by the NYT?

lsanger@ross.bomis.com wrote:

...

I was interviewed by a NYT reporter on Tuesday or Wednesday. Jimbo told me is was going to be (perhaps already have been) interviewed as well.

Yes. The reporter clarified that he is a freelancer, but the piece will appear in the NYT. He suggested to me that Wikipedia might well be the focus of the article.

...

If so, Wikipedia is going to flooded with traffic--at least as much as with the Slashdotting.

I don't necessarily agree with this. The NYT may be powerful, but it isn't exactly slashdot. :-)

...

It's totally within the realm of possibility that we will have an order of magnitude greater amount of traffic than we had during the Slashdotting. Now, the server can probably handle even that (right, Jimbo?) but I have my worries about the software.

Yes, the server can handle that, but we'll probably have edit lock problems. We have a cron job to clear those, though, and I plan to make the instructions for peopel to clear it themselves more clear before the article appears.

Also, we have a fastcgi version of the software ready to run if that becomes necessary. fastcgi is many many many times faster than cgi.

--Jimbo

-- ************************************************* * http://www.nupedia.com/ * * The Ever Expanding Free Encyclopedia * *************************************************

Tim Chambers

24 Aug 24 Aug

11:20 p.m.

New subject: Wikipedia publicity by the NYT?

Jimbo wrote:

...

Also, we have a fastcgi version of the software ready to run if that

becomes

...

necessary. fastcgi is many many many times faster than cgi.

So what's the downside of doing it BEFORE it "becomes necessary"?

<>< Tim

Jimmy Wales

25 Aug 25 Aug

1:12 a.m.

New subject: Wikipedia publicity by the NYT?

Not much. It's just a matter of doing it. I'll try to get to it this weekend.

Tim Chambers wrote:

...

Jimbo wrote:

...
Also, we have a fastcgi version of the software ready to run if that

becomes

...
necessary. fastcgi is many many many times faster than cgi.

So what's the downside of doing it BEFORE it "becomes necessary"?

<>< Tim

[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l

-- ************************************************* * http://www.nupedia.com/ * * The Ever Expanding Free Encyclopedia * *************************************************

lsanger＠ross.bomis.com

24 Aug 24 Aug

11:29 p.m.

New subject: Wikipedia publicity by the NYT?

On Fri, 24 Aug 2001, Jimmy Wales wrote:

...

...
If so, Wikipedia is going to flooded with traffic--at least as much as with the Slashdotting.

I don't necessarily agree with this. The NYT may be powerful, but it isn't exactly slashdot. :-)

Yeah, but what if the article gets Slashdotted? :-) --Larry

Jimmy Wales

11:34 p.m.

lsanger@ross.bomis.com wrote:

...

No, the best thing of all would be something that can instantly and *without loss of data or format* transfer Wikipedia pages from UseModWiki into Wikipedia code. We could call this Encyclode (encyclopedia + code). On the other hand, that might be lame. That's just my idea. :-)

My lame name suggestion is Pediawiki. "Which wiki software does wikipedia run? Wikipedia runs Pediawiki! It's the wiki software designed for encyclopedias!"

...

Jimbo mentioned the possibility of a CVS system--wouldn't take long to set up. Unfortunately, it's Friday afternoon...

Yes, probably Monday morning we can set this up.

There's NO REASON to think that we need to hurry this because of the New York Times article. Our current site will easily survive the load of that, especially after I finish speeding up the search engine (later this afternoon, cross your fingers).

-- ************************************************* * http://www.nupedia.com/ * * The Ever Expanding Free Encyclopedia * *************************************************

lsanger＠ross.bomis.com

10:36 p.m.

On Fri, 24 Aug 2001, Jimmy Wales wrote:

...

lsanger@ross.bomis.com wrote:

...
No, the best thing of all would be something that can instantly and *without loss of data or format* transfer Wikipedia pages from UseModWiki into Wikipedia code. We could call this Encyclode (encyclopedia + code). On the other hand, that might be lame. That's just my idea. :-)

My lame name suggestion is Pediawiki. "Which wiki software does wikipedia run? Wikipedia runs Pediawiki! It's the wiki software designed for encyclopedias!"

Huh, I have to admit that my suggestion is lamer than yours.

...

...
Jimbo mentioned the possibility of a CVS system--wouldn't take long to set up. Unfortunately, it's Friday afternoon...

Yes, probably Monday morning we can set this up.

Magnus seemed not to like the idea, though. I'd like to know why, Magnus...

...

There's NO REASON to think that we need to hurry this because of the New York Times article. Our current site will easily survive the load of that, especially after I finish speeding up the search engine (later this afternoon, cross your fingers).

Well, here's what I'm worried about:

(1) No huge deal, but...it's possible that the RecentChanges page is going to be, well, even huger than during the recent Slashdotting. That would suck, because it would be incredibly unwieldy and messy. But would that be a problem, other than that no one could possibly look at even a significant fraction of all that? (Which I myself argued isn't that huge of a problem, in a recent column...)

(2) Edit lock problems might increase if traffic increases radically. What can we do about this? It's still a reasonably serious problem right now--I have removed a half-dozen edit locks today.

(3) The search engine--Jimbo has got that handled, thanks.

(4) Isn't it possible that, with a lot of traffic, the whole website will just become unusably slow because it's based on text files? I am asking out of pure ignorance and blind fear, and need to be educated on this point. If we were using a database-driven website (and adequate server bandwidth, which won't be a problem), I'm assuming this *wouldn't* be a problem.

Larry

Jimmy Wales

25 Aug 25 Aug

12:24 a.m.

lsanger@ross.bomis.com wrote:

...

(4) Isn't it possible that, with a lot of traffic, the whole website will just become unusably slow because it's based on text files? I am asking out of pure ignorance and blind fear, and need to be educated on this point. If we were using a database-driven website (and adequate server bandwidth, which won't be a problem), I'm assuming this *wouldn't* be a problem.

Bomis handles roughly 500 times the traffic of wikipedia with text files. The text files are not a bottleneck. They get cached by the filesystem.

The fact that this is a cgi script, rather than a fastcgi script, is a bottleneck, but conversion from cgi to fastcgi is (nearly) trivial, and I've got that ready to go.

Also, the server running wikipedia has 512 meg of memory now, but we can quickly upgrade that to 2 gig if necessary. (And memory is frighteningly cheap these days.) Additional memory will be automatically used by the filesystem to cache frequently accessed files, etc., etc.

It's all good. My only concern is the edit locks.

-- ************************************************* * http://www.nupedia.com/ * * The Ever Expanding Free Encyclopedia * *************************************************

lsanger＠ross.bomis.com

24 Aug 24 Aug

11:27 p.m.

On Fri, 24 Aug 2001, Jimmy Wales wrote:

Wow, I'm glad to hear that the reporter might be featuring Wikipedia. Great!

...

The fact that this is a cgi script, rather than a fastcgi script, is a bottleneck, but conversion from cgi to fastcgi is (nearly) trivial, and I've got that ready to go.

Very cool!

...

Also, the server running wikipedia has 512 meg of memory now, but we can quickly upgrade that to 2 gig if necessary. (And memory is frighteningly cheap these days.) Additional memory will be automatically used by the filesystem to cache frequently accessed files, etc., etc.

It's all good. My only concern is the edit locks.

OK. Well, is there any way to take care of that problem by rewriting the software? What the hell IS an edit lock, anyway?

Showing my technical ignorance again, Larry

8518

Age (days ago)

8519

Last active (days ago)

wikipedia-l@lists.wikimedia.org

15 comments

4 participants

tags (0)

participants (4)

Jimmy Wales
lsanger＠ross.bomis.com
Magnus Manske
Tim Chambers