Jason is activating php.wikipedia.com for the script to test, which should
be working later today. So, soon you can flood me with bug reports ;)
Some points that were mentioned on the list while I was asleep:
- Larry, I don't oppose CVS as such, I just thought why bother...
So, I wouldn't mind a CVS at all.
- Edit locks : I thought they'd protect a page that is edited for a certain
time, e.g., 5 minutes, so there won't be two edits of the same text at the
same time. Now that I know it's only for writing, I am glad to not have
wasted time in implementing such a thing in my script ;)
The MySQL server will take care of the write-at-the-same-time problem, for
sure.
- /Talk pages : Changing the standard text for new documents so they'd have
a /Talk page should do it, right? I could also have the parser look for
"/Talk" and append it if necessary in a "top-level" article.
- Conversion to SQL format : The easiest way I can think of is a script that
goes through all articles in the current wikipedia and generates a complete
article text in chronoligical order (oldest first). After each "version" is
generated, a variant of my script can store it in the DB. That would ensure
identical data. Anyone to write a "generation" script?
- Lame names : How about "Aide-Pikiw" (wikipedia spelled backwards)? That
must be the lamest, for sure? ;)
Magnus
Hi all,
as a few of you might know, I just wrote a complete (well, almost) Wikipedia
software as a PHP script!
It has all essential wiki features like article editing, version management,
user management, subpages, etc.
Additionally, its data storage is completely MySQL (fast!), it has a file
upload tool, some other goodies soon.
Maybe best of all, script and database are prepared to support some kind of
editor/superuser functionality for "locking" pages, as it is currently
discussed.
Now to the bad sides (yes, there are some...)
- I don't have a server to host it yet. Maybe I can run it on the Nupedia
server sometimes. So, no trying yet, sorry.
- The parser (to convert the source text into readable stuff) is very basic.
I copied the HomePage and the SandBox from wikipedia, and they look about
the same, but this is wherte the bugs will be.
- Currently, I don't have a means to convert wikipedia to MySQL
automatically, which is what would have to be done if (IF!) this script ever
gets used.
Just letting you know there's an early but working alternative ready...
Magnus
I can't speak to the general availability of PHP skills on the net, except
to say I've seen statistics that show that PHP the single most common Mod
for Apache with an 850,000 server installed base -- which is about over
three times mod pearl's 250,000. Those statistics come form
www.secutityspace.com. The exact web site is:
http://www.securityspace.com/s_survey/data/man.200107/apachemods.html
I have some basic skills with PHP and MySQL, and some experience with
PosgreSQL. I'm no uberhacker, but I can throw together PHP code a lot
faster than pearl CGI, so I think PHP is a reasonable way to go.
I'll look into it, but I seem to remember seeing another PHP based WikiWiki
project out there, so we may have an existing code base to work with.
(Though it looks like Magnus may have done this for us as well...)
I think MySQL will be fine for a project like this, no matter whether we
choose PHP or Mod Pearl, but we should take a look at what we are likely to
need in the future before making a commitment to a particular database
backend.
-----Original Message-----
From: Jimmy Wales [mailto:jwales@bomis.com]
Sent: Friday, August 24, 2001 5:30 PM
To: wikipedia-l(a)nupedia.com
Subject: Re: [Wikipedia-l] PHP Wikipedia
How difficult do you think it will be to import all the existing wikipedia
data
into your version?
What's your take on the general availability of PHP skills on the net,
versus
Perl skills?
I've been talking to Clifford Adams about the future of Wikipedia and
UseModWiki,
and he's given his blessing to a Wikipedia-centric fork of his code. Some
of the
design goals for UseModWiki, namely that it be easy to install, etc., aren't
necessarily
consistent with some of the more highly specialized needs of Wikipedia. He
suggests
that we find someone (Magnus? :-)) to take over the forked version, and that
we could
set up a modern CVS, etc.
I'm reluctant to go the PHP route, partially because I don't personally know
much
about PHP, but if you can convince me that it's sufficiently superior to a
mod_perl
or perl fastcgi solution, I'm thinking that we could go your route. Your
code could
be wikipedia-centric, and we could all learn to help you with it.
--Jimbo
Magnus Manske wrote:
> Hi all,
>
> as a few of you might know, I just wrote a complete (well, almost)
Wikipedia
> software as a PHP script!
>
> It has all essential wiki features like article editing, version
management,
> user management, subpages, etc.
> Additionally, its data storage is completely MySQL (fast!), it has a file
> upload tool, some other goodies soon.
> Maybe best of all, script and database are prepared to support some kind
of
> editor/superuser functionality for "locking" pages, as it is currently
> discussed.
>
> Now to the bad sides (yes, there are some...)
> - I don't have a server to host it yet. Maybe I can run it on the Nupedia
> server sometimes. So, no trying yet, sorry.
> - The parser (to convert the source text into readable stuff) is very
basic.
> I copied the HomePage and the SandBox from wikipedia, and they look about
> the same, but this is wherte the bugs will be.
> - Currently, I don't have a means to convert wikipedia to MySQL
> automatically, which is what would have to be done if (IF!) this script
ever
> gets used.
>
> Just letting you know there's an early but working alternative ready...
>
> Magnus
>
> [Wikipedia-l]
> To manage your subscription to this list, please go here:
> http://www.nupedia.com/mailman/listinfo/wikipedia-l
--
*************************************************
* http://www.nupedia.com/ *
* The Ever Expanding Free Encyclopedia *
*************************************************
[Wikipedia-l]
To manage your subscription to this list, please go here:
http://www.nupedia.com/mailman/listinfo/wikipedia-l
hey,
I recently read that google has implemented a system that learns,
to some approximate degree, how often a site is changed and then starts to
reindex that site according to what it has learned. I don't know much
more about it than that, but if this rumor is true, than the problem with
them only indexing wikipedia once a month should go away in time. Someone
could write and ask them about this.
me
> Message: 7
> Date: Fri, 24 Aug 2001 15:35:31 +0200 (MET DST)
> From: Axel Boldt <axel(a)uni-paderborn.de>
> To: wikipedial-l(a)nupedia.com
> Subject: [Wikipedia-l] Searching with Google
> Reply-To: wikipedia-l(a)nupedia.com
>
> Hi,
>
> I think Tim's Google trick is really cool and until we have a better
> search engine ourselves, we should definitely put a search box like
> his on our home page.
>
> The only downside is that Google indexes our site only once a month.
>
> On my sites, I have in the past used http://htdig.org. The search
> results are just as fast and precise as google's, but you control when
> and what you index. A cron script once a night should be fine.
>
> Axel
What I'm seeing with wikipedia is that Google does index the site once
a month, but that the new parts don't show up in a search until a
month after that.
kq
0
Hi,
I think Tim's Google trick is really cool and until we have a better
search engine ourselves, we should definitely put a search box like
his on our home page.
The only downside is that Google indexes our site only once a month.
On my sites, I have in the past used http://htdig.org. The search
results are just as fast and precise as google's, but you control when
and what you index. A cron script once a night should be fine.
Axel
On Saturday 24 August 2002 12:01 pm, Ed wrote:
> I'm sure we're all sorry (er, um, glad!) to hear about Mav's new job.
Thanks for the sentiment but I'm still at the same job :-) -- except now I
have 4 days a week of night class in addition to my full time job.
> I've been greeting new users occasionally,
> and as of today I've started
> copying Mav's meet-and-greet template.
Great! Although if you can try to spice the greeting up a bit -- I'm overly
logical and it shows in my greeting.
>I can't promise 3 hours a day of
>anti-vandalism patrol, but I do
>browse Recent Changes frequently.
Yeah, 3 hours is a bit too long -- it would be nice to have a feature whereby
an asterisk (or whatever) is by any edit in recent changes that has not been
viewed by a logged-in user (or better yet a sysop). This would majorly cut
down on duplication of vandalism and copyright infringement patrol effort by
militia members.
Also for me at least it would be a /major/ speed-up improvement to have the
option of having Recent Changes in table format again -- it takes me
/forever/ to scan user names and IPs the way it is now.
> I enjoy encouraging newbies with praise and guidance, and I don't mind
> cleaning up after the uninitiated. But if they refuse to listen, I won't
> hesitate to make 'em stand in the corner (see [[child time-out]]).
>
> Ed Poor
In spite of our differences of opinion in social and political matters I
think you are a great people person and well suited to help with the job or
greeting and guiding.
-- Daniel Mayer (aka mav)
[I'm a bit late in the thread ...]
Stephen Gilbert <sgilbert(a)nbnet.nb.ca> writes:
> Until then, is there any reason why we couldn't use FTP to upload images?
> An anonymous FTP account allowing us to upload to, say,
> www.wikipedia.com/images would do the trick, and we could simply link the
> pictures from there.
Opening Wikipedia up to unchecked binary upload would also allow a lot
of things in that we'd rather not have. Remember that when ftp was
most popular, /incoming direcories where invariably (ab)used
for illegal exchange of copyrighted material. This has mostly been
mitigated by removing read access to the upload directory to the
public. Probably also by the prolification of gratis web hosters,
where warez can still easily be stored for a few days as well. So I
don't know if this would become a problem.
In true WikiWiki spirit, people should be able to delete/modify
uploaded material as well. This could keep us abreast of the warez
problem, since our editorial staff (aka everybody) could wipe out any
occurance of these. On the other hand, to prevent mischief, we would
probably want to keep past revisions of binaries as well, so that
people could undo changes, or merge them. Unfortunately, here the
warez pop up again.
Binaries, even when they replace a thousand words, usually take more
space than a thousand words. Therefore, resources may also pose a
problem: disk space, access speed, network bandwidth.
While I recognize the fact that images (maybe sounds, much less
animations), genuinely complement some entries (hey, one of the two
pictures in /images/ came from me!), I argue for a very sparing use of
these. The encyclopedias I knew and loved all had only few images. I
know that these are becoming more, and the "multimedia encylopedia" is
the current craze. Call me old-fashioned in that repect, but I think
the advantages of these are by far over-hyped.
As an afterthought, I would want Wikipedia to stay as useful as
possible to people who can't view pretty pictures (because their
hardware won't show them properly, because they lack eye-sight). Add
pictures if you want, but don't replace explanatory text with them.
To wrap up, these concerns make me belief that the current state of
affairs with images trickling in slowly (actually, the installing of
my image was not slow!) is fine. Optimisations of the process, like
setting up a write-only ftp area for people to use instead of mail,
are of course good.
Implementing unchecked binary upload, and keeping it until it becomes
a nuisance, is also an option. You will surely survive any potentially
"I told you so" calls by me. The main loss will be the implementor's
time, then.
--
Robbe
I wrote:
>If there's a long discussion between two people that is slightly
>off-topic, I would hope that someone steps in and asks them politely
>(and perhaps firmly, if necessary) to continue it by private e-mail.
etc.
Actually, even if the discussion gets up to a regular 20 posts a day
(the Great Technocracy Discussion at its peak), it might not even be
much of a problem considering the number of edits being made now each
day, especially if we can implement some way of notifying people of
certain changes they ''want'' to know about. Still, though, I'd hate
to see something out of hand like that on wikipedia, even on a user's
page....
0
>3) If there's a long discussion between two people, and it's slightly off
>topic, feel free to move it to a new page, and replace it with a link
to the
>new page (a sub page of one of the involved parties is a good idea.)
If there's a long discussion between two people that is slightly
off-topic, I would hope that someone steps in and asks them politely
(and perhaps firmly, if necessary) to continue it by private e-mail.
I don't expect this to happen; and I bring it up only because of my
experience on various listservs, one of which involved a diehard
capitalist and a diehard technocratist: the two of them would bat
points back and forth endlessly, rebutting but not considering, in the
middle of a listserv that was ostensibly dedicated to sustainable
living. After several months, about of a third of the people had
filtered their messages to delete anything with the word "technocracy"
in the subject and at least another third had left the list. The two
both balked at the suggestion that they continue the mess privately,
seeming to think it a violation of their 1st amendment rights not to
be able to hog donated bandwidth with chatter that 2 people out of 300
wanted to hear. I hope that the issue never comes up on wikipedia. :-)
KQ
0