I just got word back from the editor of the encyclopedia site, he told me that
in a few days they'd be revising the site ith more abilities. One of them will
be a complete download of the text, so no need for wget just yet.
another problem besides typos is that a lot of the information is not correct, and the remainder takes a long time to verify. I had this problem with the 2 or 3 paragraphs I added on Algeria--it took several hours to verify what I did add, so unless you're loaded with time and patience or are an expert in the field, the articles probably should not be added at all, IMO.
kq
You Wrote:
>> > On Thu, 14 Mar 2002 16:20:41 kband(a)www.llamacom.com wrote:
>> > > >
>> > > > Someone published the 1911 britannica online:
>> > > >
>> > > > http://1911encyclopedia.org/
>> > > >
etc.0
> > I understand there may be technical difficulties in reducing the minimum,
> > but we should consider it a flaw if we have a four-letter minimum,
> > or even a three-letter or two-letter minimum.
> >
I'd also consider the error message given when searching for short words
a Bad Idea.
[!! SYNTAX ERROR: the word 'the' is too short, the index requires at
least 4 characters]
AND "cunctuator"
I think it would be better if words shorter than four letters (or stop
words) were ignored, and a more polite error given if it can't search.
Dragon Dave.
At the very least, there should be a convient way to search with Google.
The problem with Google is that it does not update very much. What's
currently in the cache is from January 26.
I noticed this problem as well, when I tried to search for "Ian". (-:
Ian Monroe
http://mlug.missouri.edu/~eean/
On Sat, 9 Mar 2002, kband(a)www.llamacom.com XXXXXXXXXXXXXXXXXXXXX wrote:
> As someone has previously pointed out, short word searches (such as
> Ur and Oz) are important. There shouldn't be a letter minimum, if
> at all possible.
> We should probably use google as a good guide for usability standards.
> Google lets you search for single letters, for example.
>
> I understand there may be technical difficulties in reducing the minimum,
> but we should consider it a flaw if we have a four-letter minimum,
> or even a three-letter or two-letter minimum.
>
> I've cc'ed wikipedia-l because this is a policy discussion as much as a
> technical discussion.
>
> --tc
> [Wikipedia-l]
> To manage your subscription to this list, please go here:
> http://www.nupedia.com/mailman/listinfo/wikipedia-l
>
As someone has previously pointed out, short word searches (such as
Ur and Oz) are important. There shouldn't be a letter minimum, if
at all possible.
We should probably use google as a good guide for usability standards.
Google lets you search for single letters, for example.
I understand there may be technical difficulties in reducing the minimum,
but we should consider it a flaw if we have a four-letter minimum,
or even a three-letter or two-letter minimum.
I've cc'ed wikipedia-l because this is a policy discussion as much as a
technical discussion.
--tc
Um, when you get down to it, the PHP does produce HTML. Couldn't you just
have the mirrors display static html pages?
It would be a new feature in the PHP code - it would have to produce pages
with only relative links, except for the edit features, which would link
directly to wikipedia.com. And some way of communicating to mirrors of
what to download when.
Ian
> The existing PHP code will be useless in generating a static site
> suitable for mirroring.
>
> Well, I do have the wt2txt utility, which will convert articles into
> DocBook XML, which is easily publishable in many formats including
> PDF, PostScript, text, and good old html.
>
> Sime I'm already working on very much the same problem, there's no
> need for you to duplicate work. My code is all under the GPL. So if
> any programmers want to help with this, talk to me.
>
> --
> David C. Merrill http://www.lupercalia.net
> Linux Documentation Project david(a)lupercalia.net
> Collection Editor & Coordinator http://www.linuxdoc.org
>
> I wonder if in part why so many people are angry at Microsoft is not just
> because their products frustrate them so much, but also because this
> frustration is ignored. The computer makes people feel like they are
> dummies, when in fact it is the computer that is stupid.
> --MIT Associate Professor Rosalind Picard
> [Wikipedia-l]
> To manage your subscription to this list, please go here:
> http://www.nupedia.com/mailman/listinfo/wikipedia-l
>
After some tweaking with character set headers, title character
limitations and mysterious high-byte problems, the Japanese wikipedia
should now be useable:
http://ja.wikipedia.com/
(At some point it and the other non-English wikipedias will be converted
to the new software, but there is still work to be done before that's
ready - see the thread "International upgrades" on wikitech-l.)
-- brion vibber (brion @ pobox.com)
Much more pratical would be a mirroring system. Bomis could continue to do
the main wikipedia.com where updates would take place and the
authoritative version would reside, and mirrors could automatically
download new versions of pages. There could be more then one tier of
mirrors to lift even more bandwidth and CPU time (which appears to be a
bigger problem) off of Bomis.
The mirrors could wouldn't have to use php-wikipedia, they could just have
static pages (though they could have edit this links, which would take
you to wikipedia.com).
Granted, I don't have the time to put the code where my moulth is.
Ian Monroe
http://mlug.missouri.edu/~eean/
On Sun, 3 Mar 2002, kband(a)www.llamacom.com XXXXXXXXXXXXXXXXXXXXX wrote:
> > Why would we want to distribute hosting? And for that matter, what's
> > wrong with advertising, provided it's done as subtly as possible, and
> > not from various merchants of death ;-) (which I still think is for
> > Bomis to decide -- I may feel very proprietary about a lot of stuff
> > here, but I still remember I'm on somebody else's playground. Jimbo et
> > al. have been very cool about trying to get input, but business
> > decisions should belong to them)
>
> It's a matter of scaling. Hosting is currently the only bottleneck
> in the wikipedia process, other than possible problems in organization
> of material. If the popularity of Wikipedia doubles, the hosting needs
> to double--with distributed hosting, that would happen automatically.
>
> -tc
> [Wikipedia-l]
> To manage your subscription to this list, please go here:
> http://www.nupedia.com/mailman/listinfo/wikipedia-l
>
Why all the discussion about advertising, distributed hosting and mirroring? It all
seems a bit Alice in Wonderland.
You already have a good site easily able to handle the load, which is only moderate,
from what I can see.
You have a well-disposed site manager in the form of Jimmy Wales who has
generously declared himself happy to continue running and paying for it. The marginal
cost of running an extra host to support something the size of Wikipedia is not great,
so you don't need advertising anytime soon. I mean the whole thing will sit on $100 or
less worth of disk, for heavens sake!
But you've just lost your EDITOR-IN-CHIEF! You need an EDITOR first and foremost.
If you can't get an editor, you need to work out how to set up a group to keep the
vision and the standards going, or the lights will start going out. Which would be a
shame.
GrahamEmail: grahamc(a)patia.com
Web : http://patia.com/grahamc
>
> Message: 1
> From: kband(a)www.llamacom.com
> Subject: Re: [Wikipedia-l] Everything in the last
> digest
> To: wikipedia-l(a)nupedia.com
> Date: Sun, 3 Mar 2002 21:54:40 -0600 (CST)
> Reply-To: wikipedia-l(a)nupedia.com
>
> > Why would we want to distribute hosting? And for
> that matter, what's
> > wrong with advertising, provided it's done as
> subtly as possible, and
> > not from various merchants of death ;-) (which I
> still think is for
> > Bomis to decide -- I may feel very proprietary
> about a lot of stuff
> > here, but I still remember I'm on somebody else's
> playground. Jimbo et
> > al. have been very cool about trying to get
> input, but business
> > decisions should belong to them)
>
> It's a matter of scaling. Hosting is currently the
> only bottleneck
> in the wikipedia process, other than possible
> problems in organization
> of material. If the popularity of Wikipedia doubles,
> the hosting needs
> to double--with distributed hosting, that would
> happen automatically.
>
> -tc
>
> --__--__--
>
I understand that, Cunc, but wouldn't Bomis having
larger, perhaps more redundant (and mirrored) servers
work as well? I've never dealt with wiki technology,
but my last tech job was with a firm that did online
transaction management. We had three active servers
up at all times (web, SQL, and mail), and were always
running checks so that we could add more servers as
soon as it was necessary. It's very expensive, but
isn't that where the advertising comes in? I know
remote networking is viable, but is it practical? It
just doesn't seem to be a sensible solution long-term.
Can you imagine having to deal with migrations, etc.,
every time someone decides to back out? Ugh.
---JHK
__________________________________________________
Do You Yahoo!?
Yahoo! Sports - sign up for Fantasy Baseball
http://sports.yahoo.com