The "newcodebase" directory in CVS now contains the codebase I've been
working on. The structure is complete and the basic functions of user
logins, page viewing, and editing are filled in. Filling in the rest
should be a pretty straightforward case of fleshing out functions that
are already in place with modified code from the old codebase. It looks
like it will fix a lot of problems, and it's a real miser on database
accesses.
Both codebases are running on my server (the old code is at
http://www.piclab.com/wiki/wiki.phtml, and the new code is at
http://www.piclab.com/newwiki/wiki.phtml, each with a March dump of
the Wikipedia database). I'd again like to stress that coders can use
that for testing before installation.
--
Lee Daniel Crocker <lee(a)piclab.com> <http://www.piclab.com/lee/>
"All inventions or works of authorship original to me, herein and past,
are placed irrevocably in the public domain, and may be used or modified
for any purpose, without permission, attribution, or notification."--LDC
What is going on?
Warning: Supplied argument is not a valid MySQL result resource in
/home/wiki-newest/work-http/wikiPage.php on line 215
Warning: Supplied argument is not a valid MySQL result resource in
/home/wiki-newest/work-http/wikiPage.php on line 219
Also, pages aren't updating from [Link]? to working links when I create the
pages.
E.g. I made the [[tobacco industry]] link on the Cigarette entry, then made
the Tobacco industry page, but the Cigarette entry still displays [tobacco
industry]?.
> Certainly, it's good for spiders to hit 'Recent Changes', and often.
Why? The spider doesn't know that the pages on RecentChanges have had
recent changes. It's just a list of links, like special:allpages or
special:ShortPages.
Maybe one could add
/wiki/special:RecentChanges&
to the robots.txt file; that way, the spider would fetch only one copy
of RecentChanges, not 14.
Axel
>A robots.txt could easily be set up to disallow
>/wiki/special%3ARecentChanges (and various case variations). That only
>stops _nice_ spiders, of course.
>History links would need to be changed to be sufficiently
>distinguishable, for instance using
>/wiki.phtml?title=Foo&action=history
>etc; then ban /wiki.phtml.
I think we should do that ASAP. Let's close the whole special:
namespace, &action=edit, &action=history, &diff=yes and &oldID stuff
to spiders. None of this is of any value to the spiders anyway.
Axel
Hi!
I know you guys are busy with speeding up wikipedia and killing bugs.
But I just wanted to remind you that the German wikipedians are more
frequently asking for the new PHP script. The bugs in the UseMod script
are annoying, and the new features would make life so much easier.
If you find the time to set up a test site for us it would be very nice
:-)
Thanks,
Kurt
On wikipedia-l, Chuck Smith wrote:
> I COMPLETELY disagree with this. Let the robots crawl
> everything. It's better that someone finds one of our
> Talk or User pages and cruises on over to our main
> site than to simply find a completely website!
I think search robots should be stopped from indexing the editing
pages. Google has already indexed this URL
http://www.wikipedia.com/wiki/Wikipedia:Help&action=edit
and I don't see the point in that. Look here for a list,
http://www.google.com/search?q=+site:www.wikipedia.com+editing+wikipedia
This can be stopped by inserting this in the HTML <head>:
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
--
Lars Aronsson (lars(a)aronsson.se)
tel +46-70-7891609
http://aronsson.se/http://elektrosmog.nu/http://susning.nu/
I contacted Magnus and Brion earlier this week to let them know
that I had set up accounts for them on my server which they can
use to test wiki software changes. It has a fairly large database
dump installed, and plenty of space to experiment, and all the
softare needed to develop and test.
Brion hasn't contacted me, but Magnus tells me the setup is
basically OK, so I'll make the same offer to other developers:
my server is available as a testbed; please take advantage of
it. It would be GREAT if some enterprising developer who didn't
want to muck with the wikipedia software itself would take on
writing some testing scripts--just some code that pounds on a
running wiki for a while and exercises its functions. We could
then run that against the running wiki on my site before
installing any software change. It's no big deal if my site
goes down now and then.
I also gave Magnus and Brion access to the new reorganized
software base I've been working on, and I may open that up to
a wider audience soon. It's doing the basic functions of
displaying pages with links and all the wiki formatting, and
handles user login and prefs. I still need to move the code
for page editing/history/etc., many special pages, and some
more internationalization over. It's going remarkably fast,
but it will still take another week or two.
The new software should not only be easier to understand,
faster, and more scaleable, but I've included plenty of hooks
for debugging and logging features that will make it easier
to diagnose problems. Just keeping everyone up to date.
--
Lee Daniel Crocker <lee(a)piclab.com> <http://www.piclab.com/lee/>
"All inventions or works of authorship original to me, herein and past,
are placed irrevocably in the public domain, and may be used or modified
for any purpose, without permission, attribution, or notification."--LDC
Right now, I'm seeing nice and fast responses, except every
once in a while everything slows to a halt. If that's due to our
script, then there must be some really bad, really rare special
function somewhere. I doubt that.
Maybe the slowdowns are due to spiders that hit our site and request
several pages at once, in parallel, like many of these multithreaded
programs do. I read somewhere that everything2.com for this very
reason has disallowed spiders completely and doesn't even allow Google
to index their site anymore.
Maybe we should search the server logs for several rapid requests from
the same IP, and try to correlate those to load averages?
Axel
Axel Boldt <axel(a)uni-paderborn.de> writes:
> Right now, I'm seeing nice and fast responses, except every
> once in a while everything slows to a halt. If that's due to our
> script, then there must be some really bad, really rare special
> function somewhere. I doubt that.
Its a punt, and I've no real evidence ... but ...
when the 'pedia comes back after some slowtime, there often seems to be an
Wikipedia:Unsuccessful searches (date); [Unsuccessful search for foobar]
at the top of RecentChanges
I might be imagining it, I certainly can't reproduce it at will.
--
Gareth Owen