Bryce (just realized that I had replied just to you rather than the list): I sympathize, but posting on this list frankly isn't going to get anything done. I do only a little work directly on the server; Jason's the man who will have to do this, if this or something like it is what we'll end up doing, exactly. What Jason might not know (so, I've cc'd him) is how seriously you all take this. I certainly would like to see Wikipedia's content easily useable--it will get a lot of new links back to Wikipedia and it will indeed make the project more credible, as someone very correctly said. I have wanted this to be done for months, but, well--our programmers are very busy with projects that actually make money. :-/
Larry
You Wrote:
I think what people are trying to politely say is that you may be in violation of your license...
My suggestion to get around this with a minimum of time expended
would
be to set up a cron job to tarball the wiki databases once a week.
Login as www-data or nobody or root or whomever $ crontab -e 0 5 * * 0 /usr/local/bin/tarball_wiki # Weekly Sunday 5am tarball
#!/bin/sh ## tarball_wiki script tar cf /tmp/wiki.tar /home/www-data/wiki_db /home/www-data/cgi-
bin/wiki.pl
gzip /tmp/wiki.tar rm -f /home/www-data/html/wiki.tar.gz mv /home/www-data/html/wiki.tar.gz /home/www-data/html/wiki-
prev.tar.gz
mv /tmp/wiki.tar /home/www-data/html/wiki.tar.gz
This will keep the current plus previous week's tarball. Obviously, you'll have to fiddle the paths to match however your
system
is organized. You might need to add some chmod/chown commands if
you do
this as a user other than the web user.
Anyway, IANAL, but I think this little script would get you off the
hook
regarding the transparency issue.
Bryce
On Wed, 15 Aug 2001 sanger1@nupedia.com wrote:
I agree completely as well. You all must realize, though, that
the
amount of (expensive) paid programming labor Bomis can devote to useful and even essential features like this is less than we would all like. It would be ideal if some programmers would step up to
the
plate and actually help bring some of these proposals into being.
I
for one would be absolutely delighted.
Larry
You Wrote:
On Tue, Aug 14, 2001 at 09:56:47AM +0200, Robert Bihlmeyer wrote:
Jan Hidders hidders@win.tue.nl writes:
Because the GFDL allows you to download everything and start
your own
server.
Well the GFDL also wants transparent copy to be easily
available. I
don't consider spidering wikipedia to be an option open to
the "man
from the street".
FWIW I certainly agree with that, and there should certainly be
an
easy
way to download the complete Wikipedia. So you can also add
my "pretty
please" :-)
-- Jan Hidders
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l 0
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l
On Wed, 15 Aug 2001 sanger1@nupedia.com wrote:
... but, well--our programmers are very busy with projects that actually make money. :-/
(And the rest of us programmers are not?)
Wikipedia gains nupedia a good deal of attention. You also have said that there may be articles which can go into nupedia. I further imagine you will find other ways to profit off wikipedia more directly; I know you have expended thought in this area.
The reason many people got involved (at the very least, *me*) was the willingness to hold the content under the GFDL. As I see it, what folks are asking is simply to deliver on the promises made at the outset, and frankly it's rather frightening to encounter any resistance to requests like these. (Like a car manufacturer refusing to do warranty work because their employees are busy making new sales.)
This is NOT going to take anyone more than maybe an hour to do. It isn't really even programming, per se; any unix sysadmin should be comfortable crontabbing tarballs of a website. And it IS tied up with being able to continue making money; call it a marketing or legal requirement if nothing else.
Anyway, I'm a little miffed - you said your programmers were too busy, and that you'd appreciate it if someone would supply code to do it. I responded with some code that would do the minimum needed to comply (just change the paths). I was anticipating a thank you, but instead "It does no good to post it here"? Well, call me confused. ;-)
Bryce, you are quite right to be miffed. I would be miffed if I were you. I share your confusion, too. I had no idea that it was such a small operation (remember, I'm not a programmer, so you must explain things simply and clearly to me, if you want me to understand). There is absolutely no *resistance* to the request--unless inertia counts as resistance...
Nupedia isn't making any money either. In fact, Wikipedia is much closer to the point where it might (somehow) make money for Bomis (and therefore, potentially, for anyone else who wants to use the content for profit).
Anyway, remember, Wikipedia, like Nupedia, is a volunteer project--so it's not surprising (though regrettable) that essential features like this should be rooted-for and planned by volunteers. Really, we should have done this long ago.
Larry
----- Original Message ----- From: "Bryce Harrington" bryce@neptune.net To: wikipedia-l@nupedia.com Cc: jasonr@bomis.com Sent: Wednesday, August 15, 2001 11:29 AM Subject: Re: Re: [Wikipedia-l] Wikipedia teamwork
On Wed, 15 Aug 2001 sanger1@nupedia.com wrote:
... but, well--our programmers are very busy with projects that actually make money. :-/
(And the rest of us programmers are not?)
Wikipedia gains nupedia a good deal of attention. You also have said that there may be articles which can go into nupedia. I further imagine you will find other ways to profit off wikipedia more directly; I know you have expended thought in this area.
The reason many people got involved (at the very least, *me*) was the willingness to hold the content under the GFDL. As I see it, what folks are asking is simply to deliver on the promises made at the outset, and frankly it's rather frightening to encounter any resistance to requests like these. (Like a car manufacturer refusing to do warranty work because their employees are busy making new sales.)
This is NOT going to take anyone more than maybe an hour to do. It isn't really even programming, per se; any unix sysadmin should be comfortable crontabbing tarballs of a website. And it IS tied up with being able to continue making money; call it a marketing or legal requirement if nothing else.
Anyway, I'm a little miffed - you said your programmers were too busy, and that you'd appreciate it if someone would supply code to do it. I responded with some code that would do the minimum needed to comply (just change the paths). I was anticipating a thank you, but instead "It does no good to post it here"? Well, call me confused. ;-)
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l
On Wed, 15 Aug 2001, Bryce Harrington wrote:
On Wed, 15 Aug 2001 sanger1@nupedia.com wrote:
... but, well--our programmers are very busy with projects that actually make money. :-/
(And the rest of us programmers are not?)
The reason many people got involved (at the very least, *me*) was the willingness to hold the content under the GFDL.
Another point while I'm in rant mode. ;-)
There is a disincentive to doing coding work for Wikipedia without having a transparent copy of the site (i.e., a tarball of the whole thing) to work on, offline. Similarly, if there is a feeling that the whole thing belongs to Bromis and that they are unwilling to share the underlying code, then there will be a similar unwillingness to assist with enhancing the underlying code. Just to pull one example out of the hat - imagine if I wanted to create a German version of wikipedia by programmatically running the whole site through Babbelfish.
Furthermore, while none of us have any wish to see Bromis go out of business, we must admit that these days this is not unheard of. If Bromis were to go under, and we did not have a backup of the site that the community could resurrect quickly and easily, then there is a question whether wikipedia would exist if Bromis did not. And that is probably scary to some people when you stop and think about it.
Anyway, my guess is that 99% of the desire for the site backup is for one or both of the above reasons. And I should think both of these reasons are also important to Bromis.
Bryce
This is NOT going to take anyone more than maybe an hour to do. It
Alert: this is meant somewhat tongue-in-cheek....
This is one of those "code phrases" that immediately puts any programmer / sysadmin / manager on the defensive, regardless of its truth or the current workload.
I worked in a college computing services area (typically underfunded and understaffed) for years, and the line of faculty, staff, and students who each wanted their "only a few minutes to do" project handed "right now" would like like the infinite regression of images in two facing mirrors.
This is *not* an argument for procrastination or other delay -- I have no idea what Jason's work load is -- this is just a sympathizing note for him (grin).
loh
On Wed, 15 Aug 2001, Larry Olin Horn wrote:
This is NOT going to take anyone more than maybe an hour to do. It
This is one of those "code phrases" that immediately puts any programmer / sysadmin / manager on the defensive, regardless of its truth or the current workload.
*grin* 'tis true. My bad. Sorry jason. (Tho I bet he did get it done within an hour.)
"What do you mean your code can't do <insert something that seems simple but isn't>? Just use <insert irrelevant bit of tech they've heard of but never used>. Next problem!"
"When I said we needed it ASAP before, you were able to do it in a week, so this time you ought to be able to do it in half a week, right?"
"No, we will never need to <insert feature they'll need next month>"
"If you can't do it in the given time, then let's find someone to help you." Uhhnnngg.
"I don't care if it isn't done, we're screwed if it isn't delivered NOW."
I worked in a college computing services area (typically underfunded and understaffed) for years, and the line of faculty, staff, and students who each wanted their "only a few minutes to do" project handed "right now" would like like the infinite regression of images in two facing mirrors.
This is *not* an argument for procrastination or other delay -- I have no idea what Jason's work load is -- this is just a sympathizing note for him (grin).
Yup, apologies again. I guess I was just trying to address the concern that it would take a significant amount of time. I could have chosen better phrasing, though.
Btw, browsing through the distro, I see that static versions of the pages *are* available: /wiki/lib-http/db/wiki/html/ Cool! :-)
Bryce
Bryce Harrington wrote:
Anyway, I'm a little miffed - you said your programmers were too busy, and that you'd appreciate it if someone would supply code to do it. I responded with some code that would do the minimum needed to comply (just change the paths). I was anticipating a thank you, but instead "It does no good to post it here"? Well, call me confused. ;-)
I understand and appreciate your confusion. I was on vacation last week, and Larry doesn't really have the means to make things happen that I do. He also doesn't know much about technical matters, so he probably didn't realize that you had already supplied code, etc., etc.
I'm back now, and we'll have this resolved pronto.
There is now a tarball of each of the wiki sites that are running as "Wikipedia". Go to the homepage http://www.wikipedia.com to find the link, or go to the downloads directly at http://www.wikipedia.com/tarballs. These are just tarballs of the entire directory that the sites are served from.
Jason
sanger1@nupedia.com wrote:
Bryce (just realized that I had replied just to you rather than the list): I sympathize, but posting on this list frankly isn't going to get anything done. I do only a little work directly on the server; Jason's the man who will have to do this, if this or something like it is what we'll end up doing, exactly. What Jason might not know (so, I've cc'd him) is how seriously you all take this. I certainly would like to see Wikipedia's content easily useable--it will get a lot of new links back to Wikipedia and it will indeed make the project more credible, as someone very correctly said. I have wanted this to be done for months, but, well--our programmers are very busy with projects that actually make money. :-/
Larry
You Wrote:
I think what people are trying to politely say is that you may be in violation of your license...
My suggestion to get around this with a minimum of time expended
would
be to set up a cron job to tarball the wiki databases once a week.
Login as www-data or nobody or root or whomever $ crontab -e 0 5 * * 0 /usr/local/bin/tarball_wiki # Weekly Sunday 5am tarball
#!/bin/sh ## tarball_wiki script tar cf /tmp/wiki.tar /home/www-data/wiki_db /home/www-data/cgi-
bin/wiki.pl
gzip /tmp/wiki.tar rm -f /home/www-data/html/wiki.tar.gz mv /home/www-data/html/wiki.tar.gz /home/www-data/html/wiki-
prev.tar.gz
mv /tmp/wiki.tar /home/www-data/html/wiki.tar.gz
This will keep the current plus previous week's tarball. Obviously, you'll have to fiddle the paths to match however your
system
is organized. You might need to add some chmod/chown commands if
you do
this as a user other than the web user.
Anyway, IANAL, but I think this little script would get you off the
hook
regarding the transparency issue.
Bryce
On Wed, 15 Aug 2001 sanger1@nupedia.com wrote:
I agree completely as well. You all must realize, though, that
the
amount of (expensive) paid programming labor Bomis can devote to useful and even essential features like this is less than we would all like. It would be ideal if some programmers would step up to
the
plate and actually help bring some of these proposals into being.
I
for one would be absolutely delighted.
Larry
You Wrote:
On Tue, Aug 14, 2001 at 09:56:47AM +0200, Robert Bihlmeyer wrote:
Jan Hidders hidders@win.tue.nl writes:
Because the GFDL allows you to download everything and start
your own
server.
Well the GFDL also wants transparent copy to be easily
available. I
don't consider spidering wikipedia to be an option open to
the "man
from the street".
FWIW I certainly agree with that, and there should certainly be
an
easy
way to download the complete Wikipedia. So you can also add
my "pretty
please" :-)
-- Jan Hidders
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l 0
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l
Thanks very much, Jason!
Larry
----- Original Message ----- From: "Jason Richey" jasonr@bomis.com To: wikipedia-l@nupedia.com Sent: Wednesday, August 15, 2001 2:44 PM Subject: Re: Re: [Wikipedia-l] Wikipedia teamwork
There is now a tarball of each of the wiki sites that are running as "Wikipedia". Go to the homepage http://www.wikipedia.com to find the link, or go to the downloads directly at http://www.wikipedia.com/tarballs. These are just tarballs of the entire directory that the sites are served from.
Jason
sanger1@nupedia.com wrote:
Bryce (just realized that I had replied just to you rather than the list): I sympathize, but posting on this list frankly isn't going to get anything done. I do only a little work directly on the server; Jason's the man who will have to do this, if this or something like it is what we'll end up doing, exactly. What Jason might not know (so, I've cc'd him) is how seriously you all take this. I certainly would like to see Wikipedia's content easily useable--it will get a lot of new links back to Wikipedia and it will indeed make the project more credible, as someone very correctly said. I have wanted this to be done for months, but, well--our programmers are very busy with projects that actually make money. :-/
Larry
You Wrote:
I think what people are trying to politely say is that you may be in violation of your license...
My suggestion to get around this with a minimum of time expended
would
be to set up a cron job to tarball the wiki databases once a week.
Login as www-data or nobody or root or whomever $ crontab -e 0 5 * * 0 /usr/local/bin/tarball_wiki # Weekly Sunday 5am tarball
#!/bin/sh ## tarball_wiki script tar cf /tmp/wiki.tar /home/www-data/wiki_db /home/www-data/cgi-
bin/wiki.pl
gzip /tmp/wiki.tar rm -f /home/www-data/html/wiki.tar.gz mv /home/www-data/html/wiki.tar.gz /home/www-data/html/wiki-
prev.tar.gz
mv /tmp/wiki.tar /home/www-data/html/wiki.tar.gz
This will keep the current plus previous week's tarball. Obviously, you'll have to fiddle the paths to match however your
system
is organized. You might need to add some chmod/chown commands if
you do
this as a user other than the web user.
Anyway, IANAL, but I think this little script would get you off the
hook
regarding the transparency issue.
Bryce
On Wed, 15 Aug 2001 sanger1@nupedia.com wrote:
I agree completely as well. You all must realize, though, that
the
amount of (expensive) paid programming labor Bomis can devote to useful and even essential features like this is less than we would all like. It would be ideal if some programmers would step up to
the
plate and actually help bring some of these proposals into being.
I
for one would be absolutely delighted.
Larry
You Wrote:
On Tue, Aug 14, 2001 at 09:56:47AM +0200, Robert Bihlmeyer wrote:
Jan Hidders hidders@win.tue.nl writes:
> Because the GFDL allows you to download everything and start
your own
> server.
Well the GFDL also wants transparent copy to be easily
available. I
don't consider spidering wikipedia to be an option open to
the "man
from the street".
FWIW I certainly agree with that, and there should certainly be
an
easy
way to download the complete Wikipedia. So you can also add
my "pretty
please" :-)
-- Jan Hidders
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l 0
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l
-- "Jason C. Richey" jasonr@bomis.com
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l
Cool! (/me gets downloading)
On Wed, 15 Aug 2001, Larry Sanger wrote:
Thanks very much, Jason!
Larry
----- Original Message ----- From: "Jason Richey" jasonr@bomis.com To: wikipedia-l@nupedia.com Sent: Wednesday, August 15, 2001 2:44 PM Subject: Re: Re: [Wikipedia-l] Wikipedia teamwork
There is now a tarball of each of the wiki sites that are running as "Wikipedia". Go to the homepage http://www.wikipedia.com to find the link, or go to the downloads directly at http://www.wikipedia.com/tarballs. These are just tarballs of the entire directory that the sites are served from.
On Wed, 15 Aug 2001, Tim Chambers wrote:
Cool! (/me gets downloading)
I thought I'd do it now, too, but 55Mb is a big tarball to download. Gulp.
But thanks for the blazingly fast response, Jason!
If you want *just* the html files and can deal with bzip2, I've created a tarball of just that, here: http://www.osdlab.org/~bryce/wikipedia_html.tar.bz2 (4.3M)
Bryce
If you want *just* the html files and can deal with bzip2, I've created a tarball of just that, here: http://www.osdlab.org/~bryce/wikipedia_html.tar.bz2 (4.3M)
Thanks, Bryce!
BTW -- other list readers, the bzip2 home page is http://sources.redhat.com/bzip2
I downloaded the executable to my PC and it worked fine.
<>< [[tbc]]
I've been rather overwhelmed with non-wiki work recently, so I'm way behind on my mail. I noticed the tarball announcement, and I think you'll want to modify the tarballs slightly to remove the user subdirectory.
The reason to remove the user subdirectory is that it contains the user cookie data, including the "randkey" value which validates the user cookies. With these files users could impersonate other users and have the same UserID number. This probably isn't a big deal, since I don't think too many people are looking at the UserID numbers (which appear in the popup when you move the mouse over a username in RecentChanges).
Probably the easiest way is to use the tar option to exclude the user directory. Another approach would be to move the user dir and change the $UserDir variable in the wiki.
I hope to get less busy in the next few weeks, and help out more with wiki issues. Wikipedia has definitely exceeded my expectations--thanks to you and all the other people at Wikipedia/Bomis.
--Cliff
There is now a tarball of each of the wiki sites that are running as "Wikipedia". Go to the homepage http://www.wikipedia.com to find the link, or go to the downloads directly at http://www.wikipedia.com/tarballs. These are just tarballs of the entire directory that the sites are served from.
wikipedia-l@lists.wikimedia.org