Wikitech-l February 2003

wikitech-l@lists.wikimedia.org

54 participants
118 discussions

Start a nNew thread

Re: Blocked another leech

by Jan Hidders

There is an Apache module for that: http://dominia.org/djao/limitipconn.html -- Jan Hidders

21 years, 2 months

Blocked another leech

by Brion Vibber

80.2.170.93 public2-bror2-5-cust93.bror.broadband.ntl.com Using "WebStripper/2.19". I've blocked both the UA string "WebStripper" (permanently) and the IP (will clear it after a few days). Is there some kind of rate-of-connections-per-IP throttling we could do with Apache? -- brion vibber (brion @ pobox.com)

21 years, 2 months

Offline-/CD-Reader

by Magnus Manske

I have (despite some protests :-) begun to write an offline wikipedia reader. It is C++, using wxWindows, and the free Dev-C++ development environment. I was pleasently surprised to find that wxWindows not only contains a built-in HTML display component, but also a zipfile input stream, so it can read a file directly from a packed archive, which makes it perfect for CD-ROM use (no need to unpack the thing). An offline reader for pre-made HTML pages could be up in a matter of days. But I'd like to aim higher. 1. The power of C++ would make it perfect for parsing the wiki code. 2. The parser will be an object (of course), which could be used in a Phase IV C++ software. 3. It could be used as a fast client-side reader, instead of a web browser, for wikipedia pages. It could load the wiki code from the live site without the server having to render it. 4. It could serve as a special wikipedia offline (or client-side) editor. I'll try to implement some of #1 before going public, though. Magnus

21 years, 2 months

[brion@pobox.com: Re: [Wikipedia-l] Problem with "User Contributions"]

by Jimmy Wales

Forwaded from wikipedia-l... this seems like a good example to me of something that could be cached. If someone requests "User Contributions" then if they've calculated it within the last 24 hours, say, then they should be given the cached version and told that a new one will be generated after midnight, or similar. That is, unless people use their User Contributions for some realtime work. ----- Forwarded message from Brion Vibber <brion(a)pobox.com> ----- From: Brion Vibber <brion(a)pobox.com> Date: 01 Feb 2003 17:19:30 -0800 To: wikipedia-l <wikipedia-l(a)wikipedia.org> Subject: Re: [Wikipedia-l] Problem with "User Contributions" On sab, 2003-02-01 at 14:33, Zoe wrote: > For several days now, when I've clicked on "User Contributions" on my > user page, Wikipedia has churned for several minutes, and then > displayed the browser's "The page cannot be displayed". I can > sometimes Refresh and eventually it comes up, but today, that's been > completely unsuccessful. That's cause you edit too much, Zoe. ;) Seriously, though, yes. Certain operations that involve checking large numbers of old page revisions are sometimes excruciatingly slow of late. This is hitting: * User contributions * History of oft-edited pages with hundreds of revisions (Village pump, mav's talk page, current events, talk:main page, vandalism alerts) * Diff to last edit on the same. (This involves sorting to find the most recent edit, and that seems to be holding us up.) Until this is resolved, I'd like to ask that you _don't_ hit refresh when one of these is churning away like mad, but rather just let it go. The first query keeps running for a while, and it's just not going to go anywhere until it's done (or I or Magnus or someone logs in and kills the query), and a second request might just make it worse. :( -- brion vibber (brion @ pobox.com) ----- End forwarded message -----

21 years, 2 months

Size of database dump

by Magnus Manske

Part of my offline reader is the functionality to split a database dump into a zillion files, one for each article. (That function will only be used on the *coding* side, e.g., on the wikipedia server when generating a CD-ROM version). So I downloaded the zipped German database, which unzips to ~33MB. Imagine my surprise when the article files (article namespace only!) came to less than 14MB (~16.000 files). It seems that in each database dump, we have the "search indexed" article text as well, which contains the same text as the article, but without special chars. Can we not dump that field next time? It would reduce the file size (and download time) by about 50%! Magnus

21 years, 2 months

[Fwd: BUG: InnoDB ORDER BY DESC may hang in 4.0.10]

by Brion Vibber

Oh, lovely. Perhaps I should hold off on that upgrade just a bit... -- brion vibber (brion @ pobox.com) -----Forwarded Message----- From: Heikki Tuuri <Heikki.Tuuri(a)innodb.com> To: mysql(a)lists.mysql.com Subject: BUG: InnoDB ORDER BY DESC may hang in 4.0.10 Date: 07 Feb 2003 02:40:40 +0200 Hi! A rather serious bug was introduced to 4.0.10 in connection of another bug fix. If you have a composite key (col1, col2) in an InnoDB table, then a query of type SELECT ... FROM ... WHERE col1 = x ORDER BY col2 DESC; may hang in an infinite loop. The fix is in 4.0.11. Best regards, Heikki Tuuri Innobase Oy --- InnoDB - transactions, hot backup, and foreign key support for MySQL See http://www.innodb.com, download MySQL-Max from http://www.mysql.com sql query

21 years, 2 months

Hardware inventory

by Jimmy Wales

Jason and I are taking stock of our hardware, and I'm going to find a secondary machine to devote exclusively to doing apache for wikipedia, i.e. with no other websites on it or anything. I'll loan the machine to the Wikipedia Foundation until the Foundation has money to buy a new machine later on this year. We'll keep the MYSQL where it is, on the powerful machine. The new machine will be no slouch, either. Today is Friday, and I think we'll have to wait for Jason to take a trip to San Diego next week sometime (or the week following) to get this all setup. (The machine I have in mind is actually in need of minor repair right now.) By having this new machine be exclusively wikipedia, I can give the developers access to it, which is a good thing. This will *not* involve a "failover to read-only" mechanism, I guess, but then, it's still going to be a major improvement -- such a mechanism is really a band-aid on a fundamental problem, anyway. ------ Lots of people think it's a good thing to set up mirror servers all over the Internet. It's really not that simple. There are issues of organizational trust with user data, issues with network latency, etc. Some things should be decentralized, some things should be centralized.

21 years, 2 months

LALR?

by Jason Richey

I haven't heard anything new about getting a more powerful parser together. Have I missed anything? -- "Jason C. Richey" <jasonr(a)bomis.com>

21 years, 2 months

quick software question...

by Jason Richey

Does the current software have a quick mechanism to make the site read-only? Also, is there anything that would make it easy to have "articles and searching only". That is, no user logins or anything fancy, just the ability to read the articles and search through them. -- "Jason C. Richey" <jasonr(a)bomis.com>

21 years, 2 months

Re: Giving Jan access

by Sheldon Rampton

Jimbo wrote: >Ian Gilfillan, author of Mastering MySQL 4, has also volunteered to >help, and if he wants access, we can give it to him too, even though I >don't know him. He wrote a book, so he's legit. There's no better source of legitimacy, in my eyes. -- -------------------------------- | Sheldon Rampton | Editor, PR Watch (www.prwatch.org) | Author of books including: | Friends In Deed: The Story of US-Nicaragua Sister Cities | Toxic Sludge Is Good For You | Mad Cow USA | Trust Us, We're Experts --------------------------------

21 years, 2 months

← Newer
1
...
5
6
7
8
9
10
11
12
Older →

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l February 2003