Wikitech-l November 2002

wikitech-l@lists.wikimedia.org

46 participants
134 discussions

by Tomasz Wegrzanowski

There are 4 issues standing between polish wikipedia and phase 3 software. One is automatic conversion of &#codes; are requires just a small patch, which has been sent by me already. Second is support for H1 (don't say that it is reserved for page titles, CSS already treats H1 and H1.pagetitle different). Two others are MySQL issues: * MySQL doesn't search UTF-8 right. * Wikipedia should be mirrorable and MySQL database dumps are not really convenient way. Afair MySQL 4.1 is supposed to fix the first, and there is some patch already that fixes that, so could you investigate that stuff ? Mirroring by downloading dumps is very inconvenient, making nightly patches of dump file available is bare minimum. But in longer term some better solution should be developed. Anyway, I'm for setting up final setup instalation as soon as &#code; and H1 issue is fixed. MySQL issues may be hard to fix and nothing really critical would happen if they were fixed a few weeks later.

21 years, 5 months

Quickbar patch

by Erik Moeller

Hi, attached is a patch for the navigation sidebar. It changes the link names as described in my previous message, and it fixes two problems: - The "Protected page" text was shown twice - The Watchlist link was shown for users who were not logged in, although only logged in users can use it. Regards, Erik -- FOKUS - Fraunhofer Insitute for Open Communication Systems Project BerliOS - http://www.berlios.de

21 years, 5 months

Re: [Wikipedia-l] question about html templates

by Brion VIBBER

(cc'ing to wikitech-l) Jonathan Walther wrote: > Where are the templates stored that are used in generating the HTML > pages? For instance, the templates for the header and footer for each > article; the format of the recent changes page; that sort of thing. Is > it hard coded into the source? Yes. It's kind of ugly that way. ;) Language-specific text is stored away in arrays in the Language**.php files, but that contains relatively little markup. See Skin.php and its fellows for most of the layout; also OutputPage.php for some bits. (And feel free to use the wikitech-l list for technical discussion of the code; wikipedia-l is fairly high traffic as it is.) Jonathan Walther wrote: > Hi. Question about Wiki version control. Am I correct in believing > that every revision of an article is stored in the database, in full? Yes. Old revisions sit in the 'old' table, every blessed one. It's a big table. (In theory this could be made more efficient in various ways; compression, diffs, etc.) > Also, looking through the php source, I'm seeing what look like a lot > of MySQLisms that are hard to clean up, but if fixed could mean > tremendous speedups with Postgres. Thats entirely apart from the > benefit of running the VACUUM program every night so the database > self-optimizes itself for the data access patterns that it sees. Mmmm, please do! > I would like to complement the coders on a really clean codebase. The > code is a pleasure to read and tweak. Send all compliments on the current codebase to Lee Daniel Crocker. He da man! > Not nice to do major changes on, but I doubt if that was ever > intended for the code anyway. Postgres support isn't a major change, > btw. > There is one minor point; it's a very nice thing to have the sql stuff > abstracted out into it's own .sql file. I refer to things like > buildTables.php, and the like. Code and SQL don't mix too well; makes > it harder to hunt down bugs or make modifications in either one. For > instance, getting rid of MySQLisms... Yes, it might not be a bad idea to break out the queries that way, so as much as possible you can just drop in an alternate file or two and run with a different database backend. -- brion vibber (brion @ pobox.com)

21 years, 5 months

Re: [Wikipedia-l] proposal to speed up Wikipedia

by Brion VIBBER

(moving to the wikitech-l list; see sign-up and archive page at http://www.wikipedia.org/mailman/listinfo/wikitech-l ) Jonathan Walther wrote: > I've done some work at converting the Wikipedia to Postgres, but am not > there yet. So, let's put that aside for now. Great! I did get postgresql installed on my machine, but got bogged down in details of converting the table definitions and various interface behaviors. Someone with prior experience working with postgres would be a big help there. > It seems that the wiki "source" is "interpreted" into html every single > time someone accesses a link. That seems like a lot of overhead. > Given that for every time a change is made to the wiki source to a page, > several people "view" it, why not just regenerate the html only when > changes are made, and store it? It would take more storage space, but > should be MUCH faster. And if storage is an issue, I can donate some > hard drives... We used to cache in the phase II days on the old server. This was removed for two reasons: 1) Wiki->HTML rendering is still pretty darn fast, particularly with our new dedicated server; database contention seems to be our main problem during high-load periods. 2) We had problems keeping the cache consistent with the old code. On number 2, I would certainly welcome an improved cache subsystem that's designed right from the ground up. The old one was hacked in as a "crap! the system's unusably slow, let's hack in some improved code" On number 1, note that LinkCache::addLink() does a brief query on the cur table for every link when rendering a wikipage. These could probably be consolidated somehow or other. (Note that this does not apply to Recentchanges, which loads everything in a big chunk.) > The savings on the Recent Changes page alone should work wonders. On the English wikipedia, Recentchanges is loaded at default options about 3000 times per day; the number of edits per day is a similar figure, and every edit means the page has to change to reflect it. Caching the rendered display wouldn't seem to save significantly over rerendering it on each view. -- brion vibber (brion @ pobox.com)

21 years, 5 months

Re: [Wikipedia-l] Blocked, insulted, and pissed

by Erik Moeller

Hi, I'd like to suggest two solutions to the blocking of dynamic IP addresses or proxies that may affect innocent users. Solution #1: IP address blocks should expire after n days unless renewed by someone. That way, instead of forgetting to unblock people, at worst we forget to re-block them. In my opinion, it's better to fail to punish somone effectively than to punish someone who's innocent. Solution #2: We should give blocked users a way to re-gain access to the site, namely by creating an account. I don't know if this is currently possible, but it should be. We can block accounts a lot easier than IP addresses. So we could basically say on the block page: "Because IP addresses cannot be reliably linked to individuals, it may be that you receive this message in error. In that case, or if you want to change your behavior, please create an account and sign in, and you can continue to use Wikipedia." We might still reserve complete IP&account bans for those who abuse the account "backdoor", but this should be the exception, not the rule. This would make our security softer, and hopefully more effective. Regards, Erik -- +++ GMX - Mail, Messaging & more http://www.gmx.net +++ NEU: Mit GMX ins Internet. Rund um die Uhr für 1 ct/ Min. surfen!

21 years, 5 months

Improving sidebar

by Erik Moeller

I'm not 100% happy with the sidebar: Main page Recent changes Watch list Current events -------------------- Edit this page Watch this page Move this page Talk page | Subject page History What links here Watch links -------------------- Upload Bug reports Special pages The links are perfectly OK, but I have problems with the words used and sometimes with the positioning. I suggest the following sidebar: Main Page Recent changes My watchlist Random page Current events -------------------- Edit this page Watch this page Move this page Discuss this page | View article Older versions What links here? Link history -------------------- Upload file Special pages Bug reports Explanations: - "Watch list" should be "My watchlist" to make clear that this is not a page that is the same for all users, like the other links in this section of the sidebar. - "Talk page" should be "Discuss this page" to use the same imperative style of the other links above it. Since users don't have to create "/Talk" links anymore, it's not necessary that we use the actual word "Talk" anywhere but in the URLs. - "Subject page" should be "View article" or "Back to article". "Subject page" is really ambiguous and hard to understand, I searched several times for a way back to the article because I didn't find an obvious link. - "History" should be "Older versions" or "Page history" to be more obvious. Most people are not familiar with the concept of article histories. - "What links here" needs a questionmark. - "Watch links" should be "Link history", "Related changes" or something else. "Watch links" suggests that this will add the links to my watchlist, which it doesn't do. - Upload should be "Upload file". True, a bit redundant, but more familiar this way. - Bug reports should be at the bottom as this is the least relevant. Your thoughts? I'll be glad to patch this, although it should be easy enough for someone with CVS access to do it on their own. Regards, Erik -- FOKUS - Fraunhofer Insitute for Open Communication Systems Project BerliOS - http://www.berlios.de

21 years, 5 months

Number of articles suggestion

by Erik Moeller

Hi, since the site stats are conveniently stored in the site_stats table, I suggest subtracting the number of articles created by the Ram-Man bot (US Census city information) from the total number of articles. Why? The NOA is primarily interesting as a measure of our collaborative progress. This is important for ourselves and for others. Personally, I've had several discussions about Wikipedia where I was reluctant to cite the NOA because of the high number of machine-generated articles, others probably feel the same. I therefore believe we should generally exclude autogenerated articles (we can change the wording on Main_Page to reflect this). As it would be a 5 minute task for anyone with access to the db, is there any reason not to do it? Regards, Erik -- FOKUS - Fraunhofer Insitute for Open Communication Systems Project BerliOS - http://www.berlios.de

21 years, 5 months

Wikipedia is stuck

by Pierre Abbat

Wikipedia has been stuck for the past twelve minutes or so. Can someone unstick it, or correlate it with some timeconsuming query? phma

21 years, 5 months

Lag

by Toby Bartels

I've been getting lag for about the past hour that seems to be especially pronounced when saving articles. Otherwise, it's slower than usual but still tolerable. Saving, however, takes half a dozen minutes. -- Toby

21 years, 5 months

Automatic &#codes; to UTF8 conversion patch

by Tomasz Wegrzanowski

Patch for Polish Wikipedia.

21 years, 5 months

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l November 2002