Wikitech-l December 2003

wikitech-l@lists.wikimedia.org

84 participants
155 discussions

Re: [Wikitech-l] Re: Proposed table restructuring
by user_Jamesday 21 Dec '03

21 Dec '03

Timwi wrote: >> * The other thing I like to mention is LiveJournal. Their database backend is pretty impressive and handles the load of almost a million active users. They have never even dreamt of placing journal entries, audio posts or user pictures into files in a file system. The way they have it now they can easily create more database clusters and move users (and their data) around between clusters using a little Perl script. With a file system, that would be quite a bit more difficult.<< Not quite. LiveJournal identified blob transfer from their databases as a bottleneck and removed them to remove the bottleneck, as part of their work on removing the databases as a whole from bottleneck status, which in general meant shifting database things into memcached caches whenever possible. This moving started in earnest in early November. LJ blobs (images and audio) are now on a NetApp box filesystem, completely apart from their user database cluster machines. You can see the overall architecture here: http://www.livejournal.com/community/lj_backend/974.html A description of the move out of the database machines is here: http://www.livejournal.com/community/lj_backend/502.html Without the third party Akamai servers they would be serving via their blob component, which would cache in memcached, loading into that from the NetApp filesystem if not in the cache already. Addressing your concerns about integrity, the way to do that is to back up the images, then the database, then any new images. With all image names incorporating timestamps that will ensure that the images to match the database are available, at the cost of keeping some extra images - those deleted before the backup and new created during it. LJ caches data into memcached with the memcached servers (sometimes several of them on one machine) residing on the web servers (page builders), because the web servers are CPU-bound and have RAM to spare. Lets the machines do double duty. Like Memcached, the blob component is open source.

3 3

Edit conflicts on section editing
by zocky 21 Dec '03

21 Dec '03

If we treated section' s path as its ID, we could get rid of most conflicts when editing sections. ''Disclaimer: this should work fairly well for two users in conflict. I'm not sure how scalable it is or needs to be.'' Construct paths from heading texts, separated by #. Where needed, add dissambig characters. Use the path to identify sections when editing. Provide users with an option to either edit the section with it subsections (so the smallest needed part is edited when restructuring) or just the text of the section (to edit even less when appropriate). So, users U1 and U2 are concurrently editing section S1 and S2 and then U1 saves the page with his version of S1. U2 tries to save the page with his edit of S2, but gets an edit conflict. What's to be done? 1) If S1 and S2 are same, show diff and let U2 handle it. 2 ) If S1 and S2 are disjunct (i.e. one is not included in the other), there is no conflict. Overwrite S2 with U2's version and save. 3) If S1 and S2 are not disjunct, we call the larger one (S) and the smaller one (s). Check if (s) exists in the newest version of (S). 3.1) If (s) doesn't exist, present U2 with the diff page for (S). 3.2) If (s) exists, and U1 changed its contents, present U2 with the diff. 3.3) If (s) exists, and the user who edited (S) did not change its contents, replace it with the version of the user who edited (s) (this is quite automagical, so if conservative, present U2 with the diff) 3.4) When U2 finally saves the page, record the version as an edit of (S) to ensure correct behaviour. Additional tuning may be needed: Maybe show separate diff for (s) and show source of (s) in the version where it was directly chosen for editing . What happens when U1 changed the title of the section, (or ordering of sections with the same names) but left the contents alone? Should we bother to check for it? Maybe keep CRC's of section contents? (They can be kept in the source- filter them out on display and edit, calculate new and store on save) ~~~~ zocky

4 4

Proposal for deadlinks
by Jeroenvrp 21 Dec '03

21 Dec '03

Hi, Please visit http://meta.wikipedia.org/wiki/Proposal_for_dead_links . Cheers, Jeroenvrp

2 1

Wikipedia with more than one language
by Daniel Mayer 21 Dec '03

21 Dec '03

Some projects, like Wikibooks and Wikisource (not to mention Meta), don't have major naming conflict problems. But it would help a great deal if it were possible for users to set their preferred interface language in their preferences and for language category tags to set the interface language as well (so anons visiting a page with a French language tag will have their interface switch to French; &lang=fr in the URL would trigger the same thing as well). -- mav

1 0

RE: [Wikitech-l] FW: [WikiEN-l] Wikibot, murder and Black Pantherredirects
by Poor, Edmund W 21 Dec '03

21 Dec '03

Thanks for taking such quick action, Andre. :-) Ed Poor

10 38

Wikipedia with more than one language
by Markus Baumeister 21 Dec '03

21 Dec '03

Hello, I'm currently playing around with wikipedia to get a version with German and English entries running on my zaurus. As far as I currently understand the code, having entries from two languages would require two mysql-databases, two http-directories on the server, and cross-references between these wikis on the "en" and "de" prefix within the interwiki database table. Is this correct or is there a way to have two languages with only one database and one copy of the mediawiki code? BTW, http://wikipedia.sourceforge.net/ still names mediawiki-20031118.tar.gz as most recent version, which got me wondering over the speed of rebuildlinks.php ;-) Kind regards Markus

2 1

Wikipedia : MediaWiki
by zocky 21 Dec '03

21 Dec '03

Just out of curiosity, what is the current thinking among developers? Is it more: A) MediaWiki is a general purpose wiki infrastructure, which also runs Wikipedia. Developers of MediaWiki should concentrate on making the best possible software and editors of Wikipedia should take advantage of new features as they're provided. B) MediaWiki is support infrastructure for Wikipedia. Editors of Wikipedia should decide what kind of behaviour and which features Wikipedia needs and developers of MediaWiki should implement them. ~~~~ zocky

5 4

[ken@dobruskin.com: Forbidden access to wikipedia server]
by Jimmy Wales 21 Dec '03

21 Dec '03

Ken, I'm forwarding your inquiry to wikitech-l. ----- Forwarded message from Ken Dobruskin <ken(a)dobruskin.com> ----- From: Ken Dobruskin <ken(a)dobruskin.com> Date: Fri, 19 Dec 2003 12:00:02 +0100 (CET) To: Jimbo Wales <jwales(a)bomis.com> Subject: Forbidden access to wikipedia server Greetings! After reading the Wikipedia GFDL and policy on robots, I thought I'd try an experiment, absolutely in line with said policies. However when I tried to access the site from my server it failed with a 403. User-agent: Python-urllib/1.15 IP address: 216.28.158.40 Sorry if I should be asking this to the mailing list, but I did see a word about access issues to be addressed to a site admin. Would appreciate your advice. Best 'net regards, Ken ----- End forwarded message -----

3 2

Please disregard previous message from me
by zocky 20 Dec '03

20 Dec '03

It is old and has been dealt with.

1 0

Linux security updates
by Brion Vibber 19 Dec '03

19 Dec '03

I topped off the rpm-based security updates on pliny and larousse, I *think* I got the updates installed on geoffrin via yast2, and I gave them more current kernels which shouldn't have the recently publicized kernel vulnerability (the latest Red Hat-provided one for larousse, and a compiled 2.4.23 for pliny; I assume the SuSE updates should also be current). They'll need to be rebooted for the kernel upgrades to take. L & P are theoretically both set to boot to the new kernels on the next boot only; so if they won't come up, a second reboot should bring back the old kernel. Since this involves conservatively some 30 minutes of downtime (possibly rather more if there are problems) I'd rather not do it in the middle of the day. Note also that geoffrin cannot be remotely rebooted yet if it can't be logged into. -- brion vibber (brion @ pobox.com)

1 0

← Newer
1
2
3
4
5
6
7
8
...
16
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l December 2003