Wikitech-l October 2003

wikitech-l@lists.wikimedia.org

79 participants
171 discussions

by Brion Vibber

I've consolidated some of the watchlist access code into WatchedItem.php and added memcached support for it. This should reduce db hits on logged-in page views; it had been checking the watchlist table (twice!) per logged-in page render. (Also a note: main development is currently going on in the stable branch and is focused on speed, security, and bug fixes, not features. A lot of fixes will need to be forward-ported to the dev branch at some point.) -- brion vibber (brion @ pobox.com)

20 years, 6 months

Robot code: open source or lock away in safe?

by Rob Hooft

Some of you might have seen that I did write a bot to edit on Wikipedia. Actually it is not a bot per-se, but a framework to be able to write wikipedia bots. I have used this framework to make a new and consistent lay-out for all the nl: year pages, and currently its two greatest hits are automatically adding consistent interwiki links and semi-automatically disambiguating links to disambiguation pages. I am running these robots on the nl: wikipedia only, and I think that is the way it should stay. My idea is that if a language wikipedia agrees to have a robot sniff around, it should be handled and monitored by someone that is a very regular visitor of that language. Now of course I am getting into trouble with this: there are other people that want to run this software. First other user was Andre Engels. I just gave him a copy of the code. Then there was Christian List on da:, and he got a copy. But then more people working on wikipedia started asking for the code, and recently also people that have other instances of the wikipedia software running for other projects. I am quite fond of free software, but here I am hesitating a bit: if I give out the code to a growing select few, maintenance for me is going to be a nightmare. Now, I could just start an open source project at sourceforge to collaboratively develop it further. BUT: I am afraid of the power of this software, and the damage it could do if it ends up in the wrong hands (vandals). On the other hand, it is not very difficult to write a vandalizing robot for wikipedia, and I haven't heard anyone did it so far; why would it start if a decent piece of robot software would be freely available? I'd like to evoke a discussion on this list to help me find a solution to this. What do wikipedia developers and admins think about it? Regards, Rob Hooft -- Rob W.W. Hooft || rob(a)hooft.net || http://www.hooft.net/people/rob/

20 years, 6 months

inter-language link insanity

by Sascha Noyes

Hi I want to raise the issue that the inter-language link situation at the moment is a big mess: Please have a look at Rob Hooft's 5 pages listing articles on the english wikipedia that do not have inter-language links to corresponding articles in other languages. http://en.wikipedia.org/wiki/User_talk:Rob_Hooft/page1 (1 through 5). There are literally thousands of such links missing. Multiply this by roughly two, to factor in the missing links on all the other languages (English wikipedia has roughly half the total number of articles), and you can see the magnitude of the problem. The correct alphabetical ordering of links is also necessary, and adds more work. Now, there are a couple of solutions to this problem (please add your own): 1) Just continue as we have been doing. Additionally asking people to make an effort to add inter-language links. Pro: doesn't require any mediawiki changes Con: wastes a _massive_ amount of man-hours that needless to say could be better employed 2) Deploy the robot that Rob wrote (and is using on the Dutch wikipedia) on all wikipedias. To my knowledge this robot searches for missing links and automatically adds those. Pro: no mediawiki changes necessary Con: at least one person per language is needed to run/administer the robot 3) Create a unified inter-language link field (similar to what I have proposed for pictures/images). This would have a list of languages in which a particular article is available. Pro: No need for robot traffic that eats up man-hours, bandwidth, cpu, ram, disc usage. No massive waste of man-hours due to just continuing as we are. Con: mediawiki development of such a feature. Another obvious problem is that the field relating to each article will have to have a unique identifier which is referrenced from each different language version of that article. Two possible approaches: a) just settle on the english title, I know this is language-ism and will piss a bunch of people off b) unique identifier is a number. The problem here is obvious: If you write a new article you will 'have' to check that the article has not been written in a different language already and therefore already has a unique number. Actually, now that I think about it option (b) sounds pretty stupid. Thoughts? Best, Sascha Noyes aka snoyes -- Please encrypt all email. Public key available from www.pantropy.net/snoyes.asc

20 years, 6 months

Eureka! I've got it (was: Pages with hundreds of international links slow?)

by Poor, Edmund W

Eureka! I've got it!! Give each page a "links complete" bit. The first time the page is loaded, we check all the links, then set the bit ON. After that, any page change (creation, deletion, etc.) WHICH AFFECTS THIS PAGE would then turn the bit OFF. For example, a page with 20 links. Half of them are to existing (blue) articles. Half are to missing (red) articles. We load the page, figure out the colors, cache the page, and set the bit ON. Next time, someone loads the page, we chech the bit. If it's still on, we serve the cached copy. (Cold cash! Cash and carry! We get a substantial discount!!!) But -- If anyone creates a new page, we take some extra time to trace all the "pages that link here" and turn off each affected page's "links complete" bit. (We do not have to re-render the page, though. That can wait.) Likewise, if anyone deletes a page. (patting self on back) Uncle Ed Poor Resident genius, tame creationist, and brainwashed cult member ^_^

20 years, 6 months

Stupid database tricks: images?

by Brion Vibber

Just thinking in terms of minimizing the number of directions we've got to go when mirroring and synchronizing things across servers... We currently keep page data in a database, but also cache some rendered pages on the local filesystem. We keep information about TeX fragments in a database, but cache the rasterized images in the local filesystem. Uploaded files are a little different: we keep track of them in a database, but the essential files themselves cannot be reconstructed from the database alone. Thus if we run multiple web servers from one database, they fight over the files, needing either a shared networked filesystem -- inaccessible if the server dies -- or for all but one to be read-only and periodically update from the live site. Transparent load-balancing will require that any web server that you stumble upon should be as good as any other and provide the same stuff, so locking off all but one (as we've done from time to time when experimenting with a second 'en2' access point) is less than ideal. Hypothetically we could have a notification system where the server that makes a change sends updates to all other servers, but this raises a question of resynchronization if a member of the cluster is taken out of service and brought back in -- or if two make conflicting changes simultaneously. It might be the simplest thing to just store uploads directly as blobs in the database. When a request comes to serve an uploaded file, we can cache it in the local filesystem (or memory) if we like to speed access and allow read-only access with the db server down. This allows resynchronization by checking cache time validity, and lets images be backed up along with the rest of the db. Terrible idea? Good idea? Anyone have experience doing this kind of thing? Typical files are in the tens or low hundreds of kilobytes, with some ranging up to the maximum allowed 2MB. The total size of uploads we have is comparable to the total size of the current-version page texts, and much smaller than the total old-revision page texts we've already got in the db. -- brion vibber (brion @ pobox.com)

20 years, 6 months

Oops, please delete my previous post.

by Poor, Edmund W

Darn MS Outlook. I hit the wrong key while trying to send private mail to Brion and now I made a private post public. I hate computers!! Ed

20 years, 6 months

PRIVATE - OFFLIST

by Poor, Edmund W

PRIVATE - OFFLIST Brion, I worry that some vandal might learn from the Porsche incident and replace some innocent picture with goatse.cx -- on some prominent page like Mother Theresa or George Bush. I hope we can close this vulnerability quickly (I'm more concerned about this, than about my fanciful scheme for speeding up the database). Ed Poor (PRIVATE - OFFLIST)

20 years, 6 months

Watchlist feature wishlist

by Poor, Edmund W

I would like to have my watchlist publicly viewable, and I would also like to maintain (discreetly) a private, non-viewable watchlist. Additionally, I would like to have a publicly EDITABLE watchlist: like, hey Ed please watch this page! I'd even like to have multiple watchlists, each with: * a name * a setting for publicly viewable, or hidden * a setting for publicly editable, or personally controlled I need an NPOV watchlist, a "users fighting" watchlist, and a "global warming" watchlist. How soon can we get this stuff, Tim/Brion/Magnus and crew? ^_^ Ed Poor

20 years, 6 months

RE: [Wikitech-l] Eureka! I've got it (was: Pages with hundredsofinternational links slow?)

by Poor, Edmund W

Um, Brion, I plead insanity. I didn't understand your summary of what the software does. I *think* you're saying the software already does what I in my Archimedean joy had suggested. Ed "dripping wet" Poor

20 years, 6 months

Make Math Error

by Fred Bauder

While trying to install Phase 3 using mediawiki-20030829 I had the following error while making math: [colorado@tribal math]$ ls CVS html.ml mathml.mli tex.mli texvc_cgi.ml Makefile html.mli parser.mly texutil.ml texvc_test.ml README lexer.mll render.ml texutil.mli util.ml TODO mathml.ml render_info.mli texvc.ml [colorado@tribal math]$ make ocamlopt -c util.ml ocamlc -c render_info.mli ocamlc -c tex.mli ocamlyacc parser.mly ocamlc -c parser.mli ocamlopt -c parser.ml ocamlc -c html.mli ocamlopt -c html.ml ocamlc -c mathml.mli ocamlopt -c mathml.ml ocamlc -c texutil.mli ocamlopt -c texutil.ml ocamlopt -c render.ml ocamllex lexer.mll 187 states, 3221 transitions, table size 14006 bytes ocamlopt -c lexer.ml ocamlopt -c texvc.ml ocamlopt -o texvc unix.cmxa util.cmx parser.cmx html.cmx mathml.cmx texutil.cmx render.cmx lexer.cmx texvc.cmx ocamlopt -c texvc_test.ml ocamlopt -o texvc_test util.cmx parser.cmx html.cmx mathml.cmx texutil.cmx lexer.cmx texvc_test.cmx ocamlopt -o texvc_tex util.cmx parser.cmx html.cmx mathml.cmx texutil.cmx lexer.cmx texvc_tex.cmx Cannot find file texvc_tex.cmx make: *** [texvc_tex] Error 2 rm parser.ml lexer.ml [colorado@tribal math]$ [colorado@tribal math]$ ls CVS lexer.mll parser.o texutil.ml texvc_test.cmx Makefile lexer.o render.cmi texutil.mli texvc_test.ml README mathml.cmi render.cmx texutil.o texvc_test.o TODO mathml.cmx render.ml texvc util.cmi html.cmi mathml.ml render.o texvc.cmi util.cmx html.cmx mathml.mli render_info.cmi texvc.cmx util.ml html.ml mathml.o render_info.mli texvc.ml util.o html.mli parser.cmi tex.cmi texvc.o html.o parser.cmx tex.mli texvc_cgi.ml lexer.cmi parser.mli texutil.cmi texvc_test lexer.cmx parser.mly texutil.cmx texvc_test.cmi [colorado@tribal phase3]$ php install.php Directory "/public/html/colorado/dev" exists. Directory "/public/html/colorado/dev/upload" created. Directory "/public/html/colorado/dev/style" created. Directory "/public/html/colorado/dev/upload/tmp" created. Copying files... Directory "/public/html/colorado/dev/math" created. Directory "/public/html/colorado/dev/upload/math" created. Warning: copy(./math/texvc_tex): failed to open stream: No such file or directory in /public/html/colorado/dev/mediawiki-20030829/phase3/install.php on line 157 Failed to copy file "texvc_tex" to "/public/html/colorado/dev/math" [colorado@tribal math]$ make ocamlopt -o texvc_tex util.cmx parser.cmx html.cmx mathml.cmx texutil.cmx lexer.cmx texvc_tex.cmx Cannot find file texvc_tex.cmx make: *** [texvc_tex] Error 2 Any ideas? Fred

20 years, 6 months

← Newer
1
2
3
4
5
6
7
8
...
18
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l October 2003