Wikitech-l July 2009

wikitech-l@lists.wikimedia.org

104 participants
110 discussions

Downtime due to network maintenance, Friday July 31st 12:00 UTC
by Mark Bergsma 30 Jul '09

30 Jul '09

Hello, Due to a problem in one of our core routers in our Tampa cluster we need to perform some network maintenance tomorrow, Friday July 31st around 12:00 UTC. We will be performing a software upgrade and reboot of the router. This should not take more than a few minutes if everything goes well. Unfortunately this means that practically all sites and services will be down during that time. For those interested: one of the line cards in the router failed earlier this week. A replacement has arrived, but does not boot up correctly after hot plugging. Because we want to upgrade the firmware anyway, we will reboot the entire box. Cheers, -- Mark Bergsma <mark(a)wikimedia.org> System & Network Administrator, Wikimedia Foundation

1 0

parserTests code coverage statistics
by dan nessett 30 Jul '09

30 Jul '09

I decided to investigate how well parserTests exercises the MW code. So, I threw together a couple of MacGyver tools that use xdebug's code coverage capability and analyzed the results. The results are very, very preliminary, but I thought I would get them out so others can look them over. In the next couple of days I hope to post more detailed results and the tools themselves on the Mediawiki wiki. (If someone could tell me the appropriate page to use that would be useful. Otherwise, I will just create a page in my own namespace). The statistics (again very preliminary) are: Number of files exercised: 141 Number of lines in those files: 85606 Lines covered: 59489 Lines not covered: 26117 Percentage covered: 0.694916244188 So, parserTests is getting (at best) about 70% code coverage. This is better than I expected, but still it means parserTests does not test 26117 lines of code. What I mean by "at best" is xdebug just notes whether a line of code is visited. It doesn't do any logic analysis on which branches are taken. Furthermore, parserTests may not visit some files that are critical to the operation of the MW software. Obviously, xdebug can only gather statistics on visited files. I want to emphasize that there may be errors in these results due to bad assumptions on my part or bad coding. However, it is a place to start.

5 8

secure seems to be intermittent / very flaky right now
by George Herbert 30 Jul '09

30 Jul '09

I haven't seen an announcement go out, but something seems to be borken on secure.wikimedia.org again, at least talking to en.wp. -- -george william herbert george.herbert(a)gmail.com

2 1

Quick notes on SVG support potential: svgweb in Flash
by Brion Vibber 29 Jul '09

29 Jul '09

I had a great talk today with Google open web standards evangelist Brad Neuberg about one of his projects -- a high quality SVG-in-Flash adapter which allows SVG graphics to be shown inline, complete with scripted interactivity, in the vast majority of Internet Explorer clients. Using Flash to implement open standards in older browsers always makes me feel subversively happy... ;) There are a lot of possibilities with SVG, which become much more exciting if we can do in-browser previews and interactivity for a majority of users. I've written up some notes on the tech blog: http://techblog.wikimedia.org/2009/07/svg-for-all-with-flash/ -- brion

1 0

Note: some load problems on upload & image scaler servers
by Brion Vibber 29 Jul '09

29 Jul '09

A couple quick notes I tossed up on the tech blog: http://techblog.wikimedia.org/2009/07/intermittent-media-server-load-proble… Domas thinks it's related to this problem with ZFS snapshots badly affecting NFS server performance in some cases: http://www.opensolaris.org/jive/thread.jspa?messageID=64379 Actual load from clients doesn't seem problematic, but the NFS horror can cause things to time out badly, which sometimes affects the main apaches as well as the image scalers. (Especially when, say, deleting a category of 100 image pages. :) We've got it behaving reasonably well at the moment, but we'll want to keep an eye on things until we've reduced the coupling between things a bit... -- brion

14 21

Re: [Wikitech-l] wmf-deployment branch
by Chad 29 Jul '09

29 Jul '09

It already has wmf hacks :) IMHO, its not a branch outside people should use. -Chad On Jul 17, 2009 1:51 AM, "Tisza Gergő" <gtisza(a)gmail.com> wrote: Brion Vibber <brion <at> wikimedia.org> writes: > Since we seem to routinely have our deployment li... Will this work for other installations too, or can it contain WMF-specific hacks? _______________________________________________ Wikitech-l mailing list Wikitech-l(a)lists.wikimedia...

2 1

New SVN committers
by Brion Vibber 29 Jul '09

29 Jul '09

I've done a first pass through the committer request queue; for starters I'm mostly sticking in extension maintainers: algorithmix - DynamicPageList beckr - SlippyMap churchofemacs - general fixes conrad - Wiktionary-specific extensions flominator - WikiBlame greenreaper - MediaWikiAuth, misc fixes gslater - SlippyMap happy-melon - misc fixes husky - Wikiportrait jeroendedauw - Maps, Semantic Maps malvineous - FlvHandler, MassEditRegex questpc - QPoll strainu - Romanian Planet Wikimedia maintenance Thanks everybody for your patience! This is now a weekly item on my calendar... ;) Remaining in the queue: http://www.mediawiki.org/wiki/Commit_access_requests -- brion vibber (brion @ wikimedia.org)

2 1

Interwiki linking
by Helder Geovane Gomes de Lima 28 Jul '09

28 Jul '09

Hi! Currently, when an editor create at some wikiproject an interwiki link using [[Lang:Project:Page|Text]] the readers get a link to a "Page" of "Project" (in the "Lang" language) and "Text" is shown in the link. Besides this: 1) The "title" attribute of the <a> element is equals to (something not very intuitive for readers): [[Lang:Project:Page]]. 2) If "Text" is equals to "Page", and we want/need to show only the "Page" text, it is needed to use [[Lang:Project:Page|Page]] (although for local links [[A very long page name|A very long page name]] can be abbreviated to [[A very long page name]]) 3) When the target page doesn't exists, the reader go to a page showing the system message "MediaWiki:Noarticletext". I have some considerations: 1) What if the title attribute could be something more descriptive like "Search at Project (in Lang) for pages related to 'Page'"? The exact text could be translated for each language (or accordingly to the target project), in such way that "Page", the real name of "Project" (Wikipedia instead of "w", etc...) and the real name of "Language" could be variables $1, $2, etc...; 2) For example, if in a page of a wikibook we need to use various links to wiktionary and Wikinews, we would have to duplicate the total size of each link (the number of characters to be typed) just to have only the "Page" text shown. This is also bad for summaries, where we can not use a long text. Perhaps another _short_ syntax could be used to get "Page" instead of "Lang:Project:Page" without having to type (the possible very long) "Page" two times (maybe the prefixed colon [[:Lang:Project:Page]]?), in order to facilitate the integration between the content of the projects; 3)Compare the following: 3.1) http://en.wikipedia.org/wiki/Education_collaboration 3.2) http://en.wikipedia.org/wiki/Special:Search/education_collaboration and note that: 3.3) If the article "Education collaboration" exists, the two links carries the reader to it. 3.4) If not, 3.1 could be frustrating for a reader that has clicked in the link expecting to get (immediately) more information (remember, not every reader wants to became an collaborator, and that is fine) related to what he was reading. There are various reasons to the reader get the "Noarticletext" message, as in 3.1: 3.4.1) If the editor that created the interwiki link wrongly typed "colaboration" is his edit, the reader will get: http://en.wikipedia.org/wiki/Education_colaboration Note that in this case, something like 3.2 is better than 3.1, for the reader: http://en.wikipedia.org/wiki/Special:Search/education_colaboration I mean, he gets a message "Did you mean: education collaboration" and still have some related links shown in the search results. While the reader could prefer to go exactly to an Wikipedia article about education and collaboration, and because of a typo (that could take a long time to be corrected) he didn't, he could be happy to found another article in the results (maybe one more interesting than the article suggested by the editor who created the "wrong" link). 3.4.2) When writing, the editor simply imagine "There could be something about this at Wikipedia, and some news at Wikinews, let me create [[w:whis|this]] and [[n:whis|this]] links..." but instead of search at Wikipedia and Wikinews for the exact text to put in each link, he prefers to do a link using "keywords" like "education" and/or "collaboration". But these [interwiki] links are not "red [interwiki] links", so they could stay exactly as they were created for a long time, and bother some readers in the meantime... 3.4.3) Some editor do a search for "Education And Collaboration" at Wikipedia and find a page (say, "Education and collaboration"). So he creates a link (by copy and paste) pointing to "Education And Collaboration" that is not at Wikipedia, because of the "A" and "C". Then, although the search engine is case insensitive and the editor has found the article when he _searched_ for it at Wikipedia, the reader that will _click_ in the link will not go directly to the article. 3.4.4 And so on... The enhancement of these features can be achieved easily by means of a template whit code similar to this: ---- Begin of Template:Wikt ---- [[wikt:Special:Search/{{{1}}}|<span title="Search at Wiktionary the meaning of '{{{1}}}'">{{{2|{{{1}}}}}}</span>]] ---- End of Template:Wikt ---- then, we can use: * {{wikt|word|text}} instead of [[wikt:word|text]] * {{wikt|word}} instead of [[wikt:word|word]] and each of the resulting links points to * http://en.wikitionary.org/wiki/Special:Search/word instead of * http://en.wikitionary.org/wiki/word Note that with "Special:Search/" links, the reader still gets a red link for easily create the page (if he wants to became an editor) when no result is found. I would like to hear from you if, besides the need of use a template syntax just for create links, is there any other (technical/practical) disadvantages (advantages?) of using such behavior in the interwiki links (with or without using templates), and also if any of them is easy (and/or of interest) to implement internally in the MediaWiki. Thanks, Helder

3 3

Re: [Wikitech-l] [Dbpedia-discussion] URLs that aren t cool...
by Paul Houle 28 Jul '09

28 Jul '09

Georgi Kobilarov wrote: >> In this particular one, it's two articles about the >> same >> topic, but there could be some cases where the two articles are about >> something different. >> > > Yes, such as http://en.wikipedia.org/wiki/FROG > and http://en.wikipedia.org/wiki/Frog > > I agree that this can be annoying. One have to make sure to not lose the > case information (as it happened to me with lookup.dbpedia.org once, hence > merging FROG and Frog). > > But what do you suggest to do about that, Paul? Should Wikipedia make URLs > case-insensitive and then enforce disambiguation with ()? > If (wikipedia) were my site, I'd do two things: (i) map all case-variant forms to a single form (New yOrK cITy -> New York City;) "FROG" gets renamed to "FROG Cipher" or "Frog (Cipher)" (ii) do a permanent redirect from variant forms to the canonical form I think what dbpedia is doing is reasonable considering the situation. My own system for handling generic databases has both a VARBINARY and VARCHAR field for dbpedia URLs/labels. It does a case-insensitive lookup first, and if that fails, looks at the alternatives that turn up. It's also got some heuristics for dealing with redirects, disambiguation, and all that. In the big picture I see "naming and identity" as a specific functional module for this kind of system...

1 0

commons.wikimedia.org allowing directory indexes and web robots
by Alexandre Dulaunoy 27 Jul '09

27 Jul '09

Hi All, Commons.wikimedia.org is growing and provides a quite complete set of media files including a lot of interesting historical documents. Contributors are relying on the availability and persistence of commons.wikimedia.org but currently the full export is only available on download.wikimedia.org (ok not Today ;-). I was wondering if it would be possible to allow web robots to access http://upload.wikimedia.org/wikipedia/commons/ to gather and mirror the media files. As this is pure HTTP, the mirroring could benefit from the caching mechanisms of HTTP object (instead of having a large dump containing all the media files, that is more difficult to cache/update). Maybe this could allow a more distributed backup approach to ensure the resilience of commons.wikimedia.org? Thanks a lot for your work, adulau -- -- Alexandre Dulaunoy (adulau) -- http://www.foo.be/ -- http://www.foo.be/cgi-bin/wiki.pl/Diary -- "Knowledge can create problems, it is not through ignorance -- that we can solve them" Isaac Asimov

8 22

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l July 2009