Wikitech-l May 2007

wikitech-l@lists.wikimedia.org

89 participants
135 discussions

Re: [Wikitech-l] [MediaWiki-CVS] SVN: [22600] trunk/phase3/includes

by Rob Church

On 31/05/07, aaron(a)svn.wikimedia.org <aaron(a)svn.wikimedia.org> wrote: > Revision: 22600 > Author: aaron > Date: 2007-05-31 09:01:26 -0700 (Thu, 31 May 2007) > > Log Message: > ----------- > *Add BeforeGalleryFindFile, TitleLinkUpdatesAfterCompletion, BeforeParserFetchTemplateAndtitle, BeforeParserMakeImageLinkObj, BeforeParserrenderImageGallery; make parser outputs and output page record images -> timestamps used and templates -> revision ids * Release notes * Hook documentation file Rob Church

16 years, 11 months

wikihow gets a new look

by Travis Derouin

We upgraded to 1.9 yesterday and launched a new design if anyone is interested in checking it out: http://www.wikihow.com/Main-Page We hit the 20,000 article mark just yesterday as well, and should pass 5 million unique readers for the month of May. Thanks, Travis

16 years, 11 months

Customising sidebar for non/logged-in users

by Brianna Laugher

Hi, There's no way at present to customise the sidebar display depending on if the user is logged in or not, is there? Would it be possible, given traffic/caching/whatever? It would certainly be handy to be able to give more relevant links to different audiences. cheers, Brianna user:pfctdayelise

16 years, 11 months

MediaWiki automated test run failure 2007-05-31

by brion＠pobox.com

An automated run of parserTests.php showed the following failures: This is MediaWiki version 1.11alpha (r22594). Reading tests from "maintenance/parserTests.txt"... Reading tests from "extensions/Cite/citeParserTests.txt"... Reading tests from "extensions/Poem/poemParserTests.txt"... 30 previously passing test(s) now FAILING! :( * Template with thumb image (with link in description) [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * Simple image [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * Right-aligned image [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * Image with caption [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * Image with frame and link [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * Frameless image caption with a free URL [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * Thumbnail image caption with a free URL [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * BUG 1887: A ISBN with a thumbnail [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * BUG 1887: A RFC with a thumbnail [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * BUG 1887: A mailto link with a thumbnail [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * BUG 1887: A <math> with a thumbnail- we don't render math in the parsertests by default, so math is not stripped and turns up as escaped <math> tags. [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * BUG 648: Frameless image caption with a link [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * BUG 648: Frameless image caption with a link (suffix) [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * BUG 648: Frameless image caption with an interwiki link [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * BUG 648: Frameless image caption with a piped interwiki link [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * Escape HTML special chars in image alt text [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * BUG 499: Alt text should have Ӓ, not &1234; [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * Image caption containing another image [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * Image caption containing a newline [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * Bug 3090: External links other than http: in image captions [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * BUG 1219 URL next to image (good) [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * BUG 1219 URL next to image (broken) [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * Media link [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * Media link with text [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * Media link with nasty text fixme: doBlockLevels won't wrap this in a paragraph because it contains a div [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * Media link to nonexistent file (bug 1702) [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * Centre-aligned image [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * None-aligned image [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * Width + Height sized image (using px) (height is ignored) [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] * <references> after <gallery> (bug 6164) [Introduced between 30-May-2007 07:15:22, 1.11alpha (r22553) and 31-May-2007 07:15:31, 1.11alpha (r22594)] 18 still FAILING test(s) :( * URL-encoding in URL functions (single parameter) [Has never passed] * URL-encoding in URL functions (multiple parameters) [Has never passed] * Table security: embedded pipes (http://mail.wikipedia.org/pipermail/wikitech-l/2006-April/034637.html) [Has never passed] * Link containing double-single-quotes '' (bug 4598) [Has never passed] * message transform: <noinclude> in transcluded template (bug 4926) [Has never passed] * message transform: <onlyinclude> in transcluded template (bug 4926) [Has never passed] * BUG 1887, part 2: A <math> with a thumbnail- math enabled [Has never passed] * HTML bullet list, unclosed tags (bug 5497) [Has never passed] * HTML ordered list, unclosed tags (bug 5497) [Has never passed] * HTML nested bullet list, open tags (bug 5497) [Has never passed] * HTML nested ordered list, open tags (bug 5497) [Has never passed] * Fuzz testing: image with bogus manual thumbnail [Introduced between 08-Apr-2007 07:15:22, 1.10alpha (r21099) and 25-Apr-2007 07:15:46, 1.10alpha (r21547)] * Inline HTML vs wiki block nesting [Has never passed] * Mixing markup for italics and bold [Has never passed] * dt/dd/dl test [Has never passed] * Images with the "|" character in the comment [Has never passed] * Parents of subpages, two levels up, without trailing slash or name. [Has never passed] * Parents of subpages, two levels up, with lots of extra trailing slashes. [Has never passed] Passed 465 of 513 tests (90.64%)... 48 tests failed!

16 years, 11 months

A cross-lingual Wikipedia search engine

by Tian-Jian "Barabbas" Jiang＠Gmail

Hi all, My friend Prof. Wu and his students developed Wikigazer ( http://wil.csie.cyut.edu.tw/Wikigazer.php?hl=en ), a cross-lingual Wikipedia search engine based on Lucene. I would like to know if you think it's useful or not, and is it fast enough from your location. Thank you! Best Regards, /Mike/

16 years, 11 months

Re: [Wikitech-l] lucene search 2.0 test webinterface

by Tian-Jian "Barabbas" Jiang＠Gmail

Hi Robert, Yes, there are multiple queries. In my scenario, "precision first" usually implied the amount of return results is limited. Users may not have patiences on both waiting for responses and reading for pages of results. That's why I prefer sequential process rather than parallel; I can guess a small and maybe precise result set first, then query for more if the result set seems to be too small, i.e. the recall is not high enough. For example, a query in Chinese applies word-based analyzer first, with a limit, say 1000: static int m_limit = 1000; Query query = _a_word_based_Chinese_query_here_; ArrayList<MyResult> resultList = new ArrayList<MyResult>(); TopDocs topDocs = m_standardSearcher.search(query, (Filter)null, m_limit); for(ScoreDoc scoreDoc: topDocs.scoreDocs) { Document doc = m_standardSearcher.doc(scoreDoc.doc); float score = scoreDoc.score; MyResult aResult = new MyResult(doc, score); resultList.add(aResult); } Once the size of resultList did not reach 1000, another character-based query will be fired to get more results up to (1000 - current size). It's a very simple heuristic and proved to be fast enough on single P4 2GHz machine with 2GB RAM, which served for a 3GB Lucene index file. Results returned within 1 sec, in average. The problem of all multiple, parallel, or distributed Lucene queries is, score merging may not be reasonable, especially when indexes are in different strategy of tokenization. You may be also interested in http://issues.apache.org/jira/browse/NUTCH-92 , http://hellonline.com/blog/?p=55 , and http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg12709.html Thank you! Cheers, /Mike/ Robert Stojnic wrote: > > Hm, wouldn't that require running multiple queries for a single user > query? If I'm understanding it correctly it refines search by trying > different queries, and merges the results? > For the wikipedia system, speed is of out most importance, since it's > a high traffic site, and has very few resources (compared to other > sites of same traffic). > > r. > > On 5/23/07, *Tian-Jian Barabbas Jiang@Gmail* <barabbas(a)gmail.com > <mailto:barabbas@gmail.com>> wrote: > > Although I bet you have already done it, here's my > 2 cents: > I usually adapt a concept to my IR system: > Precision first, Recall next. > For example, my system may do exact match first, get > the results from > > searcher.doc(topDocs.scoreDocs[i].doc) > > and save them externally. > It allows me to merge some more partial matched > results later. > Apparently these can be done by something like parallel > queries, but I like to merge them sequentially by myself.

16 years, 11 months

Re: [Wikitech-l] lucene search 2.0 test webinterface

by Tian-Jian "Barabbas" Jiang＠Gmail

Hi all, Tian-Jian "Barabbas" Jiang@Gmail wrote: > It's a very simple heuristic and proved to be fast enough on single > P4 2GHz > machine with 2GB RAM, which served for a 3GB Lucene index file. Results > returned within 1 sec, in average. BTW, in case you are not satisfied with 1 sec, here's some reasonable room to improve: 1. My environment is, actually, lame. It is a Windows 2003 box without the benefit of inode-alike file system for Lucene's index file format. The worse thing is, due to the lack of storage space, some secondary index files may be located on separated machines via SLOW Windows Network Disks. 2. There's no either pagination nor cache in my system. I bet anyone who has a FreeBSD 6.1 server with nice pagination will get much better performance than me. ;-) Regards, /Mike/

16 years, 11 months

MediaWiki automated test run failure 2007-05-30

by brion＠pobox.com

An automated run of parserTests.php showed the following failures: This is MediaWiki version 1.11alpha (r22553). Reading tests from "maintenance/parserTests.txt"... Reading tests from "extensions/Cite/citeParserTests.txt"... Reading tests from "extensions/Poem/poemParserTests.txt"... 18 still FAILING test(s) :( * URL-encoding in URL functions (single parameter) [Has never passed] * URL-encoding in URL functions (multiple parameters) [Has never passed] * Table security: embedded pipes (http://mail.wikipedia.org/pipermail/wikitech-l/2006-April/034637.html) [Has never passed] * Link containing double-single-quotes '' (bug 4598) [Has never passed] * message transform: <noinclude> in transcluded template (bug 4926) [Has never passed] * message transform: <onlyinclude> in transcluded template (bug 4926) [Has never passed] * BUG 1887, part 2: A <math> with a thumbnail- math enabled [Has never passed] * HTML bullet list, unclosed tags (bug 5497) [Has never passed] * HTML ordered list, unclosed tags (bug 5497) [Has never passed] * HTML nested bullet list, open tags (bug 5497) [Has never passed] * HTML nested ordered list, open tags (bug 5497) [Has never passed] * Fuzz testing: image with bogus manual thumbnail [Introduced between 08-Apr-2007 07:15:22, 1.10alpha (r21099) and 25-Apr-2007 07:15:46, 1.10alpha (r21547)] * Inline HTML vs wiki block nesting [Has never passed] * Mixing markup for italics and bold [Has never passed] * dt/dd/dl test [Has never passed] * Images with the "|" character in the comment [Has never passed] * Parents of subpages, two levels up, without trailing slash or name. [Has never passed] * Parents of subpages, two levels up, with lots of extra trailing slashes. [Has never passed] Passed 495 of 513 tests (96.49%)... 18 tests failed!

16 years, 11 months

Wikisource in Old English language closed

by River Tarnell

with the resolution of bug #10059 [0], the Old English Wikisource has been locked to edits. this was proposed at [1] and received broad agreement. - river. [0] http://bugzilla.wikimedia.org/show_bug.cgi?id=10059 [1] http://meta.wikimedia.org/wiki/Proposals_for_closing_projects/Closure_of_Ol…

16 years, 11 months

New subversion account

by Brion Vibber

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Added SVN account for Daniel Cannon (amidaniel), working on API and other stuff. :) - -- brion vibber (brion @ wikimedia.org) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD4DBQFGXJ1MwRnhpk1wk44RAhpGAJYvrln0OlAEFDzxKVoBxnJO7HCOAJ0QUJVD 7/1wb0ZI8o5X5sEYb6343Q== =qPAK -----END PGP SIGNATURE-----

16 years, 11 months

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l May 2007