Wikitech-l May 2007

wikitech-l@lists.wikimedia.org

89 participants
135 discussions

by Christoph

Hi, Is there a way to keep the right order of the autonumbering for links if I call the parser out of an xml-style-tag extension? The parser first numbers all links which I let parse within the extension and then takes care about the "normal" links which appear on the page. So in fact the numbering is not ordered on the output page (also there are no gaps or doubled numbered links). Thank you very much. Greets Christoph

17 years

MediaWiki automated test run failure 2007-05-03

by brion＠pobox.com

An automated run of parserTests.php showed the following failures: This is MediaWiki version 1.11alpha (r21819). Reading tests from "maintenance/parserTests.txt"... Reading tests from "extensions/Cite/citeParserTests.txt"... Reading tests from "extensions/Poem/poemParserTests.txt"... 1 new PASSING test(s) :) * TOC regression (bug 9764) [Has never failed] 18 still FAILING test(s) :( * URL-encoding in URL functions (single parameter) [Has never passed] * URL-encoding in URL functions (multiple parameters) [Has never passed] * Table security: embedded pipes (http://mail.wikipedia.org/pipermail/wikitech-l/2006-April/034637.html) [Has never passed] * Link containing double-single-quotes '' (bug 4598) [Has never passed] * message transform: <noinclude> in transcluded template (bug 4926) [Has never passed] * message transform: <onlyinclude> in transcluded template (bug 4926) [Has never passed] * BUG 1887, part 2: A <math> with a thumbnail- math enabled [Has never passed] * HTML bullet list, unclosed tags (bug 5497) [Has never passed] * HTML ordered list, unclosed tags (bug 5497) [Has never passed] * HTML nested bullet list, open tags (bug 5497) [Has never passed] * HTML nested ordered list, open tags (bug 5497) [Has never passed] * Fuzz testing: image with bogus manual thumbnail [Introduced between 08-Apr-2007 07:15:22, 1.10alpha (r21099) and 25-Apr-2007 07:15:46, 1.10alpha (r21547)] * Inline HTML vs wiki block nesting [Has never passed] * Mixing markup for italics and bold [Has never passed] * dt/dd/dl test [Has never passed] * Images with the "|" character in the comment [Has never passed] * Parents of subpages, two levels up, without trailing slash or name. [Has never passed] * Parents of subpages, two levels up, with lots of extra trailing slashes. [Has never passed] Passed 494 of 512 tests (96.48%)... 18 tests failed!

17 years

Translation of &rlm;

by Tim Starling

Unicode's bidirectional algorithm often fails where there are RTL characters, LTR characters and neutrals such as punctuation in the same paragraph. Often this can be fixed by liberal sprinkling of either the RLM character (in base RTL text) or the LTR character (in base LTR text). Putting these characters directly into the article text makes such changes difficult to review and edit, since they are invisible in the edit box in major browsers. A better solution is to use HTML's &lrm; and &rlm; character entities. By happy coincidence, &lrm; has roughly the same effect in the edit box as it does in display, because the latin characters "lrm" are of strong left-to-right type, just like the control character they represent. The same is not so for &rlm;, meaning that in cases where &rlm; is used, the text remains broken on edit while being fixed on display. Here's an example: http://he.wikipedia.org/wiki/ACID What I propose is that someone should come up with a translation of "rlm" into Hebrew, Arabic or both, and that we should implement this artificial character entity in the MediaWiki parser. -- Tim Starling

17 years

Database query

by Katherine Ehmann

Hello, Would someone be able to run a database query for me to retrieve a list of new Wikipedia articles created February 8 2007 on http://en.wikipedia.org? Or could you direct me to whom I should contact about this? Any help would be greatly appreciated. Thanks, Katherine

17 years

[WikiEN-l] XFF headers (was: contacting schools)

by Ilmari Karonen

Deathphoenix wrote: > I have some success with Lancaster University. I originally slapped one of > their proxies with a 6 month AO block due to persistent, long term > vandalism, but one of the sysadmins contacted me and told me they have XFF > headers. After some fruitful discussion/negotiation, I removed the block and > put up a header on the talk pages for their four proxies asking anyone who > blocks the IP (or issues a warning) to also send an email to their abuse > email, or to ask me to send and email. FYI, I have links to the four proxies > at [[User talk:Deathphoenix/Lancaster]] (the IP talk page header is at > [[User:Deathphoenix/Lancaster]]). > [snip] > > My suggestions for the school network admins and staff would be: > > 1. Implement XFF headers and make sure students have to log in using a > unique user ID (easiest would be based on student number) before using > school computers. On the subject of XFF ("X-Forwarded-For") headers, I'd like to note a few important technical details that one should keep in mind: 1. Having a proxy provide XFF headers isn't enough; the address of the proxy also needs to be added to the list of trusted proxies that Wikimedia servers will accept such headers from. That's because such headers would otherwise be trivially easy to fake. To get an address added to the list, you can post a request on [[meta:Talk:XFF project]] or contact a developer with shell access (such as Tim Starling, who's been doing most of the work on the XFF project) directly. 2. One of the requirements for getting a proxy added to the trusted list is that the individual computers behind it have public IP addresses of their own. If the school network is using [[private IP addresses]] internally, XFF headers won't help. 3. Once the address of a proxy has been added to the trusted XFF list, no edits should be seen from that address ever again, and blocking the address of the proxy should have no effect. That's because, as far as MediaWiki is concerned, the edits made via that proxy will no longer be seen as coming from the proxy, but from the IP address of the computer behind the proxy. I'll repeat that, since it's important: Once a proxy is on the trusted XFF list, *any blocks on it will have no effect*. 4. If the computers behind the proxy are public workstations in, say, a school computer lab, XFF headers may not help prevent vandalism much. By making edits from different workstations be seen as coming from different IPs, they may reduce collateral damage from blocking one workstation; but if the vandals can just switch to another computer, this may end up doing more harm than good. At best, they may make tracking down the vandals easier, if the school requires users to log in to workstations and keeps logs of who used which workstation when; this may often be true at college level schools, but much less so at high schools or even elementary schools. That last point is also important; to catch vandals, it's not enough that students log in, it's also necessary to keep a log of who used which workstation when _and_ to make said log available to whoever is tasked with handling network abuse issues. Of course, there are significant privacy issues here that need to be considered too. So, to summarize, XFF headers are only useful for catching school vandals if the school has: 1. their proxy/ies listed in the trusted XFF list, 2. public IP addresses for each workstation, 3. workstations requiring students to log in to use them, 4. a log of who was using which workstation when, and 5. a person with access to said log who can handle complaints. Of course, it should go without saying that the contact information for the person or department responsible for handling net abuse issues must also be easy to find, if it's to do anyone any good. (This is all based on my understanding of the XFF implementation in MediaWiki as it was when I last looked at it. If you find any incorrect or outdated information above, please correct me. To increase the odds of this happening, I've crossposted this to wikitech-l in addition to wikien-l.) -- Ilmari Karonen

17 years

Looking for historical article views per day statistics

by Reid Priedhorsky

Dear wikitechnicians, I'm looking for historical statistics on article views per day on the English Wikipedia. I've spent a good bit of time wading around Wikipedia and Wikimedia but have begun to feel like I'm going in circles. I have located: http://stats.wikimedia.org/EN/TablesUsagePageRequest.htm which is pretty ideal but is missing data from November 2005 to present. I have also located: http://en.wikipedia.org/wiki/Wikipedia:Awareness_statistics which contains a "page views per million" statistic which has the necessary long-term reach but includes all Wikipedia traffic and is relative to a somewhat mysteriously selected user sample. I would be very appreciative of suggestions or pointers to information. This will inform ongoing research here at the University of Minnesota. Many thanks, Reid

17 years

Usage stats for interwiki links?

by David Gerard

The interwiki map is a subject of long and stupid discussion at present - see for a precis. Discussion is at: http://meta.wikimedia.org/wiki/Talk:Interwiki_map#Inclusion_criteria_clarif… Now, the question is: Do we have any way to gather usage statistics for interwiki links? How can we tell when a link is not in fact being used? If a link is to be removed from the interwiki map, how can we be sure to fix the damage? - d.

17 years

MediaWiki automated test run failure 2007-05-02

by brion＠pobox.com

An automated run of parserTests.php showed the following failures: This is MediaWiki version 1.11alpha (r21791). Reading tests from "maintenance/parserTests.txt"... Reading tests from "extensions/Cite/citeParserTests.txt"... Reading tests from "extensions/Poem/poemParserTests.txt"... 18 still FAILING test(s) :( * URL-encoding in URL functions (single parameter) [Has never passed] * URL-encoding in URL functions (multiple parameters) [Has never passed] * Table security: embedded pipes (http://mail.wikipedia.org/pipermail/wikitech-l/2006-April/034637.html) [Has never passed] * Link containing double-single-quotes '' (bug 4598) [Has never passed] * message transform: <noinclude> in transcluded template (bug 4926) [Has never passed] * message transform: <onlyinclude> in transcluded template (bug 4926) [Has never passed] * BUG 1887, part 2: A <math> with a thumbnail- math enabled [Has never passed] * HTML bullet list, unclosed tags (bug 5497) [Has never passed] * HTML ordered list, unclosed tags (bug 5497) [Has never passed] * HTML nested bullet list, open tags (bug 5497) [Has never passed] * HTML nested ordered list, open tags (bug 5497) [Has never passed] * Fuzz testing: image with bogus manual thumbnail [Introduced between 08-Apr-2007 07:15:22, 1.10alpha (r21099) and 25-Apr-2007 07:15:46, 1.10alpha (r21547)] * Inline HTML vs wiki block nesting [Has never passed] * Mixing markup for italics and bold [Has never passed] * dt/dd/dl test [Has never passed] * Images with the "|" character in the comment [Has never passed] * Parents of subpages, two levels up, without trailing slash or name. [Has never passed] * Parents of subpages, two levels up, with lots of extra trailing slashes. [Has never passed] Passed 493 of 511 tests (96.48%)... 18 tests failed!

17 years

Re: [Wikitech-l] Wikitech-l Digest, Vol 46, Issue 2

by Stian Haklev

Guys, this is really amazing! I've never seen such quick response. Thank you very much. It makes a huge difference - the Chinese dump alone is up from 250MB 7zipped in December to 346MB 7zipped in April - that's a lot of new knowledge!! :) All I could ask for now is that you consider at least making a version of the English (which is at least twice the size of the next biggest Wiki) with only article pages. Thank you so much Stian

17 years

New Pages archive

by Katherine Ehmann

<mailto:wikitech-l@wikipedia.org> Hello, I am conducting a research project on Wikipedia as part of my master's program and I'm wondering if there is a way to access a list of new articles/pages created on a specific date. I tried using the New Pages page (http://en.wikipedia.org/wiki/Special:Newpages <https://exchange.mcgill.ca/exchweb/bin/redir.asp?URL=http://en.wikipedia.or…> ) to go back in time, but I need to go back to February 2007 and considering how many pages are added each day, this method is rather time-consuming and imprecise. Is there a way to retrieve a history of new pages created on a specific date? Thanks, Katherine

17 years, 1 month

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l May 2007