Hi,
Is there a way to keep the right order of the autonumbering for links if
I call the parser out of an xml-style-tag extension?
The parser first numbers all links which I let parse within the
extension and then takes care about the "normal" links which appear on
the page. So in fact the numbering is not ordered on the output page
(also there are no gaps or doubled numbered links).
Thank you very much.
Greets
Christoph
An automated run of parserTests.php showed the following failures:
This is MediaWiki version 1.11alpha (r21819).
Reading tests from "maintenance/parserTests.txt"...
Reading tests from "extensions/Cite/citeParserTests.txt"...
Reading tests from "extensions/Poem/poemParserTests.txt"...
1 new PASSING test(s) :)
* TOC regression (bug 9764) [Has never failed]
18 still FAILING test(s) :(
* URL-encoding in URL functions (single parameter) [Has never passed]
* URL-encoding in URL functions (multiple parameters) [Has never passed]
* Table security: embedded pipes (http://mail.wikipedia.org/pipermail/wikitech-l/2006-April/034637.html) [Has never passed]
* Link containing double-single-quotes '' (bug 4598) [Has never passed]
* message transform: <noinclude> in transcluded template (bug 4926) [Has never passed]
* message transform: <onlyinclude> in transcluded template (bug 4926) [Has never passed]
* BUG 1887, part 2: A <math> with a thumbnail- math enabled [Has never passed]
* HTML bullet list, unclosed tags (bug 5497) [Has never passed]
* HTML ordered list, unclosed tags (bug 5497) [Has never passed]
* HTML nested bullet list, open tags (bug 5497) [Has never passed]
* HTML nested ordered list, open tags (bug 5497) [Has never passed]
* Fuzz testing: image with bogus manual thumbnail [Introduced between 08-Apr-2007 07:15:22, 1.10alpha (r21099) and 25-Apr-2007 07:15:46, 1.10alpha (r21547)]
* Inline HTML vs wiki block nesting [Has never passed]
* Mixing markup for italics and bold [Has never passed]
* dt/dd/dl test [Has never passed]
* Images with the "|" character in the comment [Has never passed]
* Parents of subpages, two levels up, without trailing slash or name. [Has never passed]
* Parents of subpages, two levels up, with lots of extra trailing slashes. [Has never passed]
Passed 494 of 512 tests (96.48%)... 18 tests failed!
Unicode's bidirectional algorithm often fails where there are RTL
characters, LTR characters and neutrals such as punctuation in the same
paragraph. Often this can be fixed by liberal sprinkling of either the RLM
character (in base RTL text) or the LTR character (in base LTR text).
Putting these characters directly into the article text makes such changes
difficult to review and edit, since they are invisible in the edit box in
major browsers. A better solution is to use HTML's ‎ and ‏
character entities.
By happy coincidence, ‎ has roughly the same effect in the edit box as
it does in display, because the latin characters "lrm" are of strong
left-to-right type, just like the control character they represent. The
same is not so for ‏, meaning that in cases where ‏ is used, the
text remains broken on edit while being fixed on display. Here's an example:
http://he.wikipedia.org/wiki/ACID
What I propose is that someone should come up with a translation of "rlm"
into Hebrew, Arabic or both, and that we should implement this artificial
character entity in the MediaWiki parser.
-- Tim Starling
Hello,
Would someone be able to run a database query for me to retrieve a list of new Wikipedia articles created February 8 2007 on http://en.wikipedia.org? Or could you direct me to whom I should contact about this?
Any help would be greatly appreciated.
Thanks,
Katherine
Deathphoenix wrote:
> I have some success with Lancaster University. I originally slapped one of
> their proxies with a 6 month AO block due to persistent, long term
> vandalism, but one of the sysadmins contacted me and told me they have XFF
> headers. After some fruitful discussion/negotiation, I removed the block and
> put up a header on the talk pages for their four proxies asking anyone who
> blocks the IP (or issues a warning) to also send an email to their abuse
> email, or to ask me to send and email. FYI, I have links to the four proxies
> at [[User talk:Deathphoenix/Lancaster]] (the IP talk page header is at
> [[User:Deathphoenix/Lancaster]]).
>
[snip]
>
> My suggestions for the school network admins and staff would be:
>
> 1. Implement XFF headers and make sure students have to log in using a
> unique user ID (easiest would be based on student number) before using
> school computers.
On the subject of XFF ("X-Forwarded-For") headers, I'd like to note a
few important technical details that one should keep in mind:
1. Having a proxy provide XFF headers isn't enough; the address of the
proxy also needs to be added to the list of trusted proxies that
Wikimedia servers will accept such headers from. That's because such
headers would otherwise be trivially easy to fake. To get an address
added to the list, you can post a request on [[meta:Talk:XFF project]]
or contact a developer with shell access (such as Tim Starling, who's
been doing most of the work on the XFF project) directly.
2. One of the requirements for getting a proxy added to the trusted list
is that the individual computers behind it have public IP addresses of
their own. If the school network is using [[private IP addresses]]
internally, XFF headers won't help.
3. Once the address of a proxy has been added to the trusted XFF list,
no edits should be seen from that address ever again, and blocking the
address of the proxy should have no effect. That's because, as far as
MediaWiki is concerned, the edits made via that proxy will no longer be
seen as coming from the proxy, but from the IP address of the computer
behind the proxy.
I'll repeat that, since it's important: Once a proxy is on the trusted
XFF list, *any blocks on it will have no effect*.
4. If the computers behind the proxy are public workstations in, say, a
school computer lab, XFF headers may not help prevent vandalism much.
By making edits from different workstations be seen as coming from
different IPs, they may reduce collateral damage from blocking one
workstation; but if the vandals can just switch to another computer,
this may end up doing more harm than good. At best, they may make
tracking down the vandals easier, if the school requires users to log in
to workstations and keeps logs of who used which workstation when; this
may often be true at college level schools, but much less so at high
schools or even elementary schools.
That last point is also important; to catch vandals, it's not enough
that students log in, it's also necessary to keep a log of who used
which workstation when _and_ to make said log available to whoever is
tasked with handling network abuse issues. Of course, there are
significant privacy issues here that need to be considered too.
So, to summarize, XFF headers are only useful for catching school
vandals if the school has:
1. their proxy/ies listed in the trusted XFF list,
2. public IP addresses for each workstation,
3. workstations requiring students to log in to use them,
4. a log of who was using which workstation when, and
5. a person with access to said log who can handle complaints.
Of course, it should go without saying that the contact information for
the person or department responsible for handling net abuse issues must
also be easy to find, if it's to do anyone any good.
(This is all based on my understanding of the XFF implementation in
MediaWiki as it was when I last looked at it. If you find any incorrect
or outdated information above, please correct me. To increase the odds
of this happening, I've crossposted this to wikitech-l in addition to
wikien-l.)
--
Ilmari Karonen
Dear wikitechnicians,
I'm looking for historical statistics on article views per day on the
English Wikipedia. I've spent a good bit of time wading around Wikipedia
and Wikimedia but have begun to feel like I'm going in circles.
I have located:
http://stats.wikimedia.org/EN/TablesUsagePageRequest.htm
which is pretty ideal but is missing data from November 2005 to present.
I have also located:
http://en.wikipedia.org/wiki/Wikipedia:Awareness_statistics
which contains a "page views per million" statistic which has the
necessary long-term reach but includes all Wikipedia traffic and is
relative to a somewhat mysteriously selected user sample.
I would be very appreciative of suggestions or pointers to information.
This will inform ongoing research here at the University of Minnesota.
Many thanks,
Reid
The interwiki map is a subject of long and stupid discussion at
present - see for a precis. Discussion is at:
http://meta.wikimedia.org/wiki/Talk:Interwiki_map#Inclusion_criteria_clarif…
Now, the question is: Do we have any way to gather usage statistics
for interwiki links? How can we tell when a link is not in fact being
used? If a link is to be removed from the interwiki map, how can we be
sure to fix the damage?
- d.
An automated run of parserTests.php showed the following failures:
This is MediaWiki version 1.11alpha (r21791).
Reading tests from "maintenance/parserTests.txt"...
Reading tests from "extensions/Cite/citeParserTests.txt"...
Reading tests from "extensions/Poem/poemParserTests.txt"...
18 still FAILING test(s) :(
* URL-encoding in URL functions (single parameter) [Has never passed]
* URL-encoding in URL functions (multiple parameters) [Has never passed]
* Table security: embedded pipes (http://mail.wikipedia.org/pipermail/wikitech-l/2006-April/034637.html) [Has never passed]
* Link containing double-single-quotes '' (bug 4598) [Has never passed]
* message transform: <noinclude> in transcluded template (bug 4926) [Has never passed]
* message transform: <onlyinclude> in transcluded template (bug 4926) [Has never passed]
* BUG 1887, part 2: A <math> with a thumbnail- math enabled [Has never passed]
* HTML bullet list, unclosed tags (bug 5497) [Has never passed]
* HTML ordered list, unclosed tags (bug 5497) [Has never passed]
* HTML nested bullet list, open tags (bug 5497) [Has never passed]
* HTML nested ordered list, open tags (bug 5497) [Has never passed]
* Fuzz testing: image with bogus manual thumbnail [Introduced between 08-Apr-2007 07:15:22, 1.10alpha (r21099) and 25-Apr-2007 07:15:46, 1.10alpha (r21547)]
* Inline HTML vs wiki block nesting [Has never passed]
* Mixing markup for italics and bold [Has never passed]
* dt/dd/dl test [Has never passed]
* Images with the "|" character in the comment [Has never passed]
* Parents of subpages, two levels up, without trailing slash or name. [Has never passed]
* Parents of subpages, two levels up, with lots of extra trailing slashes. [Has never passed]
Passed 493 of 511 tests (96.48%)... 18 tests failed!
Guys, this is really amazing! I've never seen such quick response. Thank you
very much. It makes a huge difference - the Chinese dump alone is up from
250MB 7zipped in December to 346MB 7zipped in April - that's a lot of new
knowledge!! :)
All I could ask for now is that you consider at least making a version of
the English (which is at least twice the size of the next biggest Wiki) with
only article pages.
Thank you so much
Stian
<mailto:wikitech-l@wikipedia.org>
Hello,
I am conducting a research project on Wikipedia as part of my master's program and I'm wondering if there is a way to access a list of new articles/pages created on a specific date. I tried using the New Pages page (http://en.wikipedia.org/wiki/Special:Newpages <https://exchange.mcgill.ca/exchweb/bin/redir.asp?URL=http://en.wikipedia.or…> ) to go back in time, but I need to go back to February 2007 and considering how many pages are added each day, this method is rather time-consuming and imprecise. Is there a way to retrieve a history of new pages created on a specific date?
Thanks,
Katherine