Hi,
since some time we have RevisionDelete, which allows deleting specific
revisions of a page without having to delete the whole page. This
feature is currently enabled for Oversight on WMF wikis and the
community wants it to be enabled for regular administrators[1].
The only bug currently blocking this request is
https://bugzilla.wikimedia.org/show_bug.cgi?id=20186
which requests an easy method to filter history and contributions lists
for RevisionDeleted edits.
I partially fixed that bug in r58153, but afaik there is a problem with
the current database layout:
To filter for revisiondeleted edits, one has to go through all
history/contributions lists and look at each individual revision if it
is rev_deleted or not. This clearly isn't efficient and fails for pages
that have a large history.
Afaik, a database change is needed to make filtering for RevisionDeleted
edits efficient for large pages (I'm no DB expert, though).
Any suggestions on this problem?
Regards,
Church of emacs
[1] https://bugzilla.wikimedia.org/show_bug.cgi?id=18780https://bugzilla.wikimedia.org/show_bug.cgi?id=19697
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi there,
we are using MW 1.14.1/59249 (Lenny, Apache 2, mySQL 5.0.51a-24, PHP
5.2.6-1) since January without big problems.
http://wiki-de.genealogy.net
Only from the last week we suffer from a strange MW behavior, we can not
understand what does it cause and what did it trigger:
(Some!) System messages from the MediaWiki namespace are shown as
<MESSAGE_NAME>
e.g.:
- - the title of main page: <pagetitle> and <pagetitle-view-mainpage>
or correctly converted (in article text area) as
<MESSAGE_NAME>
e.g.:
- - in Special:Login - <Loginstart> and <Loginend>
Another phenomenon was, that MediaWiki:Signature was not set, so from
one day to the next ~~~~ began to deliver <signature>.
This is for emptied system messages or such untouched since
installation/updates.
But they should be delivered from the system with default values?!
We did not do any maintenance actions, especially for db- or
language-maintenance.
$wgUseDatabaseMessages is still true.
Creating/editing wrong working messages heals the situation.
But we don´t know, where all over lurk (much?) further ones to annoy us?!
Any suggestions? Crying for help ;-) and thanking in advance
Uwe (Baumbach)
U.Baumbach(a)web.de
- --
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEUEARECAAYFAksJRusACgkQFEbayCH8zXncpwCeKi0r+wibF7ToR0BZ/dFcNmAN
lCYAmNU/iRfOg9W8ivfF2/T7qQxZYqc=
=kOeu
-----END PGP SIGNATURE-----
Does Wikipedia have a sitemap? If so, what is the URL? If not, is
there a different way they are working with search engines so that the
engines don't have to crawl everything all the time?
When a new paragraph was inserted, diff doesn't discover that the
previous first paragraph is now the second. The diff reports much
larger changes than actually happened. Why is that? How can it be
fixed?
I'm talking about Wikipedia now. Are there different
implementations of diff in various instances of MediaWiki?
How is it implemented? Using UNIX/Linux diff, wdiff, or some other
algorithm?
Here is an example, where a bullet list of works (discography) was
enhanced,
http://sv.wikipedia.org/w/index.php?title=Staffan_M%C3%A5rtensson&diff=1052…
As you can see, Brahms Clarinet Sonatas were pushed from 1st to
2nd position, but is reported by diff as a total change. Instead
the record label (Channel Sound) is reported as unchanged text.
Yes, the phrase "med Erik Lanninger" was also changed to "Med E
Lanninger", but that is a much smaller change than the one
reported.
At my website runeberg.org, where scanned books are proofread,
I have implemented the diff function using wdiff with some
extra features. An example is shown here,
http://runeberg.org/rc.pl?action=diff&src=nfbf/0734
Since a common edit is to change "word" to "<b>word</b>", I want
changes in XML-like markup to be reported separately, which you
can see is the case at the bottom of that diff. But wdiff looks
strictly at whitespace, so I had to modify this. The quite naive
and non-optimized (but working) Perl code looks like this (yes,
versions are maintained by plain old RCS):
# A change from "foo bar" to "<b>foo bar" is seen by wdiff as a
# change of the word "foo" into "<b>foo". But we want to see this
# as the addition of the HTML/XML tag "<b>". To this effect, we
# pad spaces around all "<" and ">" in the original text versions,
# i.e. " <b> foo bar" before calling wdiff. The output from wdiff
# will be " <span><b></span> foo bar", where the padding spaces
# are outside of the <span> tags. This has to be taken into
# consideration when removing the space padding, below.
my $cmd = "umask 2"
. " && co -p1.$rev1 $filename 2>/dev/null | sed 's/</ </g;s/>/> /g' >$tmp1"
. " && co -p1.$rev2 $filename 2>/dev/null | sed 's/</ </g;s/>/> /g' >$tmp2"
. " && wdiff -n -s -w '<span class=\"del\">' -x '</span>' "
. " -y '<span class=\"ins\">' -z '</span>' $tmp1 $tmp2 |";
if (open(FILE, $cmd)) {
local $/ = undef;
$diff = <FILE>;
close(FILE);
} else {
debug_log("rc.pl: Failed with $cmd");
}
$diff = html_encode($diff);
Hope this was helpful.
--
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik - http://aronsson.se
Hi all
The Memento Project <http://www.mementoweb.org/> (including the Los Alamos
National Laboratory (!) featuring Herbert Van de Sompel of OpenURL fame) is
proposing a new HTTP header, X-Accept-Datetime, to fetch old versions of a web
resource. They already wrote a MediaWiki extension for this
<http://www.mediawiki.org/wiki/Extension:Memento> - which would of course be
particularly interesting for use on Wikipedia.
Do you think we could have this for Wikimedia project? I think that would be
very nice indeed. I recall that ways to look at last weeks main page have been
discussed before, and I see several issues:
* the timestamp isn't a unique identifier, multiple revisions *might* have the
same timestamp. We need a tiebreak (rev_id would be the obvious choice).
* templates and images also need to be "time warped". It seems like the
extension does not address this at the moment. For flagged revisions we do have
such a machnism, right? Could that be used here?
* Squids would need to know about the new header, and by pass the cache when
it's used.
so, what do you think? what does it take? Can we point them to the missing bits?
-- daniel
Hi,
I have an extension called BugSquish, which I have been happily using on
MediaWiki 1.6.10 for quite a long time. I am also aware of other people
using it on later versions, but cannot cite specific version numbers where
it is known to work. The code works by performing a regex replace on code
passed into the ParserAfterStrip hook function that I have set up, to strike
out links to bugs that have been marked as fixed.
On MW 1.6 this correctly handles <nowiki> and <pre> tags, in that text
within these tags is not parsed by the extension.
On MW 1.14 and above the code within the <nowiki> tags is parsed and ends up
having the regex applied to it, though it is subsequently rendered as plain
text by the engine (so the page ends up being filled with HTML/CSS
gobbledygook, rendered literally).
I am not sure at which revision this change took place.
First question: Is this a bug or a deliberate change in functionality, or
have I been mis-using the hook all along?
Second question: Assuming this is not a bug, how should I rewrite the code
to make it behave as it used to?
The current code for the extension is available here, if you want to test:
http://www.kennel17.co.uk/testwiki/BugSquish
Cheers,
- Mark Clements (HappyDog)
Hi all,
There has been some downtime this morning (about 15 minutes) due to a
software update.
I pushed a software update, and immediately servers started crashing
according to nagios. Looking at ganglia, it looks like the issue was
the familiar issue where scap pushes a few 4-CPU apaches into swap,
which then crash and come back a few minutes later. This time,
however, obviously a key memcached node fell over, causing a database
overload, resulting in the site being mostly inaccessible for about
ten minutes.
I prepared to revert the software update, but determined that the
problem was not the software update, and a scap would exacerbate the
issue. The problem resolved itself spontaneously.
We need to fix things up so the scap script is less liable to push
machines into swap :)
--
Andrew Garrett
agarrett(a)wikimedia.org
http://werdn.us/