I just tried uploading a 12MB TIFF (a scan directly from the Library
of Congress) to Commons. It waited until the whole 12MB had uploaded,
of course, to tell me it didn't want it.
1. Is there any reason TIFF is off the allowed media types list?
2. Is there any way for the software to say "no" earlier in the
process of making a huge upload?
- d.
3713 lines changed, by my count, which makes it probably one of the
biggest commits ever that weren't orchestrated primarily with sed
(yes, you all know who to look at there). I glanced over like half of
it before giving up. I found about four definite errors, so it was
probably worth it, although it seemed to take an inordinately long
time. Good luck to Brion on reviewing it properly. :)
On 10/1/07, aaron(a)svn.wikimedia.org <aaron(a)svn.wikimedia.org> wrote:
> $this->mContent = $revision->userCan( Revision::DELETED_TEXT ) ? $revision->getRawText() : "";
> //$this->mContent = $revision->getText();
> + $this->mContent = $revision->revText(); // Loads if user is allowed
Is there any reason to set it using $revision->getRawText() and then
immediately afterwards overwrite it with $revision->revText()? You
should delete the old line (and the commented-out one while you're at
it).
> + $supress = "<tr><td> </td><td>";
> + $supress .= Xml::checkLabel( wfMsg( 'revdelete-suppress' ), 'wpSuppress', 'wpSuppress', false, array( 'tabindex' => '2' ) );
> + $supress .= "</td></tr>";
> + } else {
> + $supress = '';
> ...
> + $supress
Should be $suppress, with two p's.
> + // Bitfields to further supress the content
Here too, although this is just a comment.
> + } else {
> + $bitfield = 'rev_deleted';
> + }
Uh, storing a string in a variable called $bitfield, which is later
stored in a TINYINT?
> - return $f;
> + return "<tt>$f</tt>";
This should be in CSS, not inline styling.
> - } elseif( $rc_namespace == NS_SPECIAL ) {
> + } else if( $rc_namespace == NS_SPECIAL ) {
You know, "else if" is slower than "elseif". There's no reason to
deliberately remove dedicated language constructs in favor of compound
constructs just because that's how they do it in C.
> + $r .= ' ';
Surely there's a better way of doing whatever you're trying to accomplish.
> + * Generate HTML for the equivilant of a spacer image for tables
"equivalent"
> + * @access private
Use the private keyword for this. We shouldn't be using @access now
that we're using PHP 5 and have an explicit keyword to that effect.
> + function spacerColumn() {
> + return '<td width="12"></td>';
> + }
This is the sort of thing CSS should be used for. Give the cells a
class or id and some padding. You don't need extra columns that
contain no content.
> + // Adds a few spaces
> + function spacerIndent() {
> + return ' ';
> + }
There has *got* to be a better way of doing whatever you're trying to do.
> +// For deleted images, gererally were all versions of the image are discarded
> $wgFileStore = array();
> $wgFileStore['deleted']['directory'] = false;// Defaults to $wgUploadDirectory/deleted
> $wgFileStore['deleted']['url'] = null; // Private
> $wgFileStore['deleted']['hash'] = 3; // 3-level subdirectory split
The added comment doesn't make any sense to me. What's it supposed to mean?
> +// To hide usernames
> +$wgGroupPermissions['oversight']['hideuser'] = true;
> +// To see hidden revs and unhide revs hidden from Sysops
> +$wgGroupPermissions['oversight']['hiderevision'] = true;
> +// For private log access
> +$wgGroupPermissions['oversight']['oversight'] = true;
Is the Oversight extension being merged into core?
> + $rdel = ''; $ldel = '';
$rdel = $ldel = ''; is more conventional, although that's nitpicking.
> + if ( $this->mOldRev && $this->mOldRev->isDeleted(Revision::DELETED_TEXT) ) {
> + wfIncrStats( 'diff_uncacheable' );
> + } else if ( $this->mNewRev && $this->mNewRev->isDeleted(Revision::DELETED_TEXT) ) {
> + wfIncrStats( 'diff_uncacheable' );
There's no reason to have two different if clauses with the same
effect. The conditions should be or'd instead.
> + $this->mOldPagetitle = htmlspecialchars( wfMsg( 'revisionasof', $t ) );
Could just use wfMsgHTML().
> + if( ($bitfield & $field) == $field ) {
A stylistic issue, but 'if( $bitfield & $field ) {' is simpler.
> + /**
> + * As we use the same small set of messages in various methods and that
> + * they are called often, we call them once and save them in $this->message
> + */
Is this actually necessary? Surely this is done on a lower level, by
the implementation of the wfMsg() functions.
> + * @private
You should use the PHP keyword for this, again. At the very least,
use the correct Doxygen markup. :)
> + function newRowFromID( $logid ) {
> + $fname = 'LogReader::newFromTitle';
That seems slightly . . . misleading. How about you just use __METHOD__?
> + /**
> + * As we use the same small set of messages in various methods and that
> + * they are called often, we call them once and save them in $this->message
> + */
Same comment as before.
> + * @private
> + */
> + function showhideLinks( $s, $title ) {
As above.
> + if( self::isDeleted($s,self::DELETED_ACTION) )
> + return $revert;
Clearer to say return ''; here.
> + if( $this->flags & self::NO_ACTION_LINK ) {
> + return $revert;
> + }
And here. The two should be combined into a single conditional, and
the initialization of $revert moved after them.
> + $revert = $this->skin->userToolLinks( 1, $s->log_title );
Surely you don't want the tool links to be hardcoded for user #1.
> + /**
> + * As we use the same small set of messages in various methods and that
> + * they are called often, we call them once and save them in $this->message
> + */
Again. At the very least this should be made a globally-accessible
function of some kind, not cut-and-pasted for every class.
> + $wgOut->addHTML( "<h2 id=\"mergehistory\">" . wfMsgHtml( "mergehistory-list" ) . "</h2>\n" );
New id's should start with "mw-" to help avoid conflicts.
> + if( $this->mTimestamp && $this->mTimestamp >= $maxtimestamp ) {
Why are you checking $this->mTimestamp if immediately afterward you
require it be greater than zero (assuming $maxtimestamp is
non-negative)?
> + $wgOut->addHtml( wfMsg('mergehistory-fail') );
You probably want addWikiText() there.
> + $maxtimestamp = $maxtimestamp ? $maxtimestamp : 0;
> + $this->maxTimestamp = $maxtimestamp;
How about: $this->maxTimestamp = intval( $maxtimestamp );
NOTE: Many of these errors are self-referencing #REDIRECT statements which cause database corruption if not applied in the right order.
In short, the DUMPS are broken again.
Jeff
----- Original Message -----
From: Jeffrey Vernon Merkey
To: wikitech-l(a)wikimedia.org
Sent: Monday, October 01, 2007 1:11 PM
Subject: Duplicate Articles in Database Dump -- MYSQL errors
The last two runs of database dumps are severely broken and have a large number of duplicate titles which cause mysql to throw errors. I have managed to get some good translation runs and full imports, but not without a lot of work, and a lot of wasted time.
How about we fix the database dumps for those of us who need them and stop introfucing breakage.
Thanks. Latest errors and duplicate titles attached. I did not attach the full listing since this will exceed the limits of the mail server for a single message since there are hundreds of errors.
The 20070802 dumps do not seem to have this problem as bad.
ERROR 1062 (23000) at line 96187: Duplicate entry '0-Nyquist_theorem' for key 2
ERROR 1062 (23000) at line 113658: Duplicate entry '0-Urysohn_lemma' for key 2
ERROR 1062 (23000) at line 114417: Duplicate entry '0-Turner_syndrome' for key 2
ERROR 1062 (23000) at line 116050: Duplicate entry '0-World_fair' for key 2
ERROR 1062 (23000) at line 125448: Duplicate entry '0-Dining_cryptographer_protocol' for key 2
ERROR 1062 (23000) at line 132820: Duplicate entry '0-Rothmund-Thompson_syndrome' for key 2
ERROR 1062 (23000) at line 136377: Duplicate entry '0-Hansen_disease' for key 2
ERROR 1062 (23000) at line 139130: Duplicate entry '0-Wilson_disease' for key 2
ERROR 1062 (23000) at line 147900: Duplicate entry '0-Falkner_Island' for key 2
ERROR 1062 (23000) at line 170901: Duplicate entry '0-Microsoft_.NET' for key 2
ERROR 1062 (23000) at line 184074: Duplicate entry '0-Ohm_Law' for key 2
ERROR 1062 (23000) at line 204307: Duplicate entry '0-Kaposi_Sarcoma' for key 2
ERROR 1062 (23000) at line 257406: Duplicate entry '0-Bell_inequality' for key 2
ERROR 1062 (23000) at line 289396: Duplicate entry '0-Hollywood_Walk_of_Fame' for key 2
ERROR 1062 (23000) at line 343974: Duplicate entry '0-Sgt._Pepper_Lonely_Hearts_Club_Band' for key 2
ERROR 1062 (23000) at line 350134: Duplicate entry '0-Boltzmann_constant' for key 2
ERROR 1062 (23000) at line 361687: Duplicate entry '0-Gauss_theorem' for key 2
ERROR 1062 (23000) at line 369568: Duplicate entry '0-Charles_law' for key 2
ERROR 1062 (23000) at line 374524: Duplicate entry '0-Lavender_Blue' for key 2
ERROR 1062 (23000) at line 375710: Duplicate entry '0-DeMorgan' for key 2
ERROR 1062 (23000) at line 378180: Duplicate entry '0-Kaposi_sarcoma' for key 2
ERROR 1062 (23000) at line 383521: Duplicate entry '0-Down_syndrome' for key 2
ERROR 1062 (23000) at line 384832: Duplicate entry '0-Maslow_hierarchy_of_needs' for key 2
ERROR 1062 (23000) at line 389129: Duplicate entry '0-Japan_copyright_law' for key 2
ERROR 1062 (23000) at line 397556: Duplicate entry '0-Sainsbury' for key 2
ERROR 1062 (23000) at line 409151: Duplicate entry '0-St._John' for key 2
ERROR 1062 (23000) at line 413091: Duplicate entry '0-Milgram_experiment' for key 2
ERROR 1062 (23000) at line 418176: Duplicate entry '0-Hudson_Bay_Company' for key 2
ERROR 1062 (23000) at line 419421: Duplicate entry '0-Tourette_syndrome' for key 2
ERROR 1062 (23000) at line 423098: Duplicate entry '0-Benny_Goodman_Orchestra' for key 2
ERROR 1062 (23000) at line 427487: Duplicate entry '0-Long_John_Silver' for key 2
ERROR 1062 (23000) at line 432944: Duplicate entry '0-Schroedinger_equation' for key 2
ERROR 1062 (23000) at line 442032: Duplicate entry '0-Hölder_inequality' for key 2
ERROR 1062 (23000) at line 444903: Duplicate entry '0-Jay_Treaty' for key 2
ERROR 1062 (23000) at line 446458: Duplicate entry '0-Back_River' for key 2
ERROR 1062 (23000) at line 453610: Duplicate entry '0-1980' for key 2
ERROR 1062 (23000) at line 478337: Duplicate entry '0-Hudson_Bay' for key 2
ERROR 1062 (23000) at line 479826: Duplicate entry '0-Hilbert_basis_theorem' for key 2
ERROR 1062 (23000) at line 484443: Duplicate entry '0-Saint_Mary' for key 2
An automated run of parserTests.php showed the following failures:
This is MediaWiki version 1.12alpha (r26302).
Reading tests from "maintenance/parserTests.txt"...
Reading tests from "extensions/Cite/citeParserTests.txt"...
Reading tests from "extensions/Poem/poemParserTests.txt"...
Reading tests from "extensions/LabeledSectionTransclusion/lstParserTests.txt"...
17 still FAILING test(s) :(
* URL-encoding in URL functions (single parameter) [Has never passed]
* URL-encoding in URL functions (multiple parameters) [Has never passed]
* Table security: embedded pipes (http://lists.wikimedia.org/mailman/htdig/wikitech-l/2006-April/022293.html) [Has never passed]
* Link containing double-single-quotes '' (bug 4598) [Has never passed]
* message transform: <noinclude> in transcluded template (bug 4926) [Has never passed]
* message transform: <onlyinclude> in transcluded template (bug 4926) [Has never passed]
* BUG 1887, part 2: A <math> with a thumbnail- math enabled [Has never passed]
* HTML bullet list, unclosed tags (bug 5497) [Has never passed]
* HTML ordered list, unclosed tags (bug 5497) [Has never passed]
* HTML nested bullet list, open tags (bug 5497) [Has never passed]
* HTML nested ordered list, open tags (bug 5497) [Has never passed]
* Inline HTML vs wiki block nesting [Has never passed]
* Mixing markup for italics and bold [Has never passed]
* dt/dd/dl test [Has never passed]
* Images with the "|" character in the comment [Has never passed]
* Parents of subpages, two levels up, without trailing slash or name. [Has never passed]
* Parents of subpages, two levels up, with lots of extra trailing slashes. [Has never passed]
Passed 527 of 544 tests (96.88%)... 17 tests failed!
Hi,
I have a question that concerns access to my wikipedia user accounts. I have
user accounts at the English, Slovenian, German and Spanish wikipedias under
the username "Jalen" ("Jalen1" on the German wikipedia). My access to these
accounts has been blocked due to weak passwords. I had my my e-mail address
provided on the Spanish wikipedia but not on the other three wikipedias, hence
I am unable to restore access to the accounts by myself.
My e-mail address is scythus at volja.net. I am also subscribed to the
wikitech-l list under the same username ("Jalen") and the same e-mail address
("scythus(a)volja.net").
Would it be legitimate to request that my access to the user accounts on en:,
de: and sl: wikis be restored?
A bureaucrat at my native (sl:) wiki can confirm my identity since he has
previously communicated with me through the above-mentioned e-mail address (he
was the first person I contacted for troubleshooting, but since bureaucrats can
not restore access to user accounts I was told to contact developers) and has
also seen my IP address which is 84.52.134.168.
To further prove my identity, I have saved the confirmation mail I received on
opening my account on the Spanish wikipedia, where the original IP address is
stated.
I would also be satisfied if only the user account on my native (sl:) wiki could
be restored since, as I have said, a bureaucrat there knows me and can confirm
my the username ("Jalen"), the e-mail address ("scythus(a)volja.net") and the IP
address belong to one and the same person.
I would kindly appreciate any response or assistance.
Regards,
Jalen
Feature request: It would be extremely useful if a users watchlist
included entries if pages previously watchlisted are deleted.
There are some ways around that, manually or using a bot to create a
links page and look for red things, but having it appear in the main
watchlist would be better...
(popping this on my list of things to play around with coding, as well...)
--
-george william herbert
george.herbert(a)gmail.com
Dear All,
I posted at http://trust.cse.ucsc.edu/Code a tiny bit of code that enables
you to split a Wikipedia .xml dump into n-page chunks, for a given n. The
chunks are then immediately (on the fly) compressed with a compression
algorithm you can choose (default: gzip).
We are using this to split a dump, to be able to analyze it in pieces in a
more manageable way. We hope the code is useful to others as well. (It is a
tiny and trivial piece of code, btw).
Luca
On 10/1/07, David Gerard <dgerard(a)gmail.com> wrote:
> (What's the US patent exposure from having an MPEG-to-Theora converter
> on Wikimedia servers? Would running one on the toolserver be safe
> enough?)
I think that even better would be a client local transcoder which uses
the client installed codecs: then it becomes "if you can play it in
WindowsMedia, you can upload it using this tool" ... Gets us out of
the business of tracking the codec dejure and is useful for people who
want to do things other than preparing files for us.
This is pretty much already possible within the QT framework with
XiphQT (http://xiph.org/quicktime/) ... any video app that uses QT
codecs can just directly save Ogg/Theora+Vorbis files.
The last two runs of database dumps are severely broken and have a large number of duplicate titles which cause mysql to throw errors. I have managed to get some good translation runs and full imports, but not without a lot of work, and a lot of wasted time.
How about we fix the database dumps for those of us who need them and stop introfucing breakage.
Thanks. Latest errors and duplicate titles attached. I did not attach the full listing since this will exceed the limits of the mail server for a single message since there are hundreds of errors.
The 20070802 dumps do not seem to have this problem as bad.
ERROR 1062 (23000) at line 96187: Duplicate entry '0-Nyquist_theorem' for key 2
ERROR 1062 (23000) at line 113658: Duplicate entry '0-Urysohn_lemma' for key 2
ERROR 1062 (23000) at line 114417: Duplicate entry '0-Turner_syndrome' for key 2
ERROR 1062 (23000) at line 116050: Duplicate entry '0-World_fair' for key 2
ERROR 1062 (23000) at line 125448: Duplicate entry '0-Dining_cryptographer_protocol' for key 2
ERROR 1062 (23000) at line 132820: Duplicate entry '0-Rothmund-Thompson_syndrome' for key 2
ERROR 1062 (23000) at line 136377: Duplicate entry '0-Hansen_disease' for key 2
ERROR 1062 (23000) at line 139130: Duplicate entry '0-Wilson_disease' for key 2
ERROR 1062 (23000) at line 147900: Duplicate entry '0-Falkner_Island' for key 2
ERROR 1062 (23000) at line 170901: Duplicate entry '0-Microsoft_.NET' for key 2
ERROR 1062 (23000) at line 184074: Duplicate entry '0-Ohm_Law' for key 2
ERROR 1062 (23000) at line 204307: Duplicate entry '0-Kaposi_Sarcoma' for key 2
ERROR 1062 (23000) at line 257406: Duplicate entry '0-Bell_inequality' for key 2
ERROR 1062 (23000) at line 289396: Duplicate entry '0-Hollywood_Walk_of_Fame' for key 2
ERROR 1062 (23000) at line 343974: Duplicate entry '0-Sgt._Pepper_Lonely_Hearts_Club_Band' for key 2
ERROR 1062 (23000) at line 350134: Duplicate entry '0-Boltzmann_constant' for key 2
ERROR 1062 (23000) at line 361687: Duplicate entry '0-Gauss_theorem' for key 2
ERROR 1062 (23000) at line 369568: Duplicate entry '0-Charles_law' for key 2
ERROR 1062 (23000) at line 374524: Duplicate entry '0-Lavender_Blue' for key 2
ERROR 1062 (23000) at line 375710: Duplicate entry '0-DeMorgan' for key 2
ERROR 1062 (23000) at line 378180: Duplicate entry '0-Kaposi_sarcoma' for key 2
ERROR 1062 (23000) at line 383521: Duplicate entry '0-Down_syndrome' for key 2
ERROR 1062 (23000) at line 384832: Duplicate entry '0-Maslow_hierarchy_of_needs' for key 2
ERROR 1062 (23000) at line 389129: Duplicate entry '0-Japan_copyright_law' for key 2
ERROR 1062 (23000) at line 397556: Duplicate entry '0-Sainsbury' for key 2
ERROR 1062 (23000) at line 409151: Duplicate entry '0-St._John' for key 2
ERROR 1062 (23000) at line 413091: Duplicate entry '0-Milgram_experiment' for key 2
ERROR 1062 (23000) at line 418176: Duplicate entry '0-Hudson_Bay_Company' for key 2
ERROR 1062 (23000) at line 419421: Duplicate entry '0-Tourette_syndrome' for key 2
ERROR 1062 (23000) at line 423098: Duplicate entry '0-Benny_Goodman_Orchestra' for key 2
ERROR 1062 (23000) at line 427487: Duplicate entry '0-Long_John_Silver' for key 2
ERROR 1062 (23000) at line 432944: Duplicate entry '0-Schroedinger_equation' for key 2
ERROR 1062 (23000) at line 442032: Duplicate entry '0-Hölder_inequality' for key 2
ERROR 1062 (23000) at line 444903: Duplicate entry '0-Jay_Treaty' for key 2
ERROR 1062 (23000) at line 446458: Duplicate entry '0-Back_River' for key 2
ERROR 1062 (23000) at line 453610: Duplicate entry '0-1980' for key 2
ERROR 1062 (23000) at line 478337: Duplicate entry '0-Hudson_Bay' for key 2
ERROR 1062 (23000) at line 479826: Duplicate entry '0-Hilbert_basis_theorem' for key 2
ERROR 1062 (23000) at line 484443: Duplicate entry '0-Saint_Mary' for key 2