Hi.
A question about dumpHTML.inc:
function getFriendlyName( $name ) { global $wgLang; # Replace illegal characters for Windows paths with underscores $friendlyName = strtr( $name, '/\*?"<>|~', '_________' );
# Work out lower case form. We assume we're on a system with case-insensitive # filenames, so unless the case is of a special form, we have to disambiguate if ( function_exists( 'mb_strtolower' ) ) { $lowerCase = $wgLang->ucfirst( mb_strtolower( $name ) ); } else { $lowerCase = ucfirst( strtolower( $name ) ); }
# Make it mostly unique if ( $lowerCase != $friendlyName ) { $friendlyName .= '_' . substr(md5( $name ), 0, 4); } # Handle colon specially by replacing it with tilde # Thus we reduce the number of paths with hashes appended $friendlyName = str_replace( ':', '~', $friendlyName );
return $friendlyName; }
How can this last str_replace "reduce the number of paths with hashes appended"? The hash append happens _before_ this line. Sorry, for my ignorance.
Sven
Sven Hartrumpf wrote:
Hi.
A question about dumpHTML.inc:
function getFriendlyName( $name ) { global $wgLang; # Replace illegal characters for Windows paths with underscores $friendlyName = strtr( $name, '/\\*?"<>|~', '_________' ); # Work out lower case form. We assume we're on a system with case-insensitive # filenames, so unless the case is of a special form, we have to disambiguate if ( function_exists( 'mb_strtolower' ) ) { $lowerCase = $wgLang->ucfirst( mb_strtolower( $name ) ); } else { $lowerCase = ucfirst( strtolower( $name ) ); } # Make it mostly unique if ( $lowerCase != $friendlyName ) { $friendlyName .= '_' . substr(md5( $name ), 0, 4); } # Handle colon specially by replacing it with tilde # Thus we reduce the number of paths with hashes appended $friendlyName = str_replace( ':', '~', $friendlyName ); return $friendlyName; }
How can this last str_replace "reduce the number of paths with hashes appended"? The hash append happens _before_ this line. Sorry, for my ignorance.
Sven
I think it reduces the number of paths with hashes appended compared with treating the ':' in the initial strtr. ':' is a really common character on mediawiki due to the namespace, and if it was replaced before, lots and lots of pages would enter the $lowerCase != $friendlyName check.
mediawiki-l@lists.wikimedia.org