On Wed, Jul 9, 2008 at 5:11 PM, <vasilievvv(a)svn.wikimedia.org> wrote:
> Log Message:
> -----------
> * Forbid files with * and ? to be uploaded under Windows (it caused internal errors since such characters are illegal there)
It seems like it would be a better idea to be consistent across
platforms here. Otherwise you're just going to cause trouble for
portability; for instance, Windows users would be unable to easily use
an image dump from Wikimedia, or other Unix-based MediaWiki
installations.
However, we don't *really* have to use the same name in the filesystem
as we use as a title. This seems to me like it would be better
implemented by mangling the filename somehow. The invalid Windows/DOS
characters are supposedly:
? [ ] / \ = + < > : ; " ,
Of those, I think the following are currently legal in image names
(before your commit):
? \ = + : ; " ,
Each of these could be replaced in the filesystem by some character
that Windows will accept, or some combination of them, which are
invalid image names anyway. For instance, you could replace them with
{question} {backslash} {equals} {plus} {colon} {semicolon} {quote}
{comma}; these will work correctly because {} are illegal in page
titles but legal in Windows filenames. (But they could send filenames
over the file length limit, so more creative substitutes might be a
better idea.) This way the rules for image titles remain unchanged,
which is nice because a lot of those characters are quite handy to
have in titles.
(Googled sources actually conflict as to the exact list of prohibited
characters. Some say * is prohibited, some don't mention it. Same
for |. ^ is apparently supposed to be illegal in FAT, according to
one source, and there are other restrictions, like no trailing space
or period, and a list of reserved names like "com1" and "nul".
Probably it varies across different versions, but it's a lot bigger
than just ? and *, anyway.)
> * Forbid files to be moved to invalid filenames
This might be more cleanly implemented by making invalid filenames
invalid titles in the Image namespace. That would make things
somewhat simpler by keeping things in more expected places. It also
makes sense to prohibit image pages from existing when it's not
possible for an image of that title to exist. (But projects will need
to be checked for pages that will become invalid under this scheme, of
course, perhaps using a maintenance script.)
> +/**
> + * Checks filename for validity
> + * @param mixed $title Filename or title to check
> + */
> +function wfIsValidFileName( $name ) {
Surely this shouldn't be a global function, but a static method of
something? Or even a non-static method of something.
> + elseif( wfIsWindows() && ( in_string( '*', $name ) || in_string( '?', $name ) ) )
> + return false;
> . . .
> + if( wfIsWindows() )
> + $filtered = preg_replace ( "/[*?]/", '-', $filtered );
Magic constants here. You have a list of blacklisted characters
scattered across multiple files, that's bad. They could become
inconsistent over time.