Why the numbered subfolders? And what do they mean?
Pretty funny, someone else (Brion) fixed the problem while I was looking at it. The file just magically appeared while I was editing the wiki page to check things out. I added a newline. :-)
Using subdirectory trees fixes an ancient UNIX problem that happens when there are several hundred of files in one directory -- if this is allowed to happen the system will sllloooooowwwww ddooooooowwwwwnnnn every time the full directory is accessed.
Brian
Quoting bwilson@clickshift.com, from the post of Wed, 08 Sep:
Using subdirectory trees fixes an ancient UNIX problem that happens when there are several hundred of files in one directory -- if this is allowed to happen the system will sllloooooowwwww ddooooooowwwwwnnnn every time the full directory is accessed.
yo! don't generalize! my ReiserFS deals quite fine with tens of thousands of files in a directories, and probably more :-)
Using subdirectory trees fixes an ancient UNIX problem that happens when there are several hundred of files in one directory -- if this is allowed to happen the system will sllloooooowwwww ddooooooowwwwwnnnn every time the full directory is accessed.
yo! don't generalize! my ReiserFS deals quite fine with tens of thousands of files in a directories, and probably more :-)
Yo! Hence my use of the word "ancient".
This problem still exists in ext2fs and ext2fs is still the default filesystem on most Linux distributions. The mediawiki folk are smart enough not to force everyone to use reiserfs should they not want to. ;-)
Brian
heidialyssa wrote:
Why the numbered subfolders? And what do they mean?
The uploads are broken up into separate smaller subdirectories so there are fewer files in each directory. (This matters more if you're a big site with tens of thousands of uploaded files.) There are a couple reasons for this:
* Some filesystems don't handle large numbers of files in a directory very efficiently. Smaller directories are faster to access.
* In theory you could break up the upload area across multiple drives/partitions if necessary.
* Doing a directory list on ten thousand files is kind of annoying. :) Smaller directories are easier to peek at when debugging.
The numbers are derived from a cryptographic hash of the filename and don't mean anything in particular; they're just there to divide things up evenly (whereas something like using the first letter of the filename would cause large clusters on some letters).
One problem however is that if PHP is running in "safe mode" the restrictions sometimes stop the subdirectories from working.
-- brion vibber (brion @ pobox.com)
mediawiki-l@lists.wikimedia.org