The next thing to tackle in the new codebase is the image directory.
The current implementation is hard to maintain and clearly not
scaleable, so I'll outline my proposed implementation here and ask
for feedback.
Also, I'd like it if there were some convenient way for me do grab a
tarball of Wikipedia's image directory--maybe an FTP logon?
Anyway, here's the implementation: First, I've already implemented
[[image:name.jpg]] links, so that users won't have to know about the
internal image directory structure, and so that wikitext will be more
transportable.
I think the internal directory structure should be de-flattened at
least one level. We can't count on all installations using ReiserFS,
so a single directory with hundreds of files is a performance
problem. It might even pay to go two levels. This means that images
will be placed under directories based on the first letter or two of
their name, e.g., "/upload/c/chessboard.png" or
even "/upload/c/ch/chessboard.png".
The database conversion tool will have to add a function to find all
the existing image links and convert them.
I've added a database table which contains for each image its name,
size, the time it was uploaded, and by whom. So we'll be able to get
a listing of images sorted by name to make maintenance possible.
The upload function should enforce some reasonable file name
conventions, and it should warn the user who tries to unsupported
file formats or very large files (though perhaps the user should be
able to override these warnings).
It would be _very_ nice to have a feature to search for pages that
use a particular image. I can't think of any way to do that but a
fulltext search, but possibly that could be made reasonable by
enforcing naming conventions (for example, requiring a minimum
length, forbidding problematic characters).
Any other issues I haven't thought of?
0