I plan to merge the file backend branch next Monday (PST).
Overview: FileRepo was refactored to use storage paths instead of file system paths. A storage path looks like "mwstore://backend/container/rel_path_to_file". This is somewhat similar to FileRepo virtual URLs (though they are URL-encoded of course), which look like "mwrepo://repo/zone/rel_path_to_file".
Some functions, like storeBatch() still allow FS paths as sources. Important breaking changes are in functions like File::getPath(), which return storage paths now instead of file system paths. The append-related functions were removed as we are using concatenate instead (already added in trunk in r104687).
The main goal is to abstract storage away so that various backends (FS, Swift, S3, Azure,...) can be supported. Our current NFS usage for thumbnails is not sustainable short-term and nor is the usage for source files long-term. Beyond being a single point of failure, it doesn't scale very well. With new features like chunked uploads and TimedMediaHandler, we hope to actually have serious video content in the future, which will require a better storage medium.
Other changes: * Media handler code was minimally affected, as the transform tools are based on FS file reads/output anyway. However File::transform() will copy the output (if any) to the final storage path destination. * Upload code was minimally affected too. Initial uploads still work with temp FS source files and call performUpload(). Stash-based uploads still store virtual URLs in the DB to track the uploaded files (from the initial attempt). When the user finishes and uploads from the stash, the usual performUpload() function is called on a local FS copy. Chunked uploads likewise use keys that determine virtual URLs, which use the FileRepo::concatenate() function to create a new storage file. The usual performUpload() function is called on a local FS copy of the file. Improvements could still be made here. * Minor changes to img_auth.php/thumb.php were also required. * Thumb handler code was recently added to /trunk, this can eventually be used to replace our custom thumb-handler.php script on our NFS thumbnail cache server.
Breakage: Typically, the more a module makes use of FileRepo and virtual URLs, the less likely it is to break. Even calling File::getPath() and using that as a source to FileRepo::store() will happen to still work. Things like: a) filemtime( $file->getPath() ) b) copy( $file->getPath(), ... ) c) StreamFile::stream( $file->getPath() ) ...will be broken. You will see errors about PHP not finding a wrapper for 'mwstore'.
For example, ConfirmAccount and NSFileRepo will need updating. Since I wrote the former, it may provide an example for any updates needed. Such extensions will want to use FileRepo with an FSFileBackend and handle storage paths properly. If done correctly, the end-user won't notice anything on upgrade.
All core unit tests pass on my local machine.
End-users: Once bugs are ironed out, nothing should really change for end-users. Setup.php will automatically create backwards compatible FSFileBackend containers for repositories. There aren't really any user facing features in this rewrite.
-- View this message in context: http://wikimedia.7.n6.nabble.com/FileBackend-branch-merge-tp1799672p1799672.... Sent from the Wikipedia Developers mailing list archive at Nabble.com.
Very exciting. I can hardly wait until Monday to get my hands on this to try it out on Azure. That's been the missing piece for me. Everything else in MW 18.5 works for me on Azure, except paging beyond the first page for some of the special pages (e.g. Most Wanted) due to a SQL Server PDO bug that degrades scrollable cursors to forward-only cursors when using the Common Table Expressions (CTEs) that I introduced to compensate for the lack of LIMIT/OFFSETsyntax in T-SQL for SQL Azure.
On Fri, Dec 16, 2011 at 4:40 PM, Aaron S. aschulz4587@gmail.com wrote:
I plan to merge the file backend branch next Monday (PST).
Overview: FileRepo was refactored to use storage paths instead of file system paths. A storage path looks like "mwstore://backend/container/rel_path_to_file". This is somewhat similar to FileRepo virtual URLs (though they are URL-encoded of course), which look like "mwrepo://repo/zone/rel_path_to_file".
Some functions, like storeBatch() still allow FS paths as sources. Important breaking changes are in functions like File::getPath(), which return storage paths now instead of file system paths. The append-related functions were removed as we are using concatenate instead (already added in trunk in r104687).
The main goal is to abstract storage away so that various backends (FS, Swift, S3, Azure,...) can be supported. Our current NFS usage for thumbnails is not sustainable short-term and nor is the usage for source files long-term. Beyond being a single point of failure, it doesn't scale very well. With new features like chunked uploads and TimedMediaHandler, we hope to actually have serious video content in the future, which will require a better storage medium.
Other changes:
- Media handler code was minimally affected, as the transform tools are
based on FS file reads/output anyway. However File::transform() will copy the output (if any) to the final storage path destination.
- Upload code was minimally affected too. Initial uploads still work with
temp FS source files and call performUpload(). Stash-based uploads still store virtual URLs in the DB to track the uploaded files (from the initial attempt). When the user finishes and uploads from the stash, the usual performUpload() function is called on a local FS copy. Chunked uploads likewise use keys that determine virtual URLs, which use the FileRepo::concatenate() function to create a new storage file. The usual performUpload() function is called on a local FS copy of the file. Improvements could still be made here.
- Minor changes to img_auth.php/thumb.php were also required.
- Thumb handler code was recently added to /trunk, this can eventually be
used to replace our custom thumb-handler.php script on our NFS thumbnail cache server.
Breakage: Typically, the more a module makes use of FileRepo and virtual URLs, the less likely it is to break. Even calling File::getPath() and using that as a source to FileRepo::store() will happen to still work. Things like: a) filemtime( $file->getPath() ) b) copy( $file->getPath(), ... ) c) StreamFile::stream( $file->getPath() ) ...will be broken. You will see errors about PHP not finding a wrapper for 'mwstore'.
For example, ConfirmAccount and NSFileRepo will need updating. Since I wrote the former, it may provide an example for any updates needed. Such extensions will want to use FileRepo with an FSFileBackend and handle storage paths properly. If done correctly, the end-user won't notice anything on upgrade.
All core unit tests pass on my local machine.
End-users: Once bugs are ironed out, nothing should really change for end-users. Setup.php will automatically create backwards compatible FSFileBackend containers for repositories. There aren't really any user facing features in this rewrite.
-- View this message in context: http://wikimedia.7.n6.nabble.com/FileBackend-branch-merge-tp1799672p1799672.... Sent from the Wikipedia Developers mailing list archive at Nabble.com.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org