In order to support viewing deleted files (currently a blocker bug in SwiftMedia), I'm going to refactor File::getPath() into a new public function File::getLocalPath(), which will return an instance of a new class TempLocalPath, which will have two methods: getPath(), and close(). This class will own a local copy of the file. When it goes out of scope or its close() method is called (same thing), any resources held by the class will be freed.
With the upcoming FileBackend class and subclasses, this class will be a requirement. Since I need it anyway, I may as well do the work now to create it. File::getPath() will remain as a call, but it will throw an exception if SwiftMedia is installed. When I get finished, its only uses will be by extension writers who have chosen not to publish their code in our SVN.
On 19 October 2011 20:41, Russell Nelson russnelson@gmail.com wrote:
In order to support viewing deleted files (currently a blocker bug in SwiftMedia), I'm going to refactor File::getPath() into a new public function File::getLocalPath(), which will return an instance of a new class TempLocalPath, which will have two methods: getPath(), and close(). This class will own a local copy of the file. When it goes out of scope or its close() method is called (same thing), any resources held by the class will be freed.
With the upcoming FileBackend class and subclasses, this class will be a requirement. Since I need it anyway, I may as well do the work now to create it. File::getPath() will remain as a call, but it will throw an exception if SwiftMedia is installed. When I get finished, its only uses will be by extension writers who have chosen not to publish their code in our SVN.
Is there a more standardised name than "close"? "dispose" is pretty common in compiled languages; do we have any sort of standard for that behaviour within MW? If not, is this a good opportunity to create one?
Other than that it sounds like a good plan to me.
--HM
On Wed, Oct 19, 2011 at 12:55 PM, Happy Melon happy.melon.wiki@gmail.comwrote:
On 19 October 2011 20:41, Russell Nelson russnelson@gmail.com wrote:
In order to support viewing deleted files (currently a blocker bug in SwiftMedia), I'm going to refactor File::getPath() into a new public function File::getLocalPath(), which will return an instance of a new class TempLocalPath, which will have two methods: getPath(), and close(). This class will own a local copy of the file. When it goes out of scope or its close() method is called (same thing), any resources held by the class will be freed.
Is there a more standardised name than "close"? "dispose" is pretty common in compiled languages; do we have any sort of standard for that behaviour within MW? If not, is this a good opportunity to create one?
We have free() on things like database result wrappers; I'd probably stick with that. Generally though nobody has to call these explicitly as the runtime's reference counting auto-destroys objects once they fall out of use, and the destructor calls it for us.
As Tim noted, the particular case in Special:Undelete is probably an example of old code using a lazy interface due to old assumptions, and should actually be changed to use a proper File reference, and have a method to call from the File object / via the FileRepo/FileBackend that wraps streaming output so it doesn't actually need to do a file checkout.
This didn't come up as much for ForeignAPIRepo because we do fewer direct operations on files from non-local repositories: we leave metadata lookups and thumbnailing to the foreign site, so don't have to pass the file into MediaHandler's shell-outs. We don't stream restricted or deleted files out because we don't manage their deletion/undeletion or security.
On Wed, Oct 19, 2011 at 3:06 PM, Platonides Platonides@gmail.com wrote:
Why is this needed? A FileSwift class which needed to fetch the file from Swift could acquire it when calling getPath(), and have it deleted on FileSwift destruction (getPath() is usually called from several methods, so it isn't practical to acquire and destroy every time).
I would broadly warn against unnecessarily checking files out to a local filesystem: while this is fairly cheap with tiny JPEGs and PNG diagrams, and feels cheap on modern computers for several-megabyte photos, it's still potentially very expensive for multi-hundred-meg and multi-gigabyte video files.
Any time we are copying to a local file, we should be sure that we *really mean it*.
A getPath() that implicitly copies files makes me uneasy... I might actually recommend putting a deprecation warning on it to help find and reduce these.
A more explicit check-out / discard interface will let us keep existing code like our thumbnailing that does need to deal with local files, without hiding that it's a potentially expensive operation. And that can encourage us to move things towards interfaces that don't need to check files out, but can read them as streams or skip expensive original-file checks in the first place when they're not really needed.
-- brion
On Thu, Oct 20, 2011 at 2:39 PM, Brion Vibber brion@pobox.com wrote:
As Tim noted, the particular case in Special:Undelete is probably an example of old code using a lazy interface due to old assumptions, and should actually be changed to use a proper File reference, and have a method to call from the File object / via the FileRepo/FileBackend that wraps streaming output so it doesn't actually need to do a file checkout.
I went through all the public (stored in our svn) instances of StreamFile::stream aka wfStream:
./includes/specials/SpecialRevisiondelete.php:319: StreamFile::stream( $path );
Okay: We can use $oimage (an instance of File), which comes from newFromArchiveName().
./includes/specials/SpecialUndelete.php:1011: StreamFile::stream( $path );
Okay: showFile() contains the call. It is only called from a context with a File. It's not being passed in now, but that's easily changed.
./img_auth.php:102:StreamFile::stream( $filename, array( 'Cache-Control: private', 'Vary: Cookie' ) );
No File: $filename is derived from PATH_INFO, and a File is never created.
./thumb.php:127: StreamFile::stream( $thumbPath, $headers );
Okay: $thumbPath came from $img->getThumbPath($params), which will fetch a local copy.
./thumb.php:156: StreamFile::stream( $thumb->getPath(), $headers );
No Change: $thumb is an instance of MediaTransformOutput, not File. Its $thumb->getPath() returns the path to the file in the local filesystem which was written by File::transform(). This actually works as-is because the file doesn't get stored in the media store.
./extensions/ConfirmAccount/ConfirmAccount_body.php:396: wfStreamFile( $path );
No File: calls getZonePath and computes the filename itself. Note that it's not using the same hashing scheme as anything else, although it most closely resembles deleted files. Even though it is stored in the same place as media files, it cannot be accessed as if it were a File because the hashing is different. Probably should be accessing the FileBackend directly.
./extensions/ConfirmAccount/UserCredentials_body.php:199: wfStreamFile( $path );
No File: calls getZonePath and computes the filename itself. Note that it's not using the same hashing scheme as anything else, although it most closely resembles deleted files. Even though it is stored in the same place as media files, it cannot be accessed as if it were a File because the hashing is different. Probably should be accessing the FileBackend directly.
./extensions/ConfirmEdit/FancyCaptcha.class.php:196: wfStreamFile( $file );
No File: Has its own top-level directory and hashing level. Not suitable for access as a File. Probably should be accessing the FileBackend directly.
El 19/10/11 21:41, Russell Nelson escribió:
In order to support viewing deleted files (currently a blocker bug in SwiftMedia), I'm going to refactor File::getPath() into a new public function File::getLocalPath(), which will return an instance of a new class TempLocalPath, which will have two methods: getPath(), and close(). This class will own a local copy of the file. When it goes out of scope or its close() method is called (same thing), any resources held by the class will be freed.
With the upcoming FileBackend class and subclasses, this class will be a requirement. Since I need it anyway, I may as well do the work now to create it. File::getPath() will remain as a call, but it will throw an exception if SwiftMedia is installed. When I get finished, its only uses will be by extension writers who have chosen not to publish their code in our SVN.
Why is this needed? A FileSwift class which needed to fetch the file from Swift could acquire it when calling getPath(), and have it deleted on FileSwift destruction (getPath() is usually called from several methods, so it isn't practical to acquire and destroy every time).
On 19 October 2011 23:06, Platonides Platonides@gmail.com wrote:
El 19/10/11 21:41, Russell Nelson escribió:
In order to support viewing deleted files (currently a blocker bug in SwiftMedia), I'm going to refactor File::getPath() into a new public function File::getLocalPath(), which will return an instance of a new class TempLocalPath, which will have two methods: getPath(), and close(). This class will own a local copy of the file. When it goes out of scope or its close() method is called (same thing), any resources held by the class will be freed.
With the upcoming FileBackend class and subclasses, this class will be a requirement. Since I need it anyway, I may as well do the work now to create it. File::getPath() will remain as a call, but it will throw an exception if SwiftMedia is installed. When I get finished, its only uses will be by extension writers who have chosen not to publish their code in our SVN.
Why is this needed? A FileSwift class which needed to fetch the file from Swift could acquire it when calling getPath(), and have it deleted on FileSwift destruction (getPath() is usually called from several methods, so it isn't practical to acquire and destroy every time).
Being able to have files as resources which can be made temporarily available and then destroyed is valuable for deleted images even outside SwiftMedia, though, no? The current system where we serve images through index.php just to add authentication is not pretty...
--HM
This is needed because some code asks for a path to the file without having a File. For example SpecialUndelete::showFile(). Basically encapsulation punch-through. This is a clean-up. If you disagree, of course feel free to propose something else.
On Wed, Oct 19, 2011 at 6:06 PM, Platonides Platonides@gmail.com wrote:
El 19/10/11 21:41, Russell Nelson escribió:
In order to support viewing deleted files (currently a blocker bug in SwiftMedia), I'm going to refactor File::getPath() into a new public function File::getLocalPath(), which will return an instance of a new class TempLocalPath, which will have two methods: getPath(), and close(). This class will own a local copy of the file. When it goes out of scope or its close() method is called (same thing), any resources held by the class will be freed.
>
With the upcoming FileBackend class and subclasses, this class will be a requirement. Since I need it anyway, I may as well do the work now to create it. File::getPath() will remain as a call, but it will throw an exception if SwiftMedia is installed. When I get finished, its only uses will be by extension writers who have chosen not to publish their code in our SVN.
Why is this needed? A FileSwift class which needed to fetch the file from Swift could acquire it when calling getPath(), and have it deleted on FileSwift destruction (getPath() is usually called from several methods, so it isn't practical to acquire and destroy every time).
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Russell Nelson wrote:
This is needed because some code asks for a path to the file without having a File. For example SpecialUndelete::showFile(). Basically encapsulation punch-through. This is a clean-up. If you disagree, of course feel free to propose something else.
They should be changed to fetch a File (eg. a FileDeleted class), but I don't see why that follows that we need a TempLocalPath class.
On Wed, Oct 19, 2011 at 7:00 PM, Platonides Platonides@gmail.com wrote:
Russell Nelson wrote:
This is needed because some code asks for a path to the file without having a File. For example SpecialUndelete::showFile(). Basically encapsulation punch-through. This is a clean-up. If you disagree, of course feel free to propose something else.
They should be changed to fetch a File (eg. a FileDeleted class), but I don't see why that follows that we need a TempLocalPath class.
SwiftMedia files are stored in an object store (as are Microsoft Azure). Some functions need local access to these files. getPath() simply promises to hand over the path to the files if it were stored in the local filesystem.
On 20/10/11 06:41, Russell Nelson wrote:
In order to support viewing deleted files (currently a blocker bug in SwiftMedia), I'm going to refactor File::getPath() into a new public function File::getLocalPath(), which will return an instance of a new class TempLocalPath, which will have two methods: getPath(), and close(). This class will own a local copy of the file. When it goes out of scope or its close() method is called (same thing), any resources held by the class will be freed.
With the upcoming FileBackend class and subclasses, this class will be a requirement. Since I need it anyway, I may as well do the work now to create it. File::getPath() will remain as a call, but it will throw an exception if SwiftMedia is installed. When I get finished, its only uses will be by extension writers who have chosen not to publish their code in our SVN.
File::getPath() should return false if a local path is not available, not throw an exception. This is the documented behaviour, and is already implemented by ForeignAPIFile. Any File::getPath() callers which don't guard against this should be changed to do so.
The wfStreamFile() callers should call FileRepo::streamFile(), not File::getLocalPath(), as we previously discussed.
Making a local copy of a file is probably not the most efficient way to stream the file out to the client, for any backend. So streaming should be done by the repo, not by a UI module like Special:Undelete.
Finalising a true streaming interface between the FileRepo and the FileBackend is not essential in the short term, it can be done later to reduce the scope of the current FileBackend project. But it's important that external callers use a FileRepo interface which is flexible and allows for future performance improvements.
-- Tim Starling
wikitech-l@lists.wikimedia.org