On Thu, 23 Sep 2010 20:13:23 -0700, Brion Vibber wrote:
On Thu, Sep 23, 2010 at 7:19 PM, Dan Nessett dnessett@yahoo.com wrote:
I appreciate your recent help, so I am going to ignore the tone of your last message and focus on issues. While a test run can set up, use and then delete the temporary resources it needs (i.e., db, images directory, etc.), you really haven't answered the question I posed. If the test run ends abnormally, then it will not delete those resources. There has to be a way to garbage collect orphaned dbs, images directories and cache entries.
Any introductory Unix sysadmin handbook will include examples of shell scripts to find old directories and remove them, etc. For that matter you could simply delete *all* the databases and files on the test machine every day before test runs start, and not spend even a second of effort worrying about cleaning up individual runs.
Since each test database is a fresh slate, there is no shared state between runs -- there is *no* need to clean up immediately between runs or between test sets.
My personal view is we should start out simple (as you originally suggested) with a set of fixed URLs that are used serially by test runs. Implementing this is probably the easiest option and would allow us to get something up and running quickly. This approach doesn't require significant development, although it does require a way to control access to the URLs so test runs don't step on each other.
What you suggest is more difficult and harder to implement than creating a fresh database for each test run, and gives no clear benefit in exchange.
Keep it simple by *not* implementing this idea of a fixed set of URLs which must be locked and multiplexed. Creating a fresh database & directory for each run does not require any additional development. It does not require devising any access control. It does not require devising a special way to clean up resources or restore state.
-- brion
I am authentically sorry that you feel obliged to couch your last 2 replies in an offensive manner. You have done some very good work on the Mediawiki code and deserve a great deal of credit for it.
It is clear you do not understand what I am proposing (that is, what I proposed after I accepted your suggestion to keep things simple):
+ Every test run does create a fresh database (and fresh images directory) for each run. It does this by first dropping the database associated with the last run, recursively deleting the phase3 directory holding the code from the previous run, checking out the revision for the current run (or if this is judged too expensive, we could hold the revisions in tar files and untar them into the directory), adjusting things so that the wiki will work (e.g., recursively chmoding the image directory so it is writable), and installing a LocalSettings file so things like imagemagick, texvc, etc. are locatable and global variables set appropriately. All of this is done in the directory associated with the fixed URL.
+ Before each test suite of a regression test runs it prepares the wiki. For example, if a prerequisite for the suite is the availability of an image, it uploads it before starting.
+ The regression test can be guarded by writing a lock file in the images directory (which protects all code and data directories). When the regression test completes, the lock file can be removed and the next regression test started. If there is only one test driver application running, the lock file is unnecessary. If the test driver application crashes for some reason, a simple utility can sweep the directory structures associated with the URLs and remove everything.
While this is only a sketch, it is obviously simpler than attempting to set up a test run using the wikipedia family scheme. The previous test run db, images directory, etc. are deleted at the beginning of the next run that uses the fixed URL and its associated directory space. There is no need to "Configure your DNS with a wildcard A record, and apache with a server alias (like ServerAlias *.yourdomain.net)", something that may not be possible for some sites. There is no need to dynamically edit LocalSettings to fix up the upload directory global variable.
As for cleaning up between runs, this simply ensures the database server doesn't become clogged with extraneous databases, that directory space is used efficiently and that memcached doesn't hold useless data. While it may be possible to get by with cleaning up every 24 hours, that the fresh wiki installation process cleans up these resources by default means such a global sweep is completely unnecessary.