On Sat, Oct 11, 2008 at 9:25 AM, Stephen Bain stephen.bain@gmail.comwrote:
He's right, your analogy is misguided. You're placing importance on the time between the beginning of the round of dumps until the completion of the dump for a particular project, when what's really important is the time between dumps for that particular project.
Say that dumping project A takes 3 days, project B 8 days and project C one day. Your argument seems to be that since project C is the fastest to complete, it should be dumped first, because. But your argument doesn't take into account that these are not one-off transactions, but repeated ones.
If the dumping cycle is repeated monthly, starting on the first of the month, then project A will get their dump on the 4th, B on the 12th and C not until the 13th of the month. But C's previous dump will have been on the 13th of the preceding month, so just like A and B, C had to wait a month for their dump.
There is a difference, depending where you make the split:
Say Project A takes 1 month to dump, Project B takes 1 month to dump, and 30 other projects take 1 day each to dump. Say you're CPU bound and have 2 CPUs, so you run 2 threads.
Method 1: Project A and Project B take 1 month to complete. Then the rest of the projects take 1/2 month to complete. Time between successive dumps is 1.5 months.
Method 2: Project A takes 1 month to complete while the 30 other projects complete. Then Project B takes 1 month to complete while the other 30 projects complete again. Time between successive dumps is 2 months for projects A and B, and 1 month for the rest of the projects.
On the other hand, if the 30 "other projects" took 4 days to complete, the times would be 2 months by method 1, and 2 months/4 months by method 2. It all depends where you make the split, and it's not clear to me what split is the most "fair".
Anthony