On Mon, Aug 15, 2011 at 4:16 PM, Russell N. Nelson - rnnelson < rnnelson@clarkson.edu> wrote:
Exactly what I propose. Keep a list of files and their sizes, so that when somebody asks for a range, you can skip files up until you get to the range they've requested. Not worrying about new or already-downloaded changed files, or deleted files. You're not getting a "current" copy of the files, you're getting a copy of the files that were available when you started your download. Minus the deleted files, which by policy we shouldn't be handing out anyway.
Except the ones that weren't deleted when you started your download, I presume? Otherwise you've now got an inconsistent data set. And of course anything that has changed, you'll want to make sure you can access the original version, not the new version, or else the size or contents will be wrong and you'll end up sending bad info.
rsync doesn't have the MW database to consult for changes.
That's an implementation detail, isn't it? GNU tar doesn't either.
-- brion