Gregory Maxwell said:
"Is there a useful way to subset wikipedia?". I believe that there are many useful ways, but I believe that all of the best ways require us to have some idea of the level of notability of each article.
If you can define it algorithmically, have at it! If not, hauling perfectly verifiable, NPOV articles before VfD and saying "does not establish notability" may be one way of doing it, but you may not get a particularly accurate answer because the threat of execution tends to force opinions. Perhaps a notifiability project might be more popular, as long as notifiability is divorced from deletability (if not, the notifiability project would acquire a rather bad reputation).
We're more limited by distribution media, DVD gives us 4.5 gigs. Of course we could span multiple disks, but then the user will likely have to copy it all onto the computer. The more disks we span the greater the cost at reproducing the material.
I believe the DVD versions that have been mooted tend to stick to the opening section, and omit the detail. With categories, it's possible to select interesting categories, and there are various other projects in progress to subset the data. They adopt a "build up" approach rather than "start with the entire database and delete stuff item by item".