On Sun, Aug 2, 2009 at 6:29 PM, Michael Dalemdale@wikimedia.org wrote: [snip]
two quick points.
- you don't have to re-upload the whole video just the sha1 or some
sort of hash of the assigned chunk.
But each re-encoder must download the source material.
I agree that uploads aren't much of an issue.
[snip]
other random clients that are encoding other pieces would make abuse very difficult... at the cost of a few small http requests after the encode is done, and at a cost of slightly more CPU cylces of the computing pool.
Is >2x slightly? (Greater because some clients will abort/fail.)
Even that leaves open the risk that a single trouble maker will register a few accounts and confirm their own blocks. You can fight that too— but it's an arms race with no end. I have no doubt that the problem can be made tolerably rare— but at what cost?
I don't think it's all that acceptable to significantly increase the resources used for the operation of the site just for the sake of pushing the capital and energy costs onto third parties, especially when it appears that the cost to Wikimedia will not decrease (but instead be shifted from equipment cost to bandwidth and developer time).
[snip]
We need to start exploring the bittorrent integration anyway to distribute the bandwidth cost on the distribution side. So this work would lead us in a good direction as well.
http://lists.wikimedia.org/pipermail/wikitech-l/2009-April/042656.html
I'm troubled that Wikimedia is suddenly so interested in all these cost externalizations which will dramatically increase the total cost but push those costs off onto (sometimes unwilling) third parties.
Tech spending by the Wikimedia Foundation is a fairly small portion of the budget, enough that it has drawn some criticism. Behaving in the most efficient manner is laudable and the WMF has done excellently on this front in the past. Behaving in an inefficient manner in order to externalize costs is, in my view, deplorable and something which should be avoided.
Has some organizational problem arisen within Wikimedia which has made it unreasonably difficult to obtain computing resources, but easy to burn bandwidth and development time? I'm struggling to understand why development-intensive externalization measures are being regarded as first choice solutions, and invented ahead of the production deployment of basic functionality.