On 6/8/06, Erik Moeller eloquence@gmail.com wrote:
I wrote specifications, a budget and a full proposal for Kennisnet in February. This proposal named the developers, the roles and responsibilities, and the duration of development. The proposal, as written in February, is at http://scireview.de/temp/instantcommons_2.odt
I have some questions from a technical perspective.
1. What is the purpose of the XML-RPC subsystem? Since the images are fetched via HTTP in any case, why not simply perform the existence test via simple HTTP as well. If that were done no modifications would be needed on the commons side. Without more explanation the addition of the XML-RPC subsystem seems like an unnecessary exercises in complexity.
2. It would appear from the specification that their is no facility for caching failures. So if an page contains a non-existent image, commons will be polled every time the page is rendered. A number of mechanisms would be possible to reduce this, from the use of negative caching, to the use of periodically updated filename bloom filters.
3. The proposed method of interwiki transclusion doesn't appear fully formed enough to determine if it will be sufficiently strong to prevent instant commons from accidentally becoming an automated system for license violations. In particular there doesn't appear to be any strong assurance that attribution and license data will *always* be available when the image is available.
4. Although copyright concerns are mentioned, they don't seem to be explored in depth. Commons has a huge amount of copyright violations on it today. I think the expectation that small wiki operators will run a deletion script is unreasonable and that this area should be explored in depth. Potentially making only certain images on commons available for automated replication.
5. If the remote wiki will download the full image in all cases, what is the purpose of burdening commons with the additional transfer and storage costs of their thumbnail generation? Yes, if that size has been used before commons will have it... but it's quite likely that other wikis will use a large number of sizes which are not used on Wikimedia sites, and even if the best case they still cost us additional bandwidth. The far side will still need to perform thumbnailing for the image page for large images in any case.
6. How will this address SVGs which are widely used on commons, but not supported by mediawiki out of the box? Our support for SVGs requires a modifyed version of librsvg in order to operate securely.
There are also some more complex issues, like where the $5,000 EUR fee comes from for what is, overall, such a simple feature set. But I don't want to create a flood of comments initially.