On 6/8/06, Erik Moeller <eloquence(a)gmail.com> wrote:
I wrote specifications, a budget and a full proposal
for Kennisnet in
February. This proposal named the developers, the roles and
responsibilities, and the duration of development. The proposal, as
written in February, is at
I have some questions from a technical perspective.
1. What is the purpose of the XML-RPC subsystem? Since the images
are fetched via HTTP in any case, why not simply perform the existence
test via simple HTTP as well. If that were done no modifications
would be needed on the commons side. Without more explanation the
addition of the XML-RPC subsystem seems like an unnecessary exercises
2. It would appear from the specification that their is no facility
for caching failures. So if an page contains a non-existent image,
commons will be polled every time the page is rendered. A number of
mechanisms would be possible to reduce this, from the use of negative
caching, to the use of periodically updated filename bloom filters.
3. The proposed method of interwiki transclusion doesn't appear fully
formed enough to determine if it will be sufficiently strong to
prevent instant commons from accidentally becoming an automated system
for license violations. In particular there doesn't appear to be any
strong assurance that attribution and license data will *always* be
available when the image is available.
4. Although copyright concerns are mentioned, they don't seem to be
explored in depth. Commons has a huge amount of copyright violations
on it today. I think the expectation that small wiki operators will
run a deletion script is unreasonable and that this area should be
explored in depth. Potentially making only certain images on commons
available for automated replication.
5. If the remote wiki will download the full image in all cases, what
is the purpose of burdening commons with the additional transfer and
storage costs of their thumbnail generation? Yes, if that size has
been used before commons will have it... but it's quite likely that
other wikis will use a large number of sizes which are not used on
Wikimedia sites, and even if the best case they still cost us
additional bandwidth. The far side will still need to perform
thumbnailing for the image page for large images in any case.
6. How will this address SVGs which are widely used on commons, but
not supported by mediawiki out of the box? Our support for SVGs
requires a modifyed version of librsvg in order to operate securely.
There are also some more complex issues, like where the $5,000 EUR fee
comes from for what is, overall, such a simple feature set. But I
don't want to create a flood of comments initially.