Στις 09-09-2010, ημέρα Πεμ, και ώρα 20:08 +0200, ο/η Roan Kattouw έγραψε:
2010/9/8 David Gerard dgerard@gmail.com:
This is something that's been a problem for years now.
I do not think there is any sort of deliberate intent. However, keeping the data close is a way to proprietise a wiki even if it's free content, so making it easy to fork is an important attitude to maintain.
I realise this is difficult when the devs have to work as hard as possible just to keep everything from falling over ...
That's right, there is no deliberate intent and it's really a lack of people on the ops side (dumps are an ops thing, not a dev thing, and devs generally can't do much to help here). WMF is also not "ignoring" requests to provide image dumps, it just hasn't gotten around to setting them up yet; presumably, this is because text dumps aren't running smoothly yet (I'd appreciate a reply from Ariel Glenn to get the facts here, but since Ariel is out of the country I may or may not get my wish).
It's true that the dumps situation is still a problem, but you (OP) should assume some good faith here rather than accusing the WMF of ignoring you, not earning the community's trust or even trying to usurp Wikipedia. You're right, you are being paranoid.
I am not thinking about image dumps at all. I am concentrating on the regular XML dumps which have been in sorry shape for various reasons ever since I started as a volunteer in the community adding content. (note that I am not laying blame about the sorry state, that's not the point).
For the rest of September I'll be fooling with these parallel runs until I get something that seems to perform well. For the next 5-6 days I'm out of action on them but after that it's back to the grind on them. Today, though I should hae been working on something else, I spent crunching some numbers and trying to figure out what more optimal chunk sizes ought to be. Since earlier articles by far have the bulk of the revisions it turns out I need to write some code to implement that. Anyways, either I'm (mostly) hard at work on this problem or I'm secretly plotting to run off with all the old copies of wikipedia to Bermuda and retire.... :-P
Off of the dumps page on wikitech http://wikitech.wikimedia.org/view/Dumps there's a link to a page where I'm starting to keep updates, now that there is an actual run going. I may shoot this run and restart this piece in a few days, but what the heck, at least there's some information there. Also there's a link to a wish list for the XML dumps; if the image dumps aren't listed there please add them. I'm not going to try to think about how feasible or not they might be right now though, brain too full.
Happy trails,
Ariel