Platonides wrote:
On 29/01/12 22:48, MZMcBride wrote:
Hi.
Is there a list of (current) impediments to forking a Wikimedia wiki?
replicating parser,
The parser is publically available. We have no hidden tricks.
available dumps,
Dumps are running quite well currently. That shouldn't be a problem
Image access,
Downloading images is slightly harder. Although the big problem is the huge amount of them (so big size, which translates into needed disk space + bandwidth to download), not in getting the files. If you weren't interested in really forking the images, $wgUseInstantCommons could be used.
replicating user table, etc.
All user data deemed private is not available. Basically password hashes, watchlists and some preferences.
I appreciate the inline replies, but these were just ideas of the top of my head. ;-) I was asking about a more thorough review.
Also, replicating the parser is fairly difficult. Link existence checks, interwikis, image rendering with foreign repos, extension tags (math support, hiero support, syntax highlighting), Tidy interaction, etc. make actual replication very difficult. You can approximate, though.
Apart from the parser, depending on the level to which you want to replicate, certain parts of public data still aren't dumped, I think. The user table isn't dumped publicly, as I recall, not even in sanitized form. So you'd need a lot of API requests or the Toolserver there. These are the types of impediments it'd be nice to document....
MZMcBride