-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Felipe Ortega wrote:
Yesterday, I was moving around mysqldump files of our
processed
databases from parsed Wikipedia dumps, and this simple question came
to my mind.
Is there any special reason to use an "ad-hoc" XML schema for
Wikipedia dumps?
1) The format is relatively stable, unlike our database schema.
2) Our databases are spread over dozens of servers, in mixes of internal
binary compression formats whose interpretation is dependent on our
configuration and custom code.
3) Our internal databases mix public and private information, which we
have to separate for external dumps. Thus only completely public tables
are dumped with mysqldump.
Thus, we use a stable, safe data schema for public page dumps. Dumping
raw SQL of these tables would be unstable, insecure, and useless for
most people.
- -- brion vibber (brion @
wikimedia.org)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla -
http://enigmail.mozdev.org
iEYEARECAAYFAkgI7FcACgkQwRnhpk1wk46jWwCfSEAayLMoFIokCrEMuvdlcBUC
ht4An3M+t1Xo0kjv6vS6NRTOsYkYPi+G
=2bU3
-----END PGP SIGNATURE-----