Hello. Was recommended to try this email address about assistance with the
wiki dump torrents.
I have been using my Synology file server to share the wiki dump torrents
for several years without issues, and would like to keep hosting the
torrents. Unfortunately, the current English wiki dump, Jan 2023 seems to
have torrent padding, and is not compatible with Synology. Have spoken
with Synology and this issue has been outstanding for a while, and they do
not anticipate this being fixed any time soon. Seeing if either you can
release the next torrent dump without padding, or if you know of another
work around.
Have never had an issue with any of the torrent dumps until this Jan 2023
version.
Thank you for the help.
Chris
On Thu, Feb 23, 2023, 07:30 Hannah Okwelum <hokwelum(a)wikimedia.org> wrote:
> Hello Chris,
>
>
> Thanks for hosting the dumps.
>
>
> Unfortunately, we are not involved with the hosting or formatting of
> torrent-based dumps. You could send an email to
> xmldatadumps-l(a)lists.wikimedia.org to get more audience to weigh in on
> your question. You might also get more information on the meta page(
> https://meta.wikimedia.org/wiki/Data_dump_torrents) about the dumps.
>
>
> This may not be helpful but you could check the talk page if there is any
> useful Information for contacting someone with more Information about
> torrent-based dumps.
>
>
> Thank you.
>
> Hannah Okwelum
>
> On Wed, Feb 22, 2023 at 5:31 PM Chris Couch <chris(a)couch.ca> wrote:
>
>> Hello. I've been helping host the torrent wiki dumps for several years
>> now. The most recent English torrent for 01 Jan 2023 is having problems on
>> my Synology storage system. From what I can figure out, the newest wiki
>> torrent has padding in the file structure to align the files with the
>> blocks. Unfortunately, this method of torrenting is incompatible with the
>> Synology torrent app. I spoke to Synology, and this compatibility request
>> has been outstanding for a while, and doesn't seem to be a priority for
>> them to fix. Wondering if you know of a work around, as I would prefer to
>> continue hosting the wiki dumps.
>>
>> Thank you in advance.
>>
>> Chris
>>
>
Hi, I would like to request custom dump(s) of all (local) images on these
wikis: enwiktionary, eswiktionary, frwiktionary, dewiktionary, enwikiquote,
eswikiquote, frwikiquote, dewikiquote, enwikibooks, eswikibooks,
frwikibooks, dewikibooks. Also, if possible, it would be helpful (but not
required) to also have a dump of just the commons images that are used on
those wikis.
For context on this e-mail, see:
https://meta.wikimedia.org/wiki/Wikimedia_Forum#Help_with_old_dumps and
https://meta.wikimedia.org/wiki/User_talk:Xaosflux#Images
Hi,
I am interested in performing analysis on recently created pages on English
Wikipedia.
One way to find recently created pages is downloading a meta-history file
for the English language, and filter through the XML, looking for pages
where the oldest revision is within the desired timespan.
Since this requires a library to parse through XML string data, I would
imagine this is much slower than a database query. Is page revision data
available in one of the SQL dumps which I could query for this use case?
Looking at the exported tables list
<https://meta.wikimedia.org/wiki/Data_dumps/What%27s_available_for_download#…>,
it does not look like it is. Maybe this is intentional?
Thanks,
Eric Andrew Lewis
ericandrewlewis.com
+1 610 715 8560
Greetings XML Dump users and contributors!
This is your automatic monthly Dumps FAQ update email. This update
contains figures for the 20230101 full revision history content run.
We are currently dumping 961 projects in total.
---------------------
Stats for shnwiki on date 20230101
Total size of page content dump files for articles, current content only:
103,580,699
Total size of page content dump files for all pages, current content only:
104,183,851
Total size of page content dump files for all pages, all revisions:
512,285,216
---------------------
Stats for enwiki on date 20230101
Total size of page content dump files for articles, current content only:
92,383,133,517
Total size of page content dump files for all pages, current content only:
192,221,073,263
Total size of page content dump files for all pages, all revisions:
25,978,785,403,079
---------------------
Sincerely,
Your friendly Wikimedia Dump Info Collector