Hi,
Two questions-
1) Is there any bot running which can use the IA upload tool to transfer files from Internet Archive to Commons? I see lots and lots of public domain files in IA but they are not present in Commons. Its next to impossible to be done manually.
2) Is there any bot running, which can create index pages in respective language Wikisources, whenever a pdf or djvu files are uploaded from IA?
If they are not present, can theses bot accounts be created?
Regards
Hi Bodhisattwa, regarding your first proposal, I'm not sure to understand the advantages of having hundreds thousands books in Commons, right now. Commons manages metadata poorly, Internet Archive is much more efficient. I do find that the added value is when the book is in WIkisource, but this means that we need people doing that, human curation. So I find IA upload tool perfect in this regard: it helps you do it when you want to do it, quickly.
For the second bot, I think the problem is simply that every wikisource is different, and creating good index pages is more art than science. Moreover, there is the problem I said before.
But probably your situation is different: if you want to populate a *new* Wikisource, what you say makes sense. That would be a bot "on demand": you gave him a list of IA identifiers, he does all the work.
Aubrey
On Mon, Jan 11, 2016 at 10:06 AM, Bodhisattwa Mandal < bodhisattwa.rgkmc@gmail.com> wrote:
Hi,
Two questions-
- Is there any bot running which can use the IA upload tool to transfer
files from Internet Archive to Commons? I see lots and lots of public domain files in IA but they are not present in Commons. Its next to impossible to be done manually.
- Is there any bot running, which can create index pages in respective
language Wikisources, whenever a pdf or djvu files are uploaded from IA?
If they are not present, can theses bot accounts be created?
Regards
Bodhisattwa
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Bodhisattwa Mandal, 11/01/2016 10:06:
- Is there any bot running which can use the IA upload tool to transfer
files from Internet Archive to Commons? I see lots and lots of public domain files in IA but they are not present in Commons. Its next to impossible to be done manually.
There is https://tools.wmflabs.org/ia-upload (documented at https://commons.wikimedia.org/wiki/Commons:Upload_tools#Internet_Archive too), do you mean that's too hard to use manually as well?
Nemo
P.s.: Please use a descriptive subject line.
Hi ,
Aubrey, I understand.
Nemo, yeah, I was talking about a bot which can use the said IA upload tool.
And no, its not so hard to use it manually. Just asking. No issues. :-)
Regards,
Bodhisattwa Mandal, 11/01/2016 11:12:
Nemo, yeah, I was talking about a bot which can use the said IA upload tool.
Ok. I wrote such a bot: https://github.com/nemobis/BEIC/blob/master/BEIC-ia2commons.rb Most of the work is in finding and parsing the metadata. The Internet Archive metadata is not good enough for Italian Wikisource, maybe other Wikisources are less demanding.
There is also some issue https://github.com/tpt/ia-upload/issues due to which I wouldn't advise very big batch uploads for now.
Nemo
There have been a couple of responses.
To me there is nothing that has excellent metadata that allows the import of files to Commons. To do it well, the data at IA needs to be updated, with splits; it is good data, but it is not excellent data. There are regularly errors or significant omissions, poor scans, etc. I simply would not rely on the data being suitable accurate for automated addition, it all needs review.
To get a bot to do the process well we would need 1) Wikidata well-populated with reviewed metadata from IA, including with IA links involving a review process. 2) A bot that can load to Commons utilising IA metadata, utilising the improved data at WD. This would include the addition of creator templates for the authors in the book template 3) A bot to create the pertinent Index page at the relevant wiki.
So to me, I would much prefer to see the means to enter accurate wikidata from IA, and THEN have the ready means from WD to put into action the transfer and creation processes at Commons and the xxWS [be it a button governed by a javascript gadget]. This puts us ahead of the game of bad or missing data in WD, which is still a burden and a significant weakness in the system (well it is for enWS).
[Noting that with an accurate {{book}} template that any administrator at Commons can import files directly from Commons in a very quick process without use of any bot. So as there are already bots out there that can import files, it is not a technical issue for an administrator bot to do it. The issue, is not the import, it is the data that goes with the import. ]
Regards, Billinghurst
On Mon, Jan 11, 2016 at 8:06 PM, Bodhisattwa Mandal bodhisattwa.rgkmc@gmail.com wrote:
Hi,
Two questions-
- Is there any bot running which can use the IA upload tool to transfer
files from Internet Archive to Commons? I see lots and lots of public domain files in IA but they are not present in Commons. Its next to impossible to be done manually.
- Is there any bot running, which can create index pages in respective
language Wikisources, whenever a pdf or djvu files are uploaded from IA?
If they are not present, can theses bot accounts be created?
Regards
Bodhisattwa
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
wikisource-l@lists.wikimedia.org