James Hare wrote:
I'll be blunt: I will be using the toolset to upload millions of files. Taking that into consideration: what kind of marginal cost are we looking at having an external tool interfacing with the API instead of something built directly into the software? These are media files, not byte-sized edits to Wikidata. Also, how is uploading files—even large numbers of them—not a core function of a media repository?
Hi.
I personally think we must give a lot more thought to the broad strategy being used here. For example, if you want to upload millions of files, why not put them on a hard drive and ship them? It will be dramatically faster and a lot less wasteful of bandwidth and time. In my opinion, we need to figure out what the actual scope of this tool is and then build around that, recognizing that the scope probably doesn't (or shouldn't) include the ability to import a nation's archives into Wikimedia Commons.
There's also a larger conversation that needs to happen about whether Wikimedia Commons is ready to accept such large media donations. Yes, I realize that people have been bulk-uploading to Commons for years now, but that doesn't mean that this is acceptable nor sustainable, it's just currently tolerated. There are both technical costs (disk space comes to mind) and social costs (flooding a wiki with content that can't be reviewed or curated in a timely manner) to understand and account for.
MZMcBride