Well that really depends on the data actually. There are lots of printed datasets and if someone has those online and can no longer host them, then we might want to harvest some, if not all of it. I am thinking of datasets of large collections of <whatever>. I recall not long ago a museum of music records became defunct and they were looking for a home for their database. We couldn't do anything for them then but we could put it in Mix-n-Match today (assuming the data is all published material that is considered a reliable source yada yada...)

On Fri, Jul 3, 2015 at 2:10 PM, Andrew Gray <andrew.gray@dunelm.org.uk> wrote:
On 1 July 2015 at 22:51, Quim Gil <qgil@wikimedia.org> wrote:

> * Where to publish entire datasets... Something tells me that this is not
> the most urgent and important problem that we have, but the community
> definitely knows better, so correct me if I'm wrong. I think our main use

I would agree with this. There has historically been a lot of
vagueness around the word "data", and a lot of vague suggestions in
the early days when Wikidata was still being created... and as a
result people sometimes get the impression that Wikidata intends to be
a kind of generalised data repository. This is a bit like assuming
Wikipedia will take anything that's got words :-)

I wonder if it would be good to identify a couple of good, reliable,
repositories we can encourage people to use for this sort of material?
This means that even if we have to say to a potential partner "sorry,
this isn't what we want", we can still give them advice on how to get
it released and available in the most appropriate way. Better than a
frustrating back-and-forth...


- Andrew Gray

Wikidata mailing list