Hi Robert,
TBH I asked the question as NPOV as possible because I have my own bias. By
stating it in general terms I hope that the conversation isn't forced in
any particular direction.
There are technical limitations to reuse external datasets, like how do you
control that the external site doesn't manage to inject malicious code into
the visitors' browser, or how do you cache the data, or what happens when
the source data changes or is no longer available... which doesn't mean
that it cannot be overcome. In general I also tend to prefer a complete
data management solution, because it is what we do in all our projects. We
only use the files stored in Commons, like images, videos, books, sound...
no exceptions (or at least I don't know any).
OTOH, is it practical to import and standardize the data?
I appreciate your thoughts. If you could write them on the talk page too,
that would be great. And if you think that we should make a precision about
the three aspects of datasets, please feel free to edit the RFC and let's
address each one of them individually.
Cheers,
Micru
On Thu, May 15, 2014 at 9:48 PM, Robert Rohde <rarohde(a)gmail.com> wrote:
Micru,
There are several related aspects of datasets, that I would enumerate as:
1) Storing / archiving datasets
2) Editing / manipulating datasets
3) Using excerpts (e.g. specific data) from datasets
Each of these involves a different, but related set of tools.
It isn't entirely clear to me, but I think the question you started
with is aimed at how we might use excerpts from externally managed
datasets. For example, having a way to pull data from CKAN and have
it appear in a Wikipedia article? That would remove steps one and two
from immediate consideration, as someone else would be responsible for
maintaining the data. On the other hand, the responses so far seem
more aimed at question one, i.e. where / how would Wikimedia best
store datasets.
Personally, I think all parts of the question are ultimately
important, as I would love for Wikimedia to have a complete data
management solution. But am I correct in thinking that you asked the
question primarily out of a desire to think about how we could use
externally managed data sets?
-Robert Rohde
On Thu, May 15, 2014 at 2:25 AM, David Cuenca <dacuetu(a)gmail.com> wrote:
Hi,
During the Zürich Hackathon I met several people that looked for
solutions
about how to integrate external open datasets
into our projects (mainly
Wikipedia, Wikidata). Since Wikidata is not the right tool to manage them
(reasons explained in the RFC as discussed during the Wikidata session),
I
have felt convenient to centralize the discussion
about potential
requirements, needs, and how to approach this new changing landscape that
didn't exist a few years ago.
You will find more details here
https://meta.wikimedia.org/wiki/Requests_for_comment/How_to_deal_with_open_…
Your comments, thoughts and ideas are appreciated!
Cheers,
Micru
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
--
Etiamsi omnes, ego non