I don't know because I'm not the developer of the app and my knowledge is
limited in this area. For many years now, I am collecting data
(information, photo, coordinates) about the cemetery. I've publish
everything on Commons, Wikidata and OSM, so developers can do something
smart with that ;)
Pyb
Hi everyone,
Some exciting news here. The Open Data Awards' finalists lists were
recently published on their website. Wikidata has been listed as a finalist
in two different categories which are the Open Data Innovation Award and
the Open Data Publisher Award. Lydia
<http://www.wikidata.org/wiki/User:Lydia_Pintscher_(WMDE)> and Magnus
<http://www.wikidata.org/wiki/User:Magnus_Manske> will be representing
Wikidata at the gala dinner where the winner of each category will be
announced live. I will be standing in as a backup should Lydia be unable to
attend the award dinner but let's wish Lydia and Magnus a good time and
keep our fingers crossed that Wikidata will win at least one of the two
categories we've been nominated for. As Lydia would say - the entire
community is awesome for working to help build Wikidata to where it is and
this is as much as all of our work as it is the development team's for
helping build and innovate the way free knowledge is shared within the
mission of the Wikimedia Foundation.
Thanks,
John Lewis
--
John Lewis
I thought folks might like to know that every human gene (according to the
United States National Center for Biotechnology Information) now has a
representative entity on wikidata. I hope that these are the seeds for
some amazing applications in biology and medicine.
Well done Andra and ProteinBoxBot !
For example:
Here is one (of approximately 40,000) called "spinocerebellar ataxia 37"
https://www.wikidata.org/wiki/Q18081265
-Ben
Hi everybody,
With the Structured Data for Commons project about to move into high
gear, it seems to me that there's something the Wikidata community needs
to have a serious discussion about, before APIs start getting designed
and set in stone.
Specifically: when should an object have an item with its own Q-number
created for it on Wikidata? What are the limits? (Are there any limits?)
The position so far seems to be essentially that a Wikidata item has
only been created when an object either already has a fully-fledged
Wikipedia article written for it, or reasonably could have.
So objects that aren't particularly notable typically have not had
Wikidata items made for them.
Indeed, practically the first message Lydia sent to me when I started
trying to work on Commons and Wikidata was to underline to me that
Wikidata objects should generally not be created for individual Commons
files.
But, if I'm reading the initial plans and API thoughts of the Multimedia
team correctly, eg
https://commons.wikimedia.org/w/index.php?title=File%3AStructured_Data_-_Sl…
and
https://docs.google.com/document/d/1tzwGtXRyK3o2ZEfc85RJ978znRdrf9EkqdJ0zVj…
there seems to be the key assumption that, for any image that contains
information relating to something beyond the immediate photograph or
scan, there will be some kind of 'original work' item on main Wikidata
that the file page will be able to reference, such that the 'original
work' Wikidata item will be able to act as a place to locate any
information specifically relating to the original work.
Now in many ways this is a very clean division to be able to make. It
removes any question of having to judge "notability"; and it removes any
ambiguity or diversity of where information might be located -- if the
information relates to the original work, then it will be stored on
Wikidata.
But it would appear to imply a potentially *huge* increase in the
inclusion criteria for Wikidata, and the number of Wikidata items
potentially creatable.
So it seems appropriate that the Wikidata community should discuss and
sign off just what should and should not be considered appropriate,
before things get much further.
For example, a year ago the British Library released 1 million
illustrations from out-of-copyright books, which increasingly have been
uploaded to Commons. Recently the Internet Archive has announced plans
to release a further 12 million, with more images either already
uploading or to follow from other major repositories including eg the
NYPL, the Smithsonian, the Wellcome Foundation, etc, etc.
How many of these images, all scanned from old originals, are going to
need new Q-numbers for those originals? Is this okay? Or are some of
them too much?
For example, for maps, cf this data schema
https://docs.google.com/spreadsheets/d/1Hn8VQ1rBgXj3avkUktjychEhluLQQJl5v6W…
, each map sheet will have a separate Northernmost, Southernmost,
Easternmost, Westernmost bounding co-ordinates. Does that mean each map
sheet should have its own Wikidata item?
For book illustrations, perhaps it is would be enough just to reference
the edition of the book. But if individual illustrations have their own
artist and engraver details, does that mean the illustration needs to
have its own Wikidata item? Similarly, if the same engraving has
appeared in many books, is that also a sign that it should have its own
Wikidata item?
What about old photographs, or old postcards, similarly. When should
these have their own Wikidata item? If they have their own known
creator, and creation date, then is it most simple just to give them a
Wikidata item, so that such information about an original underlying
work is always looked for on Wikidata? What if multiple copies of the
same postcard or photograph are known, published or re-published at
different times? But the potential number of old postcards and
photographs, like the potential number of old engravings, is *huge*.
What if an engraving was re-issued in different "states" (eg a
re-issued engraving of a place might have been modified if a tower had
been built). When should these get different items?
At
https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Visual_arts#Wikidat…
where I raised some of these issues a couple of weeks ago, there has
even been the suggestion that particular individual impressions of an
engraving might deserve their own separate items; or even everything
with a separate accession number, so if a museum had three copies of an
engraving, we would make three separate items, each carrying their own
accession number, identifying the accession number that belonged to a
particular File.
(See also other sections at
https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Visual_arts for
further relevant discussions on how to represent often quite complicated
relations with Wikidata properties).
With enough items, we could re-create and represent essentially the
entire FRBR tree.
We could do this. We may even need to do this, if MM team's outline for
Commons is to be implemented in its apparent current form.
But it seems to me that we shouldn't just sleepwalk into it.
It does seem to me that this does represent (at least potentially) a
*very* large expansion in the number of items, and widening of the
inclusion criteria, for what Wikidata is going to encompass.
I'm not saying it isn't the right thing to do, but given the potential
scale of the implications, I do think it is something we do need to have
properly worked through as a community, and confirmed that it is indeed
what we *want* to do.
All best,
James.
(Note that this is a slightly different discussion, though related, to
the one I raised a few weeks ago as to whether Commons categories -- eg
for particular sets of scans -- should necessarily have their own
Q-number on Wikidata. Or whether some -- eg some intersection
categories -- should just have an item on Commons data. But it's
clearly related: is the simplest thing just to put items for everything
on Wikidata? Or does one try to keep Wikidata lean, and no larger than
it absolutely needs to be; albeit then having to cope with the
complexity that some categories would have a Q-number, and some would not.)
There are major problems using redirects as sitelinks. The top one is that
they do not always point to the concept they should, and even if they do,
there is no guarantee that this redirect will keep pointing to the same
place (normally to a section of another article), since the section title
can change.
Wikipedia supports section labelling:
https://en.wikipedia.org/wiki/Help:Labeled_section_transclusion
It would be nice if someone from wd-dev could take a look and see if this
could be the basis to link items with a stable section identifier in a WP
article. Or if that is not possible or far from ideal, then see what other
approaches are available.
Thanks,
Micru
On Mon, Oct 20, 2014 at 1:21 PM, James Heald <j.heald(a)ucl.ac.uk> wrote:
> Well actually, we *do* support redirects. One just has to be a bit crafty
> in how one creates them.
>
> Do you have a problem with that?
>
> If so, what is your problem?
>
> -- James.
>
>
>
> On 20/10/2014 11:45, Gerard Meijssen wrote:
>
>> Hoi,
>> We do not support redirects. We do not support paragraphs.Wikidata is not
>> designed to support either.
>> Thanks,
>> GerardM
>>
>> On 20 October 2014 10:39, rupert THURNER <rupert.thurner(a)gmail.com>
>> wrote:
>>
>> Gerard how do you, within wikidata, properly handle the case where an
>>> article is there on enwp, and a paragraph and a redirect to it is there
>>> on
>>> dewp?
>>>
>>> Rupert
>>> On Oct 18, 2014 1:21 PM, "Gerard Meijssen" <gerard.meijssen(a)gmail.com>
>>> wrote:
>>>
>>> Hoi,
>>>> As you correctly quote, "one of the requirements is an article". So
>>>> what
>>>> is your point ?
>>>> Thanks,
>>>> GerardM
>>>>
>>>> On 18 October 2014 12:52, John Lewis <johnflewis93(a)gmail.com> wrote:
>>>>
>>>> On Saturday, 18 October 2014, Gerard Meijssen <
>>>>> gerard.meijssen(a)gmail.com>
>>>>> wrote:
>>>>>
>>>>> One of the requirements is an article.
>>>>>>
>>>>>>
>>>>> One of three requirements. Only one has to be true for an item to be
>>>>> notable. Please could you stop taking this out of context making it
>>>>> look
>>>>> like Wikidata requires articles for items.
>>>>>
>>>>> John Lewis
>>>>>
>>>>>
>>>>> --
>>>>> John Lewis
>>>>>
>>>>> _______________________________________________
>>>>> Wikidata-l mailing list
>>>>> Wikidata-l(a)lists.wikimedia.org
>>>>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Wikidata-l mailing list
>>>> Wikidata-l(a)lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>>>>
>>>>
>>>> _______________________________________________
>>> Wikidata-l mailing list
>>> Wikidata-l(a)lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>>>
>>>
>>>
>>
>>
>> _______________________________________________
>> Wikidata-l mailing list
>> Wikidata-l(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>>
>>
>
> _______________________________________________
> Wikidata-l mailing list
> Wikidata-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
--
Etiamsi omnes, ego non
Different keys can still be found in the actual xml dump
wikidatawiki-20141009-pages-articles.xml.bz2.
This bug/feature
is also present in the current dump with history.
page_id wd_id keys
111 Q15 ['aliases', 'claims', 'descriptions', 'id', 'labels',
'sitelinks', 'type']
137 Q24 ['aliases', 'claims', 'description', 'entity',
'label', 'links']
31500 Q28119 ['aliases', 'description', 'entity', 'label', 'links']
225144 ? ['entity', 'redirect']
3916689 P6 ['aliases', 'claims', 'datatype', 'descriptions',
'id', 'labels', 'type']
3916937 P10 ['aliases', 'claims', 'datatype', 'description',
'entity', 'label']
Lukas
Am Do 09.10.2014 19:32, schrieb Lydia Pintscher:
> On Thu, Oct 9, 2014 at 3:19 PM, Magnus Manske
> <magnusmanske(a)googlemail.com> wrote:
>> I managed to do the task at hand by switching to JSON dumps (because that's
>> the new, officially supported, long-term-stable Wikidata dump format, right?
>> Right???), so no hurry there.
>>
>> Maybe the XML dump process was run in the middle of the switch to the new
>> format, or got a stale cache for some items?
>
> It looks like the switch happened in the middle of a dump creation so
> this one is half old and half new format mixed. The ones after that
> should be all new format. And yay for switching to JSON!
>
>
> Cheers
> Lydia
>