On 02.08.2016 22:28, Yuri Astrakhan wrote:
Is there a way we could have more than just the number
of language
links? Eg number of incoming links from other wikipedia pages?
One could have other data added to the store, but this may be more work
depending on what you want. You ask about links from "wikipedia pages".
If you really mean this (and not Wikidata items), then this would be a
lot of work to do since one would have to update RDF when (any)
Wikipedia page changes. I guess we do not have infrastructure for doing
this in a life update mode. Also note that the number of these links is
different in each language, so one would have to store many numbers.
Overall, this link count would really be (meta)data about Wikipedia
pages and their relations, and not so much about Wikidata. I think you
could get such Wikipedia-specific data from DBpedia, but I am not sure
how well their life endpoint keeps track of this data (since it is
tricky). Maybe an offline solution that combines RDF dumps is the most
practical approach for now if you really need this data.
Even storing the number of incoming links (properties) from other
Wikidata items would actually be tricky. Currently, the RDF data about
each item only depends on the content of this item's Wikidata page. The
number of inlinks depends on other Wikidata pages, and therefore it is
much more work to keep it up to date when there are edits.
Markus
On Aug 2, 2016 10:41 PM, "Markus Kroetzsch"
<markus.kroetzsch(a)tu-dresden.de <mailto:markus.kroetzsch@tu-dresden.de>>
wrote:
On 02.08.2016 20:59, Daniel Kinzler wrote:
Am 02.08.2016 um 20:19 schrieb Markus Kroetzsch:
Oh, there is a little misunderstanding here. I have not
suggested to create a
property "number of sitelinks in this document". What I
propose instead is to
create a property "number of sitelinks for the document
associated with this
entity". The domain of this suggested property is entity.
The advantage of this
proposal over the thing that you understood is that it makes
queries much
simpler, since you usually want to sort items by this value,
not documents. One
could also have a property for number of sitelinks per
document, but I don't
think it has such a clear use case.
"number of sitelinks for the document associated with this
entity" strikes me as
semantically odd, which was the point of my earlier mail. I'd
much rather have
"number of sitelinks in this document". You are right that the
primary use would
be to "rank" items, and that it would be more conveniant to have
the count
assocdiated directly with the item (the entity), but I fear it
will lead to a
blurring of the line between information about the entity, and
information about
the document. That is already a common point of confusion, and
I'd rather keep
that separation very clear. I also don't think that one level of
indirection
would be orribly complicated.
To me it's just natural to include the sitelink info on the same
level as we
provide a timestmap or revision id: for the document.
I just proposed the simple and straightforward way to solve the
practical problem at hand. It leads to shorter, more readable
queries that execute faster. (I don't claim originality for this; it
is the obvious solution to the problem and most people would arrive
at exactly the same conclusion).
Your concern is based on the assumption that there is some kind of
psychological effect that a particular RDF encoding would have on
users. I don't think that there is any such effect. Our users will
not confuse the city of Paris with an RDF document just because of
some data in the RDF store.
Markus
--
Prof. Dr. Markus Kroetzsch
Knowledge-Based Systems Group
Faculty of Computer Science
TU Dresden
+49 351 463 38486 <tel:%2B49%20351%20463%2038486>
https://iccl.inf.tu-dresden.de/web/KBS/en
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata