On 03.08.2016 02:49, Stas Malyshev wrote:
Hi!
Oh, there is a little misunderstanding here. I
have not suggested to
create a property "number of sitelinks in this document". What I propose
instead is to create a property "number of sitelinks for the document
associated with this entity". The domain of this suggested property is
I think this is covered by
https://phabricator.wikimedia.org/T129046 -
which seeks to add page props (which already have sitelinks count I
think but we can define any that we want) to RDF. I kind of neglected it
due to the lack of demand, but it should not be that hard to do.
If you think it is best to implement a more general feature that adds
even more properties, then I am sure nobody will complain, but it sounds
like more work to me. The number I was asking for is something that you
can easily compute from the data that you process already. You can also
compute the number in a SPARQL query from the RDF. It is a completely
redundant piece of information. It's only purpose is to make SPARQL
queries that currently time out fast. In databases, such things are
called "materialized views".
This leads to a slightly different perspective than the one you'd have
in T129046. By adding page props, you want to add "new" information from
another source, and questions like data modelling etc. come to the fore.
With a materialized view, you just add some query results back to the
database for technical reasons that are specific to the database. The
two motivations might lead to different requirements at some point
(e.g., if you want to add another materialized query result to the RDF
you may have to extend page props, which involves more dependencies than
if you just extend the RDF converter).
Markus