On 03.08.2016 02:49, Stas Malyshev wrote:
Hi!
Oh, there is a little misunderstanding here. I have not suggested to create a property "number of sitelinks in this document". What I propose instead is to create a property "number of sitelinks for the document associated with this entity". The domain of this suggested property is
I think this is covered by https://phabricator.wikimedia.org/T129046 - which seeks to add page props (which already have sitelinks count I think but we can define any that we want) to RDF. I kind of neglected it due to the lack of demand, but it should not be that hard to do.
If you think it is best to implement a more general feature that adds even more properties, then I am sure nobody will complain, but it sounds like more work to me. The number I was asking for is something that you can easily compute from the data that you process already. You can also compute the number in a SPARQL query from the RDF. It is a completely redundant piece of information. It's only purpose is to make SPARQL queries that currently time out fast. In databases, such things are called "materialized views".
This leads to a slightly different perspective than the one you'd have in T129046. By adding page props, you want to add "new" information from another source, and questions like data modelling etc. come to the fore. With a materialized view, you just add some query results back to the database for technical reasons that are specific to the database. The two motivations might lead to different requirements at some point (e.g., if you want to add another materialized query result to the RDF you may have to extend page props, which involves more dependencies than if you just extend the RDF converter).
Markus