Hi Tom,

Would you have any insight about what would have been counterproductive to transform this conceptual model (hyper relational) into a physical model of tables and relationships (relational)?
Could it have been something to do with the expressiveness of the model or query language, performance for querying and updating, or something else?
Was it just an architectural decision inherited from other MediaWiki page content storage, as Laurence mentioned?
I've read some articles indicating that (pure) RDF model would not be suitable for Wikidata as reification approaches significantly increase the number of triples and make SPARQL queries more complex and verbose.

Veronica


Em qui., 7 de abr. de 2022 às 08:37, Thomas Arrow <thomas.arrow_ext@wikimedia.de> escreveu:
Hi Veronica,

I think for this purpose it's most useful to know then that the primary way we store the data is in a single JSON file for each "Entity": that is each Item, Property, Lexeme and so on. In short "Or even if the entire model is stored as a JSON document, one for each entity." is correct.

This is transformed into various other forms for reuse, for example, ttl or other additional mysql/mariadb tables.

Thanks for the interest!
Tom

On Tue, 5 Apr 2022 at 14:52, Veronica Santos <versant.2612@gmail.com> wrote:
Hi Tom,
Thank you for your reply.
I am PhD student and I am researching about Graph DB, Knowledge Graphs, and son on. 
Since Wikidata is one of the biggest hiper relational KG that I found, I want to know about data modeling process.  


Em ter, 5 de abr de 2022 9:06 AM, Thomas Arrow <thomas.arrow_ext@wikimedia.de> escreveu:
Hi,

I think Laurence already did a great job of describing the situation but I can try and reiterate.

> Or even if the entire model is stored as a JSON document, one for each entity.
Yes, this is the case in this is the primary data store. There are also secondary duplicate stores of this data in mysql.

There is no single document describing this because we do not officially provide a stable interface for this data model. That is, we do not expect software other than Wikibase to touch these tables.

Perhaps it would be useful to ask what you would hope to do with this mapping?

Cheers,
Tom

On Mon, 4 Apr 2022 at 23:40, Veronica Santos <versant.2612@gmail.com> wrote:
Thank you Laurence. I will check these references carefully.
I hoped there was some kind of mapping from the Conceptual Model (Wikidata Data Model) to the Physical Model (Wikibase relational schema)

Em sex., 1 de abr. de 2022 às 21:58, Laurence Parry <greenreaper@hotmail.com> escreveu:
Hi Versant,

Some information you're looking for is stored within the Wikibase documentation, notably:
and 
but it must be read with reference to the main MediaWiki documentation, e.g.
https://mediawiki.org/wiki/Manual:Revision_table

As I understand it, primary entity storage is the same as other MediaWiki page content storage, but using JSON blobs, with slightly different [content] types for items and properties - the formats described at the second URL you gave.

Wikibase does not attempt to fully decompose an entity into SQL normal form, as you might expect from a typical SQL data store. The same applies to references, qualifiers, etc. A corollary of this is that it is inadvisable to store too much data on any single entity if it can be avoided, as it requires JSON parsing and creation proportional to the entity size, even if just one aspect is modified.

There are however secondary tables to support entities' different roles within Wikibase's editing and display interface (e.g. property information required to turn external links into URLs), and for performance - some shared between items and properties where requirements overlap, such as for labels; somewhat analogous to indexes on the JSON, some of which are updated after the blob itself. (If PostgreSQL's hstore or jsonb had been used, I'd imagine some being real indexes or at least automatically-derived fields.)

This storage is separate from any representation in WDQS, that derives from higher-level access to Wikibase entities, and which is more suitable for querying details otherwise stored within JSON. I believe this is described more in https://mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format - although said format may well differ from what is actually inside Blazegraph.

Hopefully people more experienced with the internals could chip in if you have more specific questions, after considering the above.
--
Laurence 'GreenReaper' Parry - WBUG

From: versant.2612@gmail.com <versant.2612@gmail.com>
Sent: Friday, April 1, 2022 11:34:53 PM
To: wikibaseug@lists.wikimedia.org <wikibaseug@lists.wikimedia.org>
Subject: [Wikibase] Wikibase relational schema
 
Are there any document that contains the mapping of Wikidata Data Model into a Wikibase relational schema?

I want know, for example, if Item is mapped to one relational table of MariaDB and Property to another one or if both are mapped to only one table (as Entity), if ItemDescriptions has a separate table, how many tables support the Statement structure, including references and qualifiers, which column datatype is used to represent an IRI, and so on. Or even if the entire model is stored as a JSON document, one for each entity.

https://www.mediawiki.org/wiki/Wikibase/DataModel
https://doc.wikimedia.org/Wikibase/master/php/md_docs_topics_json.html
_______________________________________________
Wikibaseug mailing list -- wikibaseug@lists.wikimedia.org
To unsubscribe send an email to wikibaseug-leave@lists.wikimedia.org
_______________________________________________
Wikibaseug mailing list -- wikibaseug@lists.wikimedia.org
To unsubscribe send an email to wikibaseug-leave@lists.wikimedia.org
_______________________________________________
Wikibaseug mailing list -- wikibaseug@lists.wikimedia.org
To unsubscribe send an email to wikibaseug-leave@lists.wikimedia.org
_______________________________________________
Wikibaseug mailing list -- wikibaseug@lists.wikimedia.org
To unsubscribe send an email to wikibaseug-leave@lists.wikimedia.org
_______________________________________________
Wikibaseug mailing list -- wikibaseug@lists.wikimedia.org
To unsubscribe send an email to wikibaseug-leave@lists.wikimedia.org
_______________________________________________
Wikibaseug mailing list -- wikibaseug@lists.wikimedia.org
To unsubscribe send an email to wikibaseug-leave@lists.wikimedia.org