Hello,
I was checking out Wikidata and was wondering if it would be a good fit for a website I wanted to make to crowdsource data about retail products, storing properties like product name, description, UPC/GTIN, MPN, manufacturer, color, size, and so on.
I take it that due to Wikidata's Wikipedia notability requirement I'd have to operate my own Wikibase instance separate from Wikidata? In that case, is it still possible to integrate with Wikidata's ontologies, or do I have to have my own completely separate ontology from scratch (I'd hate to have to reinvent the real basic properties and constraints)? Are there similar projects I could look at to get an idea how to partially-fork Wikidata in this way?
Another thing I'm wondering about is how I would integrate data that wouldn't necessarily fit into the product data ontology, like customer reviews of the product, or sale offers (offers having their own properties like price, availability, condition, and hyperlink to seller's site) - things that aren't inherent characteristics of the item and change often. I was wondering if it would be easier to have a "wrapper" website that stores this data separate, while still integrating with the core product data from Wikibase. Does anyone have any experience or references to projects doing an integration like that? I'm wondering what the easiest way to integrate the two would be - connect directly to the MySQL database, sync databases with hooks, SPARQL, etc.
Also, some of the data for this website I'd wish to populate by crawling online retail stores and manufacturers and performing edits with a bot. Some of these sites provide schema.org metadata, so I was wondering if that makes integration with Wikidata/Wikibase any easier, or do I still have to do some kind of manual mapping process between the two.
Thanks for your patience with this braindump as I'm new to Wikidata and still trying to wrap my head around things. I did the Wikidata tutorials, messed around with a local Wikibase install using Docker Compose, and a lot of clicking around Wikidata and reading about ontologies but it feels like I've just barely scratched the surface!
Thank you, Abe Voelker
Hello Abe and welcome!
I'm working on inventaire.io https://inventaire.io, which might be the closest existing thing to what you're describing: for the needs of the book sharing webapp, we maintain an open bibliographic database using Wikidata vocabulary and extending Wikidata https://wiki.inventaire.io/wiki/Data?lang=en for entries that don't match the notability requirements and/or were automatically generated from data found on the web and that couldn't be reconciled with existing entities on Wikidata. We build edition data primarily around ISBNs, which are part of GTINs. This wasn't built using Wikibase but with an /ad hoc/ software (see repo http://github.com/inventaire/inventaire/) as Wikibase federation wasn't ready at all when we started, and still misses some critical pieces today, but we are considering moving the bibliographic data in a dedicated federated Wikibase instance at some point https://github.com/inventaire/inventaire/issues/186. The rest of the data (users, inventories, transactions, maybe reviews https://trello.com/c/uwdkvGl1/114-book-reviews in the future) would keep their current form (documents in CouchDB databases without any relation to the Wikidata data model).
So, answering your question, I don't think Wikidata is the place to crowdsource data about retail products but I'm convinced a database doing this should do it using Wikidata vocabulary! And just like we are glad that the WikiProject_Books https://www.wikidata.org/wiki/Wikidata:WikiProject_Books and WikiCite https://www.wikidata.org/wiki/Wikidata:WikiCite/Roadmap exists to work on a consistent (*/cough/* almost */cough/*) data model on books that we can reuse within Inventaire https://inventaire.github.io/entities-map/, there are several projects in or around Wikidata with which such a project could/should work: - the *WikiObject https://meta.wikimedia.org/wiki/WikiObject* sister project proposal: you got to check that, Quico, the main contributor (in cc) has been doing quite some research on this very close project - Wikidata:WikiProject_Companies https://www.wikidata.org/wiki/Wikidata:WikiProject_Companies - Wikidata:W https://www.wikidata.org/wiki/Wikidata:WikiProject_MaterialsikiProject_Materials https://www.wikidata.org/wiki/Wikidata:WikiProject_Materials - OpenFoodFacts http://openfoodfacts.org/, which also has to deal with GTIN and products properties, and which expressed interest in getting more integrated with Wikidata https://en.wiki.openfoodfacts.org/Structured_Data/Wikidata
Also of interest: - Open Product Data http://product-open-data.com/, a project that was sharing your idea but couldn't get the momentum(?) - the GoodRelations http://www.heppnetz.de/projects/goodrelations/ ontology - OpenCorporates https://opencorporates.com/, unfortunately not so open from what I could tell
I have been dreaming of such a database for a while now (see my (now old
<) articles P2P Resources Management
https://maxlath.eu/articles/p2p-rm/, Wikidata and the apt-get of things https://maxlath.eu/articles/wikidata-and-the-apt-get-of-things/, Mapping resources using open knowledge https://maxlath.eu/articles/mapping-resources-using-open-knowledge/), and extending Inventaire to other things that books has always been in the category of the possible futures, so I would be more than happy to hear more about any progresses on this :)
Bests,
Maxime Lathuilière maxlath.eu http://maxlath.eu - twitter https://twitter.com/maxlath - mastodon http://mastodon.social/@maxlath - User:Maxlath https://www.wikidata.org/wiki/User:Maxlath inventaire.io https://inventaire.io - roadmap https://trello.com/b/0lKcsZDj/inventaire-roadmap - code https://github.com/inventaire/inventaire - mastodon https://mamot.fr/@inventaire - twitter https://twitter.com/inventaire_io - facebook https://facebook.com/inventaire.io for personal emails use max@maxlath.eu instead
Le 19/09/2018 à 21:49, Abe Voelker a écrit :
Hello,
I was checking out Wikidata and was wondering if it would be a good fit for a website I wanted to make to crowdsource data about retail products, storing properties like product name, description, UPC/GTIN, MPN, manufacturer, color, size, and so on.
I take it that due to Wikidata's Wikipedia notability requirement I'd have to operate my own Wikibase instance separate from Wikidata? In that case, is it still possible to integrate with Wikidata's ontologies, or do I have to have my own completely separate ontology from scratch (I'd hate to have to reinvent the real basic properties and constraints)? Are there similar projects I could look at to get an idea how to partially-fork Wikidata in this way?
Another thing I'm wondering about is how I would integrate data that wouldn't necessarily fit into the product data ontology, like customer reviews of the product, or sale offers (offers having their own properties like price, availability, condition, and hyperlink to seller's site) - things that aren't inherent characteristics of the item and change often. I was wondering if it would be easier to have a "wrapper" website that stores this data separate, while still integrating with the core product data from Wikibase. Does anyone have any experience or references to projects doing an integration like that? I'm wondering what the easiest way to integrate the two would be
- connect directly to the MySQL database, sync databases with hooks,
SPARQL, etc.
Also, some of the data for this website I'd wish to populate by crawling online retail stores and manufacturers and performing edits with a bot. Some of these sites provide schema.org http://schema.org metadata, so I was wondering if that makes integration with Wikidata/Wikibase any easier, or do I still have to do some kind of manual mapping process between the two.
Thanks for your patience with this braindump as I'm new to Wikidata and still trying to wrap my head around things. I did the Wikidata tutorials, messed around with a local Wikibase install using Docker Compose, and a lot of clicking around Wikidata and reading about ontologies but it feels like I've just barely scratched the surface!
Thank you, Abe Voelker
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Maxime,
Wow, I greatly appreciate you taking the time to write that very thorough response!
Indeed inventaire.io looks very similar to what I'd like to do, and the data model page you linked to is very explanatory and helpful - thank you! I've been so torn because as you say, using Wikidata vocabulary seems like the correct way to go, and I do really like Wikidata's editor (especially being able to add a reference for every statement), however I'm not a PHP developer nor do I have any experience administering Mediawiki software, so it would be a very steep hill for me to climb to be able to customize Wikibase to suit my needs. I may instead end up writing some ad-hoc software while trying to conform to Wikidata vocabulary as you say, with a view to making integration with a Wikibase instance easier for later on.
Also, I know I made it sound general in my initial email by saying I wanted to catalog "retail products" but truth be told I'm actually only really interested in a specific niche of products - firearms and related accessories. Sorry if that was misleading but I thought being overly-specific might be distracting from describing what I was trying to do. In any case conforming to or integrating with a larger "Wikidata for products" is still aligned with my interest so thank you for that info and links.
You've given me a lot to pore over and ruminate on. I also see on your Wikidata page you've authored some useful tools that I will have to check out. Thanks again for taking the time; if I make anything useful I will share.
Best Regards, Abe
On Wed, Sep 19, 2018 at 5:17 PM Maxime Lathuilière groups@maxlath.eu wrote:
Hello Abe and welcome!
I'm working on inventaire.io, which might be the closest existing thing to what you're describing: for the needs of the book sharing webapp, we maintain an open bibliographic database using Wikidata vocabulary and extending Wikidata https://wiki.inventaire.io/wiki/Data?lang=en for entries that don't match the notability requirements and/or were automatically generated from data found on the web and that couldn't be reconciled with existing entities on Wikidata. We build edition data primarily around ISBNs, which are part of GTINs. This wasn't built using Wikibase but with an *ad hoc* software (see repo http://github.com/inventaire/inventaire/) as Wikibase federation wasn't ready at all when we started, and still misses some critical pieces today, but we are considering moving the bibliographic data in a dedicated federated Wikibase instance at some point https://github.com/inventaire/inventaire/issues/186. The rest of the data (users, inventories, transactions, maybe reviews https://trello.com/c/uwdkvGl1/114-book-reviews in the future) would keep their current form (documents in CouchDB databases without any relation to the Wikidata data model).
So, answering your question, I don't think Wikidata is the place to crowdsource data about retail products but I'm convinced a database doing this should do it using Wikidata vocabulary! And just like we are glad that the WikiProject_Books https://www.wikidata.org/wiki/Wikidata:WikiProject_Books and WikiCite https://www.wikidata.org/wiki/Wikidata:WikiCite/Roadmap exists to work on a consistent (**cough** almost **cough**) data model on books that we can reuse within Inventaire https://inventaire.github.io/entities-map/, there are several projects in or around Wikidata with which such a project could/should work:
- the *WikiObject https://meta.wikimedia.org/wiki/WikiObject* sister
project proposal: you got to check that, Quico, the main contributor (in cc) has been doing quite some research on this very close project
- Wikidata:WikiProject_Companies
https://www.wikidata.org/wiki/Wikidata:WikiProject_Companies
- Wikidata:W
https://www.wikidata.org/wiki/Wikidata:WikiProject_Materials ikiProject_Materials https://www.wikidata.org/wiki/Wikidata:WikiProject_Materials
- OpenFoodFacts http://openfoodfacts.org/, which also has to deal with
GTIN and products properties, and which expressed interest in getting more integrated with Wikidata https://en.wiki.openfoodfacts.org/Structured_Data/Wikidata
Also of interest:
- Open Product Data http://product-open-data.com/, a project that was
sharing your idea but couldn't get the momentum(?)
- the GoodRelations http://www.heppnetz.de/projects/goodrelations/
ontology
- OpenCorporates https://opencorporates.com/, unfortunately not so open
from what I could tell
I have been dreaming of such a database for a while now (see my (now old
<) articles P2P Resources Management
https://maxlath.eu/articles/p2p-rm/, Wikidata and the apt-get of things https://maxlath.eu/articles/wikidata-and-the-apt-get-of-things/, Mapping resources using open knowledge https://maxlath.eu/articles/mapping-resources-using-open-knowledge/), and extending Inventaire to other things that books has always been in the category of the possible futures, so I would be more than happy to hear more about any progresses on this :)
Bests,
Maxime Lathuilière maxlath.eu - twitter https://twitter.com/maxlath - mastodon http://mastodon.social/@maxlath - User:Maxlath https://www.wikidata.org/wiki/User:Maxlath inventaire.io - roadmap https://trello.com/b/0lKcsZDj/inventaire-roadmap
- code https://github.com/inventaire/inventaire - mastodon
https://mamot.fr/@inventaire - twitter https://twitter.com/inventaire_io - facebook https://facebook.com/inventaire.io for personal emails use max@maxlath.eu instead
Le 19/09/2018 à 21:49, Abe Voelker a écrit :
Hello,
I was checking out Wikidata and was wondering if it would be a good fit for a website I wanted to make to crowdsource data about retail products, storing properties like product name, description, UPC/GTIN, MPN, manufacturer, color, size, and so on.
I take it that due to Wikidata's Wikipedia notability requirement I'd have to operate my own Wikibase instance separate from Wikidata? In that case, is it still possible to integrate with Wikidata's ontologies, or do I have to have my own completely separate ontology from scratch (I'd hate to have to reinvent the real basic properties and constraints)? Are there similar projects I could look at to get an idea how to partially-fork Wikidata in this way?
Another thing I'm wondering about is how I would integrate data that wouldn't necessarily fit into the product data ontology, like customer reviews of the product, or sale offers (offers having their own properties like price, availability, condition, and hyperlink to seller's site) - things that aren't inherent characteristics of the item and change often. I was wondering if it would be easier to have a "wrapper" website that stores this data separate, while still integrating with the core product data from Wikibase. Does anyone have any experience or references to projects doing an integration like that? I'm wondering what the easiest way to integrate the two would be - connect directly to the MySQL database, sync databases with hooks, SPARQL, etc.
Also, some of the data for this website I'd wish to populate by crawling online retail stores and manufacturers and performing edits with a bot. Some of these sites provide schema.org metadata, so I was wondering if that makes integration with Wikidata/Wikibase any easier, or do I still have to do some kind of manual mapping process between the two.
Thanks for your patience with this braindump as I'm new to Wikidata and still trying to wrap my head around things. I did the Wikidata tutorials, messed around with a local Wikibase install using Docker Compose, and a lot of clicking around Wikidata and reading about ontologies but it feels like I've just barely scratched the surface!
Thank you, Abe Voelker
Wikidata mailing listWikidata@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata
Hey Abe, Maxime, cool that you've been looking into how to use Wikibase / Wikidata for something like a retail products / firearms database. I could definitely understand why you're considering that.
I suppose just using the vocabulary and writing your own software is probably the best way to go. I don't have any experience running my own Wikibase + Mediawiki instance, but i could imagine it might be a bit overkill if all you want is a database of firearms with metadata.
Of course, what you could do is try to make integration with Wikidata as easy as possible. There are a lot of items and properties that you could reuse, like manufacturers, colours, specific types of guns that all have a Q-number. If you link those up in your own database you could, in theory, get all the hard work from the volunteers (like translations and the like) without having to do the work yourself. The API is very useful for something like that.
You could even provide your own API, and as long as you provide permanent identifiers and a machine-readable format for your items, data might even flow back to Wikidata.
Kind regards, -- Hay
On Thu, Sep 20, 2018 at 7:16 PM Abe Voelker abe@abevoelker.com wrote:
Maxime,
Wow, I greatly appreciate you taking the time to write that very thorough response!
Indeed inventaire.io looks very similar to what I'd like to do, and the data model page you linked to is very explanatory and helpful - thank you! I've been so torn because as you say, using Wikidata vocabulary seems like the correct way to go, and I do really like Wikidata's editor (especially being able to add a reference for every statement), however I'm not a PHP developer nor do I have any experience administering Mediawiki software, so it would be a very steep hill for me to climb to be able to customize Wikibase to suit my needs. I may instead end up writing some ad-hoc software while trying to conform to Wikidata vocabulary as you say, with a view to making integration with a Wikibase instance easier for later on.
Also, I know I made it sound general in my initial email by saying I wanted to catalog "retail products" but truth be told I'm actually only really interested in a specific niche of products - firearms and related accessories. Sorry if that was misleading but I thought being overly-specific might be distracting from describing what I was trying to do. In any case conforming to or integrating with a larger "Wikidata for products" is still aligned with my interest so thank you for that info and links.
You've given me a lot to pore over and ruminate on. I also see on your Wikidata page you've authored some useful tools that I will have to check out. Thanks again for taking the time; if I make anything useful I will share.
Best Regards, Abe
On Wed, Sep 19, 2018 at 5:17 PM Maxime Lathuilière groups@maxlath.eu wrote:
Hello Abe and welcome!
I'm working on inventaire.io, which might be the closest existing thing to what you're describing: for the needs of the book sharing webapp, we maintain an open bibliographic database using Wikidata vocabulary and extending Wikidata for entries that don't match the notability requirements and/or were automatically generated from data found on the web and that couldn't be reconciled with existing entities on Wikidata. We build edition data primarily around ISBNs, which are part of GTINs. This wasn't built using Wikibase but with an ad hoc software (see repo) as Wikibase federation wasn't ready at all when we started, and still misses some critical pieces today, but we are considering moving the bibliographic data in a dedicated federated Wikibase instance at some point. The rest of the data (users, inventories, transactions, maybe reviews in the future) would keep their current form (documents in CouchDB databases without any relation to the Wikidata data model).
So, answering your question, I don't think Wikidata is the place to crowdsource data about retail products but I'm convinced a database doing this should do it using Wikidata vocabulary! And just like we are glad that the WikiProject_Books and WikiCite exists to work on a consistent (*cough* almost *cough*) data model on books that we can reuse within Inventaire, there are several projects in or around Wikidata with which such a project could/should work:
- the WikiObject sister project proposal: you got to check that, Quico, the main contributor (in cc) has been doing quite some research on this very close project
- Wikidata:WikiProject_Companies
- Wikidata:WikiProject_Materials
- OpenFoodFacts, which also has to deal with GTIN and products properties, and which expressed interest in getting more integrated with Wikidata
Also of interest:
- Open Product Data, a project that was sharing your idea but couldn't get the momentum(?)
- the GoodRelations ontology
- OpenCorporates, unfortunately not so open from what I could tell
I have been dreaming of such a database for a while now (see my (now old ><) articles P2P Resources Management, Wikidata and the apt-get of things, Mapping resources using open knowledge), and extending Inventaire to other things that books has always been in the category of the possible futures, so I would be more than happy to hear more about any progresses on this :)
Bests,
Maxime Lathuilière maxlath.eu - twitter - mastodon - User:Maxlath inventaire.io - roadmap - code - mastodon - twitter - facebook for personal emails use max@maxlath.eu instead
Le 19/09/2018 à 21:49, Abe Voelker a écrit :
Hello,
I was checking out Wikidata and was wondering if it would be a good fit for a website I wanted to make to crowdsource data about retail products, storing properties like product name, description, UPC/GTIN, MPN, manufacturer, color, size, and so on.
I take it that due to Wikidata's Wikipedia notability requirement I'd have to operate my own Wikibase instance separate from Wikidata? In that case, is it still possible to integrate with Wikidata's ontologies, or do I have to have my own completely separate ontology from scratch (I'd hate to have to reinvent the real basic properties and constraints)? Are there similar projects I could look at to get an idea how to partially-fork Wikidata in this way?
Another thing I'm wondering about is how I would integrate data that wouldn't necessarily fit into the product data ontology, like customer reviews of the product, or sale offers (offers having their own properties like price, availability, condition, and hyperlink to seller's site) - things that aren't inherent characteristics of the item and change often. I was wondering if it would be easier to have a "wrapper" website that stores this data separate, while still integrating with the core product data from Wikibase. Does anyone have any experience or references to projects doing an integration like that? I'm wondering what the easiest way to integrate the two would be - connect directly to the MySQL database, sync databases with hooks, SPARQL, etc.
Also, some of the data for this website I'd wish to populate by crawling online retail stores and manufacturers and performing edits with a bot. Some of these sites provide schema.org metadata, so I was wondering if that makes integration with Wikidata/Wikibase any easier, or do I still have to do some kind of manual mapping process between the two.
Thanks for your patience with this braindump as I'm new to Wikidata and still trying to wrap my head around things. I did the Wikidata tutorials, messed around with a local Wikibase install using Docker Compose, and a lot of clicking around Wikidata and reading about ontologies but it feels like I've just barely scratched the surface!
Thank you, Abe Voelker
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata