Dear All,
although in Italy these data are normally not available (not even the basic data) from the chambers of commerce, there are some open data from which we could extract several identifiers - of course these are biased toward the suppliers of Public Administrations, because contracting with PA is the trigger for being listed in these Open Data.
In the context of a broader effort to upload this kind of data in Wikidata, as the one which seems to emerge from this thread, the firm which I manage may be willing to contribute about half a million couples of labels and VAT IDs... it's a relatively thin dataset - in the sense that you just have the name of the firm and the VAT ID, and possibly a link to a portal we're building in which you may gather additional information about the activity of this firm with the Italian public administration - but, as I was mentioning, Italian firm data are quite rare (they are not even available on OpenCorporates.com).
By the way, https://www.wikidata.org/wiki/Property:P3608 (EU VAT number) already exists and may provide a sufficient identifier in most cases, since in most cases the country ISO code (e.g. IT for Italy) + the national VAT ID does generated the EU VAT number (the actual algorithm may be a bit more complex, but it's documented). (That said, there are also national identifiers which may be worth creating, such as the number of registration at national chambers of commerce, etc.)
About the value of these data on Wikidata, starting from our use case, I think that having permanent URIs for all firms on Wikidata would provide, for instance, great value for several anti-corruption projects around the world. (This could also provide a place to trace some international links among companies, which are not always readily available today.) That said, I perfectly understand the concerns of Andra in terms of scalability and maintenance, and this is one of the reasons I did not think of donating these data to Wikidata so far.
I'll try to follow these discussions, but please - Sebastian or others - feel free to ping me if the project goes on and you want to include these Italian data.
Best,
Federico
On Mon, Oct 16, 2017 at 10:25 AM, Andra Waagmeester andra@micelio.be wrote:
There is an equal size of data on Belgian enterprises available. with the same objective to enrich wikidata with enterprise data I recently proposed the following property: https://www.wikidata.org/wiki/Wikidata: Property_proposal/NACE_code
However, after some talks with others in the Wikidata community, I recently have some second thoughts on whether or not a full dump of these type of datasets are valuable enrichments of Wikidata. Adding 2 million items with additional statement per item would be quite an enlargement of Wikidata. If we would bot add all business of both Belgium and Germany, we would have 4 million of new items, which currently would count for 10% of all of Wikidata. I am not sure what this would mean in term scalability and if it would cause any scalability issues.
Maybe a use-case driven approach here would be more appropriate. We could think of a bot that would source both the trade registers of the different countries when a specific use case would vouch for the inclusion of trade data.
Just my 2cts
Cheers,
Andra
On Mon, Oct 16, 2017 at 9:48 AM, Sebastian Hellmann < hellmann@informatik.uni-leipzig.de> wrote:
Thanks, done.
https://www.wikidata.org/wiki/Wikidata:Project_chat#Handelsregister
On 15.10.2017 22:10, Yaroslav Blanter wrote:
Hi Sebastian,
I would say the best way is to file a request for the permissions for the bot
https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot
and possibly leave a message on the Project Chat
https://www.wikidata.org/wiki/Wikidata:Project_chat
Cheers Yaroslav
On Sun, Oct 15, 2017 at 9:44 AM, Sebastian Hellmann < hellmann@informatik.uni-leipzig.de> wrote:
Hi all,
the German business registry contains roughly 2.2 million organisations. Some information is paid, but other is public, i.e. the info you are searching for at and clicking on UT (see example below):
https://www.handelsregister.de/rp_web/mask.do?Typ=e
I would like to add this to Wikidata, either by crawling or by raising money to use crowdsourcing concepts like crowdflour or amazon turk.
It should meet notability criteria 2: https://www.wikidata.org/wiki/ Wikidata:Notability
- It refers to an instance of a *clearly identifiable conceptual or
material entity*. The entity must be notable, in the sense that it *can be described using serious and publicly available references*. If there is no item about you yet, you are probably not notable.
The reference is the official German business registry, which is serious and public. Orgs are also per definition clearly identifiable legal entities.
How can I get clearance to proceed on this?
All the best, Sebastian
Entity data Saxony District court *Leipzig HRB 32853 *– A&A Dienstleistungsgesellschaft mbH Legal status: Gesellschaft mit beschränkter Haftung Capital: 25.000,00 EUR Date of entry: 29/08/2016 (When entering date of entry, wrong data input can occur due to system failures!) Date of removal: - Balance sheet available: - Address (subject to correction): A&A Dienstleistungsgesellschaft mbH Prager Straße 38-40 04317 Leipzig
-- All the best, Sebastian Hellmann
Director of Knowledge Integration and Linked Data Technologies (KILT) Competence Center at the Institute for Applied Informatics (InfAI) at Leipzig University Executive Director of the DBpedia Association Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt http://www.w3.org/community/ld4lt Homepage: http://aksw.org/SebastianHellmann Research Group: http://aksw.org
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing listWikidata@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata
-- All the best, Sebastian Hellmann
Director of Knowledge Integration and Linked Data Technologies (KILT) Competence Center at the Institute for Applied Informatics (InfAI) at Leipzig University Executive Director of the DBpedia Association Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt http://www.w3.org/community/ld4lt Homepage: http://aksw.org/SebastianHellmann Research Group: http://aksw.org
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata