Hi,
I would like to join thread I found in the archive: https://lists.wikimedia.org/pipermail/wikidata//2017-October/011259.html
I worked in contextual research to facilitate knowledge transfer.
One of the domain I would like to treat is visualisation of economics networks.
I seek for an impact over governance of innovation and transparency over economics network control, and allow also SMEs companies or private citizens to build their analytics and prevent cases of collusions.
Information about business profiles is currently a premium service provided by private specialised corporations, although much of the information about companies is public, but there is lack of open data policy.
I would like to fill the gap and contribute to feed Wikidata as repository, either in bulk either as a collective action - as a design thinker I could contribute to design processes to fill in data, like applications that facilitate the process.
*Is there any guidance or clearance about this initiatives?*
I am happy to read similar interest from Germany, Belgium and Italy, I would like to connect.
I read that feeding wikidata with corporate information would significantly increase the size - though, I think that the benefit to allow to inquire for public governance would allow to distribute governance of economics data.
Aside of public services like: https://www.gov.uk/government/organisations/companies-house
I would like to allow data-visualisation researchers (as myself) to uncover for the public results like: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0025995
that relies on private parternships to access corporate databases, and so findings cannot be quieried by the public.
*Is there a specific Wikidata policy to comply with to feed data from scrapers of websites?*
As a starter, the URI of sites with good reputation could act as an *identifier*. I believe that scraping would be legit for information about property "facts" (below) are public, and organisations that collated data provides services (as professional communities or services augmented with private data) that would be not in competition with building a repository.
In a way, I see wikidata as possibility to indexing data that can be functional to search engines and discovery engines, and indexing data is an activity that is daily run by such services. I believe that enabling public transparency would enhance open-data services.
Below some properties of interest.
Properties I would be interested in are: - TEAM (founders) - DESCRIPTION (corporate description over products and services) - INVESTORS (corp. and private equity) - EMPLOYEES / INCUBATORS / ADVISORS (personal information available as public information over the web) - PARTICIPATED COMPANIES - DATE of acquisition or participation to companies - CAPITAL (if available, or in ranges) - VAT NUMBER (or registry number) - ADDRESS
Other ideas to fetch the business profile of companies? It should be, somehow, publicly available, for each corporate report to the organisation registry and there are already private companies offering analytics over the business profiles.
Luigi
Hi Luigi,
Have you looked at https://opencorporates.com ?
Thad +ThadGuidry https://www.google.com/+ThadGuidry
Hi Thad,
It is a really great project, I quote some of the points of Sebastian:
- # regarding Opencorporates *>* I have a critical opinion with
Opencorporates. It appears to be *>* open, but you actually can not get the data. If somebody has a *>* data dump, please forward to me. Thanks. *
- More on top, I consider Opencorporates a danger to open data. It *>*
appears to push open availability of data, but then it is limited *>* to open licenses. Usefulness is limited as there are no free dumps *>* and no possibility to duplicate it effectlively. Wikipedia and *>* Wikidata provide dumps and an API for exactly this reason. *>* Everytime somebody wants to create an open organisation dataset *>* with no barriers, the existence of Opencorporates is blocking this.*
I think that having the possibility to make an analysis on bulk is important.
Some data in opencorporates are incomplete - like founders, capital raised, investors, despite some info is fed from users. Currently most data is about US and NZ, Id like t see EU more represented.
I would like to have possibility to visualise a network of companies and their participations. And build bypartite graphs between personas and companies. I will try to reach them, about cooperation for such a project.
Do you have connections with them?
On Thu, Oct 19, 2017 at 2:17 PM, Thad Guidry thadguidry@gmail.com wrote:
Hi Luigi,
Have you looked at https://opencorporates.com ?
Thad +ThadGuidry https://www.google.com/+ThadGuidry
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
No connections to Opencorporates, sorry.
The good news is that the data sources in Opencorporates (the Registers) are accessible to you...sometimes in dump format.
https://opencorporates.com/registers
Hope that helps you further in your research and needs. I am not saying its easy :)
-Thad
Hi Luigi,
I favour cooperation with OpenCorporates instead of independently adding lots of company record to Wikidata. Sure there are parallel strategies but any effort should also include OpenCorporates to some degree.
OpenCorporates is licensed under ODbL (just added this referenced statement to Q7095760) and we have property P1320 to link Wikidata and OpenCorporates. A first step would be to align
https://opencorporates.com/registers https://en.wikipedia.org/wiki/List_of_company_registers
Right now we have 18 instances of company register (Q1394657) and its subclasses explicitly classified as such in Wikidata.
These items should be linked with the registers listed at OpenCorporates, e.g.
UK Companies House (Q257303) = https://opencorporates.com/registers/270
I've also noticed that OpenCorporates has a field for "Identifiers" where Wikidata QIDs may be included to have two-way-links between the two datasets.
Anyway, better contact https://opencorporates.com/info/contributing at least to let them know about your plans.
Cheers, Jakob
Is there any RDF dump available of OpenCorporates data? Or even any dump at all? Their licensing terms are ambiguous... They say it's released under ODbL, but if I want to use the data I have to ask permission and they will decide if I can use it for free or if I have to pay a fee :/
Sent: Wednesday, October 25, 2017 at 9:44 AM From: "Jakob Voß" Jakob.Voss@gbv.de To: wikidata@lists.wikimedia.org Subject: Re: [Wikidata] Kickstartet: Adding 2.2 million German organisations to Wikidata Hi Luigi,
I favour cooperation with OpenCorporates instead of independently adding lots of company record to Wikidata. Sure there are parallel strategies but any effort should also include OpenCorporates to some degree.
OpenCorporates is licensed under ODbL (just added this referenced statement to Q7095760) and we have property P1320 to link Wikidata and OpenCorporates. A first step would be to align
https://opencorporates.com/registers https://en.wikipedia.org/wiki/List_of_company_registers
Right now we have 18 instances of company register (Q1394657) and its subclasses explicitly classified as such in Wikidata.
These items should be linked with the registers listed at OpenCorporates, e.g.
UK Companies House (Q257303) = https://opencorporates.com/registers/270%5Bhttps://opencorporates.com/regist...]
I've also noticed that OpenCorporates has a field for "Identifiers" where Wikidata QIDs may be included to have two-way-links between the two datasets.
Anyway, better contact https://opencorporates.com/info/contributing%5Bhttps://opencorporates.com/in...] at least to let them know about your plans.
Cheers, Jakob
-- Jakob Voß jakob.voss@gbv.de Verbundzentrale des GBV (VZG) / Common Library Network Platz der Goettinger Sieben 1, 37073 Göttingen, Germany +49 (0)551 39-10242, http://www.gbv.de/%5Bhttp://www.gbv.de/]
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata%5Bhttps://lists.wikime...]
Laura,
Talk to OpenCorporates and ask those questions yourself. Get involved ! :)
-Thad +ThadGuidry https://plus.google.com/+ThadGuidry
On Wed, Oct 25, 2017 at 3:22 AM Laura Morales lauretas@mail.com wrote:
Is there any RDF dump available of OpenCorporates data? Or even any dump at all? Their licensing terms are ambiguous... They say it's released under ODbL, but if I want to use the data I have to ask permission and they will decide if I can use it for free or if I have to pay a fee :/
OK, just asked. Their reply was that they "reserves the right under paragraph 3.3 of ODbL to release the database under different terms", which is to say their data is NOT free because they want to control how and where the data is used. Are we starting to see "free vs open" all over again, this time with data instead of software?
Sent: Wednesday, October 25, 2017 at 5:06 PM From: "Thad Guidry" thadguidry@gmail.com To: "Discussion list for the Wikidata project." wikidata@lists.wikimedia.org Subject: Re: [Wikidata] Kickstartet: Adding 2.2 million German organisations to Wikidata
Laura, Talk to OpenCorporates and ask those questions yourself. Get involved ! :)
-Thad +ThadGuidry[https://plus.google.com/+ThadGuidry]
On Wed, Oct 25, 2017 at 3:22 AM Laura Morales <lauretas@mail.com[mailto:lauretas@mail.com]> wrote:Is there any RDF dump available of OpenCorporates data? Or even any dump at all? Their licensing terms are ambiguous... They say it's released under ODbL, but if I want to use the data I have to ask permission and they will decide if I can use it for free or if I have to pay a fee :/ _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata%5Bhttps://lists.wikime...]
I think Laura raised a very good point here. One question, broader :
is Wikidata team thinking about moving their dataset over *block-chain*? That, I do believe, would incentivise people to participate, maintain, and even craft useful thing with clear licenses (eventually profiting based on utility of a thing ) .
Again, the possibility to compute / process things that are of public utility / governance depends on accessibliity to data. Or, very large computation / legal / negitation power of a few stakeholders.
Making data accessisble from multiple stakehodlers from public audiences, or at least mirroring the meta-data, would allow a sane "competition" / collaboration in governance - with also concrete applications saving $$$ , like preventing companies collusions.
I also tried to connect with OpenCorporates - research and CEO. No answer so far.
On Thu, Oct 26, 2017 at 2:40 PM, Laura Morales lauretas@mail.com wrote:
OK, just asked. Their reply was that they "reserves the right under paragraph 3.3 of ODbL to release the database under different terms", which is to say their data is NOT free because they want to control how and where the data is used. Are we starting to see "free vs open" all over again, this time with data instead of software?
Sent: Wednesday, October 25, 2017 at 5:06 PM From: "Thad Guidry" thadguidry@gmail.com To: "Discussion list for the Wikidata project." < wikidata@lists.wikimedia.org> Subject: Re: [Wikidata] Kickstartet: Adding 2.2 million German organisations to Wikidata
Laura,
Talk to OpenCorporates and ask those questions yourself. Get involved ! :)
-Thad +ThadGuidry[https://plus.google.com/+ThadGuidry]
On Wed, Oct 25, 2017 at 3:22 AM Laura Morales <lauretas@mail.com[mailto: lauretas@mail.com]> wrote:Is there any RDF dump available of OpenCorporates data? Or even any dump at all? Their licensing terms are ambiguous... They say it's released under ODbL, but if I want to use the data I have to ask permission and they will decide if I can use it for free or if I have to pay a fee :/ _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/ mailman/listinfo/wikidata[https://lists.wikimedia.org/ mailman/listinfo/wikidata]
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Laura Morales wrote:
OK, just asked. Their reply was that they "reserves the right under paragraph 3.3 of ODbL to release the database under different terms", which is to say their data is NOT free because they want to control how and where the data is used. Are we starting to see "free vs open" all over again, this time with data instead of software?
This means we could re-publish the data openly once we actually get it but they make it hard to get their data :-(
I'd still try to be open about OpenCorporates and keep on asking them. If they don't switch to more open data sharing, they will likely be replaced, that's for sure. So work independently from OpenCorporates but keep compatible unless they actively reject to work with Wikidata in any way.
Cheers, Jakob
As a general question, is *it discouraged or encouraged to mirror corporate data on Wikidata as public repository?*
*Could you provide a bullet list of why discouraged?*
*How does the decision process work?*
Referring to:
https://www.wikidata.org/wiki/Wikidata:Introduction Data is entered and maintained* by Wikidata editors*, who decide on the rules of content creation and management. Referring to: *A secondary database.* Wikidata records not just statements, but also their sources, and connections to other databases.
*According to wikidata editors, is it possible to index web sources and collate their data on WD?*How do they deal with bulks or pieces of data that may be provided by users ?
Indexing the web does not require agreements, since any web crowler of search engines works indeed like that.
Here, a crowd of people can coordinate themselve to create a *consistent* database.
I believe consistency is a key to serve *Anyone in the world.*
Anyone can use Wikidata for any number of different ways by using its application programming interface.
I think applications that have "value" in the sense of corporate datasets can be built over data including business profiles and ownership towards other participated /subsidiaries companies and stakeholders who participate in the business.
Imagine a minimised version of Bloomberg of Bureau Van Dijk, free to serve * a**nyone in the world.*
*****
I think I could contribute in three ways:
- collecting data - designing test-application to facilitate crowd-sourced addition of data - providing a simplified guide to treat Wikidata properties on a specific case (a kind of info-graphic, but need very clear guidance in the entities and properties for corporates).
On Fri, Oct 27, 2017 at 9:14 AM, Jakob Voß Jakob.Voss@gbv.de wrote:
Laura Morales wrote:
OK, just asked. Their reply was that they "reserves the right under
paragraph 3.3 of ODbL to release the database under different terms", which is to say their data is NOT free because they want to control how and where the data is used. Are we starting to see "free vs open" all over again, this time with data instead of software?
This means we could re-publish the data openly once we actually get it but they make it hard to get their data :-(
I'd still try to be open about OpenCorporates and keep on asking them. If they don't switch to more open data sharing, they will likely be replaced, that's for sure. So work independently from OpenCorporates but keep compatible unless they actively reject to work with Wikidata in any way.
Cheers, Jakob
-- Jakob Voß jakob.voss@gbv.de Verbundzentrale des GBV (VZG) / Common Library Network Platz der Goettinger Sieben 1, 37073 Göttingen, Germany +49 (0)551 39-10242, http://www.gbv.de/
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata