I have noticed a lack of actual data in wikidata representations of
wikipedia list.
for example
List of countries by tax revenue to GDP ratio
https://en.wikipedia.org/wiki/List_of_countries_by_tax_revenue_to_GDP_ratio
to
List of countries by tax revenue as percentage of GDP (Q2529105)
https://www.wikidata.org/wiki/Q2529105
is there currently a development in the wikidata community to
transform these lists into wikibase items and last but not least
produce RDF respresentations for WDQS?
best,
Marco
--
---
Marco Neumann
KONA
Magnus, or anyone else who may be able to advise:
I'd like to add the Photographers' Identities Catalog (PIC) entries to Mix
n Match. I have about 128,000 entries for photographers in PIC, of which I
already have matched ~14,000 to Wikidata entries. My PIC IDs are already in
the corresponding Wikidata entries. I assume I should remove these from the
file before I upload it to Mix n Match, but wanted to check first.
Thanks in advance,
David
*David Lowe | The New York Public Library**Specialist II, Photography
Collection*
*Photographers' Identities Catalog <http://pic.nypl.org>*
Hi Sebastian,
This is huge! It will cover almost all currently existing German companies. Many of these will have similar names, so preparing for disambiguation is a concern.
A good way for such an approach would be proposing a property for an external identifier, loading the data into Mix-n-match, creating links for companies already in Wikidata, and adding the rest (or perhaps only parts of them - I’m not sure if having all of them in Wikidata makes sense, but that’s another discussion), preferably with location and/or sector of trade in the description field.
I’ve tried to figure out what could be used as key for a external identifier property. However, it looks like the registry does not offer any (persistent) URL to its entries. So for looking up a company, apparently there are two options:
- conducting an extended search for the exact string “A&A Dienstleistungsgesellschaft mbH“
- copying the register number “32853” plus selecting the court (Leipzig) from the according dropdown list and search that
Both ways are not very intuitive, even if we can provide a link to the search form. This would make a weak connection to the source of information. Much more important, it makes disambiguation in Mix-n-match difficult. This applies for the preparation of your initial load (you would not want to create duplicates). But much more so for everybody else who wants to match his or her data later on. Being forced to search for entries manually in a cumbersome way for disambiguation of a new, possibly large and rich dataset is, in my eyes, not something we want to impose on future contributors. And often, the free information they find in the registry (formal name, register number, legal form, address) will not easily match with the information they have (common name, location, perhaps founding date, and most important sector of trade), so disambiguation may still be difficult.
Have you checked which parts of the accessible information as below can be crawled and added legally to external databases such as Wikidata?
Cheers, Joachim
--
Joachim Neubert
ZBW – German National Library of Economics
Leibniz Information Centre for Economics
Neuer Jungfernstieg 21
20354 Hamburg
Phone +49-42834-462
Von: Wikidata [mailto:wikidata-bounces@lists.wikimedia.org] Im Auftrag von Sebastian Hellmann
Gesendet: Sonntag, 15. Oktober 2017 09:45
An: wikidata(a)lists.wikimedia.org<mailto:wikidata@lists.wikimedia.org>
Betreff: [Wikidata] Kickstartet: Adding 2.2 million German organisations to Wikidata
Hi all,
the German business registry contains roughly 2.2 million organisations. Some information is paid, but other is public, i.e. the info you are searching for at and clicking on UT (see example below):
https://www.handelsregister.de/rp_web/mask.do?Typ=e
I would like to add this to Wikidata, either by crawling or by raising money to use crowdsourcing concepts like crowdflour or amazon turk.
It should meet notability criteria 2: https://www.wikidata.org/wiki/Wikidata:Notability
2. It refers to an instance of a clearly identifiable conceptual or material entity. The entity must be notable, in the sense that it can be described using serious and publicly available references. If there is no item about you yet, you are probably not notable.
The reference is the official German business registry, which is serious and public. Orgs are also per definition clearly identifiable legal entities.
How can I get clearance to proceed on this?
All the best,
Sebastian
Entity data
Saxony District court Leipzig HRB 32853 – A&A Dienstleistungsgesellschaft mbH
Legal status:
Gesellschaft mit beschränkter Haftung
Capital:
25.000,00 EUR
Date of entry:
29/08/2016
(When entering date of entry, wrong data input can occur due to system failures!)
Date of removal:
-
Balance sheet available:
-
Address (subject to correction):
A&A Dienstleistungsgesellschaft mbH
Prager Straße 38-40
04317 Leipzig
--
All the best,
Sebastian Hellmann
Director of Knowledge Integration and Linked Data Technologies (KILT) Competence Center
at the Institute for Applied Informatics (InfAI) at Leipzig University
Executive Director of the DBpedia Association
Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt<http://www.w3.org/community/ld4lt>
Homepage: http://aksw.org/SebastianHellmann
Research Group: http://aksw.org
(cross-posting to Wikitech and Wikidata; Wikidata is getting this due to
the general interest in data sets and databases)
Greetings,
The Platform Scoring Team at the Wikimedia Foundation are developing an
auditing tool for ORES, the Judgement and Dialogue Engine (JADE) [0]. It's
largely inspired by the false positive reporting work that currently occurs
by hand on wiki pages. JADE's purpose is to provide human oversight to
ORES's AI work.
The team is seeking input in two areas as development begins in two
technical areas: the database schema [1], and the implementation strategy
[2]. If you have thoughts or experience to share in how to best set these
up, please have a look over the pages. Questions, comments, concerns are
welcome on the talk pages [3][4].
Feedback and stories on using ORES in production is also welcome, to help
give the team general ideas about areas of concern and/or improvement that
JADE should keep in mind [5].
Thank you for your time, see you on the wikis.
0. https://www.mediawiki.org/wiki/JADE
1. https://www.mediawiki.org/wiki/JADE/Schema
2. https://www.mediawiki.org/wiki/JADE/Implementations
3. https://www.mediawiki.org/wiki/Talk:JADE/Schema
4. https://www.mediawiki.org/wiki/Talk:JADE/Implementations
5. https://www.mediawiki.org/wiki/Talk:JADE
--
Keegan Peterzell
Technical Collaboration Specialist
Wikimedia Foundation
Google Code-in is an annual contest for 13-17 year old students. It
will take place from Nov28 to Jan17 and is not only about coding tasks.
While we wait whether Wikimedia will get accepted:
* You have small, self-contained bugs you'd like to see fixed?
* Your documentation needs specific improvements?
* Your user interface has small design issues?
* Your Outreachy/Summer of Code project welcomes small tweaks?
* You'd enjoy helping someone port your template to Lua?
* Your gadget code uses some deprecated API calls?
* You have tasks in mind that welcome some research?
Also note that "Beginner tasks" (e.g. "Set up Vagrant" etc) and
"generic" tasks are very welcome (e.g. "Choose & fix 2 PHP7 issues
from the list in https://phabricator.wikimedia.org/T120336 ").
Because we will need hundreds of tasks. :)
And we also have more than 400 unassigned open 'easy' tasks listed:
https://phabricator.wikimedia.org/maniphest/query/HCyOonSbFn.z/#R
Would you be willing to mentor some of those in your area?
Please take a moment to find / update [Phabricator etc.] tasks in your
project(s) which would take an experienced contributor 2-3 hours. Check
https://www.mediawiki.org/wiki/Google_Code-in/Mentors
and please ask if you have any questions!
For some achievements from last round, see
https://blog.wikimedia.org/2017/02/03/google-code-in/
Thanks!,
andre
--
Andre Klapper | Wikimedia Bugwrangler
http://blogs.gnome.org/aklapper/
Hi all,
the German business registry contains roughly 2.2 million organisations.
Some information is paid, but other is public, i.e. the info you are
searching for at and clicking on UT (see example below):
https://www.handelsregister.de/rp_web/mask.do?Typ=e
I would like to add this to Wikidata, either by crawling or by raising
money to use crowdsourcing concepts like crowdflour or amazon turk.
It should meet notability criteria 2:
https://www.wikidata.org/wiki/Wikidata:Notability
> 2. It refers to an instance of a *clearly identifiable conceptual or
> material entity*. The entity must be notable, in the sense that it
> *can be described using serious and publicly available references*. If
> there is no item about you yet, you are probably not notable.
>
The reference is the official German business registry, which is serious
and public. Orgs are also per definition clearly identifiable legal
entities.
How can I get clearance to proceed on this?
All the best,
Sebastian
Entity data
Saxony District court *Leipzig HRB 32853 *– A&A
Dienstleistungsgesellschaft mbH
Legal status: Gesellschaft mit beschränkter Haftung
Capital: 25.000,00 EUR
Date of entry: 29/08/2016
(When entering date of entry, wrong data input can occur due to system
failures!)
Date of removal: -
Balance sheet available: -
Address (subject to correction): A&A Dienstleistungsgesellschaft mbH
Prager Straße 38-40
04317 Leipzig
--
All the best,
Sebastian Hellmann
Director of Knowledge Integration and Linked Data Technologies (KILT)
Competence Center
at the Institute for Applied Informatics (InfAI) at Leipzig University
Executive Director of the DBpedia Association
Projects: http://dbpedia.org, http://nlp2rdf.org,
http://linguistics.okfn.org, https://www.w3.org/community/ld4lt
<http://www.w3.org/community/ld4lt>
Homepage: http://aksw.org/SebastianHellmann
Research Group: http://aksw.org
Hi,
Regarding question 1,
the DBpedia ontology is available here http://wiki.dbpedia.org/services-resources/ontology <http://wiki.dbpedia.org/services-resources/ontology> , with the link to the ontology at http://downloads.dbpedia.org/2014/dbpedia_2014.owl.bz2 <http://downloads.dbpedia.org/2014/dbpedia_2014.owl.bz2>
You can find here [1] that there are terms in Arabic
HTH
Best,
Ghislain
[1] http://lov.okfn.org/dataset/lov/vocabs/dbpedia-owl <http://lov.okfn.org/dataset/lov/vocabs/dbpedia-owl>
> Le 12 oct. 2017 à 22:48, Aman Slama <eng.slamaa(a)yahoo.com> a écrit :
>
> from chapter 10 of book" a developer's guides to the semantic web" this quote: " It is legal to create RDF statements without using any ontology at all, but the resources described by these statements will not have any precisely defined meanings, they will not be able to participate in any reasoning process or take advantage of any benefit from aggregation with other resources, these RDF statements will simply remain isolated with fairly limited value. Generation RDF statements without using ontology was a major drawback in the early versions of DBpedia's extractor. "
> 1- now, how can I know there are an ontology exist for DBpedia versions for diferrent languages?
> 2- ontology and localized DBpedia- is Ontology for Arabic exist ?
> 3- if it is not exist, what are steps of Ontolgy creation?
> 4- can use and connect english ontology with arabic RDF?
>
>
> Best Regards
> Eng.Amany Slamaa
> Teaching Assistant
> Computer and Information Science
> Nile institute for technology sciences
>
---------------------------------------
Ghislain A. Atemezing, Ph.D
Mail: ghislain.atemezing(a)gmail.com
Web: https://w3id.org/people/gatemezing <http://www.atemezing.org/>
Twitter: @gatemezing
About Me: https://about.me/ghislain.atemezing <https://about.me/ghislain.atemezing>
Thanks Andra and Magnus,
It is good to know that the query works fine, I was worried about the
amount of records.
@Andra, if any doubt or so, would be the GeneWiki team our contact? Of
course we would try the list first, but I think we need kind of a direct
contact with the team behind the pages we are linking to.
So, please let me know how to proceed. Should we keep this discussion
here, or move it to GeneWiki team mails? I do not want to misuse the
list for something that might not be of broad interest.
Regards,
------------------------------
Message: 5
Date: Mon, 9 Oct 2017 15:31:02 +0200
From: Andra Waagmeester <andra(a)micelio.be>
To: "Discussion list for the Wikidata project."
<wikidata(a)lists.wikimedia.org>
Cc: Andrew Su <asu(a)scripps.edu>, Gregory Stupp <gstupp(a)scripps.edu>
Subject: Re: [Wikidata] Linking to wikidata pages
Message-ID:
<CAMNM0fWyWeuUhHTmQc9bKGyXANC64O3NSRfK=UzmadgZ9Sc8xg(a)mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi Leyla,
I just tried that query without limits, and the response time is
quite
reasonable. I don't see any constraints in not downloading it directly
from
Wikidata through this SPARQL query.
We, the genewiki team, are regularly maintaining protein items on
Wikidata
with our family of bots. Happy to discuss any cross pollination.
Cheers,
Andra Waagmeester
On Mon, Oct 9, 2017 at 3:15 PM, Leyla Garcia <ljgarcia(a)ebi.ac.uk> wrote:
Hi all,
We would like to link from our pages to wikidata pages. Who should we
contact in this regard? We would need a contact other than the mailing
list, if possible.
I also want to make sure we will not disturb the SPARQL endpoint service
with our query. We want to retrieve all pages pointing to a UniProt
entry
regardless of the taxon. So far we have this query
SELECT ?item ?itemLabel ?UniProt_ID ?taxonID WHERE {
?item wdt:P352 ?UniProt_ID ;
wdt:P703 ?taxon .
?taxon wdt:P31 wd:Q16521 ;
wdt:P685 ?taxonID .
SERVICE wikibase:label { bd:serviceParam wikibase:language
"[AUTO_LANGUAGE],en". }
}
I tried it with a limit of 100 and it worked fine but wondering what
would
be the recommended way if we want them all.
By the way, what would be the way to get that query using the query
helper? I did not managed so I wrote it manually.
Regards,
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata