Yep,
Please notes that RDFSlice will take the subset. That is, the triples that contain the property that you are looking for. Here go three examples of SPARQL queries:
ps: you can try them here https://query.wikidata.org.
** For your example,*
SELECT * WHERE { http://www.wikidata.org/entity/Q1652291 http://schema.org/description ?o . filter(lang(?o)='en'). }
** For all English bios:*
SELECT * WHERE { ?s http://schema.org/description ?o . filter(lang(?o)='en'). }
** For all language bios:*
SELECT * WHERE { http://www.wikidata.org/entity/Q1652291 http://schema.org/description ?o . }
best, Edgard
On Mon, Feb 1, 2016 at 4:34 AM, Hampton Snowball hamptonsnowball@gmail.com wrote:
Thanks. I see it requires constructing a query to only extract the data you want. E.g. the graph pattern:
<graphPatterns> - desired query, e.g. "SELECT * WHERE {?s ?p ?o}" or graph pattern e.g. "{?s ?p ?o}"
Since I don't know about constructing queries, would you be able to tell me what would be the proper query to extract from all the pages the short bio, english wikipedia, maybe other wikipedias?
For example from: https://www.wikidata.org/wiki/Q1652291"
"Turkish female given name" https://en.wikipedia.org/wiki/H%C3%BClya and optionally https://de.wikipedia.org/wiki/H%C3%BClya
Thanks in advance!
On Sun, Jan 31, 2016 at 3:53 PM, Edgard Marx < marx@informatik.uni-leipzig.de> wrote:
Hey, you can simple use RDFSlice ( https://bitbucket.org/emarx/rdfslice/overview) directly on the dump file (https://dumps.wikimedia.org/wikidatawiki/entities/20160125/)
best, Edgard
On Sun, Jan 31, 2016 at 7:43 PM, Hampton Snowball < hamptonsnowball@gmail.com> wrote:
Hello,
I am interested in a subset of wikidata and I am trying to find the best way to get it without getting a larger dataset then necessary.
Is there a way to just get the "bios" that appear on the wikidata pages below the name of the person/organization, as well as the link to the english wikipedia page / or all wikipedia pages?
For example from: https://www.wikidata.org/wiki/Q1652291"
"Turkish female given name" https://en.wikipedia.org/wiki/H%C3%BClya and optionally https://de.wikipedia.org/wiki/H%C3%BClya
I know there is SPARQL which previously this list helped me construct a query, but I know some requests seem to timeout when looking at a large amount of data so I am not sure this would work.
The dumps I know are the full dataset, but I am not sure if there's any other subset dumps available or better way of grabbing this data
Thanks in advance, HS
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata