Excellent, I did some tests and with some cycles I already identified and classified several articles.I will have a look at your script in the next days but I already have a question: the number of iterations is based on the total number of articles, how do you know that ?---FabrizioIl giorno sab 15 dic 2018 alle ore 10:18 Egon Willighagen <egon.willighagen@gmail.com> ha scritto:The approach I use is the following, see this (Bioclipse/Groovy) script: https://gist.github.com/egonw/ca4c348b9a2d1116efcdb55fa85dd158It takes advantage of a combination Blazegraph SPARQL trick and breaking up thing in batches of a certain size:where "$concept" is my search word in the title, and $batchSize and $offset take care of the batching by the script. This script creates QuickStatements.
SELECT ?art ?artLabel WITH { SELECT ?art WHERE { ?art wdt:P31 wd:Q13442814 } LIMIT $batchSize OFFSET $offset } AS %RESULTS { INCLUDE %RESULTS ?art wdt:P1476 ?artLabel . MINUS { ?art wdt:P921 wd:$conceptQ } FILTER (contains(lcase(str(?artLabel)), "$concept")) } Mind you, I manually check the created statements, because in my domain (biochem) a simple search results of false positives, hence the "blacklist" in the script :)EgonOn Sat, Dec 15, 2018 at 10:13 AM Fabrizio Carrai <fabrizio.carrai@gmail.com> wrote:_______________________________________________Thanks Matthias,that's a pity. Your suggestion relies on the effective characterization of the item that, at this writing time, is pretty poor for my interest.Could it be an idea to download all the "scholary articles", locally select for the keyword of interest (e.g. "microgravity") and set the property P921 for all of them ? Quickstatements may be helpful for the last step, any suggestions for other tools ?ThanksFabrizioIl giorno ven 14 dic 2018 alle ore 22:16 Matthias Erfurth <erfurth@gmx.de> ha scritto:_______________________________________________Hi Fabrizio,unfortunately you can't fulltext search all the scholarly articles , you should better work with indexed properties, soyou can query for other articles with microgravity as main subject ... With the ajax based wikidata searchSELECT ?item
WHERE {
?item wdt:P31 wd:Q13442814;
wdt:P921 wd:Q48655.
}Best regards,ciao matthiasGesendet: Freitag, 14. Dezember 2018 um 18:55 Uhr
Von: "Fabrizio Carrai" <fabrizio.carrai@gmail.com>
An: "Discussion list for the Wikidata project" <wikidata@lists.wikimedia.org>
Betreff: Re: [Wikidata] Query on scholarly article failsThanks again to Ettore, but I immediately found another timeout problem when I just added a FILTER to find all the articles with the word "biokis" in the titleSELECT ?istanza_di ?instanza_diLabel WHERE {?istanza_di wdt:P31 wd:Q13442814.?istanza_di rdfs:label ?instanza_diLabel.FILTER((LANG(?instanza_diLabel)) = "en").FILTER(CONTAINS(LCASE(?instanza_diLabel), "biokis"))}LIMIT 100At least one article should be returned:but I got a timeout.Thanks to anybody that can helpFabrizioIl giorno ven 14 dic 2018 alle ore 10:12 Ettore RIZZA <ettorerizza@gmail.com> ha scritto:Hello Fabrizio,It seems that the problem comes from SERVICE wikibase:label. As said in another discussion, the query executes in less than one second if you rewrite it in this way.Cheers,Ettore Rizza_______________________________________________Le ven. 14 déc. 2018 à 09:59, Fabrizio Carrai <fabrizio.carrai@gmail.com> a écrit :_______________________________________________Hello all,the following query ends with a timeot:SELECT ?istanza_di ?istanza_diLabel WHERE {SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }?istanza_di wdt:P31 wd:Q13442814.}LIMIT 10Can anybody explain why ?Thanks in advance--Fabrizio
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata--_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidataFabrizio
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
--Fabrizio
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
--_______________________________________________Hi, do you like citation networks? Already 51% of all citations are available available for innovative new uses. Join my in asking the American Chemical Society to join the Initiative for Open Citations too. SpringerNature, the RSC and many others already did.-----E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: 0000-0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
--_______________________________________________Fabrizio
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata