I round up from DOI/PubMed ID counts on https://tools.wmflabs.org/scholia/

Egon

On Sat, Dec 15, 2018 at 3:03 PM Fabrizio Carrai <fabrizio.carrai@gmail.com> wrote:
Excellent, I did some tests and with some cycles I already identified and classified several articles.
I will have a look at your script in the  next days but I already have a question: the number of iterations is based on the total number of articles, how do you know that ?

---
Fabrizio

Il giorno sab 15 dic 2018 alle ore 10:18 Egon Willighagen <egon.willighagen@gmail.com> ha scritto:

The approach I use is the following, see this (Bioclipse/Groovy) script: https://gist.github.com/egonw/ca4c348b9a2d1116efcdb55fa85dd158

It takes advantage of a combination Blazegraph SPARQL trick and breaking up thing in batches of a certain size:

SELECT ?art ?artLabel
WITH {
SELECT ?art WHERE {
?art wdt:P31 wd:Q13442814
} LIMIT $batchSize OFFSET $offset
} AS %RESULTS {
INCLUDE %RESULTS
?art wdt:P1476 ?artLabel .
MINUS { ?art wdt:P921 wd:$conceptQ }
FILTER (contains(lcase(str(?artLabel)), "$concept"))
}
where "$concept" is my search word in the title, and $batchSize and $offset take care of the batching by the script. This script creates QuickStatements. 

Mind you, I manually check the created statements, because in my domain (biochem) a simple search results of false positives, hence the "blacklist" in the script :)

Egon










On Sat, Dec 15, 2018 at 10:13 AM Fabrizio Carrai <fabrizio.carrai@gmail.com> wrote:
Thanks Matthias,
that's a pity. Your suggestion relies on the effective characterization of the item that,  at this writing time, is pretty poor for my interest.
Could it be an idea to download all the "scholary articles", locally select  for the keyword of interest (e.g. "microgravity") and set the property P921 for all of them ? Quickstatements may be helpful for the last step, any suggestions for other tools ? 

Thanks
Fabrizio

Il giorno ven 14 dic 2018 alle ore 22:16 Matthias Erfurth <erfurth@gmx.de> ha scritto:
Hi Fabrizio,
unfortunately you can't fulltext search all the scholarly articles , you should better work with indexed properties, so
you can query for other articles with microgravity as main subject ... With the ajax based wikidata search 
 
SELECT ?item
WHERE {
    ?item wdt:P31 wd:Q13442814;
          wdt:P921 wd:Q48655.
}
 
Best regards,
 
ciao matthias
 
 
Gesendet: Freitag, 14. Dezember 2018 um 18:55 Uhr
Von: "Fabrizio Carrai" <fabrizio.carrai@gmail.com>
An: "Discussion list for the Wikidata project" <wikidata@lists.wikimedia.org>
Betreff: Re: [Wikidata] Query on scholarly article fails
Thanks again to Ettore, but I immediately found another timeout problem when I just added a FILTER to find all the articles with the word "biokis"  in the title
 
SELECT ?istanza_di ?instanza_diLabel WHERE {
  ?istanza_di wdt:P31 wd:Q13442814.
  ?istanza_di rdfs:label ?instanza_diLabel.
  FILTER((LANG(?instanza_diLabel)) = "en").
  FILTER(CONTAINS(LCASE(?instanza_diLabel), "biokis"))
}
LIMIT 100
 
At least one article should be returned: 
but I got a timeout.
 
Thanks to anybody that can help
 
Fabrizio
 
 
Il giorno ven 14 dic 2018 alle ore 10:12 Ettore RIZZA <ettorerizza@gmail.com> ha scritto:
Hello Fabrizio, 
 
It seems that the problem comes from SERVICE wikibase:label. As said in another discussion, the query executes in less than one second if you rewrite it in this way.
 
Cheers,
 
Ettore Rizza
 
Le ven. 14 déc. 2018 à 09:59, Fabrizio Carrai <fabrizio.carrai@gmail.com> a écrit :
Hello all,
the following query ends with a timeot:
 
SELECT ?istanza_di ?istanza_diLabel WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
  ?istanza_di wdt:P31 wd:Q13442814.
}
LIMIT 10
 
Can anybody explain why ?
Thanks in advance
 
--
Fabrizio
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
 
 
--
Fabrizio
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


--
Fabrizio
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


--
Hi, do you like citation networks? Already 51% of all citations are available available for innovative new uses. Join my in asking the American Chemical Society to join the Initiative for Open Citations too. SpringerNature, the RSC and many others already did.

-----
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: 0000-0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


--
Fabrizio
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


--
Hi, do you like citation networks? Already 51% of all citations are available available for innovative new uses. Join my in asking the American Chemical Society to join the Initiative for Open Citations too. SpringerNature, the RSC and many others already did.

-----
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: 0000-0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen