I just ran Max' one-liner over one of the dump
files, and it worked
smoothly. Not sure where the best place would be to store such things,
so I simply put it in my sandbox for now:
.
d.
On Tue, Aug 7, 2018 at 6:06 PM David Cuenca Tudela <dacuetu(a)gmail.com> wrote:
If someone could post the 10 (or 50!) more popular items, I would really appreciate it
:-)
Cheers,
Micru
On Tue, Aug 7, 2018 at 5:59 PM Maximilian Marx <maximilian.marx(a)tu-dresden.de>
wrote:
Hi,
On Tue, 7 Aug 2018 17:37:34 +0200, Markus Kroetzsch
<markus.kroetzsch(a)tu-dresden.de> said:
If you want a sorted list of "most
popular" items, this is a bit more
work and would require at least some Python script, or some less
obvious combination of sed (extracting all URLs of entities), and
sort.
zgrep -Eoe '%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2FQ[1-9][0-9]+%3E'
dump.gz | cut -d 'Q' -f 2 | cut -d '%' -f 1 | sort | uniq -c | sort -nr
should do the trick.
Best,
Maximilian
--
Dipl.-Math. Maximilian Marx
Knowledge-Based Systems Group
Faculty of Computer Science
TU Dresden
+49 351 463 43510
https://kbs.inf.tu-dresden.de/max
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
--
Etiamsi omnes, ego non
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org