Hi All!
I'm trying to figure out possible ways to launch Mediawiki-Wikibase
software to allow collaborative creation of wiki pages and
corresponding knowledge graph.
As well as I understand, it is possible to configure a single installation
of Mediawiki with Wikibase extension, and have all wiki pages in the Main
namespace like https://example.org/wiki/ and all graph items in the
namespace https://example.org/wiki/Item:
I want to build something more similar to Wikipedia-Wikidata -- wiki pages
in the namespace https://wiki.example.org/wiki/ and wikibase graph in the
namespace https://graph.example.org/wiki/ . Am I right that I have to
launch two instances of MediaWiki for that, one without Wikibase extension
and one with it?
Or is there a simpler way to configure the system to get such namespace
structure?
Thank you for help!
Victor Agroskin
Dear all,
I'm loading the whole wikidata dataset into Blazegraph using a High
Performance Computer. I gave 120 GB RAM and 3 processing cores to the job.
After almost 24 hours of load the "wikidata.jnl" file has only 28 GB as
size. Initially the process was fast, but as the file increased its size
the loading speed has decreased. I realize that only 14 GB of RAM are being
used. I already implemented the recomendations given in
https://github.com/blazegraph/database/wiki/IOOptimization Do you have some
other recommendations to increase the loading speed?
Leandro
Hi,
I have been studying about wikidata for the last few weeks in order to
perform an analysis. My main concern is about the history and the
discussion data of every entity. I was wondering whether these information
are included in the json files here
https://dumps.wikimedia.org/wikidatawiki/entities/ or there are separate
files about them. If so, where can I download them?
Thanks in advance.
Best wishes,
Elisavet
Hi,
I put a "wikidata.jnl" file of almost 60 GB size in the Blazegraph root
directory. When I run a query like "select ?s ?p ?o where {?s ?p ?o} limit
10" through the Blazegraph's query tab I get no results at all. Do I need
to do something for Blazegraph to recognize the database file?
Leandro
Hi,
I have downloaded Blazegraph already compiled from [1]. I also made the
optimizations indicated at [2].
For the loading process I'm following the instructions given in the
"getting-started.md" file that comes in the "docs" folder of the compiled
distribution [1]. That means:
1- Munge the data with: ./munge.sh -f
data/wikidata-20150427-all-BETA.ttl.gz -d data/split -l en -s
2- Start the loading process with: ./loadRestAPI.sh -n wdq -d
`pwd`/data/split
Then the loading process starts with a rate of 84352. However, the rate
has been progressively going down till 3362 after 36 hours of loading.
I'm running the process on a HPC with SSD and I'm giving to the loading
process 3 cores and 120 GB RAM. On the other hand, I notice that the
average processor usage doesn't go up over 1.6 and the maximum RAM usage is
14 GB.
I also saw [3] and I'm running the loading natively (without containers). I
have the difference with [3] that I've reduced the JVM heap to 4GB as [2]
suggested.
So what else could I do to improve the loading performance.
Thanks,
Leandro
[1]
http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22org.wikidata.query.rdf%2…
[2] https://github.com/blazegraph/database/wiki/IOOptimization
[3]
https://addshore.com/2019/10/your-own-wikidata-query-service-with-no-limits…
Dear all,
I am a researcher from Hasselt University performing research on Query
Reverse Engineering in the Context of the Semantic Web [1]. I think that
the Wikidata dataset could be the ideal one to test the algorithms I have
developed. However, due to the limitations of the public SPARQL endpoint
[2], I cannot do this online, so I am setting a standalone instance. I
realize that with my current computing power, it is not possible to perform
the loading process of the dataset to my local Blazegraph instance. Because
of the aforementioned reasons, I kindly request your assistance in order to
be able to download a Blazegraph instance with the dataset loaded in it.
Kind regards,
Leandro Tabares Martín
[1]
https://www.uhasselt.be/UH/Research-groups/en-projecten_DOC/en-project_deta…
[2] https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_Manual
*Apologies for cross-posting*
Hello all,
The Wikidata development team is currently doing some research to
understand better how people access and reuse Wikidata’s data from the code
of their applications and tools (for example through APIs), and how we can
improve the tools to make your workflows easier.
We are running a short survey to gather more information from people who
build tools based on Wikidata’s data. If you would like to participate,
please use this link
<https://docs.google.com/forms/d/e/1FAIpQLSfJ-I_Ib2EOuRVG4XfeUazhXTvgKsjcKhA…>
(Google Forms, estimated fill-in time 5min). If you don’t want to use
Google Forms, you can also send me a private email with your answers. We
would love to get as many answers as possible before June 9th.
The data will be anonymously collected and will only be shared in an
aggregated form.
If you have any questions, feel free to reach out to me directly.
Cheers,
--
Mohammed Sadat Abdulai
*Community Communications Manager for Wikidata/Wikibase*
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Thank you!
The KG data built in my project should be ultimately used by people more
accustomed to Semantic Web styled IRI. They will be from established SW or
OWL communities, sometimes with their own standards for IRIs. And they'd
like to have them dereferenceable! They can map their ontologies or add
other IDs as needed, I just want to make their life a bit easier and avoid
some unnecessary discussion.