Dear all,
I am happy to announce the very first release of Wikidata Toolkit [1], the Java library for programming with Wikidata and Wikibase. This initial release can download and parse Wikidata dump files for you, so as to process all Wikidata content in a streaming fashion. An example program is provided [2]. The libary can also be used with MediaWiki dumps generated by other Wikibase installations (if you happen to work in EAGLE ;-).
Maven users can get the library directly from Maven Central (see [1]); this is the preferred method of installation. There is also an all-in-one JAR at github [3] and of course the sources [4].
Version 0.1.0 is of course alpha, but the code that we have is already well-tested and well-documented. Improvements that are planned for the next release include:
* Faster and more robust loading of Wikibase dumps * Support for various serialization formats, such as JSON and RDF * Initial support for Wikibase API access
Nevertheless, you can already give it a try now. In later releases, it is also planned to support more advanced processing after loading, especially for storing and querying the data.
Feedback is welcome. Developers are also invited to contribute via github.
Cheers,
Markus
[1] https://www.mediawiki.org/wiki/Wikidata_Toolkit [2] https://github.com/Wikidata/Wikidata-Toolkit/blob/v0.1.0/wdtk-examples/src/m... [3] https://github.com/Wikidata/Wikidata-Toolkit/releases (you'll also need to install the third party dependencies manually when using this) [4] https://github.com/Wikidata/Wikidata-Toolkit/
On Mon, Mar 31, 2014 at 3:47 PM, Markus Krötzsch markus@semantic-mediawiki.org wrote:
Dear all,
I am happy to announce the very first release of Wikidata Toolkit [1], the Java library for programming with Wikidata and Wikibase. This initial release can download and parse Wikidata dump files for you, so as to process all Wikidata content in a streaming fashion. An example program is provided [2]. The libary can also be used with MediaWiki dumps generated by other Wikibase installations (if you happen to work in EAGLE ;-).
Maven users can get the library directly from Maven Central (see [1]); this is the preferred method of installation. There is also an all-in-one JAR at github [3] and of course the sources [4].
Version 0.1.0 is of course alpha, but the code that we have is already well-tested and well-documented. Improvements that are planned for the next release include:
- Faster and more robust loading of Wikibase dumps
- Support for various serialization formats, such as JSON and RDF
- Initial support for Wikibase API access
Nevertheless, you can already give it a try now. In later releases, it is also planned to support more advanced processing after loading, especially for storing and querying the data.
Feedback is welcome. Developers are also invited to contribute via github.
Congrats, Markus! Great to see a first release. Want to post something on Wikidata:Project chat too?
Cheers Lydia
Congratulations, Markus!!
Thanks so much for developing these digital tools.
Scott
On Mon, Mar 31, 2014 at 1:46 PM, Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:
On Mon, Mar 31, 2014 at 3:47 PM, Markus Krötzsch markus@semantic-mediawiki.org wrote:
Dear all,
I am happy to announce the very first release of Wikidata Toolkit [1],
the
Java library for programming with Wikidata and Wikibase. This initial release can download and parse Wikidata dump files for you, so as to
process
all Wikidata content in a streaming fashion. An example program is
provided
[2]. The libary can also be used with MediaWiki dumps generated by other Wikibase installations (if you happen to work in EAGLE ;-).
Maven users can get the library directly from Maven Central (see [1]);
this
is the preferred method of installation. There is also an all-in-one JAR
at
github [3] and of course the sources [4].
Version 0.1.0 is of course alpha, but the code that we have is already well-tested and well-documented. Improvements that are planned for the
next
release include:
- Faster and more robust loading of Wikibase dumps
- Support for various serialization formats, such as JSON and RDF
- Initial support for Wikibase API access
Nevertheless, you can already give it a try now. In later releases, it is also planned to support more advanced processing after loading,
especially
for storing and querying the data.
Feedback is welcome. Developers are also invited to contribute via
github.
Congrats, Markus! Great to see a first release. Want to post something on Wikidata:Project chat too?
Cheers Lydia
-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata
Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 31/03/14 22:46, Lydia Pintscher wrote:
On Mon, Mar 31, 2014 at 3:47 PM, Markus Krötzsch markus@semantic-mediawiki.org wrote:
Dear all,
I am happy to announce the very first release of Wikidata Toolkit [1], the Java library for programming with Wikidata and Wikibase. This initial release can download and parse Wikidata dump files for you, so as to process all Wikidata content in a streaming fashion. An example program is provided [2]. The libary can also be used with MediaWiki dumps generated by other Wikibase installations (if you happen to work in EAGLE ;-).
Maven users can get the library directly from Maven Central (see [1]); this is the preferred method of installation. There is also an all-in-one JAR at github [3] and of course the sources [4].
Version 0.1.0 is of course alpha, but the code that we have is already well-tested and well-documented. Improvements that are planned for the next release include:
- Faster and more robust loading of Wikibase dumps
- Support for various serialization formats, such as JSON and RDF
- Initial support for Wikibase API access
Nevertheless, you can already give it a try now. In later releases, it is also planned to support more advanced processing after loading, especially for storing and querying the data.
Feedback is welcome. Developers are also invited to contribute via github.
Congrats, Markus! Great to see a first release. Want to post something on Wikidata:Project chat too?
Yes, good point. Posted the message there too now.
Cheers
Markus
I was trying to use this, but my Java is a bit rusty. How do I run the DumpProcessingExample?
I did the following steps:
git clone https://github.com/Wikidata/Wikidata-Toolkit cd Wikidata-Toolkit mvn install mvn test
Now, how do I start DumpProcessingExample?
Sorry for being a bit dense here.
Cheers, Denny
On Mon Mar 31 2014 at 6:47:21 AM, Markus Krötzsch < markus@semantic-mediawiki.org> wrote:
Dear all,
I am happy to announce the very first release of Wikidata Toolkit [1], the Java library for programming with Wikidata and Wikibase. This initial release can download and parse Wikidata dump files for you, so as to process all Wikidata content in a streaming fashion. An example program is provided [2]. The libary can also be used with MediaWiki dumps generated by other Wikibase installations (if you happen to work in EAGLE ;-).
Maven users can get the library directly from Maven Central (see [1]); this is the preferred method of installation. There is also an all-in-one JAR at github [3] and of course the sources [4].
Version 0.1.0 is of course alpha, but the code that we have is already well-tested and well-documented. Improvements that are planned for the next release include:
- Faster and more robust loading of Wikibase dumps
- Support for various serialization formats, such as JSON and RDF
- Initial support for Wikibase API access
Nevertheless, you can already give it a try now. In later releases, it is also planned to support more advanced processing after loading, especially for storing and querying the data.
Feedback is welcome. Developers are also invited to contribute via github.
Cheers,
Markus
[1] https://www.mediawiki.org/wiki/Wikidata_Toolkit [2] https://github.com/Wikidata/Wikidata-Toolkit/blob/v0.1.0/ wdtk-examples/src/main/java/org/wikidata/wdtk/examples/ DumpProcessingExample.java [3] https://github.com/Wikidata/Wikidata-Toolkit/releases (you'll also need to install the third party dependencies manually when using this) [4] https://github.com/Wikidata/Wikidata-Toolkit/
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Hi Denny,
There's a "DumpProcessingExamplehttps://github.com/Wikidata/Wikidata-Toolkit/blob/v0.1.0/wdtk-examples/src/main/java/org/wikidata/wdtk/examples/DumpProcessingExample.java " here: https://www.mediawiki.org/wiki/Wikidata_Toolkit
Cheers, Scott
On Tue, Apr 8, 2014 at 2:34 PM, Denny Vrandečić vrandecic@gmail.com wrote:
I was trying to use this, but my Java is a bit rusty. How do I run the DumpProcessingExample?
I did the following steps:
git clone https://github.com/Wikidata/Wikidata-Toolkit cd Wikidata-Toolkit mvn install mvn test
Now, how do I start DumpProcessingExample?
Sorry for being a bit dense here.
Cheers, Denny
On Mon Mar 31 2014 at 6:47:21 AM, Markus Krötzsch < markus@semantic-mediawiki.org> wrote:
Dear all,
I am happy to announce the very first release of Wikidata Toolkit [1], the Java library for programming with Wikidata and Wikibase. This initial release can download and parse Wikidata dump files for you, so as to process all Wikidata content in a streaming fashion. An example program is provided [2]. The libary can also be used with MediaWiki dumps generated by other Wikibase installations (if you happen to work in EAGLE ;-).
Maven users can get the library directly from Maven Central (see [1]); this is the preferred method of installation. There is also an all-in-one JAR at github [3] and of course the sources [4].
Version 0.1.0 is of course alpha, but the code that we have is already well-tested and well-documented. Improvements that are planned for the next release include:
- Faster and more robust loading of Wikibase dumps
- Support for various serialization formats, such as JSON and RDF
- Initial support for Wikibase API access
Nevertheless, you can already give it a try now. In later releases, it is also planned to support more advanced processing after loading, especially for storing and querying the data.
Feedback is welcome. Developers are also invited to contribute via github.
Cheers,
Markus
[1] https://www.mediawiki.org/wiki/Wikidata_Toolkit [2] https://github.com/Wikidata/Wikidata-Toolkit/blob/v0.1.0/ wdtk-examples/src/main/java/org/wikidata/wdtk/examples/ DumpProcessingExample.java [3] https://github.com/Wikidata/Wikidata-Toolkit/releases (you'll also need to install the third party dependencies manually when using this) [4] https://github.com/Wikidata/Wikidata-Toolkit/
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Am 08.04.2014 23:34, schrieb Denny Vrandečić:
I was trying to use this, but my Java is a bit rusty. How do I run the DumpProcessingExample?
I did the following steps:
git clone https://github.com/Wikidata/Wikidata-Toolkit cd Wikidata-Toolkit mvn install mvn test
Now, how do I start DumpProcessingExample?
Looks like you are supposed to run it from Eclipse.
It would be very useful if maven would generate a jar with all dependencies for the examples, or if there was a shell script that would allow us to run classes without the need to specify the full class path.
Finding out how to get all the libs you need into the classpath is one of the major annoyances of java...
-- daniel
Hoi,
What is the relevance of these tools when you have to have specialised environments to use them ? Thanks, GerardM
On 9 April 2014 10:41, Daniel Kinzler daniel.kinzler@wikimedia.de wrote:
Am 08.04.2014 23:34, schrieb Denny Vrandečić:
I was trying to use this, but my Java is a bit rusty. How do I run the DumpProcessingExample?
I did the following steps:
git clone https://github.com/Wikidata/Wikidata-Toolkit cd Wikidata-Toolkit mvn install mvn test
Now, how do I start DumpProcessingExample?
Looks like you are supposed to run it from Eclipse.
It would be very useful if maven would generate a jar with all dependencies for the examples, or if there was a shell script that would allow us to run classes without the need to specify the full class path.
Finding out how to get all the libs you need into the classpath is one of the major annoyances of java...
-- daniel
-- Daniel Kinzler Senior Software Developer
Wikimedia Deutschland Gesellschaft zur Förderung Freien Wissens e.V.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Hi Gerard.
On 09/04/14 10:54, Gerard Meijssen wrote:
Hoi,
What is the relevance of these tools when you have to have specialised environments to use them ?
Not sure what you mean. Wikidata Toolkit doesn't have any requirements other than plain old Java to run.
Nevertheless, we'd also like to support people who are using some of the common Java development tools that are around, especially the free ones. Currently, we only have instructions for Eclipse users, but we could extend this. Which tools do you normally use to develop Java?
Cheers
Markus
On 9 April 2014 10:41, Daniel Kinzler <daniel.kinzler@wikimedia.de mailto:daniel.kinzler@wikimedia.de> wrote:
Am 08.04.2014 23:34, schrieb Denny Vrandečić: > I was trying to use this, but my Java is a bit rusty. How do I run the > DumpProcessingExample? > > I did the following steps: > > git clone https://github.com/Wikidata/Wikidata-Toolkit > cd Wikidata-Toolkit > mvn install > mvn test > > Now, how do I start DumpProcessingExample? Looks like you are supposed to run it from Eclipse. It would be very useful if maven would generate a jar with all dependencies for the examples, or if there was a shell script that would allow us to run classes without the need to specify the full class path. Finding out how to get all the libs you need into the classpath is one of the major annoyances of java... -- daniel -- Daniel Kinzler Senior Software Developer Wikimedia Deutschland Gesellschaft zur Förderung Freien Wissens e.V. _______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Hi Denny, hi Daniel, hi all,
Welcome to Java :-) (more useful answers below)
On 09/04/14 10:41, Daniel Kinzler wrote:
Am 08.04.2014 23:34, schrieb Denny Vrandečić:
I was trying to use this, but my Java is a bit rusty. How do I run the DumpProcessingExample?
I did the following steps:
git clone https://github.com/Wikidata/Wikidata-Toolkit cd Wikidata-Toolkit mvn install mvn test
Now, how do I start DumpProcessingExample?
Looks like you are supposed to run it from Eclipse.
It would be very useful if maven would generate a jar with all dependencies for the examples, or if there was a shell script that would allow us to run classes without the need to specify the full class path.
Finding out how to get all the libs you need into the classpath is one of the major annoyances of java...
First, the quick answer to Denny:
""" Change to the directory of the example module (wdtk-examples), then run:
mvn exec:java -Dexec.mainClass="org.wikidata.wdtk.examples.DumpProcessingExample" """
Anyway, this is not something one would normally do. Some clarifications seem to be needed here.
The examples we have are not applications that you would ever want to run. They don't do anything really useful (yet). The purpose of the examples is to give Java developers a practical, well, example on how to use the Wikidata Toolkit library. Since Wikidata Toolkit is a library for Java, it is somehow presumed that you already know how to run a Java application when you look into it ;-) Normally, every Java developer has some preferred way in which she will normally do this, usually through her IDE.
If you are new to Java, I would recommend you to start with a little tutorial (there are plenty on the Web) and to use Eclipse (there are other good IDEs for Java, but Eclipse is a safe bet if you don't know your needs yet). We have detailed instructions on the homepage how to install git and maven support for Eclipse so that it can do everything for you [1]. An Eclipse user could run the examples, for example, by: right click on file -> Run as -> Java application.
The "major annoyance" that Daniel remarked on is simply that there is no single main application here. There can (and will) be many examples, so naturally you have to tell Java which program to run. Doing this on the command line is not convenient due to the long strings, but then again that is not something you would normally do (and if you do, you could create a one-line script with the command I gave above ;-).
In the future, we might have actual applications that are useful as stand-alone tools. When this happens, we will of course provide suitable stand-alone packages where you only need to run a single "main" file to start the tool. This is also what people would do who use the library to develop their own stand-alone applications. But if you want to work with a multi-module library, you probably don't want an all-in-one package that uses third party dependencies that you don't want or need.
Anyway, all hints on packaging are really appreciated -- our goal is to provide packages that are most useful to you. We just started with the obvious standard packages but we could create more specific bundles for relevant applications. As we move along in alpha phase, these things might still change quite a bit, of course ... ;-)
Cheers
Markus
[1] https://www.mediawiki.org/wiki/Wikidata_Toolkit/Eclipse_setup
-- daniel
On 09/04/14 13:18, Markus Krötzsch wrote:
Hi Denny, hi Daniel, hi all,
Welcome to Java :-) (more useful answers below)
Following popular demand ;-), I have now created a new documentation section "Beginner's guide" [1] that takes you step-by-step through setting up your very first Maven project in Eclipse and configuring it to use Wikidata Toolkit.
A relevant remark that had been missing on the documentation pages is that you really must use Java 1.7 or above. Probably a no-brainer for most users today, but annoying if you happen to run on older Java versions (which won't make sense of the code without being able to tell you the real reason).
Cheers,
Markus
[1] https://www.mediawiki.org/wiki/Wikidata_Toolkit#Beginner.27s_guide
On 09/04/14 10:41, Daniel Kinzler wrote:
Am 08.04.2014 23:34, schrieb Denny Vrandečić:
I was trying to use this, but my Java is a bit rusty. How do I run the DumpProcessingExample?
I did the following steps:
git clone https://github.com/Wikidata/Wikidata-Toolkit cd Wikidata-Toolkit mvn install mvn test
Now, how do I start DumpProcessingExample?
Looks like you are supposed to run it from Eclipse.
It would be very useful if maven would generate a jar with all dependencies for the examples, or if there was a shell script that would allow us to run classes without the need to specify the full class path.
Finding out how to get all the libs you need into the classpath is one of the major annoyances of java...
First, the quick answer to Denny:
""" Change to the directory of the example module (wdtk-examples), then run:
mvn exec:java -Dexec.mainClass="org.wikidata.wdtk.examples.DumpProcessingExample" """
Anyway, this is not something one would normally do. Some clarifications seem to be needed here.
The examples we have are not applications that you would ever want to run. They don't do anything really useful (yet). The purpose of the examples is to give Java developers a practical, well, example on how to use the Wikidata Toolkit library. Since Wikidata Toolkit is a library for Java, it is somehow presumed that you already know how to run a Java application when you look into it ;-) Normally, every Java developer has some preferred way in which she will normally do this, usually through her IDE.
If you are new to Java, I would recommend you to start with a little tutorial (there are plenty on the Web) and to use Eclipse (there are other good IDEs for Java, but Eclipse is a safe bet if you don't know your needs yet). We have detailed instructions on the homepage how to install git and maven support for Eclipse so that it can do everything for you [1]. An Eclipse user could run the examples, for example, by: right click on file -> Run as -> Java application.
The "major annoyance" that Daniel remarked on is simply that there is no single main application here. There can (and will) be many examples, so naturally you have to tell Java which program to run. Doing this on the command line is not convenient due to the long strings, but then again that is not something you would normally do (and if you do, you could create a one-line script with the command I gave above ;-).
In the future, we might have actual applications that are useful as stand-alone tools. When this happens, we will of course provide suitable stand-alone packages where you only need to run a single "main" file to start the tool. This is also what people would do who use the library to develop their own stand-alone applications. But if you want to work with a multi-module library, you probably don't want an all-in-one package that uses third party dependencies that you don't want or need.
Anyway, all hints on packaging are really appreciated -- our goal is to provide packages that are most useful to you. We just started with the obvious standard packages but we could create more specific bundles for relevant applications. As we move along in alpha phase, these things might still change quite a bit, of course ... ;-)
Cheers
Markus
[1] https://www.mediawiki.org/wiki/Wikidata_Toolkit/Eclipse_setup
-- daniel
Hi Markus,
On Wed Apr 09 2014 at 4:18:50 AM, Markus Krötzsch < markus@semantic-mediawiki.org> wrote:
Change to the directory of the example module (wdtk-examples), then run:
mvn exec:java -Dexec.mainClass="org.wikidata.wdtk.examples.DumpProcessingExample"
Thanks, that is exactly what I needed! :)
I understand that WDTK is a library to be used in your own applications, but I am often not patient enough to actually go and code up a whole app myself in a new dev environment before I actually see that the thing is running. So being able to actually start and run the example application is superuseful for my motivation, because now I can go ahead and tinker with it while it is running, and iteratively change it to what I want.
Thanks again for the prompt and useful answer! It works like a charm now!
Cheers, Denny