Also, the problem most SPARQL backend developers worried about was not Wikidata's size, but it's dynamicity. Not the number of triples, but the frequency of edits. And we did talk to many of those people.


On Thu, Feb 19, 2015, 07:05 Markus Krötzsch <markus@semantic-mediawiki.org> wrote:
Hi Paul,

Re RDF*/SPARQL*: could you send a link? Someone has really made an
effort to find the least googleable terminology here ;-)

Re relying on standards: I think this argument is missing the point. If
you look at what developers in Wikidata are concerned with, it is +90%
interface and internal data workflow. This would be exaclty the same no
matter which data standard you would use. All the challenges of
providing a usable UI and a stable API would remain the same, since a
data encoding standard does not help with any of this. If you have
followed some of the recent discussion on the DBpedia mailing list about
the UIs they have there, you can see that Wikidata is already in a very
good position in comparison when it comes to exposing data to humans
(thanks to Magnus, of course ;-). RDF is great but there are many
problems that it does not even try to solve (rightly so). These problems
seem to be dominant in the Wikidata world right now.

This said, we are in a great position to adopt new standards as they
come along. I agree with you on the obvious relationships between
Wikidata statements and the property graph model. We are well aware of
this. Graph databases are considered for providing query solutions to
Wikidata, and we are considering to set up a SPARQL endpoint for our
existing RDF as well. Overall, I don't see a reason why we should not
embrace all of these technologies as they suit our purpose, even if they
were not available yet when Wikidata was first conceived.

Re "It is also exciting that vendors are getting on board with this and
we are going to seeing some stuff that is crazy scalable (way past 10^12
facts on commodity hardware) very soon." [which vendors?] [citation
needed] ;-) We would be very interested in learning about such
technologies. After the recent end of Titan, the discussion of query
answering backends is still ongoing.

Cheers,

Markus


On 18.02.2015 21:25, Paul Houle wrote:
> What bugs me about it is that Wikidata has gone down the same road as
> Freebase and Neo4J in the sense of developing something ad-hoc that is
> not well understood.
>
> I understand the motivations that lead there,  because there are
> requirements to meet that standards don't necessarily satisfy,  plus
> Wikidata really is doing ambitious things in the sense of capturing
> provenance information.
>
> Perhaps it has come a little too late to help with Wikidata but it seems
> to me that RDF* and SPARQL* have a lot to offer for "data wikis" in that
> you can view data as plain ordinary RDF and query with SPARQL but you
> can also attach provenance and other metadata in a sane way with sweet
> syntax for writing it in Turtle or querying it in other ways.
>
> Another way of thinking about it is that RDF* is formalizing the
> property graph model which has always been ad hoc in products like
> Neo4J.  I can say that knowing what the algebra is you are implementing
> helps a lot in getting the tools to work right.  So you not only have
> SPARQL queries as a possibility but also languages like Gremlin and
> Cypher and this is all pretty exciting.  It is also exciting that
> vendors are getting on board with this and we are going to seeing some
> stuff that is crazy scalable (way past 10^12 facts on commodity
> hardware) very soon.
>
>
>
>
> On Tue, Feb 17, 2015 at 12:20 PM, Jeroen De Dauw <jeroendedauw@gmail.com
> <mailto:jeroendedauw@gmail.com>> wrote:
>
>     Hey,
>
>     As Lydia mentioned, we obviously do not actively discourage outside
>     contributions, and will gladly listen to suggestions on how we can
>     do better. That being said, we are actively taking steps to make it
>     easier for developers not already part of the community to start
>     contributing.
>
>     For instance, we created a website about our software itself [0],
>     which lists the MediaWiki extensions and the different libraries [1]
>     we created. For most of our libraries, you can just clone the code
>     and run composer install. And then you're all set. You can make
>     changes, run the tests and submit them back. Different workflow than
>     what you as MediaWiki developer are used to perhaps, though quite a
>     bit simpler. Furthermore, we've been quite progressive in adopting
>     practices and tools from the wider PHP community.
>
>     I definitely do not disagree with you that some things could, and
>     should, be improved. Like you I'd like to see the Wikibase git
>     repository and naming of the extensions be aligned more, since it
>     indeed is confusing. Increased API stability, especially the
>     JavaScript one, is something else on my wish-list, amongst a lot of
>     other things. There are always reasons of why things are the way
>     they are now and why they did not improve yet. So I suggest to look
>     at specific pain points and see how things can be improved there.
>     This will get us much further than looking at the general state,
>     concluding people do not want third party contributions, and then
>     protesting against that.
>
>     [0] http://wikiba.se/
>     [1] http://wikiba.se/components/
>
>     Cheers
>
>     --
>     Jeroen De Dauw - http://www.bn2vs.com
>     Software craftsmanship advocate
>     Evil software architect at Wikimedia Germany
>     ~=[,,_,,]:3
>
>     _______________________________________________
>     Wikidata-l mailing list
>     Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org>
>     https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
>
>
> --
> Paul Houle
> Expert on Freebase, DBpedia, Hadoop and RDF
> (607) 539 6254    paul.houle on Skype ontology2@gmail.com
> <mailto:ontology2@gmail.com>
> http://legalentityidentifier.info/lei/lookup
>
>
> _______________________________________________
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>


_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l