On 12/22/16 3:37 AM, Ruben Verborgh wrote:
Hi Kingsley,
will see a substantial increase in server costs when they try to host that same data as a public SPARQL HTTP service.
Again subjective.
No, that's not subjective, that's perfectly measurable. And that's exactly what we did in our research.
That doesn't negate the fact that your world view is subjective. You've started this whole thing an fuzzy premise. For instance, why do you think SPARQL exists, and how have you arrived at the conclusion that it is some kind of Semantic Web frontier?
SPARQL Query Services are just one of many data definition and manipulation services available to HTTP network users (public or private) working with RDF relations.
In some cases, service providers use SPARQL to facilitate and/or compliment Linked Open Data publishing efforts.
The problem with the SPARQL protocol as an API is that the per-request cost is a) higher and b) much more variable than any other API.
A Protocol isn't the same thing as an Application Programming Interface (API), in my world view. APIs provide interaction abstraction over protocols.
ODBC and JDBC are APIs for building applications against RDBMS applications that interact with relations represented as Tables, using SQL (and in the case of Virtuoso, SQL, SPARQL, and the SPASQL hybrid). Those APIs include abstractions over TCP/IP and other protocols. Jena, Sesame, Redland, and others do provide APIs that offer similar functionality to the aforementioned, with regards to RDF triple and quad stores.
The SPARQL Protocol extends HTTP with an ability to include SPARQL queries and solutions as part of its request and response payloads.
IMHO, your position is based on a claim that isn't being made by SPARQL compliant product providers. I continue to sense some confusion about how it has been used and spoken about, with regards to the early days of the LOD community i.e., there's no Linked Data without a SPARQL endpoint, or use of SPARQL etc..
SPARQL Query Language, Protocol, and Results Serialization Formats are simply tools, like many others, that can be used solve a variety of problems. Nobody every claimed (as far as I know) that the SPARQL composite is (or was) a "silver bullet" .
Everywhere else on the Web, APIs shield data consumers from the backend, limiting the per-request complexity. That's why they thrive and SPARQL endpoints don't.
See my comment above. Your characterization is inaccurate.
Don't get me wrong, I'm happy with every highly available SPARQL endpoint out there. Wikidata and DBpedia are awesome. It's just that there are too few and I see cost as a major factor there.
It's hard to understand the statement above. Fundamentally, Wikidata & DBpedia have addressed specific challenges and an inability of others to emulate (in your world view) has little to do with SPARQL and everything to do with motivation, engineering capability, and general experience with RDBMS technology.
You are implying that cost vs benefit analysis don't drive decisions to put services on the Web, of course they do.
Quite the contrary, I am arguing that—and this is subjective— because cost/benefit analyses drive decisions on the Web, we will never have substantially more SPARQL endpoints on the public Web than we have now. They're just too expensive.
Like the statement you made prior, I am struggling to understand your point. You can't simply throw "too expensive" at something, and decide that's definitive for everyone. That simply isn't the route to a coherent pitch.
You are taking the world view of a niche and declaring it universal. What entity (in this case: Person or Organization) profile would find this endeavor expensive? A student, academic institution, commercial company, government?
Federation is where I think public SPARQL endpoints will fail, so it will be worthwhile to see what happens.
Really, then you will ultimately be surprised on that front too!
I really really hope so. If one day, machines can execute queries on the Web as well as we can, I'd be really happy.
I still don't really understand what you mean by "as well as we can". All I've seen thus far is a pitch about availability that is justifiably slow, combined with an inability to deal with complex queries. I also notice that you don't say much about:
1. change sensitivity and ; 2. actual data loading and deployment time, in a rapidly changing world increasingly driven by data.
My way to reach that is lightweight interfaces, but if it is possible with heavyweight interfaces, all the better.
Again, heavyweight and lightweight are totally subjective characterizations :)
Best,
Ruben