I'm interested in using wikibase for a database for elected and appointed
officials at a very local level - down to a level such that they don't even
meet Wikidata's relaxed notability guidelines (wikidata isn't interested in
who the members were of a small town's zoning commission 10 years ago). I
don't have anything online yet, just playing with the docker container
versions on my laptop to make sure I know what I'm doing (many thanks to
the folks who put those containers together - being able to type two
commands and get a working wikibase instance is amazing)
Beyond wikibase as a solid knowledgebase management software, I am
especially interested in connecting to wikidata for two reasons:
1. Wikidata has a pretty well-established ontology for my domain - the
properties and constraints that describe elected offices, length of terms,
details about politicians, membership of people in those offices, etc: all
of that likely directly applies to my small-scale database as well, and if
there's something else I need, it's likely wikidata will need it too and I
can get a community of linked-data/data modelers to figure it out with me.
2. There is overlap between my dataset and what wikidata is tracking - the
members of the town's zoning members might not be interesting to wikidata,
but wikidata almost certainly has an item for the town. And there's some
overlap between the people: long before she was a United States Senator,
Tammy Baldwin was a member of the local county board in the late 1980s, so
I'd like to be able to link into that. Finally, the qualifiers on domains
properties are allowed to come from are an important dataset and are useful
in my database as well.
For me, the most important thing to have is rock-solid backup and restore,
with detailed, no-question-too-dumb documentation. I'm terrified of putting
together a database and have it blow up, and having to reconstruct it. What
especially makes me nervous is that Q and P ids are set by wikibase, but
they're externally used as well - so if I screw up so bad I have to
completely re-import all of my data, if I'm not careful the Qid for a
officeholder might chance when I re-load it, so anyone else who has a query
using that Qid will be out of luck.
It'd be especially nice to have an example backup of a very small site
posted on the web somewhere as a set of example "fixtures" of a handful of
items and properties that could optionally be used in conjunction with the
docker containers to verify that you've got everything up and running
end-to-end, with sample queries and example expected output - given how
easy docker makes it to blow everything away and start over, it'd be very
nice to be able to bring up a site, modify the data to "experiment", and if
I feel like I've gotten myself into trouble, delete it and start over.
I would echo Laura's interest in optimizing server resources - for funding
I'm just going to eat the costs with a couple of VMs in the cloud (I'm
counting on being able to do it for about $100/month, but I don't know if
that's realistic), so the smaller the footprint the better (while still
maintaining some HA/disaster recovery capabilities, or at least the ability
to restore quickly if a VM crashes hard - I think I'm ok if my site goes
down for a while until something reboots, but I don't want to lose data)
Other things I'm interested in is more federation support and examples, so
I can more easily reuse properties and items from Wikidata. I think for
performance reasons I'd want to be able to import most of them into my
instance directly, and not use a literal federation where queries on my
site make network calls back to
wikidata.org - instead, I'd like to have
the wikidata data imported into my instance and into a namespace to keep
them separate, and to have a way to keep that up-to-date. I'd also like to
only import a subset of wikidata - I want all the properties and
constraints around P39 (position held) and I'm going to use them
frequently, but I'd rather not import 20 gigs of data about genomes or
fungus taxons. I'm not quite sure how to do that - I don't think wikidata
neatly separates into a "core wikidata" and "everything else", so
I'd guess
I just keep recursively walking the graph and pull in things I need.
-Erik
On Mon, Nov 19, 2018 at 5:10 AM Laura Hale <laura(a)fanhistory.com> wrote:
Hi,
Wikibase is amazing software, and if some of the tools being developed for
Wikidata can be re-purposed for Wikibase, its potential grows even more
exponentially. (In the same way that some extensions and the ease of
upgrading made Mediawiki a much more desirable CMS than it might have
otherwise been.)
I was hoping people could share what their projects are, a little bit
about the project and its purpose. In this context, I was hoping people
might be willing to share what your priorities are as it relates to
Wikibase and meeting project goals.
In case you don't know, Miguel and I are involved with ParaSports Data. ParaSports
Data is a disability and disability sports knowledge base that can serve as
a powerful resource for academics, NGOs and other stakeholders in the area
of disability rights and disability sports. ParaSport Data was conceived
in 2016 as a resource for structured data about disability and disability
sport, including facts like dates, performance results, disability
population sizes, event information, and classification related data. Using
Wikipedia and Wikidata as a model, it seeks to be the largest single
knowledgebase about disability and disability sport that allows for
stakeholders to search, analyze and re-use this data to draw awareness to
disability, disability sports and human rights as an extension of those two
things. The purpose of this project is to create a data set of
Paralympic, Deaflympic, Special Olympics, other disability sports and
general disability information for use by researchers.
As it relates to Wikibase, our immediate needs and concerns around these
needs probably fall into the following four areas:
1. Upgrading MediaWiki, Wikibase, the Query Engine, and installing
Extension:OAuth and Quick Statements. This is not explained anywhere, and
the best process as we understand it is outlined at
https://www.researchgate.net/publication/329028278_Wikibase_Upgrade_Workflow
. It makes it scary to do on our own as there is no process, lack of
documentation and it is hard to locate others who have successfully done
this.
2. Better optimizing the Wikibase software and query engine so it uses
fewer server resources. (This partly ties into point three.)
3. Identifying funding sources to pay for our installation as I currently
pay out of pocket for all hosting, and this is not a long term solution.
4. Improving bulk data import on Wikibase to prevent fewer errors, and
need to merge items after the fact. This is because either the Wikibase
software or the way we bulk import only exact matches item names. If in
bulk creating items, an item description is different than the existing
one, it creates a new item. Statements are then added to the lowest Q
number when adding statements based on an item name match.
What are your priorities as it comes to your own installations? :)
--
twitter: purplepopple
_______________________________________________
Wikibaseug mailing list
Wikibaseug(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibaseug