Hoi,
Anders, I am afraid that the way you describe is one where perfection is
the enemy of the good.
Wikidata is full of imperfections. It is incomplete and often so wrong...
how about prime ministers of the United Kingdom who have been dead for
centuries featuring as an actor in several movies ???
When the data that are to be used in stubs or articles are uploaded to
Wikidata you can as happily improve them in Wikidata as anywhere else. What
is possible is to have tools on Wikidata like Reasonator that will help you
get to grips with the consistency.
When you state that it takes months to get the properties in Wikidata that
are needed to do your project, I find it a big problem at Wikidata and
definitely something that needs attention. When you know what the set of
properties it is you need, you can propose them as a lot and make it
obvious that they have to be considered together. As you know the quantity
type of properties have been released. So this is a good time to make your
proposals.
Most of all, Wikidata is a wiki. You indicate that the official sources
need work. Wikidata is a good place to work on this. When it is a years
work for you to beaver away, it may find more people to work with you on
solutions at Wikidata.
Thanks,
GerardM
On 5 February 2014 13:21, Anders Wennersten <mail(a)anderswennersten.se>wrote;wrote:
Thanks for your input!
I agree that with Wikidata we can generate article content semiautomatic
without the controversy we have seen as for now.
But our learning is it takes much more time then expected to get Wikidata
operational on the data we want to get into it
For our data we are working with just now, Swedish entities like
administrative units, towns, parishes, lakes etc we have found:
1.Before we load Wikidata we must have the identities correct on svwp,
both name and official "entitynumber" and coordinates. We thought this
would be simple but we find that it takes much longer then anticipated as
we want this data to be 99,8% correct not 98% as we are used to have (if we
do not load widata with top Q it can not be recommended as a general source
of info for all versions). And we find loads of problems and errors and
ambiguous data in the sources we use from our authorities (besides typing
errors in wp). No one has ever scrutinized this official data as we
Wikipedians are doing now. And just for the basic 10000 entities, it will
take our group of five-six up to a year to get this right
2.before we then load data in Wikidata we must have identified the correct
properties and in many cases get new ones in place. It is just a few week
since it was possible to enter populations, and an important property like
geoshape is far away yet. And for the new property unique for our project
we have to work through the wikidata defintionprocess, that can easily take
6-9 month for a single property, All of this must of course be ready before
we start the actual loading.
3.The actual load in wikidata is then quite straight forward (by bot). But
to make use of data in Wikidata we need to have new templates in place in
our language version. And here we find we need for many dataitems to have
modules written in Lua in order for the data to be handled in the template
in order to present data correct in the articles
4.After loading wikidata we need to work through our articles for them to
base their data on wikidata. Some of this is done without problems with
using templates and going through the articles with bots. But we expect
also there will be a need in several articles to make manual adjustment to
get all correct (like factdata residing in the text portion that should now
be taken from Wikdata)
But if we come this far, our articles will be perfect and we can produce a
set of software, like templetes and modules that make implementation of
these data in other language versions very easy
Another learning is that we actually will not be able to achieve our goal
without the support of technical expertise (like knowledge in Lua, how to
write datainterface to external dataproviders) . Right now we are
discussing with our local chapter, if they can provide technical expertise
when ours is not enough , we are after all wikipedians not tech wizards
And we are missing to have colleagues on other language versions to
discuss with, it is very complex.
Anders
Gerard Meijssen skrev 2014-02-05 12:16:
Hoi,
At Wikidata the number of items and the associated data is growing
steadily. We are dealing with the aftermath of some bots and to be honest,
that is also very much the name of the game.
An example: many species have been added in the ceb nl sv Wikipedia and it
would be wonderful if the "parent taxon" would be included [1] for all of
them. This is now happening in a "one at a time" fashion.
What is also happening is new information that is added in Wikidata from
external sources. I blogged about this [2] and in my opinion this is
fabulous. What is so great is that any Wikipedia that includes "Wikidata
search" to its extended search already benefits. Every community can
choose
to add stub articles based on the information in Wikidata.
In my opinion data that has some relevance can be included in Wikidata
particularly when it is rich in statements and references to external
sources. With great information in Wikidata, it becomes possible to use it
to build even more extensive stub articles. Such things are starting to
happen.
Bot created information is controversial in many Wikipedias. It is not in
Wikidata. Very welcome is all the data that enriches the items we already
know. Very welcome is the data on the things we do not yet know but
appreciate as relevant.
Thanks,
GerardM
[1]
http://ultimategerardm.blogspot.nl/2014/01/taxonomy-
where-there-is-nothing.html
[2]
http://ultimategerardm.blogspot.nl/2014/02/wikidata-
ntf4-human-gene.html
On 4 February 2014 09:31, Anders Wennersten <mail(a)anderswennersten.se>
wrote:
Nemo has found this wiki which I find very interesting [1]. it contains
1,68 million articles and seems to be a copy of
articles from Lithunian
Wikipedia + some 1,5 million botgenerated articles, with focus on species
(i know from Lsjbot that there are at least some 1,3 M articles of
species
to be found from reliable databases)
The effort seems to be done by just a few lithuanians wikipedians with
the
right technical skill and insight on wikipedia, they are probably active
also on ltwp[2].
For me it is a reminder what will happen if we continue to be sceptical
of
botgerneration of articles with correct info with verfied sources.
Creative
people will do it anyway and then outside Wikpedia, which could make
Wikipedia redundant in the same way Wikipedia has made the old
paperbased
encyclopedias redundant. The online encyclopedia with most knowledge to
the
readers will survive, and botgenerated verified articles contains more
knowledge then no article on the subject. Also note that the most active
now are languages like Vietnamese and Lithunian, with small communities
all
aware it will take eons of time if to expected these will be created
manually
I do would like the movement and upcoming strategy to make a proactive
stand re semiautomted articles
On sv:wp we have had this focus, since last august with including upload
on wikidata as part of the articlegeneration. We have found the inclusion
of Wikidata much more complex then we anticipated. We thought half a year
would be enough to "get a set of items with proper 100% quality data into
Wikidata", but we now think it will take something like two years for
just
a small set of 10000 articles :( This have not changed our belief in this
approach, but we would certainly appreciate it there were other entities
doing the same and with whom we could exchange experience (or a central
initiative)
Anders
[1]
Start page
http://lietuvai.lt/wiki/Pagrindinis_puslapis
Latest changes
http://lietuvai.lt/wiki/Specialus:Naujausi_puslapiai
For random article press Atsitiktinis puslapis <http://lietuvai.lt/wiki/
Specialus:Atsitiktinis_puslapis/Straipsnis>
[2]
ltwp
https://lt.wikipedia.org/wiki/Pagrindinis_puslapis
_______________________________________________
Wikimedia-l mailing list
Wikimedia-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list
Wikimedia-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list
Wikimedia-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>