Hoi, Anders, I am afraid that the way you describe is one where perfection is the enemy of the good.
Wikidata is full of imperfections. It is incomplete and often so wrong... how about prime ministers of the United Kingdom who have been dead for centuries featuring as an actor in several movies ???
When the data that are to be used in stubs or articles are uploaded to Wikidata you can as happily improve them in Wikidata as anywhere else. What is possible is to have tools on Wikidata like Reasonator that will help you get to grips with the consistency.
When you state that it takes months to get the properties in Wikidata that are needed to do your project, I find it a big problem at Wikidata and definitely something that needs attention. When you know what the set of properties it is you need, you can propose them as a lot and make it obvious that they have to be considered together. As you know the quantity type of properties have been released. So this is a good time to make your proposals.
Most of all, Wikidata is a wiki. You indicate that the official sources need work. Wikidata is a good place to work on this. When it is a years work for you to beaver away, it may find more people to work with you on solutions at Wikidata. Thanks, GerardM
On 5 February 2014 13:21, Anders Wennersten mail@anderswennersten.sewrote:
Thanks for your input!
I agree that with Wikidata we can generate article content semiautomatic without the controversy we have seen as for now.
But our learning is it takes much more time then expected to get Wikidata operational on the data we want to get into it
For our data we are working with just now, Swedish entities like administrative units, towns, parishes, lakes etc we have found: 1.Before we load Wikidata we must have the identities correct on svwp, both name and official "entitynumber" and coordinates. We thought this would be simple but we find that it takes much longer then anticipated as we want this data to be 99,8% correct not 98% as we are used to have (if we do not load widata with top Q it can not be recommended as a general source of info for all versions). And we find loads of problems and errors and ambiguous data in the sources we use from our authorities (besides typing errors in wp). No one has ever scrutinized this official data as we Wikipedians are doing now. And just for the basic 10000 entities, it will take our group of five-six up to a year to get this right 2.before we then load data in Wikidata we must have identified the correct properties and in many cases get new ones in place. It is just a few week since it was possible to enter populations, and an important property like geoshape is far away yet. And for the new property unique for our project we have to work through the wikidata defintionprocess, that can easily take 6-9 month for a single property, All of this must of course be ready before we start the actual loading. 3.The actual load in wikidata is then quite straight forward (by bot). But to make use of data in Wikidata we need to have new templates in place in our language version. And here we find we need for many dataitems to have modules written in Lua in order for the data to be handled in the template in order to present data correct in the articles 4.After loading wikidata we need to work through our articles for them to base their data on wikidata. Some of this is done without problems with using templates and going through the articles with bots. But we expect also there will be a need in several articles to make manual adjustment to get all correct (like factdata residing in the text portion that should now be taken from Wikdata)
But if we come this far, our articles will be perfect and we can produce a set of software, like templetes and modules that make implementation of these data in other language versions very easy
Another learning is that we actually will not be able to achieve our goal without the support of technical expertise (like knowledge in Lua, how to write datainterface to external dataproviders) . Right now we are discussing with our local chapter, if they can provide technical expertise when ours is not enough , we are after all wikipedians not tech wizards
And we are missing to have colleagues on other language versions to discuss with, it is very complex.
Anders
Gerard Meijssen skrev 2014-02-05 12:16:
Hoi,
At Wikidata the number of items and the associated data is growing steadily. We are dealing with the aftermath of some bots and to be honest, that is also very much the name of the game.
An example: many species have been added in the ceb nl sv Wikipedia and it would be wonderful if the "parent taxon" would be included [1] for all of them. This is now happening in a "one at a time" fashion.
What is also happening is new information that is added in Wikidata from external sources. I blogged about this [2] and in my opinion this is fabulous. What is so great is that any Wikipedia that includes "Wikidata search" to its extended search already benefits. Every community can choose to add stub articles based on the information in Wikidata.
In my opinion data that has some relevance can be included in Wikidata particularly when it is rich in statements and references to external sources. With great information in Wikidata, it becomes possible to use it to build even more extensive stub articles. Such things are starting to happen.
Bot created information is controversial in many Wikipedias. It is not in Wikidata. Very welcome is all the data that enriches the items we already know. Very welcome is the data on the things we do not yet know but appreciate as relevant. Thanks, GerardM
[1] http://ultimategerardm.blogspot.nl/2014/01/taxonomy- where-there-is-nothing.html [2] http://ultimategerardm.blogspot.nl/2014/02/wikidata- ntf4-human-gene.html
On 4 February 2014 09:31, Anders Wennersten mail@anderswennersten.se wrote:
Nemo has found this wiki which I find very interesting [1]. it contains
1,68 million articles and seems to be a copy of articles from Lithunian Wikipedia + some 1,5 million botgenerated articles, with focus on species (i know from Lsjbot that there are at least some 1,3 M articles of species to be found from reliable databases)
The effort seems to be done by just a few lithuanians wikipedians with the right technical skill and insight on wikipedia, they are probably active also on ltwp[2].
For me it is a reminder what will happen if we continue to be sceptical of botgerneration of articles with correct info with verfied sources. Creative people will do it anyway and then outside Wikpedia, which could make Wikipedia redundant in the same way Wikipedia has made the old paperbased encyclopedias redundant. The online encyclopedia with most knowledge to the readers will survive, and botgenerated verified articles contains more knowledge then no article on the subject. Also note that the most active now are languages like Vietnamese and Lithunian, with small communities all aware it will take eons of time if to expected these will be created manually
I do would like the movement and upcoming strategy to make a proactive stand re semiautomted articles
On sv:wp we have had this focus, since last august with including upload on wikidata as part of the articlegeneration. We have found the inclusion of Wikidata much more complex then we anticipated. We thought half a year would be enough to "get a set of items with proper 100% quality data into Wikidata", but we now think it will take something like two years for just a small set of 10000 articles :( This have not changed our belief in this approach, but we would certainly appreciate it there were other entities doing the same and with whom we could exchange experience (or a central initiative)
Anders
[1] Start page http://lietuvai.lt/wiki/Pagrindinis_puslapis Latest changes http://lietuvai.lt/wiki/Specialus:Naujausi_puslapiai For random article press Atsitiktinis puslapis http://lietuvai.lt/wiki/ Specialus:Atsitiktinis_puslapis/Straipsnis [2] ltwp https://lt.wikipedia.org/wiki/Pagrindinis_puslapis _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe