Re: [Wikimedia-l] Botopedia?

5 Feb 2014


      Thanks for your input!
I agree that with Wikidata we can generate article content semiautomatic 
without the controversy we have seen as for now.
But our learning is it takes much more time then expected to get 
Wikidata operational on the data we want to get into it
For our data we are working with just now, Swedish entities like 
administrative units, towns, parishes, lakes etc we have found:
1.Before we load Wikidata we must have the identities correct on svwp, 
both name and official "entitynumber" and coordinates. We thought this 
would be simple but we find that it takes much longer then anticipated 
as we want this data to be 99,8% correct not 98% as we are used to have 
(if we do not load widata with top Q it can not be recommended as a 
general source of info for all versions).  And we find loads of problems 
and errors and ambiguous data in the sources we use from our authorities 
(besides typing errors in wp). No one has ever scrutinized this official 
data as we Wikipedians are doing now. And just for the basic 10000 
entities,  it will take our group of five-six up to a year to get this 
right
2.before we then load data in Wikidata we must have identified the 
correct properties and in many cases get new ones in place. It is just a 
few week since it was possible to enter populations, and an important 
property like geoshape is far away yet. And for the new property unique 
for our project we have to work through the wikidata defintionprocess, 
that can easily take 6-9 month for a single property, All of this must 
of course be ready before we start the actual loading.
3.The actual load in wikidata is then quite straight forward (by bot). 
But to make use of data in Wikidata we need to have new templates in 
place in our language version. And here we find we need for many 
dataitems to have modules written in Lua in order for the data to be 
handled in the template in order to present data correct in the articles
4.After loading wikidata we need to work through our articles for them 
to base their data on wikidata. Some of this is done without problems 
with using templates and going through the articles with bots. But we 
expect also there will be a need in several articles to make manual 
adjustment to get all correct (like factdata residing in the text 
portion that should now be taken from Wikdata)
But if we come this far, our articles will be perfect and we can produce 
a set of software, like templetes and modules that make implementation 
of these data in other language versions very easy
Another learning is that we actually will not be able to achieve our 
goal without the support of technical expertise (like knowledge in Lua, 
how to write datainterface to external dataproviders) . Right now we are 
discussing with our local chapter, if they can provide technical 
expertise when ours is not enough , we are after all wikipedians not 
tech wizards
And we are missing to have colleagues on other language versions to 
discuss with, it is very complex.
Anders
Gerard Meijssen skrev 2014-02-05 12:16:
...
Hoi,
At Wikidata the number of items and the associated data is growing
steadily. We are dealing with the aftermath of some bots and to be honest,
that is also very much the name of the game.
An example: many species have been added in the ceb nl sv Wikipedia and it
would be wonderful if the "parent taxon" would be included [1] for all of
them. This is now happening in a "one at a time" fashion.
What is also happening is new information that is added in Wikidata from
external sources. I blogged about this [2] and in my opinion this is
fabulous. What is so great is that any Wikipedia that includes "Wikidata
search" to its extended search already benefits. Every community can choose
to add stub articles based on the information in Wikidata.
In my opinion data that has some relevance can be included in Wikidata
particularly when it is rich in statements and references to external
sources. With great information in Wikidata, it becomes possible to use it
to build even more extensive stub articles. Such things are starting to
happen.
Bot created information is controversial in many Wikipedias. It is not in
Wikidata. Very welcome is all the data that enriches the items we already
know. Very welcome is the data on the things we do not yet know but
appreciate as relevant.
Thanks,
      GerardM
[1]
http://ultimategerardm.blogspot.nl/2014/01/taxonomy-where-there-is-nothing.h...
[2] http://ultimategerardm.blogspot.nl/2014/02/wikidata-ntf4-human-gene.html
On 4 February 2014 09:31, Anders Wennersten mail@anderswennersten.sewrote:
...
Nemo has found this wiki which I find very interesting [1]. it contains
1,68 million articles and seems to be a copy of articles from Lithunian
Wikipedia + some 1,5 million botgenerated articles, with focus on species
(i know from Lsjbot that there are at least some 1,3 M articles of species
to be found from reliable databases)
The effort seems to be done by just a few lithuanians wikipedians with the
right technical skill and insight on wikipedia, they are probably active
also on ltwp[2].
For me it is a reminder what will happen if we continue to be sceptical of
botgerneration of articles with correct info with verfied sources. Creative
people will do it anyway and then outside Wikpedia, which could make
  Wikipedia redundant in the same way Wikipedia has made the old paperbased
encyclopedias redundant. The online encyclopedia with most knowledge to the
readers will survive, and botgenerated verified articles contains more
knowledge then no article on the subject. Also note that the most active
now are languages like Vietnamese and Lithunian, with small communities all
aware it will take eons of time if to expected these will be created
manually
I do would like the movement and upcoming strategy to make a proactive
stand re semiautomted articles
On sv:wp we have had this focus, since last august with including upload
on wikidata as part of the articlegeneration. We have found the inclusion
of Wikidata much more complex then we anticipated. We thought half a year
would be enough to "get a set of items with proper 100% quality data into
Wikidata", but we now think it will take something like two years for just
a small set of 10000 articles :( This have not changed our belief in this
approach, but we would certainly appreciate it there were other entities
doing the same and with whom we could exchange experience (or a central
initiative)
Anders
[1]
Start page http://lietuvai.lt/wiki/Pagrindinis_puslapis
Latest  changes http://lietuvai.lt/wiki/Specialus:Naujausi_puslapiai
For random article press Atsitiktinis puslapis http://lietuvai.lt/wiki/
Specialus:Atsitiktinis_puslapis/Straipsnis
[2]
ltwp https://lt.wikipedia.org/wiki/Pagrindinis_puslapis
_______________________________________________
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe

Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Wikimedia-l] Botopedia?