Hoi, At Wikidata we often find issues with data imported from a Wikipedia. Lists have been produced with these issues on the Wikipedia involved and arguably they do present issues with the quality of Wikipedia or Wikidata for that matter. So far hardly anything resulted from such outreach.
When Wikipedia is a black box, not communicating about with the outside world, at some stage the situation becomes toxic. At this moment there are already those at Wikidata that argue not to bother about Wikipedia quality because in their view, Wikipedians do not care about its own quality.
Arguably known issues with quality are the easiest to solve.
There are many ways to approach this subject. It is indeed a quality issue both for Wikidata and Wikipedia. It can be seen as a research issue; how to deal with quality and how do such mechanisms function if at all.
I blogged about it.. Thanks, GerardM
http://ultimategerardm.blogspot.nl/2015/11/what-kind-of-box-is-wikipedia.htm...
Gerard, I think this was always the case. Most Wikidatans are as at home on Wikipedia as they are on Commons. The issue you describe also happened to Commons - both communities feel the other is less focussed on quality. Many Commonists spend hours on high quality images and these are rarely picked up by Wikipedia unless a Commonist notices and does so in their own language. There is no requirement for Wikipedians to get to know any other project and this is normal wiki behavior. We don't want anyone to feel pressured to do anything they feel uncomfortable doing. It's already difficult to get Wikipedians to do small tasks like add catagories to their articles. The list of things necessary to create an acceptable article on Wikipedia just seems to get longer and longer, while the associated work for illustrations of that article or for data of that article is not even mentioned in current AfC policies on Wikipedia. I have thought about this, but I still think we need to break down the list of things necessary to make new short articles on Wikipedia, not extend the list. So in summary, I think that what you describe is normal predictable behavior for a "Wikipedia support" project such as Commons and Wikidata. This will change as more and more external users find out that Commons and Wikidata are valuable resources in and of themselves. This is already the case for many GLAMs which have found collaborations with Commons to be valuable experiences. I have high hopes this will become the case for Wikidata as well. Jane
On Fri, Nov 20, 2015 at 8:18 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, At Wikidata we often find issues with data imported from a Wikipedia. Lists have been produced with these issues on the Wikipedia involved and arguably they do present issues with the quality of Wikipedia or Wikidata for that matter. So far hardly anything resulted from such outreach.
When Wikipedia is a black box, not communicating about with the outside world, at some stage the situation becomes toxic. At this moment there are already those at Wikidata that argue not to bother about Wikipedia quality because in their view, Wikipedians do not care about its own quality.
Arguably known issues with quality are the easiest to solve.
There are many ways to approach this subject. It is indeed a quality issue both for Wikidata and Wikipedia. It can be seen as a research issue; how to deal with quality and how do such mechanisms function if at all.
I blogged about it.. Thanks, GerardM
http://ultimategerardm.blogspot.nl/2015/11/what-kind-of-box-is-wikipedia.htm... _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
My experience is that pretty much all Wikimedians care about quality, though some have different, even diametrically opposed views as to what quality means and which things are cosmetic or crucial.
My experience of the sadly dormant death anomaly project https://meta.wikimedia.org/wiki/Death_anomalies_table was that people react positively to being told "here is a list of anomalies on your language wikipedia" especially if those anomalies are relatively serious. My experience of edits on many different languages is that wikipedians appreciate someone who improves articles, even if you don't speak their language. Dismissing any of our thousand wikis as a "black box" is I think less helpful.
One of the great opportunities of Wikidata is to do the sort of data driven anomaly finding that we pioneered with the death anomalies report. But we always need to remember that there are cultural difference between wikis, and not just in such things as the age at which we assume people are dead. Diplomacy is a useful skill in cross wiki work.
~~~~
On 20 November 2015 at 07:18, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, At Wikidata we often find issues with data imported from a Wikipedia. Lists have been produced with these issues on the Wikipedia involved and arguably they do present issues with the quality of Wikipedia or Wikidata for that matter. So far hardly anything resulted from such outreach.
When Wikipedia is a black box, not communicating about with the outside world, at some stage the situation becomes toxic. At this moment there are already those at Wikidata that argue not to bother about Wikipedia quality because in their view, Wikipedians do not care about its own quality.
Arguably known issues with quality are the easiest to solve.
There are many ways to approach this subject. It is indeed a quality issue both for Wikidata and Wikipedia. It can be seen as a research issue; how to deal with quality and how do such mechanisms function if at all.
I blogged about it.. Thanks, GerardM
http://ultimategerardm.blogspot.nl/2015/11/what-kind-of-box-is-wikipedia.htm...
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
For what it’s worth, I just took a look at:
https://en.wikipedia.org/wiki/Wikipedia:Database_reports/Living_people_on_EN...
being the first time I’ve heard about it. The problem I face with that list of people is that there is no way that I can filter it to people within the categories I work in. If I knew there were people from Queensland in the list, I’d happily fix them up (my knowledge of sources for Queensland is good) but I don’t intend to spend time on Swedish poets or military leaders in Senegal (because if you start thinking you can curate the whole of Wikipedia, that way lies madness). Whereas every now and again, I do help out with:
https://en.wikipedia.org/wiki/Category:Queensland_articles_missing_geocoordi...
precisely because it’s within my sphere of interest and expertise. I’m good with Queensland geo-locations. Similarly, I also do some disambiguation using
http://dispenser.homenet.org/~dispenser/cgi-bin/watchlist_points.py
because that filters articles to those on my watchlist.
I think if we could provide tools to filter these Wikipedia-wide lists/categories where work is needed into:
* Categories
* Projects
* Watchlists
I think it is much more likely people would help out because they could focus on articles on topics they care about.
Kerry
From: Wiki-research-l [mailto:wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of WereSpielChequers Sent: Friday, 20 November 2015 11:31 PM To: Research into Wikimedia content and communities wiki-research-l@lists.wikimedia.org Cc: WikiData-l wikidata-l@lists.wikimedia.org; Wikimedia Mailing List wikimedia-l@lists.wikimedia.org Subject: Re: [Wiki-research-l] Quality issues
My experience is that pretty much all Wikimedians care about quality, though some have different, even diametrically opposed views as to what quality means and which things are cosmetic or crucial.
My experience of the sadly dormant death anomaly project https://meta.wikimedia.org/wiki/Death_anomalies_table was that people react positively to being told "here is a list of anomalies on your language wikipedia" especially if those anomalies are relatively serious. My experience of edits on many different languages is that wikipedians appreciate someone who improves articles, even if you don't speak their language. Dismissing any of our thousand wikis as a "black box" is I think less helpful.
One of the great opportunities of Wikidata is to do the sort of data driven anomaly finding that we pioneered with the death anomalies report. But we always need to remember that there are cultural difference between wikis, and not just in such things as the age at which we assume people are dead. Diplomacy is a useful skill in cross wiki work.
~~~~
On 20 November 2015 at 07:18, Gerard Meijssen <gerard.meijssen@gmail.com mailto:gerard.meijssen@gmail.com > wrote:
Hoi,
At Wikidata we often find issues with data imported from a Wikipedia. Lists have been produced with these issues on the Wikipedia involved and arguably they do present issues with the quality of Wikipedia or Wikidata for that matter. So far hardly anything resulted from such outreach.
When Wikipedia is a black box, not communicating about with the outside world, at some stage the situation becomes toxic. At this moment there are already those at Wikidata that argue not to bother about Wikipedia quality because in their view, Wikipedians do not care about its own quality.
Arguably known issues with quality are the easiest to solve.
There are many ways to approach this subject. It is indeed a quality issue both for Wikidata and Wikipedia. It can be seen as a research issue; how to deal with quality and how do such mechanisms function if at all.
I blogged about it..
Thanks,
GerardM
http://ultimategerardm.blogspot.nl/2015/11/what-kind-of-box-is-wikipedia.htm...
_______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org mailto:Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Hoi, Kerry using Wikidata it is actually possible to filter the people from Queensland. The point is however that it takes Wikidata and a query to produce this. Using the wonderful tools produced by Magnus, it is relatively easy to do this.
Wikidata is a great tool to combine things it is however only in combination with other sources that its data becomes relevant in a work flow kind of way. Its data can help any source to do comparisons. In the process both Wikidata and those sources will become a better mouse trap for quality information.
Yes, there are cultural differences. However, we can agree that quality is important and we need to work towards more data and from that to data of a higher quality. Thanks, GerardM
On 20 November 2015 at 22:23, Kerry Raymond kerry.raymond@gmail.com wrote:
For what it’s worth, I just took a look at:
https://en.wikipedia.org/wiki/Wikipedia:Database_reports/Living_people_on_EN...
being the first time I’ve heard about it. The problem I face with that list of people is that there is no way that I can filter it to people within the categories I work in. If I knew there were people from Queensland in the list, I’d happily fix them up (my knowledge of sources for Queensland is good) but I don’t intend to spend time on Swedish poets or military leaders in Senegal (because if you start thinking you can curate the whole of Wikipedia, that way lies madness). Whereas every now and again, I do help out with:
https://en.wikipedia.org/wiki/Category:Queensland_articles_missing_geocoordi...
precisely because it’s within my sphere of interest and expertise. I’m good with Queensland geo-locations. Similarly, I also do some disambiguation using
http://dispenser.homenet.org/~dispenser/cgi-bin/watchlist_points.py
because that filters articles to those on my watchlist.
I think if we could provide tools to filter these Wikipedia-wide lists/categories where work is needed into:
· Categories
· Projects
· Watchlists
I think it is much more likely people would help out because they could focus on articles on topics they care about.
Kerry
*From:* Wiki-research-l [mailto: wiki-research-l-bounces@lists.wikimedia.org] *On Behalf Of * WereSpielChequers *Sent:* Friday, 20 November 2015 11:31 PM *To:* Research into Wikimedia content and communities < wiki-research-l@lists.wikimedia.org> *Cc:* WikiData-l wikidata-l@lists.wikimedia.org; Wikimedia Mailing List wikimedia-l@lists.wikimedia.org *Subject:* Re: [Wiki-research-l] Quality issues
My experience is that pretty much all Wikimedians care about quality, though some have different, even diametrically opposed views as to what quality means and which things are cosmetic or crucial.
My experience of the sadly dormant death anomaly project https://meta.wikimedia.org/wiki/Death_anomalies_table was that people react positively to being told "here is a list of anomalies on your language wikipedia" especially if those anomalies are relatively serious. My experience of edits on many different languages is that wikipedians appreciate someone who improves articles, even if you don't speak their language. Dismissing any of our thousand wikis as a "black box" is I think less helpful.
One of the great opportunities of Wikidata is to do the sort of data driven anomaly finding that we pioneered with the death anomalies report. But we always need to remember that there are cultural difference between wikis, and not just in such things as the age at which we assume people are dead. Diplomacy is a useful skill in cross wiki work.
On 20 November 2015 at 07:18, Gerard Meijssen <gerard.meijssen@gmail.com> wrote: Hoi, At Wikidata we often find issues with data imported from a Wikipedia. Lists have been produced with these issues on the Wikipedia involved and arguably they do present issues with the quality of Wikipedia or Wikidata for that matter. So far hardly anything resulted from such outreach. When Wikipedia is a black box, not communicating about with the outside world, at some stage the situation becomes toxic. At this moment there are already those at Wikidata that argue not to bother about Wikipedia quality because in their view, Wikipedians do not care about its own quality. Arguably known issues with quality are the easiest to solve. There are many ways to approach this subject. It is indeed a quality issue both for Wikidata and Wikipedia. It can be seen as a research issue; how to deal with quality and how do such mechanisms function if at all. I blogged about it.. Thanks, GerardM http://ultimategerardm.blogspot.nl/2015/11/what-kind-of-box-is-wikipedia.html _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Gerard
I am sure it is possible but if nobody provides any easy way to do it (or detailed instructions on how to do it yourself), I am unsurprised ordinary editors don’t bother. I am not a Wikidata contributor; I don’t know how any of it works. I don’t know Lua. I hear constant claims of how wonderful it is, but never get any information on how to get started. I have looked on the Wikidata main pages but come up empty handed. I cannot even find the “upload” for the spreadsheets (it’s not on the left hand tool bar, it doesn’t come up in a Help search). I’m not the first Australian to ask the question about the census data, but all we get is “it’s possible” but never “this is how to do it”. Where do we ask that question where we might get an answer?
We had hoped to get some funding for an Australian conference at which we had hoped to get someone involved with Wikidata to give a keynote address and tutorial/workshop on this census data issue, but WMF knocked us back on the funding so it didn’t happen. However, we could probably use from WMAU funding to bring out an individual to give a workshop.
Kerry
From: Gerard Meijssen [mailto:gerard.meijssen@gmail.com] Sent: Saturday, 21 November 2015 8:41 AM To: Kerry Raymond kerry.raymond@gmail.com; Research into Wikimedia content and communities wiki-research-l@lists.wikimedia.org Subject: Re: [Wiki-research-l] Quality issues
Hoi,
Kerry using Wikidata it is actually possible to filter the people from Queensland. The point is however that it takes Wikidata and a query to produce this. Using the wonderful tools produced by Magnus, it is relatively easy to do this.
Wikidata is a great tool to combine things it is however only in combination with other sources that its data becomes relevant in a work flow kind of way. Its data can help any source to do comparisons. In the process both Wikidata and those sources will become a better mouse trap for quality information.
Yes, there are cultural differences. However, we can agree that quality is important and we need to work towards more data and from that to data of a higher quality.
Thanks,
GerardM
On 20 November 2015 at 22:23, Kerry Raymond <kerry.raymond@gmail.com mailto:kerry.raymond@gmail.com > wrote:
For what it’s worth, I just took a look at:
https://en.wikipedia.org/wiki/Wikipedia:Database_reports/Living_people_on_EN...
being the first time I’ve heard about it. The problem I face with that list of people is that there is no way that I can filter it to people within the categories I work in. If I knew there were people from Queensland in the list, I’d happily fix them up (my knowledge of sources for Queensland is good) but I don’t intend to spend time on Swedish poets or military leaders in Senegal (because if you start thinking you can curate the whole of Wikipedia, that way lies madness). Whereas every now and again, I do help out with:
https://en.wikipedia.org/wiki/Category:Queensland_articles_missing_geocoordi...
precisely because it’s within my sphere of interest and expertise. I’m good with Queensland geo-locations. Similarly, I also do some disambiguation using
http://dispenser.homenet.org/~dispenser/cgi-bin/watchlist_points.py
because that filters articles to those on my watchlist.
I think if we could provide tools to filter these Wikipedia-wide lists/categories where work is needed into:
* Categories
* Projects
* Watchlists
I think it is much more likely people would help out because they could focus on articles on topics they care about.
Kerry
From: Wiki-research-l [mailto:wiki-research-l-bounces@lists.wikimedia.org mailto:wiki-research-l-bounces@lists.wikimedia.org ] On Behalf Of WereSpielChequers Sent: Friday, 20 November 2015 11:31 PM To: Research into Wikimedia content and communities <wiki-research-l@lists.wikimedia.org mailto:wiki-research-l@lists.wikimedia.org > Cc: WikiData-l <wikidata-l@lists.wikimedia.org mailto:wikidata-l@lists.wikimedia.org >; Wikimedia Mailing List <wikimedia-l@lists.wikimedia.org mailto:wikimedia-l@lists.wikimedia.org > Subject: Re: [Wiki-research-l] Quality issues
My experience is that pretty much all Wikimedians care about quality, though some have different, even diametrically opposed views as to what quality means and which things are cosmetic or crucial.
My experience of the sadly dormant death anomaly project https://meta.wikimedia.org/wiki/Death_anomalies_table was that people react positively to being told "here is a list of anomalies on your language wikipedia" especially if those anomalies are relatively serious. My experience of edits on many different languages is that wikipedians appreciate someone who improves articles, even if you don't speak their language. Dismissing any of our thousand wikis as a "black box" is I think less helpful.
One of the great opportunities of Wikidata is to do the sort of data driven anomaly finding that we pioneered with the death anomalies report. But we always need to remember that there are cultural difference between wikis, and not just in such things as the age at which we assume people are dead. Diplomacy is a useful skill in cross wiki work.
~~~~
On 20 November 2015 at 07:18, Gerard Meijssen <gerard.meijssen@gmail.com mailto:gerard.meijssen@gmail.com > wrote:
Hoi,
At Wikidata we often find issues with data imported from a Wikipedia. Lists have been produced with these issues on the Wikipedia involved and arguably they do present issues with the quality of Wikipedia or Wikidata for that matter. So far hardly anything resulted from such outreach.
When Wikipedia is a black box, not communicating about with the outside world, at some stage the situation becomes toxic. At this moment there are already those at Wikidata that argue not to bother about Wikipedia quality because in their view, Wikipedians do not care about its own quality.
Arguably known issues with quality are the easiest to solve.
There are many ways to approach this subject. It is indeed a quality issue both for Wikidata and Wikipedia. It can be seen as a research issue; how to deal with quality and how do such mechanisms function if at all.
I blogged about it..
Thanks,
GerardM
http://ultimategerardm.blogspot.nl/2015/11/what-kind-of-box-is-wikipedia.htm...
_______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org mailto:Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org mailto:Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Gerard
Can you provide some URLs for these lists and blog postings please?
I think part of the problem may be that the information never reaches “ordinary editors”. Communication channels on our projects are very poor. I read article talk pages and the Australian Wikipedians Noticeboard, but not a lot of other places.
However, I have a problem and I wonder if Wikidata can help with it. We have a census in Australia every 5 years and the population data from the most recent census (2011) is a standard item in every lede and infobox for any Australian place (town/suburb/locality) article on en.WP at least. However, maintaining that information is a massive tedious manual task. As a consequence, we still have lots of articles with 2006 census data while the 2016 census is coming at us like a freight train. The 2016 census will be the first one done primarily online (normally we fill out a long paper form and so there are months of data entry which delays the release of the data) and the data will be released around mid-2017. Now all this population data is available as spreadsheets under CC-BY license.
My question is this. Can we update these spreadsheets into Wikidata and then create some kind of template on en.WP which can extract that data from Wikidata. I am thinking something like:
{{CensusAUlatest|QLD|Childers}}
Which we could embed in, say, the lede and which would produce something like
In the 2016 Australian census, Childers reported a population of 12,345. <ref>….</ref>
Where the 12,345 (and probably some components of the citation) would be extracted from the 2016 spreadsheet entry for Childers. I’ve asked a few people if this is possible to automate in this way and I get the standard response “it might be but I don’t know enough about Wikidata”.
We have a similar problem with climate data where again we can probably obtain spreadsheets with the data under a suitable license if we had a way to automatically incorporate it into articles within the current massive manual effort.
Do you have any advice for us? I am sure we are not the only nation with this census problem, although I realise that in some countries the data may not be released in suitable formats or with suitable licenses.
Kerry
From: Wiki-research-l [mailto:wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of Gerard Meijssen Sent: Friday, 20 November 2015 5:18 PM To: Wikimedia Mailing List wikimedia-l@lists.wikimedia.org; Research into Wikimedia content and communities wiki-research-l@lists.wikimedia.org; WikiData-l wikidata-l@lists.wikimedia.org Subject: [Wiki-research-l] Quality issues
Hoi,
At Wikidata we often find issues with data imported from a Wikipedia. Lists have been produced with these issues on the Wikipedia involved and arguably they do present issues with the quality of Wikipedia or Wikidata for that matter. So far hardly anything resulted from such outreach.
When Wikipedia is a black box, not communicating about with the outside world, at some stage the situation becomes toxic. At this moment there are already those at Wikidata that argue not to bother about Wikipedia quality because in their view, Wikipedians do not care about its own quality.
Arguably known issues with quality are the easiest to solve.
There are many ways to approach this subject. It is indeed a quality issue both for Wikidata and Wikipedia. It can be seen as a research issue; how to deal with quality and how do such mechanisms function if at all.
I blogged about it..
Thanks,
GerardM
http://ultimategerardm.blogspot.nl/2015/11/what-kind-of-box-is-wikipedia.htm...
Hoi, Yes.
Data in Wikidata can be dated and the latest data can be indicated as current. As I understand LUA may be used to use the latest data from Wikidata. So yes, you can upload the census data to Wikidata and use templates in any Wikipedia to show the latest data for any and all Australian settlements.
I am not the right guy to ask for the LUA code, it is why I included Wikidata-l. Thanks, GerardM
On 20 November 2015 at 22:44, Kerry Raymond kerry.raymond@gmail.com wrote:
Gerard
Can you provide some URLs for these lists and blog postings please?
I think part of the problem may be that the information never reaches “ordinary editors”. Communication channels on our projects are very poor. I read article talk pages and the Australian Wikipedians Noticeboard, but not a lot of other places.
However, I have a problem and I wonder if Wikidata can help with it. We have a census in Australia every 5 years and the population data from the most recent census (2011) is a standard item in every lede and infobox for any Australian place (town/suburb/locality) article on en.WP at least. However, maintaining that information is a massive tedious manual task. As a consequence, we still have lots of articles with 2006 census data while the 2016 census is coming at us like a freight train. The 2016 census will be the first one done primarily online (normally we fill out a long paper form and so there are months of data entry which delays the release of the data) and the data will be released around mid-2017. Now all this population data is available as spreadsheets under CC-BY license.
My question is this. Can we update these spreadsheets into Wikidata and then create some kind of template on en.WP which can extract that data from Wikidata. I am thinking something like:
{{CensusAUlatest|QLD|Childers}}
Which we could embed in, say, the lede and which would produce something like
In the 2016 Australian census, Childers reported a population of 12,345. <ref>….</ref>
Where the 12,345 (and probably some components of the citation) would be extracted from the 2016 spreadsheet entry for Childers. I’ve asked a few people if this is possible to automate in this way and I get the standard response “it might be but I don’t know enough about Wikidata”.
We have a similar problem with climate data where again we can probably obtain spreadsheets with the data under a suitable license if we had a way to automatically incorporate it into articles within the current massive manual effort.
Do you have any advice for us? I am sure we are not the only nation with this census problem, although I realise that in some countries the data may not be released in suitable formats or with suitable licenses.
Kerry
*From:* Wiki-research-l [mailto: wiki-research-l-bounces@lists.wikimedia.org] *On Behalf Of *Gerard Meijssen *Sent:* Friday, 20 November 2015 5:18 PM *To:* Wikimedia Mailing List wikimedia-l@lists.wikimedia.org; Research into Wikimedia content and communities < wiki-research-l@lists.wikimedia.org>; WikiData-l < wikidata-l@lists.wikimedia.org> *Subject:* [Wiki-research-l] Quality issues
Hoi,
At Wikidata we often find issues with data imported from a Wikipedia. Lists have been produced with these issues on the Wikipedia involved and arguably they do present issues with the quality of Wikipedia or Wikidata for that matter. So far hardly anything resulted from such outreach.
When Wikipedia is a black box, not communicating about with the outside world, at some stage the situation becomes toxic. At this moment there are already those at Wikidata that argue not to bother about Wikipedia quality because in their view, Wikipedians do not care about its own quality.
Arguably known issues with quality are the easiest to solve.
There are many ways to approach this subject. It is indeed a quality issue both for Wikidata and Wikipedia. It can be seen as a research issue; how to deal with quality and how do such mechanisms function if at all.
I blogged about it..
Thanks,
GerardM
http://ultimategerardm.blogspot.nl/2015/11/what-kind-of-box-is-wikipedia.htm...
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Kerry, Yes, it is possible to extract data with templates. A very simple demonstration (and about my limit of template-wrangling...), is: go to (random Queensland city article) https://en.wikipedia.org/wiki/Bundaberg and preview (don't save!) an edit with "{{#property:p1082}}" pasted in. Property1082 is "population", and you'll see that it shows the 2014 population as specified at https://www.wikidata.org/wiki/Q185404 (Bundaberg). Also, notice those 3 tiny vertical squares next to each value at Wikidata: http://i.imgur.com/8h6pGoQ.png Those are the "rank https://www.wikidata.org/wiki/Help:Ranking" and only one of them can be marked as "preferred"; in the Bundaberg population that is the most recent 2014 value. Hence it knows which of the many historical population numbers to use. There are a few live examples of data extraction in https://en.wikipedia.org/wiki/Category:Templates_using_data_from_Wikidata - A good one is Template:Infobox_anatomy - e.g. if you check the wikitext at https://en.wikipedia.org/wiki/Skin you'll see that no specific values are given for 3 of the parameters that are showing in the infobox, they're all coming from Wikidata. At Enwiki, the place to request help with the creation of specific templates, is https://en.wikipedia.org/wiki/Wikipedia:Requested_templates
Currently, Wikidata advises only using data within other templates (such as infoboxes and authority control boxes) and not in prose. This is also often individually discussed at each community, e.g. Enwiki last discussed it exhaustively in 2013, with relevant conclusions at https://en.wikipedia.org/wiki/Wikipedia:Wikidata#Inserting_Wikidata_values_i... - the talkpage there, would be another good place to ask for help or clarification.
Adding info to Wikidata, is sadly not as simple as uploading a spreadsheet. It is done one change at a time, though is often sped along via bots/scripts/tools. I would suggest asking https://en.wikipedia.org/wiki/User:Mattinbgn who added the population data to Bundaberg how they did it https://www.wikidata.org/w/index.php?title=Q185404&action=history - If that editor doesn't know how to semi-automate the process for bulk data, then ask for help at https://www.wikidata.org/wiki/Wikidata:Bot_requests (or perhaps skim old requests https://www.wikidata.org/wiki/Special:Search?search=population&prefix=Wikidata%3ABot+requests%2FArchive%2F&fulltext=Search+the+archives&fulltext=Search if so inclined). The very short overview, is https://www.wikidata.org/wiki/Wikidata:Data_donation
Hope that helps, or at least leads you towards clearer answers Quiddity (n.b. this is all with my volunteer hat (coming from the wrong email account, but the one subscribed to this list) and might contain errors (corrections appreciated). I just have an amateur enthusiasm for Wikidata, and look forward to the time when infoboxes at all Wikipedias contain up-to-date statistics with minimal redundant effort. :-)
On Fri, Nov 20, 2015 at 1:44 PM, Kerry Raymond kerry.raymond@gmail.com wrote:
[...]
However, I have a problem and I wonder if Wikidata can help with it. We have a census in Australia every 5 years and the population data from the most recent census (2011) is a standard item in every lede and infobox for any Australian place (town/suburb/locality) article on en.WP at least. However, maintaining that information is a massive tedious manual task. As a consequence, we still have lots of articles with 2006 census data while the 2016 census is coming at us like a freight train. The 2016 census will be the first one done primarily online (normally we fill out a long paper form and so there are months of data entry which delays the release of the data) and the data will be released around mid-2017. Now all this population data is available as spreadsheets under CC-BY license.
My question is this. Can we update these spreadsheets into Wikidata and then create some kind of template on en.WP which can extract that data from Wikidata. I am thinking something like:
{{CensusAUlatest|QLD|Childers}}
Which we could embed in, say, the lede and which would produce something like
In the 2016 Australian census, Childers reported a population of 12,345. <ref>….</ref>
Where the 12,345 (and probably some components of the citation) would be extracted from the 2016 spreadsheet entry for Childers. I’ve asked a few people if this is possible to automate in this way and I get the standard response “it might be but I don’t know enough about Wikidata”.
We have a similar problem with climate data where again we can probably obtain spreadsheets with the data under a suitable license if we had a way to automatically incorporate it into articles within the current massive manual effort.
Do you have any advice for us? I am sure we are not the only nation with this census problem, although I realise that in some countries the data may not be released in suitable formats or with suitable licenses.
Kerry
*From:* Wiki-research-l [mailto: wiki-research-l-bounces@lists.wikimedia.org] *On Behalf Of *Gerard Meijssen *Sent:* Friday, 20 November 2015 5:18 PM *To:* Wikimedia Mailing List wikimedia-l@lists.wikimedia.org; Research into Wikimedia content and communities < wiki-research-l@lists.wikimedia.org>; WikiData-l < wikidata-l@lists.wikimedia.org> *Subject:* [Wiki-research-l] Quality issues
Hoi,
At Wikidata we often find issues with data imported from a Wikipedia. Lists have been produced with these issues on the Wikipedia involved and arguably they do present issues with the quality of Wikipedia or Wikidata for that matter. So far hardly anything resulted from such outreach.
When Wikipedia is a black box, not communicating about with the outside world, at some stage the situation becomes toxic. At this moment there are already those at Wikidata that argue not to bother about Wikipedia quality because in their view, Wikipedians do not care about its own quality.
Arguably known issues with quality are the easiest to solve.
There are many ways to approach this subject. It is indeed a quality issue both for Wikidata and Wikipedia. It can be seen as a research issue; how to deal with quality and how do such mechanisms function if at all.
I blogged about it..
Thanks,
GerardM
http://ultimategerardm.blogspot.nl/2015/11/what-kind-of-box-is-wikipedia.htm...
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wiki-research-l@lists.wikimedia.org