FYI, useful new stats :)
We might want to build a directory of reports generated on ToolLabs somewhere in the analytics hub on mediawiki.org.
Erik
---------- Forwarded message ---------- From: Gerard Meijssen gerard.meijssen@gmail.com Date: Thu, Oct 17, 2013 at 10:26 PM Subject: [Wikidata-l] Statistics To: WikiData-l wikidata-l@lists.wikimedia.org
Hoi,
I do not know if you have seen the statistics compiled by Magnus [1]. They are up to date and useful.
I blogged about it [2]. As far as I am concerned, the biggest challenge we face is the lack of labels. Given that 280+ languages are represented in Wikidata it clearly demonstrates that Wikidata is useless as it is for most languages. Please tell me that I am wrong and explain why. Thanks, GerardM
[1] http://tools.wmflabs.org/wikidata-todo/stats.php [2] http://ultimategerardm.blogspot.nl/2013/10/statistics-for-wikidata.html
_______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On Thu, Oct 17, 2013 at 11:56 PM, Erik Moeller erik@wikimedia.org wrote:
FYI, useful new stats :)
We might want to build a directory of reports generated on ToolLabs somewhere in the analytics hub on mediawiki.org.
The dataviz is nice, but I don't really see how this is useful to anybody, except maybe the Wikidata community doing navel gazing.
Statements aren't, generally speaking, used in other Wikimedia projects hardly (such as in most Wikipedia infoxboxes) and thus aren't really visible to most Wikimedia readers. It doesn't seem like there's much incentive to improve verifiability of statements in Wikidata when they're not useful to anybody except maybe Google's Knowledge Graph. :P
A much more useful visualization would be the proportion of statements and other data from Wikidata actually referenced in places visible to users. Wikidata is not a project that is useful by itself, even if it was 100% perfectly verified by references. It only becomes useful through visibility to information consumers.
On Fri, Oct 18, 2013 at 12:16 AM, Steven Walling swalling@wikimedia.org wrote:
Statements aren't, generally speaking, used in other Wikimedia projects hardly (such as in most Wikipedia infoxboxes) and thus aren't really visible to most Wikimedia readers. It doesn't seem like there's much incentive to improve verifiability of statements in Wikidata when they're not useful to anybody except maybe Google's Knowledge Graph. :P
A much more useful visualization would be the proportion of statements and other data from Wikidata actually referenced in places visible to users. Wikidata is not a project that is useful by itself, even if it was 100% perfectly verified by references. It only becomes useful through visibility to information consumers.
I think everyone agrees that the next step for Wikidata is expansion into greater use in Wikipedia. As far as I can tell, the biggest reason this hasn't happened yet isn't community reluctance, but simply lack of support for critical data types (especially numbers) that are needed for mapping entire infoboxes against Wikidata items. Given that those types are still missing, tracking the current usage would likely not be hugely revelatory yet. That said, it'd be good for Wikidata to provide a mechanism for tracking usage automatically -- there are some manually maintained categories in different languages, but I doubt they paint a complete picture:
https://en.wikipedia.org/wiki/Category:Templates_using_data_from_Wikidata
If an automatic tracking mechanism existed, I agree that this would be one of the most valuable things to report as a key performance indicator for the project as a whole. In the absence of automatic tracking, an Erik Zachte style approach of parsing the dumps might at least be used to generate some interim reports.
I don't, however, agree with parts of your characterization, as 1) the growth of the Wikidata project itself does depend on metrics that reflects its internal characteristics, 2) Wikidata's growth is useful even if we don't yet see adoption at the level of templates, as it leads to the development of applications like http://tools.wmflabs.org/wikidata-todo/tempo_spatial_display.html , which with not a huge amount of effort could be turned into Wikipedia-embeddable content (and are also independently useful).
Erik
On Fri, Oct 18, 2013 at 12:28 AM, Erik Moeller erik@wikimedia.org wrote:
I think everyone agrees that the next step for Wikidata is expansion into greater use in Wikipedia. As far as I can tell, the biggest reason this hasn't happened yet isn't community reluctance, but simply lack of support for critical data types (especially numbers) that are needed for mapping entire infoboxes against Wikidata items.
That's not the case, to my understanding.
As explained in https://en.wikipedia.org/wiki/Wikipedia:Requests_for_comment/Wikidata_Phase_... most recent consensus on English Wikipedia is that, regardless of data type, the community only supports including material from Wikidata when it doesn't already exist in Wikipedia. Basically that means that the status quo on enwiki is that you're not allowed to replace local data with references to Wikidata, in any part of article content or templates.
Now, that's from April, and consensus can change. But there are still serious problems in my view. For instance: we're still replicating the Commons problem, where people want to edit something that appears locally relevant, but requires them to go to Yet Another Wiki. Part of that UX problem will be solved by unification of accounts across the wikis, and it's not as big a deal with interwiki links, which are tertiary information and pretty advanced. But when it comes to any template content that appears as part of articles... well, it pretty clearly *is* held up by community reluctance on a number of fronts, not just for lack of comprehensive representation of current data types.
On Fri, Oct 18, 2013 at 12:54 AM, Steven Walling swalling@wikimedia.org wrote:
As explained in https://en.wikipedia.org/wiki/Wikipedia:Requests_for_comment/Wikidata_Phase_... the most recent consensus on English Wikipedia is that, regardless of data type, the community only supports including material from Wikidata when it doesn't already exist in Wikipedia. Basically that means that the status quo on enwiki is that you're not allowed to replace local data with references to Wikidata, in any part of article content or templates.
(Getting a bit OT and we may want to move that particular thread to the relevant on-wiki page.)
I remember that RFC, and that was not my interpretation - rather, I interpret it as saying:
1) It's fine to go ahead and modify Template:Infobox_country to pull {{{capital}}} from Wikidata provided a locally specified value overrides the Wikidata one (this can be done at the template level by checking whether a value has been set for a given parameter).
2) It's fine to cautiously start removing explicit value specifications for the {{{capital}}} parameter in pages that call {{Infobox country}}.
Admittedly, 2) is a bit ambiguous in the RFC; Marc-Andre as the closer may be able to clarify whether I got it right.
But if you look at {{Infobox country}}'s actualy parameters, you'll see that very many of them cannot be represented in Wikidata currently.
You're spot on regarding the UX issues of course, but there's also a UX benefit while the majority of editing is still done in wikitext -- getting rid of parameter/value clutter simplifies the wikitext for users who have no interest in infobox editing.
Erik
On 10/18/2013 04:09 AM, Erik Moeller wrote:
- It's fine to cautiously start removing explicit value
specifications for the {{{capital}}} parameter in pages that call {{Infobox country}}.
Admittedly, 2) is a bit ambiguous in the RFC; Marc-Andre as the closer may be able to clarify whether I got it right.
IMO, that specific aspect wasn't discussed sufficiently clearly to say whether the consensus goes either way.
Definitely, there is consensus that locally specified values /must/ override Wikidata-derived ones when they are present; but there was very little discussion about whether it was appropriate to remove that local value or when it would be okay to do so.
Wearing my community hat, I'd recommend that this be proposed to the community first; possibly framing it as "look, it's been working pretty darn well; it makes sense to start defaulting to Wikidata where uncontroversial since the capability to override the local value remains in case of issues."
-- Marc
Erik Moeller, 18/10/2013 08:56:
FYI, useful new stats :)
We might want to build a directory of reports generated on ToolLabs somewhere in the analytics hub on mediawiki.org.
I'm not aware of this analytics hub, but I'm collecting on https://meta.wikimedia.org/wiki/Statistics links to the scattered statistics outside the main stats.wikimedia.org.
Nemo