I think it's a bit of time from now, but there are several open source Watson-like machine reading tools coming along, any one of which could be put to the task of interest here. Could do more than that, but it's a start.
On Sat, Oct 25, 2014 at 1:41 PM, Kerry Raymond kerry.raymond@gmail.com wrote:
I think it's pointless to argue over what we mean by "quality" or "well written" in general. It is fair to say that there are a lot of mechanically derivable metrics for articles including:
- number of citations
- number of unique citations
- article length
- density of citations, unique citations relative to article length
- ditto for photos, infoboxes, navbox, categories etc
- linguistic analysis like sentence length, Flesch-Kincaid readability
scores
- Age of article
- Number of editors
- Number of page views
- Density of ...
- number of reverts
- reverts per editor/year/etc ..
- number of inbound links, number of outbound links, number of redlinks
- manual quality assessments (usually in project tags)
- presence of "issue" tags, e.g. refimprove, citation needed, etc
It seem to be that if we had a tool that could generate a wide range of these sort of metrics, folks could then put their own algorithm over the top to compute and weight whatever combination of them makes sense for their particular purpose.
Kerry
-----Original Message----- From: wiki-research-l-bounces@lists.wikimedia.org [mailto:wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of Ziko van Dijk Sent: Saturday, 25 October 2014 11:28 PM To: Research into Wikimedia content and communities Subject: Re: [Wiki-research-l] Tool to find poorly written articles
Okay. What do you think of the wikibu tool from Switzerland? It believes that the number of editors and readers etc are indicators for the quality, or at least a basis to discuss. Kind regards Ziko
http://www.wikibu.ch/search.php?search=Frankfurter+Nationalversammlung
2014-10-25 14:44 GMT+02:00 Ditty Mathew dittyvkm@gmail.com:
Hi Ziko,
You are right. But if the content of the article is very less or having
less
references, less edits, less no of images, less no of links etc, articles are of poor quality. Based on these factors, to some extent we can find
the
quality of article.
with regards
Ditty
On Sat, Oct 25, 2014 at 8:23 AM, Ziko van Dijk zvandijk@gmail.com
wrote:
Hello Ditty,
It is difficult for me to understand your question if you are not more specific of what you consider a "poorly written article". "Poorly" can refer her to many different things, like readability, grammar, balance, statements supported by 'sources', good division of knowledge over several articles etc.
I think that software tools can only give a hint, but the judgement (how "good" is an article) can be done only by a human, on the basis of concrete criteria what is meant to be "good", and for what target group. I tend to say that some Wikipedia articles are "good" for experts but at the same time unsuitable for the general public.
E.g., a software tool can count the words per sentence, but long sentences are not necessarily good or bad by themselves.
Etc. :-)
Kind regards Ziko
2014-10-25 1:47 GMT+02:00 Joe Corneli holtzermann17@gmail.com:
On Sat, Oct 25 2014, WereSpielChequers wrote:
And just to add to the complexity of James' comments; there are some people who think that a general interest encyclopaedia should be written for
a
general audience. So articles with long sentences should be improved
by
rewriting into more but shorter sentences,
How about an even simpler version of the problem: an encyclopedia written by robots for robots. I speak, of course, of DBPedia. We
could
equally ask, what makes for quality entries there?
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l