I think it's a bit of time from now, but there are several open source Watson-like machine reading tools coming along, any one of which could be put to the task of interest here.  Could do more than that, but it's a start.


On Sat, Oct 25, 2014 at 1:41 PM, Kerry Raymond <kerry.raymond@gmail.com> wrote:
I think it's pointless to argue over what we mean by "quality" or "well
written" in general. It is fair to say that there are a lot of mechanically
derivable metrics for articles including:

* number of citations
* number of unique citations
* article length
* density of citations, unique citations relative to article length
* ditto for photos, infoboxes, navbox, categories etc
* linguistic analysis like sentence length, Flesch-Kincaid readability
scores
* Age of article
* Number of editors
* Number of page views
* Density of ...
* number of reverts
* reverts per editor/year/etc ..
* number of inbound links, number of outbound links, number of redlinks
* manual quality assessments (usually in project tags)
* presence of "issue" tags, e.g. refimprove, citation needed, etc

It seem to be that if we had a tool that could generate a wide range of
these sort of metrics, folks could then put their own algorithm over the top
to compute and weight whatever combination of them makes sense for their
particular purpose.

Kerry

-----Original Message-----
From: wiki-research-l-bounces@lists.wikimedia.org
[mailto:wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of Ziko van
Dijk
Sent: Saturday, 25 October 2014 11:28 PM
To: Research into Wikimedia content and communities
Subject: Re: [Wiki-research-l] Tool to find poorly written articles

Okay. What do you think of the wikibu tool from Switzerland? It
believes that the number of editors and readers etc are indicators for
the quality, or at least a basis to discuss.
Kind regards
Ziko

http://www.wikibu.ch/search.php?search=Frankfurter+Nationalversammlung

2014-10-25 14:44 GMT+02:00 Ditty Mathew <dittyvkm@gmail.com>:
> Hi Ziko,
>
> You are right. But if the content of the article is very less or having
less
> references, less edits, less no of images, less no of links etc, articles
> are of poor quality. Based on these factors, to some extent we can find
the
> quality of article.
>
> with regards
>
> Ditty
>
> On Sat, Oct 25, 2014 at 8:23 AM, Ziko van Dijk <zvandijk@gmail.com> wrote:
>>
>> Hello Ditty,
>>
>> It is difficult for me to understand your question if you are not more
>> specific of what you consider a "poorly written article". "Poorly" can
>> refer her to many different things, like readability, grammar,
>> balance, statements supported by 'sources', good division of knowledge
>> over several articles etc.
>>
>> I think that software tools can only give a hint, but the judgement
>> (how "good" is an article) can be done only by a human, on the basis
>> of concrete criteria what is meant to be "good", and for what target
>> group. I tend to say that some Wikipedia articles are "good" for
>> experts but at the same time unsuitable for the general public.
>>
>> E.g., a software tool can count the words per sentence, but long
>> sentences are not necessarily good or bad by themselves.
>>
>> Etc. :-)
>>
>> Kind regards
>> Ziko
>>
>>
>>
>>
>>
>>
>>
>> 2014-10-25 1:47 GMT+02:00 Joe Corneli <holtzermann17@gmail.com>:
>> >
>> > On Sat, Oct 25 2014, WereSpielChequers wrote:
>> >
>> >> And just to add to the complexity of James' comments; there are some
>> >> people
>> >> who think that a general interest encyclopaedia should be written for
a
>> >> general audience. So articles with long sentences should be improved
by
>> >> rewriting into more but shorter sentences,
>> >
>> > How about an even simpler version of the problem: an encyclopedia
>> > written by robots for robots.  I speak, of course, of DBPedia.  We
could
>> > equally ask, what makes for quality entries there?
>> >
>> > _______________________________________________
>> > Wiki-research-l mailing list
>> > Wiki-research-l@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>> _______________________________________________
>> Wiki-research-l mailing list
>> Wiki-research-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>

_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l