No subject


Tue Mar 15 17:42:23 UTC 2011


Clicking "view history" and then "contributors" gives a ranked list of all
contributors in order of most edits.

http://toolserver.org/~daniel/WikiSense/Contributors.php?wikilang=en&wikifam=.wikipedia.org&grouped=on&page=Science

The top three editors (lets call them A, B, and C) currently have 445, 73
and 70 edits respectively. Clicking on contributor "A" to see their user
page and then the "user contributions" from the tool box shows all their
edits. For example, he/she has several edits to the articles "intelligent
design" and "southern poverty law center", etc. and user "B" has edits to
"rock formations" and "human evolution". I would like to count frequency of
all these edits across the top users for the sampled (e.g. science) articles
sorted by the article title.

I don't know what the best way to arrange the data would be, but below is a
Google Doc Spreadsheet that sort of shows what I think it would look like.

http://goo.gl/VIWd6

If the Query Service seems the best approach (is this done using the
php-mysql referenced above or is it a different process?) then I will go
ahead and create a task on https://jira.toolserver.org/browse/DBQ. If this
is not the best or correct way to go any guidance is appreciated.

Thanks.

-- 
Jim

--000e0cd312daed7a3c04a24c7311
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<br><br><div class=3D"gmail_quote">On Fri, Apr 29, 2011 at 7:58 AM, Manish =
Goregaokar <span dir=3D"ltr">&lt;<a href=3D"mailto:manishsmail at gmail.com">m=
anishsmail at gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_qu=
ote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex=
;">
<div dir=3D"ltr"><div class=3D"im"><div class=3D"gmail_quote"><br><blockquo=
te class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc so=
lid;padding-left:1ex"><div>=A01. Select 200 random articles.<br>
=A02. Get the top contributors for each of them.<br>
=A03. Get the edit counts for those contributors.</div></blockquote></div><=
br></div><div>I think he has the list/s of 200 articles, and does not want =
random ones.</div><div>Plus, he doesn&#39;t want the editcounts, he wants t=
heir top edited articles, with the editcount per article.</div>


<div><br></div><div>My personal opinion is that this HAS to be done via php=
 (though I can&#39;t comment of server load).</div><div>Use php-mysql to de=
termine the list of top contributors per given article, then loop for each =
contributor, and give <i>his</i>=A0top edited articles... Shouldn&#39;t be =
hard, though you might want to clarify what you mean by &quot;top&quot;. (T=
op 3? More than X edits? More than X% edits per day/week/month/beginning of=
 time? More than X% edits of the top editor?). <br>
<br></div></div></blockquote><div><br>Thanks again for the info. Yes, this =
is basically correct. I am looking to collect this info based on 100 articl=
es from the Wikipedia science series. If the data proves relatively easy to=
 collect, I like to collect data on all articles in the science series whic=
h is around 200 articles. Top contributors for me are those with 10 or more=
 edits in the sampled article from the science series. For the sake of clar=
ity, here is a short sample of the data I&#39;m looking for.<br>
<br>From the &quot;science&quot; article <a href=3D"http://en.wikipedia.org=
/wiki/Science">http://en.wikipedia.org/wiki/Science</a><br><br>Clicking &qu=
ot;view history&quot; and then &quot;contributors&quot; gives a ranked list=
 of all contributors in order of most edits.<br>
<br><a href=3D"http://toolserver.org/~daniel/WikiSense/Contributors.php?wik=
ilang=3Den&amp;wikifam=3D.wikipedia.org&amp;grouped=3Don&amp;page=3DScience=
">http://toolserver.org/~daniel/WikiSense/Contributors.php?wikilang=3Den&am=
p;wikifam=3D.wikipedia.org&amp;grouped=3Don&amp;page=3DScience</a><br>
<br>The top three editors (lets call them A, B, and C) currently have 445, =
73 and 70 edits respectively. Clicking on contributor &quot;A&quot; to see =
their user page and then the &quot;user contributions&quot; from the tool b=
ox shows all their edits. For example, he/she has several edits to the arti=
cles &quot;intelligent design&quot; and &quot;southern poverty law center&q=
uot;, etc. and user &quot;B&quot; has edits to &quot;rock formations&quot; =
and &quot;human evolution&quot;. I would like to count frequency of all the=
se edits across the top users for the sampled (e.g. science) articles sorte=
d by the article title.<br>
<br>I don&#39;t know what the best way to arrange the data would be, but be=
low is a Google Doc Spreadsheet that sort of shows what I think it would lo=
ok like.<br><br><a href=3D"http://goo.gl/VIWd6">http://goo.gl/VIWd6</a><br>
<br>If the Query Service seems the best approach (is this done using the ph=
p-mysql referenced above or is it a different process?) then I will go ahea=
d and create a task on <a href=3D"https://jira.toolserver.org/browse/DBQ">h=
ttps://jira.toolserver.org/browse/DBQ</a>. If this is not the best or corre=
ct way to go any guidance is appreciated.<br>
<br>Thanks.<br></div></div><br>-- <br>Jim<br>

--000e0cd312daed7a3c04a24c7311--



More information about the Toolserver-l mailing list