On 8 January 2015 at 02:31, Gergo Tisza gtisza@wikimedia.org wrote:
On Wed, Jan 7, 2015 at 6:26 PM, Oliver Keyes okeyes@wikimedia.org wrote:
places to get edits? Well....the revision table? I'm sort of confused as to what you're looking for, I guess, that the db wouldn't have.
There are a thousand or so wikis; it would be nice if there was a single table with all the edits. I guess I can generate a query with a thousand unions...
We've talked about having a unified db; the problem is some tables exist on some wikis and not on others (and keeping it synced up would be a pain). The most we have is a machine with all of the dbs on it, and a table containing all the dbnames; it should be fairly trivial to write something in Python or similar to iterate through them.
The harder problem is that it would be nice to group by editor activity levels. One of the concerns about MediaViewer was that it makes harder for new editors to understand file pages and start editing them; so it would be a plausible hypothesis that the number of file edits by new editors would drop sharply after making MV default, but the total file edit count wouldn't be visibly affected because it would be dominated by power users who already know how to curate image metadata.
So I would like to look at something like the number of first edits per month, or the number of edits by editors who at the time had less than 10 edits... recovering that kind of data from the revision table seems extremely difficult.
Yeah, that is difficult. Aaron has, I believe, precomputed some things; Aaron?
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics