Alright, having screwed with it I've got it up to...21k (and I replicated Ryan's work, with a couple of tweaks to explicitly exclude log actions and namespace !0 - same results. So something is hooky here :/). At this stage I'm genuinely not sure what could be going wrong; working attached if anyone wants to point out my blunder.
On 31 January 2013 03:55, Matthew Flaschen <mflaschen@wikimedia.org> wrote:On 01/30/2013 06:43 PM, Oliver Keyes wrote:> /preeetty/ sure is unreliable somehow, but I've applied a decade and a
> So: attached, data - everyone with >5 actions in the recentchanges
> table. Now, the result-set is only ~7,000 entries long, which I'm
> half of collected comp sci studies and around 3 decades of practicalI asked Ryan Faulkner to take a look, and he did indeed get higher
> experience to the problem and they've all gone 'er. no idea. It should
> work'. If anyone else can spot what's going wrong, most appreciated :).
numbers of users:
User ids that had at least 5 edits in the last 30 days
select count(*) from (select rc_user, count(*) as revs from
enwiki.recentchanges where rc_timestamp >= '20130101000000' and
rc_timestamp < '20130131000000' group by 1 having revs >= 5) as t
He said it was 32511 for ns0 (main namespace).
He didn't try to check the skin info in the query, so far.
Tried a LEFT OUTER JOIN and it produced all of 5k more results :/. I'll look at it with fresh eyes in the morning, unless anyone wants to get there first (sorry for taking so much of your time with what should be a pretty simple issue)
I think the issue is that there is no row in user_properties if they did
not change their skin. From
https://www.mediawiki.org/wiki/Manual:User_properties_table:
"Only non-default settings are stored, so changes to the defaults are
now reflected for everybody that hasn't saved an alternative preference,
not only new accounts."
So the query seems to miss people who just always left the default skin.
If I'm understanding this correctly, it has to be an outer join,
defaulting to vector (the default on enwiki).
Matt Flaschen
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
Oliver Keyes
Community Liaison, Product Development
Wikimedia Foundation