[Labs-l] db replica query weirdness

Huji Lee huji.huji at gmail.com
Mon Aug 3 04:33:56 UTC 2015


You are seeing those gaps because the method you use to calculate account
age is not correct. Run this query:

select user_id, user_registration,
(now() - str_to_date(user_registration, '%Y%m%d%H%i%s'))/86400 as
registered_time
from user
where user_id >= 25650026
and user_id <= 25650027;

And you will see that those two accounts which are created back to back,
one at the end of 6/30/2015 and one at the beginning of 7/1/2015 will
return an account age of ~2000 and ~1181 respectively (two numbers that are
not that close).

Instead, run this:

select user_id, user_registration,
datediff(now(), str_to_date(user_registration, '%Y%m%d%H%i%s')) as acct_age
from user
where user_id >= 25650026
and user_id <= 25650027;

And you will get the account age in days (33 and 34 days in this example).


On Sun, Aug 2, 2015 at 1:12 PM, Golden Ring <goldenring.wp at gmail.com> wrote:

> I'm hoping that someone here can explain some query weirdness to me.
>
> I'm trying to get a list of the n most-recently-registered users who have
> done page moves, and how many page moves they've done.  To do this, I'm
> using this query:
>
> select
>        user_id,
>        user_name,
>        (now() - str_to_date(user_registration, '%Y%m%d%H%i%s'))/86400 as
> registered_time,
>        count(log_id) as moves
> from
>        user,
>        logging_userindex
> where
>        user_id = log_user
>   and (log_action = 'move' or log_action = 'move_redir')
> group by user_id
> order by user_id desc
> limit {};
>
> For a limit of about 100 rows, this gives plausible-looking results.  But
> increasing the limit to 1000 gives some pretty bizarre results.  There are
> big gaps in the registered_time variable.  The range of the registered_time
> variable is about 0 - 4,000 days, but there are no results with registered
> times between approx 350 - 1160 days, 1500 - 2300 days, or 2670 to 3480
> days.
>
> July 2013 is about 700 days ago, but a query to find users with page moves
> who registered on July 15, 2013 shows that such users do exist:
>
> select count(*) from user, logging_userindex where user_id = log_user and
> log_action='move' and user_registration like '20130715%';
>
> Result: 63 move actions from users registered on 20130715.
>
> What's going on here?  I assume I've simply got the first query wrong in
> some way, but I can't see how.
>
> Note that all of the above is against the enwiki replica.
>
> Thanks for any help,
> GoldenRing
>
>
> _______________________________________________
> Labs-l mailing list
> Labs-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/labs-l
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-l/attachments/20150803/6d1e2a27/attachment.html>


More information about the Labs-l mailing list