<div dir="ltr"><div><div><div>You are seeing those gaps because the method you use to calculate account age is not correct. Run this query:<br><br>select user_id, user_registration,<br>(now() - str_to_date(user_registration, '%Y%m%d%H%i%s'))/86400 as registered_time<br>from user<br>where user_id >= 25650026<br>and user_id <= 25650027;<br><br></div>And you will see that those two accounts which are created back to back, one at the end of 6/30/2015 and one at the beginning of 7/1/2015 will return an account age of ~2000 and ~1181 respectively (two numbers that are not that close).<br><br></div>Instead, run this:<br><br>select user_id, user_registration,<br>datediff(now(), str_to_date(user_registration, '%Y%m%d%H%i%s')) as acct_age<br>from user<br>where user_id >= 25650026<br>and user_id <= 25650027;<br><br></div>And you will get the account age in days (33 and 34 days in this example).<br><br></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Aug 2, 2015 at 1:12 PM, Golden Ring <span dir="ltr"><<a href="mailto:goldenring.wp@gmail.com" target="_blank">goldenring.wp@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">I'm hoping that someone here can explain some query weirdness to me.<div><br></div><div>I'm trying to get a list of the n most-recently-registered users who have done page moves, and how many page moves they've done. To do this, I'm using this query:</div><div><br></div><div><div>select</div><div> user_id,</div><div> user_name,</div><div> (now() - str_to_date(user_registration, '%Y%m%d%H%i%s'))/86400 as registered_time,</div><div> count(log_id) as moves</div><div>from</div><div> user,</div><div> logging_userindex</div><div>where</div><div> user_id = log_user</div><div> and (log_action = 'move' or log_action = 'move_redir')</div><div>group by user_id</div><div>order by user_id desc</div><div>limit {}; </div></div><div>For a limit of about 100 rows, this gives plausible-looking results. But increasing the limit to 1000 gives some pretty bizarre results. There are big gaps in the registered_time variable. The range of the registered_time variable is about 0 - 4,000 days, but there are no results with registered times between approx 350 - 1160 days, 1500 - 2300 days, or 2670 to 3480 days.</div><div><br></div><div>July 2013 is about 700 days ago, but a query to find users with page moves who registered on July 15, 2013 shows that such users do exist:</div><div><br></div><div>select count(*) from user, logging_userindex where user_id = log_user and log_action='move' and user_registration like '20130715%';<br></div><div><br></div><div>Result: 63 move actions from users registered on 20130715.</div><div><br></div><div>What's going on here? I assume I've simply got the first query wrong in some way, but I can't see how.</div><div><br></div><div>Note that all of the above is against the enwiki replica.</div><div><br></div><div>Thanks for any help,</div><div>GoldenRing</div><div><br></div></div>
<br>_______________________________________________<br>
Labs-l mailing list<br>
<a href="mailto:Labs-l@lists.wikimedia.org">Labs-l@lists.wikimedia.org</a><br>
<a href="https://lists.wikimedia.org/mailman/listinfo/labs-l" rel="noreferrer" target="_blank">https://lists.wikimedia.org/mailman/listinfo/labs-l</a><br>
<br></blockquote></div><br></div>