<div dir="ltr">> How about doing this per wiki?<br><br>That is as easy as joining this with the meta tables, I challenge all of you to see who can make it faster :-)</div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Nov 26, 2015 at 6:49 PM, Yetkin Sakal <span dir="ltr"><<a href="mailto:superyetkin@yahoo.com" target="_blank">superyetkin@yahoo.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div style="color:#000;background-color:#fff;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,sans-serif;font-size:13px">How about doing this per wiki?<div><div class="h5"><br><div><br><span></span></div><br><div><br><br></div><div style="display:block"> <div style="font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,sans-serif;font-size:13px"> <div style="font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,sans-serif;font-size:16px"> <div dir="ltr"><font face="Arial" size="2"> On Thursday, November 26, 2015 10:19 AM, Jaime Crespo <<a href="mailto:jcrespo@wikimedia.org" target="_blank">jcrespo@wikimedia.org</a>> wrote:<br></font></div> <br><br> <div><div><div><div dir="ltr"><div><div><div>> So even if the replicas don't get updated the heartbeat will report
them as up to date?<br clear="none"><br clear="none"></div>Not sure exactly what you mean with that. The masters will be updated continuously every 0.5 seconds (all slaves are read only- no writes are done there). If replication works, and slaves get updated, that will mean that they will receive the heartbeat with the same replication channel than the rest of the updates. If replication doesn't work, and replicas do not get updated, they will not receive the heartbeat either, as it comes from replication in order. If replication stops/fails, heartbeat update will stop (from the slave perspective), and lag will start to increase from your perspective (difference between last timestamp written and current time).<br clear="none"><br clear="none"></div>This measures the replication lag (aka difference with the master), not the last time an edit was done by a user, which was what the first link I sent measured. In other words, if jaimewiki receives only user edits every hour, heartbeat will still do a write to its master every half a seconds, thus proving that it is up to date with that resolution. You can still check the last user edit by checking recentchanges.<br clear="none"><br clear="none">The only reason this could fail (heartbeat updated but wiki not) is if there was a specific filter denying replication but allowing hearbeat, only done for specific tables and private wikis. Also the production master could have a problem, but that would affect the wikis itselves, not only labs.<br clear="none"><br clear="none"></div><div>To give you an idea of the accuracy of this method, we (will) use it on production to decide if a slave is usable or not to return up-to-date data.<br clear="none"></div><div><br clear="none"></div>For more information on how this works, check <<a rel="nofollow" shape="rect" href="https://www.percona.com/doc/percona-toolkit/2.1/pt-heartbeat.html#description" target="_blank">https://www.percona.com/doc/percona-toolkit/2.1/pt-heartbeat.html#description</a>><br clear="none"></div><div><br clear="none"><div>On Wed, Nov 25, 2015 at 9:51 PM, Ricordisamoa <span dir="ltr"><<a rel="nofollow" shape="rect" href="mailto:ricordisamoa@openmailbox.org" target="_blank">ricordisamoa@openmailbox.org</a>></span> wrote:<br clear="none"><blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div><span>
Il 25/11/2015 21:21, Jaime Crespo ha scritto:<br clear="none">
</span><blockquote type="cite">
<div dir="ltr">
<div>
<div>Always fearing doing queries on a lagged replica on labs?
Not anymore!<br clear="none">
<br clear="none">
</div>
While Betacommand's tool [0] was very useful, it was also very
inaccurate, as it tried to check the lag by looking at the
last rows updated, which can be a lot of time on the least
popular wikis.<br clear="none">
<br clear="none">
</div>
What I offer now is sub-second accurate lag measuring, by
writing on the production masters the current time, in
microseconds, every 0.5 seconds and making that available on all
hosts (using this tool [1]). So, it is more accurate than SHOW
SLAVE STATUS, because it compares the difference with the
original master, and it will work even if replication is broken.<br clear="all">
</div>
</blockquote>
<br clear="none">
So even if the replicas don't get updated the heartbeat will report
them as up to date?<br clear="none">
<br clear="none">
<blockquote type="cite"><div><div>
<div dir="ltr">
<div>
<div>
<div><br clear="none">
</div>
<div>To read it, just do SELECT * FROM
heartbeat_p.heartbeat;<br clear="none">
</div>
<div>And you will get:<br clear="none">
+-------+----------------------------+------+<br clear="none">
| shard | last_updated | lag |<br clear="none">
+-------+----------------------------+------+<br clear="none">
| s6 | 2015-11-25T20:20:32.000980 | 0 |<br clear="none">
| s2 | 2015-11-25T20:20:32.001030 | 0 |<br clear="none">
| s7 | 2015-11-25T20:20:32.001070 | 0 |<br clear="none">
| s3 | 2015-11-25T20:20:32.001000 | 0 |<br clear="none">
| s4 | 2015-11-25T20:20:32.000920 | 0 |<br clear="none">
| s1 | 2015-11-25T20:20:32.000740 | 0 |<br clear="none">
| s5 | 2015-11-25T20:20:32.000830 | 0 |<br clear="none">
+-------+----------------------------+------+<br clear="none">
</div>
<div><br clear="none">
Read the detailed documentation on: [2]<br clear="none">
<br clear="none">
</div>
<div>Use it, create a web page if you want to make it
public! Report a ticket if it gets too high! Report a
ticket if you need more info (a record per wiki?). But I
wanted to give you the essentials, and you can build
yourselves on top of that.<br clear="none">
<br clear="none">
</div>
<div>Only 2 know bugs:<br clear="none">
</div>
<div>- There is microsecond accuracy, but it cannot be used
until a bug in MariaDB is fixed [3]<br clear="none">
</div>
<div>- enwiki will only report s1 lag until that server is
restarted due to some existing filters. We will schedule
that at some time in the future.<br clear="none">
</div>
<div><br clear="none">
[0]<<a rel="nofollow" shape="rect" href="http://tools.wmflabs.org/betacommand-dev/cgi-bin/replag" target="_blank">http://tools.wmflabs.org/betacommand-dev/cgi-bin/replag</a>><br clear="none">
[1]<<a rel="nofollow" shape="rect" href="https://www.percona.com/doc/percona-toolkit/2.2/pt-heartbeat.html" target="_blank">https://www.percona.com/doc/percona-toolkit/2.2/pt-heartbeat.html</a>><br clear="none">
[2]<<a rel="nofollow" shape="rect" href="https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Database#Identifying_lag" target="_blank">https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Database#Identifying_lag</a>><br clear="none">
[3]<<a rel="nofollow" shape="rect" href="https://mariadb.atlassian.net/browse/MDEV-9175" target="_blank">https://mariadb.atlassian.net/browse/MDEV-9175</a>><br clear="none">
-- <br clear="none">
<div>
<div dir="ltr">
<div>Jaime Crespo<br clear="none">
</div>
<<a rel="nofollow" shape="rect" href="http://wikimedia.org/" target="_blank">http://wikimedia.org</a>><br clear="none">
</div>
</div>
</div>
</div>
</div>
</div>
<br clear="none">
<fieldset></fieldset>
<br clear="none">
</div></div><pre>_______________________________________________
Labs-l mailing list
<a rel="nofollow" shape="rect" href="mailto:Labs-l@lists.wikimedia.org" target="_blank">Labs-l@lists.wikimedia.org</a>
<a rel="nofollow" shape="rect" href="https://lists.wikimedia.org/mailman/listinfo/labs-l" target="_blank">https://lists.wikimedia.org/mailman/listinfo/labs-l</a>
</pre>
</blockquote>
</div></div>
<br clear="none">_______________________________________________<br clear="none">
Labs-l mailing list<br clear="none">
<a rel="nofollow" shape="rect" href="mailto:Labs-l@lists.wikimedia.org" target="_blank">Labs-l@lists.wikimedia.org</a><br clear="none">
<a rel="nofollow" shape="rect" href="https://lists.wikimedia.org/mailman/listinfo/labs-l" target="_blank">https://lists.wikimedia.org/mailman/listinfo/labs-l</a><br clear="none">
<br clear="none"></blockquote></div><br clear="none"><br clear="all"><br clear="none">-- <br clear="none"><div><div dir="ltr"><div>Jaime Crespo<br clear="none"></div><<a rel="nofollow" shape="rect" href="http://wikimedia.org/" target="_blank">http://wikimedia.org</a>><br clear="none"></div></div>
</div></div></div><br><div>_______________________________________________<br clear="none">Labs-l mailing list<br clear="none"><a shape="rect" href="mailto:Labs-l@lists.wikimedia.org" target="_blank">Labs-l@lists.wikimedia.org</a><br clear="none"><a shape="rect" href="https://lists.wikimedia.org/mailman/listinfo/labs-l" target="_blank">https://lists.wikimedia.org/mailman/listinfo/labs-l</a><br clear="none"></div><br><br></div> </div> </div> </div></div></div></div></div><br>_______________________________________________<br>
Labs-l mailing list<br>
<a href="mailto:Labs-l@lists.wikimedia.org">Labs-l@lists.wikimedia.org</a><br>
<a href="https://lists.wikimedia.org/mailman/listinfo/labs-l" rel="noreferrer" target="_blank">https://lists.wikimedia.org/mailman/listinfo/labs-l</a><br>
<br></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature"><div dir="ltr"><div>Jaime Crespo<br></div><<a href="http://wikimedia.org" target="_blank">http://wikimedia.org</a>><br></div></div>
</div>