Hi Risker!
On Wed, Feb 16, 2022 at 5:52 PM Risker risker.wp@gmail.com wrote:
Thank you very much for sharing this data, Tyler (and to the team that researched and analysed it, as well). I think it shows that the train has been pretty successful in mitigating the issues it was intended to improve.
I think so, too :)
I note the data points that show there has been a significant and clear trend toward fewer comments per patch. This would be worth investigating further. Iis the total number of reviews pretty consistent, or is it increasing or decreasing? Is it possible that developers have become more proficient at writing patches to standard, and thus fewer comments are required? Or could it be that, because more time is invested in writing patches (assuming that more patches = more time writing them), there is less time for review?
I'll preface my comments with the caveat: I am (definitely) not a data scientist.
I think we need to investigate more to say anything definitive. And I love that this data enables us to have a conversation about what to investigate next.
The comments per patch trend comes from the number of comments per patch averaged over a whole train. Outliers could be affecting the average (for instance, there is one patch[0] from 2015 with 354 comments).
Another possible explanation is: as we've added more bots over time, my simple tools to filter out bot noise are proving insufficient.
I've only begun to explore this trend[1]. I'll keep folks posted and I invite others to explore along with me!
Thanks! β Tyler
[0]: https://data.releng.team/train?sql=select+*+from+patch+order+by+comments+desc [1]: https://gitlab.wikimedia.org/thcipriani/train-stats#a-look-at-comments-per-patch