(Moving this discussion to analytics@ and localization-team@ based on Nuria’s suggestion below.)
Hi Leila,
The output I posted in the message is the only output I am seeing. I do not see the URL-encoded section or the validation section. I think there may be something wrong with my testing setup.
Niklas Laxstöm has checked what is happening with our event logging in beta and he confirmed that we are sending events and the events are valid. The issue seems to be that we are logging events to the beta event logging db while what we checked earlier was the production event logging db.
Can you (or anyone who is available) check the event logging db in beta to see if the table has been created and has data? The schema name again is ContentTranslation. If you don’t find anything, let us know and we will do some more investigation.
If there is data in the beta db the next step would be to follow with Dan’s instructions to get a dashboard set up on limn1. I believe that most of Dan’s instructions need to be handled by someone on the analytics team, but let me know if there is anything I can help with.
Thanks again for your help!
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote:
Hi Joel,
When you log events, the output will be the URL-encoded JSON sent by the browser, the event record (similar to what you pasted in your email), and whether the event validates against the schema. For the sample output you pasted earlier, or another sample output, can you let us know if validation section shows Valid?
Leila
On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org wrote: Joel,
For questions like these going forward you can contact analytics@ as you will be getting amore prompt response. Both Dan and Leila are OOTO the next couple of days.
There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server?
No, there is no need.
You would need sample rates to determine at which sampling rate you are logging if you are not logging all events, that is.
Thanks,
Nuria
On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu dandreescu@wikimedia.org wrote: Adding Nuria as she can probably help
On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Leila,
I have tested our EventLogging code and it seems to be working fine with the event logging dev server. I can see the events coming through and they are valid. Here is some sample output:
{"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", "event": {"action": "create-translated-page", "targetLanguage": "ca", "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": 7146627}
Are there additional configuration options we need to add to get EL working aside from just requiring the main extension file. There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server?
Any help on this would be much appreciated.
Thanks,
Joel
On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
No problem, Dan. Enjoy your vacation!
I will read through the document at the link you sent. I still need to fix our event logging code so it may be a couple days before we are ready anyway. If I have any questions I will contact Leila or Nuria.
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 7, 2014, at 3:10 PM, Dan Andreescu dandreescu@wikimedia.org wrote:
Joel, re: visualization,
I'm going on vacation tomorrow and will be back on November 19th. If that's not too late, I can set up a limn instance then. If it's too late, that's ok, I wrote up the steps needed. Someone with access to the limn1.eqiad.wmflabs instance can perform them: https://wikitech.wikimedia.org/wiki/Analytics/Dashboards
If you have the data or are generating the data in some other way, then you don't need half of that setup, you just need the part that sets up the limn dashboard which is only an hour or so of work. Sorry I'm running out the door and can't take care of that for you.
Dan
On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Thank you for the information, Pau. Very helpful. As you say, this does not change our current plans or hold us up in any way. I was just wasn’t clear about the relationship between the "high priorities" and "other metrics” sections. Knowing these came from different people at different times clarifies things a lot. Joel
On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org wrote:
@Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities?
Here is the story of that page as I'm aware of it:
In September 2013, I was in a meeting with the analytics team in SF presenting an initial proposal for metrics. On that meeting, Dario recommended to create hierarchy of metrics based on the project goals. I created such image and a description for those metrics (the image is on top of our analytics page and the metrics are described in what it now the "Other metrics for created articles" section.
In a meeting between Amir and Howie, they captured which should be the most important metrics from the product perspective in the "High priorities for product management". If I recalled correctly, as an outcome of later meetings between Howie and Amir, Howie was happy focusing on articles published as a single (initial?) metric for success. Amir can provide more details since I was not on those meetings.
In short: The analytics page has pieces contributed by different people during the last year, and although there are many ideas to organise and detail, measuring the number of published articles seems to be the solid candidate to get started with, learn from the value we get from it and polish the rest of our goal-to-signal process for detecting better metrics.
Pau
On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi All,
I have been reviewing our requirements for Content translation analytics and I have a few questions/requests. I am sending them to the language team list and Leila and Dan in the hopes of getting some more clarity. I will add the same content to the Trello card.
In the weekly team meeting earlier today we agreed that the first metric we want to collect data for is the number of articles created in each language over time. This is something has Amir has already set up our current Event Logging to track. Now that Kartik has enabled EL in beta, that part should be done. Since we are only barely turning it on, there will be very little data until people create more articles using CX. However, we should be set up to collect any new data that comes in.
@Leila, can you verify that the db table now exists for the ContentTranslation schema? If it doesn’t, can you point us to right people we need to work with to troubleshoot the issue? Also you mentioned in our meeting that personal data may soon be purged after 90 days as part of a new privacy policy. Could you explain that a bit more or point us to more information? If this is the case, it may affect some of the metrics we would like to collect in the future.
@Dan, what do we need to do next in order to set up a very simple visualization that would show the number of articles created per week by language. Pau has an image of what he would like on the Trello card. You mentioned something about being able to host a dashboard for us on one of the Limn servers you already have set up.
@Santhosh, I believe you said earlier you have a script you use to export the data for the ULS analytics. If so can you share that please in case we need a similar script for CX so I don’t have to write a new script from scratch?
@Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? If the latter, have these priorities been reviewed by anyone outside the team? I think we are safe to proceed with our current plan, but it would be good to have product sign off on things more generally.
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
-- Pau Giner Interaction Designer Wikimedia Foundation _______________________________________________ Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Niklas,
Can you answer this question from Nuria?
jsahleen: does beta have its own varnish instance? where are you posting your events in beta? can you send teh url?
Also would it be possible to document the steps you used when testing EL on beta so that others can reproduce them?
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
(Moving this discussion to analytics@ and localization-team@ based on Nuria’s suggestion below.)
Hi Leila,
The output I posted in the message is the only output I am seeing. I do not see the URL-encoded section or the validation section. I think there may be something wrong with my testing setup.
Niklas Laxstöm has checked what is happening with our event logging in beta and he confirmed that we are sending events and the events are valid. The issue seems to be that we are logging events to the beta event logging db while what we checked earlier was the production event logging db.
Can you (or anyone who is available) check the event logging db in beta to see if the table has been created and has data? The schema name again is ContentTranslation. If you don’t find anything, let us know and we will do some more investigation.
If there is data in the beta db the next step would be to follow with Dan’s instructions to get a dashboard set up on limn1. I believe that most of Dan’s instructions need to be handled by someone on the analytics team, but let me know if there is anything I can help with.
Thanks again for your help!
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote:
Hi Joel,
When you log events, the output will be the URL-encoded JSON sent by the browser, the event record (similar to what you pasted in your email), and whether the event validates against the schema. For the sample output you pasted earlier, or another sample output, can you let us know if validation section shows Valid?
Leila
On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org wrote: Joel,
For questions like these going forward you can contact analytics@ as you will be getting amore prompt response. Both Dan and Leila are OOTO the next couple of days.
There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server?
No, there is no need.
You would need sample rates to determine at which sampling rate you are logging if you are not logging all events, that is.
Thanks,
Nuria
On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu dandreescu@wikimedia.org wrote: Adding Nuria as she can probably help
On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Leila,
I have tested our EventLogging code and it seems to be working fine with the event logging dev server. I can see the events coming through and they are valid. Here is some sample output:
{"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", "event": {"action": "create-translated-page", "targetLanguage": "ca", "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": 7146627}
Are there additional configuration options we need to add to get EL working aside from just requiring the main extension file. There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server?
Any help on this would be much appreciated.
Thanks,
Joel
On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
No problem, Dan. Enjoy your vacation!
I will read through the document at the link you sent. I still need to fix our event logging code so it may be a couple days before we are ready anyway. If I have any questions I will contact Leila or Nuria.
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 7, 2014, at 3:10 PM, Dan Andreescu dandreescu@wikimedia.org wrote:
Joel, re: visualization,
I'm going on vacation tomorrow and will be back on November 19th. If that's not too late, I can set up a limn instance then. If it's too late, that's ok, I wrote up the steps needed. Someone with access to the limn1.eqiad.wmflabs instance can perform them: https://wikitech.wikimedia.org/wiki/Analytics/Dashboards
If you have the data or are generating the data in some other way, then you don't need half of that setup, you just need the part that sets up the limn dashboard which is only an hour or so of work. Sorry I'm running out the door and can't take care of that for you.
Dan
On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Thank you for the information, Pau. Very helpful. As you say, this does not change our current plans or hold us up in any way. I was just wasn’t clear about the relationship between the "high priorities" and "other metrics” sections. Knowing these came from different people at different times clarifies things a lot. Joel
On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org wrote:
@Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities?
Here is the story of that page as I'm aware of it:
In September 2013, I was in a meeting with the analytics team in SF presenting an initial proposal for metrics. On that meeting, Dario recommended to create hierarchy of metrics based on the project goals. I created such image and a description for those metrics (the image is on top of our analytics page and the metrics are described in what it now the "Other metrics for created articles" section.
In a meeting between Amir and Howie, they captured which should be the most important metrics from the product perspective in the "High priorities for product management". If I recalled correctly, as an outcome of later meetings between Howie and Amir, Howie was happy focusing on articles published as a single (initial?) metric for success. Amir can provide more details since I was not on those meetings.
In short: The analytics page has pieces contributed by different people during the last year, and although there are many ideas to organise and detail, measuring the number of published articles seems to be the solid candidate to get started with, learn from the value we get from it and polish the rest of our goal-to-signal process for detecting better metrics.
Pau
On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi All,
I have been reviewing our requirements for Content translation analytics and I have a few questions/requests. I am sending them to the language team list and Leila and Dan in the hopes of getting some more clarity. I will add the same content to the Trello card.
In the weekly team meeting earlier today we agreed that the first metric we want to collect data for is the number of articles created in each language over time. This is something has Amir has already set up our current Event Logging to track. Now that Kartik has enabled EL in beta, that part should be done. Since we are only barely turning it on, there will be very little data until people create more articles using CX. However, we should be set up to collect any new data that comes in.
@Leila, can you verify that the db table now exists for the ContentTranslation schema? If it doesn’t, can you point us to right people we need to work with to troubleshoot the issue? Also you mentioned in our meeting that personal data may soon be purged after 90 days as part of a new privacy policy. Could you explain that a bit more or point us to more information? If this is the case, it may affect some of the metrics we would like to collect in the future.
@Dan, what do we need to do next in order to set up a very simple visualization that would show the number of articles created per week by language. Pau has an image of what he would like on the Trello card. You mentioned something about being able to host a dashboard for us on one of the Limn servers you already have set up.
@Santhosh, I believe you said earlier you have a script you use to export the data for the ULS analytics. If so can you share that please in case we need a similar script for CX so I don’t have to write a new script from scratch?
@Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? If the latter, have these priorities been reviewed by anyone outside the team? I think we are safe to proceed with our current plan, but it would be good to have product sign off on things more generally.
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
-- Pau Giner Interaction Designer Wikimedia Foundation _______________________________________________ Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
To keep archives happy: Beta setup post events to http://bits.beta.wmflabs.org/event.gif http://bits.beta.wmflabs.org/event.gif?foo=bar that, while it does not look to be varnish, has some kind of listener that post those events to beta event logging database.
On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Niklas,
Can you answer this question from Nuria?
jsahleen: does beta have its own varnish instance? where are you posting your events in beta? can you send teh url?
Also would it be possible to document the steps you used when testing EL on beta so that others can reproduce them?
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
(Moving this discussion to analytics@ and localization-team@ based on Nuria’s suggestion below.)
Hi Leila,
The output I posted in the message is the only output I am seeing. I do not see the URL-encoded section or the validation section. I think there may be something wrong with my testing setup.
Niklas Laxstöm has checked what is happening with our event logging in beta and he confirmed that we are sending events and the events are valid. The issue seems to be that we are logging events to the beta event logging db while what we checked earlier was the production event logging db.
Can you (or anyone who is available) check the event logging db in beta to see if the table has been created and has data? The schema name again is ContentTranslation. If you don’t find anything, let us know and we will do some more investigation.
If there is data in the beta db the next step would be to follow with Dan’s instructions https://wikitech.wikimedia.org/wiki/Analytics/Dashboards to get a dashboard set up on limn1. I believe that most of Dan’s instructions need to be handled by someone on the analytics team, but let me know if there is anything I can help with.
Thanks again for your help!
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote:
Hi Joel,
When you log events, the output will be the URL-encoded JSON sent by the browser, the event record (similar to what you pasted in your email), and whether the event validates against the schema. For the sample output you pasted earlier, or another sample output, can you let us know if validation section shows Valid?
Leila
On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel,
For questions like these going forward you can contact analytics@ as you will be getting amore prompt response. Both Dan and Leila are OOTO the next couple of days.
There are configuration options for the dev server that need to be
added. Do similar options need to be added when not using the dev server? No, there is no need.
You would need sample rates to determine at which sampling rate you are logging if you are not logging all events, that is.
Thanks,
Nuria
On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu dandreescu@wikimedia.org wrote:
Adding Nuria as she can probably help
On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi Leila,
I have tested our EventLogging code and it seems to be working fine with the event logging dev server. I can see the events coming through and they are valid. Here is some sample output:
{"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", "event": {"action": "create-translated-page", "targetLanguage": "ca", "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": 7146627}
Are there additional configuration options we need to add to get EL working aside from just requiring the main extension file. There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server?
Any help on this would be much appreciated.
Thanks,
Joel
On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
No problem, Dan. Enjoy your vacation!
I will read through the document at the link you sent. I still need to fix our event logging code so it may be a couple days before we are ready anyway. If I have any questions I will contact Leila or Nuria.
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 7, 2014, at 3:10 PM, Dan Andreescu dandreescu@wikimedia.org wrote:
Joel, re: visualization,
I'm going on vacation tomorrow and will be back on November 19th. If that's not too late, I can set up a limn instance then. If it's too late, that's ok, I wrote up the steps needed. Someone with access to the limn1.eqiad.wmflabs instance can perform them: https://wikitech.wikimedia.org/wiki/Analytics/Dashboards
If you have the data or are generating the data in some other way, then you don't need half of that setup, you just need the part that sets up the limn dashboard which is only an hour or so of work. Sorry I'm running out the door and can't take care of that for you.
Dan
On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Thank you for the information, Pau. Very helpful. As you say, this does not change our current plans or hold us up in any way. I was just wasn’t clear about the relationship between the "high priorities" and "other metrics” sections. Knowing these came from different people at different times clarifies things a lot. Joel
On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org wrote:
@Pau, @Amir There is a section called High priorities for product
management https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities?
Here is the story of that page as I'm aware of it:
In September 2013, I was in a meeting with the analytics team in SF presenting an initial proposal for metrics https://docs.google.com/a/wikimedia.org/presentation/d/1V1XLV7jUcAtco5ZC49SNTt3VecH7hARZ6vqbSFGnOYc/edit?usp=sharing. On that meeting, Dario recommended to create hierarchy of metrics based on the project goals. I created such image and a description for those metrics (the image is on top of our analytics page and the metrics are described in what it now the "Other metrics for created articles" section.
In a meeting between Amir and Howie, they captured which should be the most important metrics from the product perspective in the "High priorities for product management". If I recalled correctly, as an outcome of later meetings between Howie and Amir, Howie was happy focusing on articles published as a single (initial?) metric for success. Amir can provide more details since I was not on those meetings.
In short: The analytics page https://www.mediawiki.org/wiki/Content_translation/analytics has pieces contributed by different people during the last year, and although there are many ideas to organise and detail, measuring the number of published articles seems to be the solid candidate to get started with, learn from the value we get from it and polish the rest of our goal-to-signal process http://www.rodden.org/kerry/heart/ for detecting better metrics.
Pau
On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi All,
I have been reviewing our requirements for Content translation analytics https://www.mediawiki.org/wiki/Content_translation/analytics and I have a few questions/requests. I am sending them to the language team list and Leila and Dan in the hopes of getting some more clarity. I will add the same content to the Trello card.
In the weekly team meeting earlier today we agreed that the first metric we want to collect data for is the number of articles created in each language over time. This is something has Amir has already set up our current Event Logging https://git.wikimedia.org/blob/mediawiki/extensions/ContentTranslation/89b6284f06b4419ddec6dcccee0eed500f267100/modules/eventlogging/ext.cx.eventlogging.js to track. Now that Kartik has enabled EL in beta, that part should be done. Since we are only barely turning it on, there will be very little data until people create more articles using CX. However, we should be set up to collect any new data that comes in.
@Leila, can you verify that the db table now exists for the ContentTranslation schema https://meta.wikimedia.org/wiki/Schema:ContentTranslation? If it doesn’t, can you point us to right people we need to work with to troubleshoot the issue? Also you mentioned in our meeting that personal data may soon be purged after 90 days as part of a new privacy policy. Could you explain that a bit more or point us to more information? If this is the case, it may affect some of the metrics we would like to collect in the future.
@Dan, what do we need to do next in order to set up a very simple visualization that would show the number of articles created per week by language. Pau has an image of what he would like on the Trello card https://trello.com/c/vQm0hlkt/18-content-translation-analytics. You mentioned something about being able to host a dashboard for us on one of the Limn servers you already have set up.
@Santhosh, I believe you said earlier you have a script you use to export the data for the ULS analytics. If so can you share that please in case we need a similar script for CX so I don’t have to write a new script from scratch?
@Pau, @Amir There is a section called High priorities for product management https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? If the latter, have these priorities been reviewed by anyone outside the team? I think we are safe to proceed with our current plan, but it would be good to have product sign off on things more generally.
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
-- Pau Giner Interaction Designer Wikimedia Foundation _______________________________________________ Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Hello,
Taking last statement back, asked Yuvi and beta does have a varnish instance so the flow of EL events "should" be the same one that production.
Now I looked on deployment-eventlogging02, which is the EL machine for labs and the last events I see there are from Aug 22.
So no events have come in as of late, which could point to an issue on the setup. I will look into it some more.
Thanks,
Nuria
On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org wrote:
To keep archives happy: Beta setup post events to http://bits.beta.wmflabs.org/event.gif http://bits.beta.wmflabs.org/event.gif?foo=bar that, while it does not look to be varnish, has some kind of listener that post those events to beta event logging database.
On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Niklas,
Can you answer this question from Nuria?
jsahleen: does beta have its own varnish instance? where are you posting your events in beta? can you send teh url?
Also would it be possible to document the steps you used when testing EL on beta so that others can reproduce them?
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
(Moving this discussion to analytics@ and localization-team@ based on Nuria’s suggestion below.)
Hi Leila,
The output I posted in the message is the only output I am seeing. I do not see the URL-encoded section or the validation section. I think there may be something wrong with my testing setup.
Niklas Laxstöm has checked what is happening with our event logging in beta and he confirmed that we are sending events and the events are valid. The issue seems to be that we are logging events to the beta event logging db while what we checked earlier was the production event logging db.
Can you (or anyone who is available) check the event logging db in beta to see if the table has been created and has data? The schema name again is ContentTranslation. If you don’t find anything, let us know and we will do some more investigation.
If there is data in the beta db the next step would be to follow with Dan’s instructions https://wikitech.wikimedia.org/wiki/Analytics/Dashboards to get a dashboard set up on limn1. I believe that most of Dan’s instructions need to be handled by someone on the analytics team, but let me know if there is anything I can help with.
Thanks again for your help!
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote:
Hi Joel,
When you log events, the output will be the URL-encoded JSON sent by the browser, the event record (similar to what you pasted in your email), and whether the event validates against the schema. For the sample output you pasted earlier, or another sample output, can you let us know if validation section shows Valid?
Leila
On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel,
For questions like these going forward you can contact analytics@ as you will be getting amore prompt response. Both Dan and Leila are OOTO the next couple of days.
There are configuration options for the dev server that need to be
added. Do similar options need to be added when not using the dev server? No, there is no need.
You would need sample rates to determine at which sampling rate you are logging if you are not logging all events, that is.
Thanks,
Nuria
On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu <dandreescu@wikimedia.org
wrote:
Adding Nuria as she can probably help
On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi Leila,
I have tested our EventLogging code and it seems to be working fine with the event logging dev server. I can see the events coming through and they are valid. Here is some sample output:
{"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", "event": {"action": "create-translated-page", "targetLanguage": "ca", "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": 7146627}
Are there additional configuration options we need to add to get EL working aside from just requiring the main extension file. There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server?
Any help on this would be much appreciated.
Thanks,
Joel
On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
No problem, Dan. Enjoy your vacation!
I will read through the document at the link you sent. I still need to fix our event logging code so it may be a couple days before we are ready anyway. If I have any questions I will contact Leila or Nuria.
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 7, 2014, at 3:10 PM, Dan Andreescu dandreescu@wikimedia.org wrote:
Joel, re: visualization,
I'm going on vacation tomorrow and will be back on November 19th. If that's not too late, I can set up a limn instance then. If it's too late, that's ok, I wrote up the steps needed. Someone with access to the limn1.eqiad.wmflabs instance can perform them: https://wikitech.wikimedia.org/wiki/Analytics/Dashboards
If you have the data or are generating the data in some other way, then you don't need half of that setup, you just need the part that sets up the limn dashboard which is only an hour or so of work. Sorry I'm running out the door and can't take care of that for you.
Dan
On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Thank you for the information, Pau. Very helpful. As you say, this does not change our current plans or hold us up in any way. I was just wasn’t clear about the relationship between the "high priorities" and "other metrics” sections. Knowing these came from different people at different times clarifies things a lot. Joel
On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org wrote:
@Pau, @Amir There is a section called High priorities for product > management > https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on > the Content translation analytics page. Did these priorities come from > outside the team or does this just represent our own internal view of the > high priorities?
Here is the story of that page as I'm aware of it:
In September 2013, I was in a meeting with the analytics team in SF presenting an initial proposal for metrics https://docs.google.com/a/wikimedia.org/presentation/d/1V1XLV7jUcAtco5ZC49SNTt3VecH7hARZ6vqbSFGnOYc/edit?usp=sharing. On that meeting, Dario recommended to create hierarchy of metrics based on the project goals. I created such image and a description for those metrics (the image is on top of our analytics page and the metrics are described in what it now the "Other metrics for created articles" section.
In a meeting between Amir and Howie, they captured which should be the most important metrics from the product perspective in the "High priorities for product management". If I recalled correctly, as an outcome of later meetings between Howie and Amir, Howie was happy focusing on articles published as a single (initial?) metric for success. Amir can provide more details since I was not on those meetings.
In short: The analytics page https://www.mediawiki.org/wiki/Content_translation/analytics has pieces contributed by different people during the last year, and although there are many ideas to organise and detail, measuring the number of published articles seems to be the solid candidate to get started with, learn from the value we get from it and polish the rest of our goal-to-signal process http://www.rodden.org/kerry/heart/ for detecting better metrics.
Pau
On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
> Hi All, > > I have been reviewing our requirements for Content translation > analytics > https://www.mediawiki.org/wiki/Content_translation/analytics and > I have a few questions/requests. I am sending them to the language team > list and Leila and Dan in the hopes of getting some more clarity. I will > add the same content to the Trello card. > > In the weekly team meeting earlier today we agreed that the first > metric we want to collect data for is the number of articles created in > each language over time. This is something has Amir has already set up our > current Event Logging > https://git.wikimedia.org/blob/mediawiki/extensions/ContentTranslation/89b6284f06b4419ddec6dcccee0eed500f267100/modules/eventlogging/ext.cx.eventlogging.js to > track. Now that Kartik has enabled EL in beta, that part should be done. > Since we are only barely turning it on, there will be very little data > until people create more articles using CX. However, we should be set up to > collect any new data that comes in. > > @Leila, can you verify that the db table now exists for the ContentTranslation > schema https://meta.wikimedia.org/wiki/Schema:ContentTranslation? > If it doesn’t, can you point us to right people we need to work with to > troubleshoot the issue? Also you mentioned in our meeting that personal > data may soon be purged after 90 days as part of a new privacy policy. > Could you explain that a bit more or point us to more information? If this > is the case, it may affect some of the metrics we would like to collect in > the future. > > @Dan, what do we need to do next in order to set up a very simple > visualization that would show the number of articles created per week by > language. Pau has an image of what he would like on the Trello card > https://trello.com/c/vQm0hlkt/18-content-translation-analytics. > You mentioned something about being able to host a dashboard for us on one > of the Limn servers you already have set up. > > @Santhosh, I believe you said earlier you have a script you use to > export the data for the ULS analytics. If so can you share that please in > case we need a similar script for CX so I don’t have to write a new script > from scratch? > > @Pau, @Amir There is a section called High priorities for product > management > https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on > the Content translation analytics page. Did these priorities come from > outside the team or does this just represent our own internal view of the > high priorities? If the latter, have these priorities been reviewed > by anyone outside the team? I think we are safe to proceed with our current > plan, but it would be good to have product sign off on things more > generally. > > Thanks, > > Joel > > Joel Sahleen, Software Engineer > Language Engineering > Wikimedia Foundation > jsahleen@wikimedia.org > > > > > > _______________________________________________ > Localisation-team mailing list > Localisation-team@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/localisation-team > >
-- Pau Giner Interaction Designer Wikimedia Foundation _______________________________________________ Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Hi Nuria,
Thank you so much for your help on this. Please let me know if there is any way I can help out or if there is anything you need from our end.
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Hello,
Taking last statement back, asked Yuvi and beta does have a varnish instance so the flow of EL events "should" be the same one that production.
Now I looked on deployment-eventlogging02, which is the EL machine for labs and the last events I see there are from Aug 22.
So no events have come in as of late, which could point to an issue on the setup. I will look into it some more.
Thanks,
Nuria
On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org wrote: To keep archives happy: Beta setup post events to http://bits.beta.wmflabs.org/event.gif that, while it does not look to be varnish, has some kind of listener that post those events to beta event logging database.
On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Niklas,
Can you answer this question from Nuria?
jsahleen: does beta have its own varnish instance? where are you posting your events in beta? can you send teh url?
Also would it be possible to document the steps you used when testing EL on beta so that others can reproduce them?
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
(Moving this discussion to analytics@ and localization-team@ based on Nuria’s suggestion below.)
Hi Leila,
The output I posted in the message is the only output I am seeing. I do not see the URL-encoded section or the validation section. I think there may be something wrong with my testing setup.
Niklas Laxstöm has checked what is happening with our event logging in beta and he confirmed that we are sending events and the events are valid. The issue seems to be that we are logging events to the beta event logging db while what we checked earlier was the production event logging db.
Can you (or anyone who is available) check the event logging db in beta to see if the table has been created and has data? The schema name again is ContentTranslation. If you don’t find anything, let us know and we will do some more investigation.
If there is data in the beta db the next step would be to follow with Dan’s instructions to get a dashboard set up on limn1. I believe that most of Dan’s instructions need to be handled by someone on the analytics team, but let me know if there is anything I can help with.
Thanks again for your help!
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote:
Hi Joel,
When you log events, the output will be the URL-encoded JSON sent by the browser, the event record (similar to what you pasted in your email), and whether the event validates against the schema. For the sample output you pasted earlier, or another sample output, can you let us know if validation section shows Valid?
Leila
On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org wrote: Joel,
For questions like these going forward you can contact analytics@ as you will be getting amore prompt response. Both Dan and Leila are OOTO the next couple of days.
There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server?
No, there is no need.
You would need sample rates to determine at which sampling rate you are logging if you are not logging all events, that is.
Thanks,
Nuria
On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu dandreescu@wikimedia.org wrote: Adding Nuria as she can probably help
On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Leila,
I have tested our EventLogging code and it seems to be working fine with the event logging dev server. I can see the events coming through and they are valid. Here is some sample output:
{"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", "event": {"action": "create-translated-page", "targetLanguage": "ca", "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": 7146627}
Are there additional configuration options we need to add to get EL working aside from just requiring the main extension file. There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server?
Any help on this would be much appreciated.
Thanks,
Joel
On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
No problem, Dan. Enjoy your vacation!
I will read through the document at the link you sent. I still need to fix our event logging code so it may be a couple days before we are ready anyway. If I have any questions I will contact Leila or Nuria.
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 7, 2014, at 3:10 PM, Dan Andreescu dandreescu@wikimedia.org wrote:
Joel, re: visualization,
I'm going on vacation tomorrow and will be back on November 19th. If that's not too late, I can set up a limn instance then. If it's too late, that's ok, I wrote up the steps needed. Someone with access to the limn1.eqiad.wmflabs instance can perform them: https://wikitech.wikimedia.org/wiki/Analytics/Dashboards
If you have the data or are generating the data in some other way, then you don't need half of that setup, you just need the part that sets up the limn dashboard which is only an hour or so of work. Sorry I'm running out the door and can't take care of that for you.
Dan
On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Thank you for the information, Pau. Very helpful. As you say, this does not change our current plans or hold us up in any way. I was just wasn’t clear about the relationship between the "high priorities" and "other metrics” sections. Knowing these came from different people at different times clarifies things a lot. Joel
On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org wrote:
@Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities?
Here is the story of that page as I'm aware of it:
In September 2013, I was in a meeting with the analytics team in SF presenting an initial proposal for metrics. On that meeting, Dario recommended to create hierarchy of metrics based on the project goals. I created such image and a description for those metrics (the image is on top of our analytics page and the metrics are described in what it now the "Other metrics for created articles" section.
In a meeting between Amir and Howie, they captured which should be the most important metrics from the product perspective in the "High priorities for product management". If I recalled correctly, as an outcome of later meetings between Howie and Amir, Howie was happy focusing on articles published as a single (initial?) metric for success. Amir can provide more details since I was not on those meetings.
In short: The analytics page has pieces contributed by different people during the last year, and although there are many ideas to organise and detail, measuring the number of published articles seems to be the solid candidate to get started with, learn from the value we get from it and polish the rest of our goal-to-signal process for detecting better metrics.
Pau
On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi All,
I have been reviewing our requirements for Content translation analytics and I have a few questions/requests. I am sending them to the language team list and Leila and Dan in the hopes of getting some more clarity. I will add the same content to the Trello card.
In the weekly team meeting earlier today we agreed that the first metric we want to collect data for is the number of articles created in each language over time. This is something has Amir has already set up our current Event Logging to track. Now that Kartik has enabled EL in beta, that part should be done. Since we are only barely turning it on, there will be very little data until people create more articles using CX. However, we should be set up to collect any new data that comes in.
@Leila, can you verify that the db table now exists for the ContentTranslation schema? If it doesn’t, can you point us to right people we need to work with to troubleshoot the issue? Also you mentioned in our meeting that personal data may soon be purged after 90 days as part of a new privacy policy. Could you explain that a bit more or point us to more information? If this is the case, it may affect some of the metrics we would like to collect in the future.
@Dan, what do we need to do next in order to set up a very simple visualization that would show the number of articles created per week by language. Pau has an image of what he would like on the Trello card. You mentioned something about being able to host a dashboard for us on one of the Limn servers you already have set up.
@Santhosh, I believe you said earlier you have a script you use to export the data for the ULS analytics. If so can you share that please in case we need a similar script for CX so I don’t have to write a new script from scratch?
@Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? If the latter, have these priorities been reviewed by anyone outside the team? I think we are safe to proceed with our current plan, but it would be good to have product sign off on things more generally.
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
-- Pau Giner Interaction Designer Wikimedia Foundation _______________________________________________ Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Please let me know if there is any way I can help out or if there is
anything you need from our end. When you have deployed your newest code to production, let's check whether events appear on the production stream. Let us know when deployment is done and you think your code should be logging. To confirm: You have seen proper logging from your events in vagrant, right?
Di you setup a sampling rate or code is logging 1 to 1?
On our end we will work to troubleshoot the beta EL infrastructure, I am not familiar with it and neither is anyone on our team but we will ask around.
On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi Nuria,
Thank you so much for your help on this. Please let me know if there is any way I can help out or if there is anything you need from our end.
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Hello,
Taking last statement back, asked Yuvi and beta does have a varnish instance so the flow of EL events "should" be the same one that production.
Now I looked on deployment-eventlogging02, which is the EL machine for labs and the last events I see there are from Aug 22.
So no events have come in as of late, which could point to an issue on the setup. I will look into it some more.
Thanks,
Nuria
On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org wrote:
To keep archives happy: Beta setup post events to http://bits.beta.wmflabs.org/event.gif http://bits.beta.wmflabs.org/event.gif?foo=bar that, while it does not look to be varnish, has some kind of listener that post those events to beta event logging database.
On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Niklas,
Can you answer this question from Nuria?
jsahleen: does beta have its own varnish instance? where are you posting your events in beta? can you send teh url?
Also would it be possible to document the steps you used when testing EL on beta so that others can reproduce them?
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
(Moving this discussion to analytics@ and localization-team@ based on Nuria’s suggestion below.)
Hi Leila,
The output I posted in the message is the only output I am seeing. I do not see the URL-encoded section or the validation section. I think there may be something wrong with my testing setup.
Niklas Laxstöm has checked what is happening with our event logging in beta and he confirmed that we are sending events and the events are valid. The issue seems to be that we are logging events to the beta event logging db while what we checked earlier was the production event logging db.
Can you (or anyone who is available) check the event logging db in beta to see if the table has been created and has data? The schema name again is ContentTranslation. If you don’t find anything, let us know and we will do some more investigation.
If there is data in the beta db the next step would be to follow with Dan’s instructions https://wikitech.wikimedia.org/wiki/Analytics/Dashboards to get a dashboard set up on limn1. I believe that most of Dan’s instructions need to be handled by someone on the analytics team, but let me know if there is anything I can help with.
Thanks again for your help!
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote:
Hi Joel,
When you log events, the output will be the URL-encoded JSON sent by the browser, the event record (similar to what you pasted in your email), and whether the event validates against the schema. For the sample output you pasted earlier, or another sample output, can you let us know if validation section shows Valid?
Leila
On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel,
For questions like these going forward you can contact analytics@ as you will be getting amore prompt response. Both Dan and Leila are OOTO the next couple of days.
There are configuration options for the dev server that need to be
added. Do similar options need to be added when not using the dev server? No, there is no need.
You would need sample rates to determine at which sampling rate you are logging if you are not logging all events, that is.
Thanks,
Nuria
On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu < dandreescu@wikimedia.org> wrote:
Adding Nuria as she can probably help
On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi Leila,
I have tested our EventLogging code and it seems to be working fine with the event logging dev server. I can see the events coming through and they are valid. Here is some sample output:
{"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", "event": {"action": "create-translated-page", "targetLanguage": "ca", "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": 7146627}
Are there additional configuration options we need to add to get EL working aside from just requiring the main extension file. There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server?
Any help on this would be much appreciated.
Thanks,
Joel
On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
No problem, Dan. Enjoy your vacation!
I will read through the document at the link you sent. I still need to fix our event logging code so it may be a couple days before we are ready anyway. If I have any questions I will contact Leila or Nuria.
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 7, 2014, at 3:10 PM, Dan Andreescu dandreescu@wikimedia.org wrote:
Joel, re: visualization,
I'm going on vacation tomorrow and will be back on November 19th. If that's not too late, I can set up a limn instance then. If it's too late, that's ok, I wrote up the steps needed. Someone with access to the limn1.eqiad.wmflabs instance can perform them: https://wikitech.wikimedia.org/wiki/Analytics/Dashboards
If you have the data or are generating the data in some other way, then you don't need half of that setup, you just need the part that sets up the limn dashboard which is only an hour or so of work. Sorry I'm running out the door and can't take care of that for you.
Dan
On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
> Thank you for the information, Pau. Very helpful. As you say, this > does not change our current plans or hold us up in any way. I was just > wasn’t clear about the relationship between the "high priorities" and > "other metrics” sections. Knowing these came from different people at > different times clarifies things a lot. > Joel > > On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org wrote: > > @Pau, @Amir There is a section called High priorities for product >> management >> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >> the Content translation analytics page. Did these priorities come from >> outside the team or does this just represent our own internal view of the >> high priorities? > > > Here is the story of that page as I'm aware of it: > > In September 2013, I was in a meeting with the analytics team in SF > presenting an initial proposal for metrics > https://docs.google.com/a/wikimedia.org/presentation/d/1V1XLV7jUcAtco5ZC49SNTt3VecH7hARZ6vqbSFGnOYc/edit?usp=sharing. > On that meeting, Dario recommended to create hierarchy of metrics based on > the project goals. I created such image and a description for those metrics > (the image is on top of our analytics page and the metrics are described in > what it now the "Other metrics for created articles" section. > > In a meeting between Amir and Howie, they captured which should be > the most important metrics from the product perspective in the "High > priorities for product management". If I recalled correctly, as an outcome > of later meetings between Howie and Amir, Howie was happy focusing on > articles published as a single (initial?) metric for success. Amir can > provide more details since I was not on those meetings. > > In short: The analytics page > https://www.mediawiki.org/wiki/Content_translation/analytics has > pieces contributed by different people during the last year, and although > there are many ideas to organise and detail, measuring the number of > published articles seems to be the solid candidate to get started with, > learn from the value we get from it and polish the rest of our goal-to-signal > process http://www.rodden.org/kerry/heart/ for detecting better > metrics. > > > Pau > > On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen <jsahleen@wikimedia.org > > wrote: > >> Hi All, >> >> I have been reviewing our requirements for Content translation >> analytics >> https://www.mediawiki.org/wiki/Content_translation/analytics and >> I have a few questions/requests. I am sending them to the language team >> list and Leila and Dan in the hopes of getting some more clarity. I will >> add the same content to the Trello card. >> >> In the weekly team meeting earlier today we agreed that the first >> metric we want to collect data for is the number of articles created in >> each language over time. This is something has Amir has already set up our >> current Event Logging >> https://git.wikimedia.org/blob/mediawiki/extensions/ContentTranslation/89b6284f06b4419ddec6dcccee0eed500f267100/modules/eventlogging/ext.cx.eventlogging.js to >> track. Now that Kartik has enabled EL in beta, that part should be done. >> Since we are only barely turning it on, there will be very little data >> until people create more articles using CX. However, we should be set up to >> collect any new data that comes in. >> >> @Leila, can you verify that the db table now exists for the ContentTranslation >> schema https://meta.wikimedia.org/wiki/Schema:ContentTranslation? >> If it doesn’t, can you point us to right people we need to work with to >> troubleshoot the issue? Also you mentioned in our meeting that personal >> data may soon be purged after 90 days as part of a new privacy policy. >> Could you explain that a bit more or point us to more information? If this >> is the case, it may affect some of the metrics we would like to collect in >> the future. >> >> @Dan, what do we need to do next in order to set up a very simple >> visualization that would show the number of articles created per week by >> language. Pau has an image of what he would like on the Trello card >> https://trello.com/c/vQm0hlkt/18-content-translation-analytics. >> You mentioned something about being able to host a dashboard for us on one >> of the Limn servers you already have set up. >> >> @Santhosh, I believe you said earlier you have a script you use to >> export the data for the ULS analytics. If so can you share that please in >> case we need a similar script for CX so I don’t have to write a new script >> from scratch? >> >> @Pau, @Amir There is a section called High priorities for product >> management >> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >> the Content translation analytics page. Did these priorities come from >> outside the team or does this just represent our own internal view of the >> high priorities? If the latter, have these priorities been >> reviewed by anyone outside the team? I think we are safe to proceed with >> our current plan, but it would be good to have product sign off on things >> more generally. >> >> Thanks, >> >> Joel >> >> Joel Sahleen, Software Engineer >> Language Engineering >> Wikimedia Foundation >> jsahleen@wikimedia.org >> >> >> >> >> >> _______________________________________________ >> Localisation-team mailing list >> Localisation-team@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/localisation-team >> >> > > > -- > Pau Giner > Interaction Designer > Wikimedia Foundation > _______________________________________________ > Localisation-team mailing list > Localisation-team@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/localisation-team > > >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Hi Nuria,
Please let me know if there is any way I can help out or if there is anything you need from our end.
When you have deployed your newest code to production, let's check whether events appear on the production stream. Let us know when deployment is done and you think your code should be logging.
Our code is not scheduled to be released to production until January. Getting the metrics is partly to help us ensure and promote that release. We will keep you informed as our plans progress, but hopefully we can figure out what the issue is in beta soon.
To confirm: You have seen proper logging from your events in vagrant, right?
The output I am seeing with vagrant is what I pasted to this thread earlier. It does not contain the url-encoded section or the user agent information as we discussed before. I think that is an issue with my dev environment, however, and not a problem with the code. The same code appears to be sending valid events in beta. The http request I sent to your email earlier is what we are seeing there. It seems to include all the information you said it should include.
If you want to debug what is happening in beta yourself, an easy way I found to do that is:
Go to our Content Translation translation view page in beta (you will need to create an account and sign in) Open chrome dev tools, Click the add translation links that appear in the middle column to add a few machine translated paragraphs to the editor Click on the publish button in the header to publish the translation to your user namespace (triggers EL event) Look at the network pane in chrome dev tools and find the entry with the event logging url (it should be near the bottom). Click on the entry to see all the request and response information.
You probably already know all this, but I thought I would pass it along just in case it helps.
Di you setup a sampling rate or code is logging 1 to 1?
No sample rate. Just logging 1 to 1.
On our end we will work to troubleshoot the beta EL infrastructure, I am not familiar with it and neither is anyone on our team but we will ask around.
Yeah, Dan said you all kind of inherited EL so that’s totally understandable. We appreciate you looking into this for us. Let us know how else we can help.
Joel
On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Nuria,
Thank you so much for your help on this. Please let me know if there is any way I can help out or if there is anything you need from our end.
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Hello,
Taking last statement back, asked Yuvi and beta does have a varnish instance so the flow of EL events "should" be the same one that production.
Now I looked on deployment-eventlogging02, which is the EL machine for labs and the last events I see there are from Aug 22.
So no events have come in as of late, which could point to an issue on the setup. I will look into it some more.
Thanks,
Nuria
On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org wrote: To keep archives happy: Beta setup post events to http://bits.beta.wmflabs.org/event.gif that, while it does not look to be varnish, has some kind of listener that post those events to beta event logging database.
On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Niklas,
Can you answer this question from Nuria?
jsahleen: does beta have its own varnish instance? where are you posting your events in beta? can you send teh url?
Also would it be possible to document the steps you used when testing EL on beta so that others can reproduce them?
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
(Moving this discussion to analytics@ and localization-team@ based on Nuria’s suggestion below.)
Hi Leila,
The output I posted in the message is the only output I am seeing. I do not see the URL-encoded section or the validation section. I think there may be something wrong with my testing setup.
Niklas Laxstöm has checked what is happening with our event logging in beta and he confirmed that we are sending events and the events are valid. The issue seems to be that we are logging events to the beta event logging db while what we checked earlier was the production event logging db.
Can you (or anyone who is available) check the event logging db in beta to see if the table has been created and has data? The schema name again is ContentTranslation. If you don’t find anything, let us know and we will do some more investigation.
If there is data in the beta db the next step would be to follow with Dan’s instructions to get a dashboard set up on limn1. I believe that most of Dan’s instructions need to be handled by someone on the analytics team, but let me know if there is anything I can help with.
Thanks again for your help!
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote:
Hi Joel,
When you log events, the output will be the URL-encoded JSON sent by the browser, the event record (similar to what you pasted in your email), and whether the event validates against the schema. For the sample output you pasted earlier, or another sample output, can you let us know if validation section shows Valid?
Leila
On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org wrote: Joel,
For questions like these going forward you can contact analytics@ as you will be getting amore prompt response. Both Dan and Leila are OOTO the next couple of days.
There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server?
No, there is no need.
You would need sample rates to determine at which sampling rate you are logging if you are not logging all events, that is.
Thanks,
Nuria
On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu dandreescu@wikimedia.org wrote: Adding Nuria as she can probably help
On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Leila,
I have tested our EventLogging code and it seems to be working fine with the event logging dev server. I can see the events coming through and they are valid. Here is some sample output:
{"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", "event": {"action": "create-translated-page", "targetLanguage": "ca", "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": 7146627}
Are there additional configuration options we need to add to get EL working aside from just requiring the main extension file. There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server?
Any help on this would be much appreciated.
Thanks,
Joel
On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
No problem, Dan. Enjoy your vacation!
I will read through the document at the link you sent. I still need to fix our event logging code so it may be a couple days before we are ready anyway. If I have any questions I will contact Leila or Nuria.
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 7, 2014, at 3:10 PM, Dan Andreescu dandreescu@wikimedia.org wrote:
Joel, re: visualization,
I'm going on vacation tomorrow and will be back on November 19th. If that's not too late, I can set up a limn instance then. If it's too late, that's ok, I wrote up the steps needed. Someone with access to the limn1.eqiad.wmflabs instance can perform them: https://wikitech.wikimedia.org/wiki/Analytics/Dashboards
If you have the data or are generating the data in some other way, then you don't need half of that setup, you just need the part that sets up the limn dashboard which is only an hour or so of work. Sorry I'm running out the door and can't take care of that for you.
Dan
On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Thank you for the information, Pau. Very helpful. As you say, this does not change our current plans or hold us up in any way. I was just wasn’t clear about the relationship between the "high priorities" and "other metrics” sections. Knowing these came from different people at different times clarifies things a lot. Joel
On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org wrote:
> @Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? > > Here is the story of that page as I'm aware of it: > > In September 2013, I was in a meeting with the analytics team in SF presenting an initial proposal for metrics. On that meeting, Dario recommended to create hierarchy of metrics based on the project goals. I created such image and a description for those metrics (the image is on top of our analytics page and the metrics are described in what it now the "Other metrics for created articles" section. > > In a meeting between Amir and Howie, they captured which should be the most important metrics from the product perspective in the "High priorities for product management". If I recalled correctly, as an outcome of later meetings between Howie and Amir, Howie was happy focusing on articles published as a single (initial?) metric for success. Amir can provide more details since I was not on those meetings. > > In short: The analytics page has pieces contributed by different people during the last year, and although there are many ideas to organise and detail, measuring the number of published articles seems to be the solid candidate to get started with, learn from the value we get from it and polish the rest of our goal-to-signal process for detecting better metrics. > > > Pau > > On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen jsahleen@wikimedia.org wrote: > Hi All, > > I have been reviewing our requirements for Content translation analytics and I have a few questions/requests. I am sending them to the language team list and Leila and Dan in the hopes of getting some more clarity. I will add the same content to the Trello card. > > In the weekly team meeting earlier today we agreed that the first metric we want to collect data for is the number of articles created in each language over time. This is something has Amir has already set up our current Event Logging to track. Now that Kartik has enabled EL in beta, that part should be done. Since we are only barely turning it on, there will be very little data until people create more articles using CX. However, we should be set up to collect any new data that comes in. > > @Leila, can you verify that the db table now exists for the ContentTranslation schema? If it doesn’t, can you point us to right people we need to work with to troubleshoot the issue? Also you mentioned in our meeting that personal data may soon be purged after 90 days as part of a new privacy policy. Could you explain that a bit more or point us to more information? If this is the case, it may affect some of the metrics we would like to collect in the future. > > @Dan, what do we need to do next in order to set up a very simple visualization that would show the number of articles created per week by language. Pau has an image of what he would like on the Trello card. You mentioned something about being able to host a dashboard for us on one of the Limn servers you already have set up. > > @Santhosh, I believe you said earlier you have a script you use to export the data for the ULS analytics. If so can you share that please in case we need a similar script for CX so I don’t have to write a new script from scratch? > > @Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? If the latter, have these priorities been reviewed by anyone outside the team? I think we are safe to proceed with our current plan, but it would be good to have product sign off on things more generally. > > Thanks, > > Joel > > Joel Sahleen, Software Engineer > Language Engineering > Wikimedia Foundation > jsahleen@wikimedia.org > > > > > > _______________________________________________ > Localisation-team mailing list > Localisation-team@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/localisation-team > > > > > -- > Pau Giner > Interaction Designer > Wikimedia Foundation > _______________________________________________ > Localisation-team mailing list > Localisation-team@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Joel:
I see, I was hoping to set aside the beta issues but if you are not deploying to prod any time soon I guess we will need to troubleshoot there. By the looks of it EL has not worked in beta since august, but, as I said before, I know very little about how beta is put together.
I have filed a bug to regarding the beta issue: https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
On Thu, Nov 13, 2014 at 12:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi Nuria,
Please let me know if there is any way I can help out or if there is
anything you need from our end. When you have deployed your newest code to production, let's check whether events appear on the production stream. Let us know when deployment is done and you think your code should be logging.
Our code is not scheduled to be released to production until January. Getting the metrics is partly to help us ensure and promote that release. We will keep you informed as our plans progress, but hopefully we can figure out what the issue is in beta soon.
To confirm: You have seen proper logging from your events in vagrant, right?
The output I am seeing with vagrant is what I pasted to this thread earlier. It does not contain the url-encoded section or the user agent information as we discussed before. I think that is an issue with my dev environment, however, and not a problem with the code. The same code appears to be sending valid events in beta. The http request I sent to your email earlier is what we are seeing there. It seems to include all the information you said it should include.
If you want to debug what is happening in beta yourself, an easy way I found to do that is:
- Go to our Content Translation translation view
http://en.wikipedia.beta.wmflabs.org/wiki/Special:ContentTranslation?page=Han+Feizi&from=es&to=ca&targettitle=Han+Feizi page in beta (you will need to create an account and sign in) 2. Open chrome dev tools, 3. Click the add translation links that appear in the middle column to add a few machine translated paragraphs to the editor 4. Click on the publish button in the header to publish the translation to your user namespace (triggers EL event) 5. Look at the network pane in chrome dev tools and find the entry with the event logging url (it should be near the bottom). 6. Click on the entry to see all the request and response information.
You probably already know all this, but I thought I would pass it along just in case it helps.
Di you setup a sampling rate or code is logging 1 to 1?
No sample rate. Just logging 1 to 1.
On our end we will work to troubleshoot the beta EL infrastructure, I am not familiar with it and neither is anyone on our team but we will ask around.
Yeah, Dan said you all kind of inherited EL so that’s totally understandable. We appreciate you looking into this for us. Let us know how else we can help.
Joel
On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi Nuria,
Thank you so much for your help on this. Please let me know if there is any way I can help out or if there is anything you need from our end.
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Hello,
Taking last statement back, asked Yuvi and beta does have a varnish instance so the flow of EL events "should" be the same one that production.
Now I looked on deployment-eventlogging02, which is the EL machine for labs and the last events I see there are from Aug 22.
So no events have come in as of late, which could point to an issue on the setup. I will look into it some more.
Thanks,
Nuria
On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org wrote:
To keep archives happy: Beta setup post events to http://bits.beta.wmflabs.org/event.gif http://bits.beta.wmflabs.org/event.gif?foo=bar that, while it does not look to be varnish, has some kind of listener that post those events to beta event logging database.
On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Niklas,
Can you answer this question from Nuria?
jsahleen: does beta have its own varnish instance? where are you posting your events in beta? can you send teh url?
Also would it be possible to document the steps you used when testing EL on beta so that others can reproduce them?
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
(Moving this discussion to analytics@ and localization-team@ based on Nuria’s suggestion below.)
Hi Leila,
The output I posted in the message is the only output I am seeing. I do not see the URL-encoded section or the validation section. I think there may be something wrong with my testing setup.
Niklas Laxstöm has checked what is happening with our event logging in beta and he confirmed that we are sending events and the events are valid. The issue seems to be that we are logging events to the beta event logging db while what we checked earlier was the production event logging db.
Can you (or anyone who is available) check the event logging db in beta to see if the table has been created and has data? The schema name again is ContentTranslation. If you don’t find anything, let us know and we will do some more investigation.
If there is data in the beta db the next step would be to follow with Dan’s instructions https://wikitech.wikimedia.org/wiki/Analytics/Dashboards to get a dashboard set up on limn1. I believe that most of Dan’s instructions need to be handled by someone on the analytics team, but let me know if there is anything I can help with.
Thanks again for your help!
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote:
Hi Joel,
When you log events, the output will be the URL-encoded JSON sent by the browser, the event record (similar to what you pasted in your email), and whether the event validates against the schema. For the sample output you pasted earlier, or another sample output, can you let us know if validation section shows Valid?
Leila
On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel,
For questions like these going forward you can contact analytics@ as you will be getting amore prompt response. Both Dan and Leila are OOTO the next couple of days.
There are configuration options for the dev server that need to be
added. Do similar options need to be added when not using the dev server? No, there is no need.
You would need sample rates to determine at which sampling rate you are logging if you are not logging all events, that is.
Thanks,
Nuria
On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu < dandreescu@wikimedia.org> wrote:
Adding Nuria as she can probably help
On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org wrote:
> Hi Leila, > > I have tested our EventLogging code and it seems to be working fine > with the event logging dev server. I can see the events coming through and > they are valid. Here is some sample output: > > {"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", > "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, > "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": > "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", > "event": {"action": "create-translated-page", "targetLanguage": "ca", > "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": > 7146627} > > Are there additional configuration options we need to add to get EL > working aside from just requiring the main extension file. There are > configuration options for the dev server that need to be added. Do similar > options need to be added when not using the dev server? > > Any help on this would be much appreciated. > > Thanks, > > Joel > > On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org > wrote: > > No problem, Dan. Enjoy your vacation! > > I will read through the document at the link you sent. I still need > to fix our event logging code so it may be a couple days before we are > ready anyway. If I have any questions I will contact Leila or Nuria. > > Thanks, > > Joel > > Joel Sahleen, Software Engineer > Language Engineering > Wikimedia Foundation > jsahleen@wikimedia.org > > > > > On Nov 7, 2014, at 3:10 PM, Dan Andreescu dandreescu@wikimedia.org > wrote: > > Joel, re: visualization, > > I'm going on vacation tomorrow and will be back on November 19th. > If that's not too late, I can set up a limn instance then. If it's too > late, that's ok, I wrote up the steps needed. Someone with access to the > limn1.eqiad.wmflabs instance can perform them: > https://wikitech.wikimedia.org/wiki/Analytics/Dashboards > > If you have the data or are generating the data in some other way, > then you don't need half of that setup, you just need the part that sets up > the limn dashboard which is only an hour or so of work. Sorry I'm running > out the door and can't take care of that for you. > > Dan > > On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen <jsahleen@wikimedia.org > > wrote: > >> Thank you for the information, Pau. Very helpful. As you say, this >> does not change our current plans or hold us up in any way. I was just >> wasn’t clear about the relationship between the "high priorities" and >> "other metrics” sections. Knowing these came from different people at >> different times clarifies things a lot. >> Joel >> >> On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org wrote: >> >> @Pau, @Amir There is a section called High priorities for product >>> management >>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>> the Content translation analytics page. Did these priorities come from >>> outside the team or does this just represent our own internal view of the >>> high priorities? >> >> >> Here is the story of that page as I'm aware of it: >> >> In September 2013, I was in a meeting with the analytics team in SF >> presenting an initial proposal for metrics >> https://docs.google.com/a/wikimedia.org/presentation/d/1V1XLV7jUcAtco5ZC49SNTt3VecH7hARZ6vqbSFGnOYc/edit?usp=sharing. >> On that meeting, Dario recommended to create hierarchy of metrics based on >> the project goals. I created such image and a description for those metrics >> (the image is on top of our analytics page and the metrics are described in >> what it now the "Other metrics for created articles" section. >> >> In a meeting between Amir and Howie, they captured which should be >> the most important metrics from the product perspective in the "High >> priorities for product management". If I recalled correctly, as an outcome >> of later meetings between Howie and Amir, Howie was happy focusing on >> articles published as a single (initial?) metric for success. Amir can >> provide more details since I was not on those meetings. >> >> In short: The analytics page >> https://www.mediawiki.org/wiki/Content_translation/analytics has >> pieces contributed by different people during the last year, and although >> there are many ideas to organise and detail, measuring the number of >> published articles seems to be the solid candidate to get started with, >> learn from the value we get from it and polish the rest of our goal-to-signal >> process http://www.rodden.org/kerry/heart/ for detecting better >> metrics. >> >> >> Pau >> >> On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen < >> jsahleen@wikimedia.org> wrote: >> >>> Hi All, >>> >>> I have been reviewing our requirements for Content translation >>> analytics >>> https://www.mediawiki.org/wiki/Content_translation/analytics and >>> I have a few questions/requests. I am sending them to the language team >>> list and Leila and Dan in the hopes of getting some more clarity. I will >>> add the same content to the Trello card. >>> >>> In the weekly team meeting earlier today we agreed that the first >>> metric we want to collect data for is the number of articles created in >>> each language over time. This is something has Amir has already set up our >>> current Event Logging >>> https://git.wikimedia.org/blob/mediawiki/extensions/ContentTranslation/89b6284f06b4419ddec6dcccee0eed500f267100/modules/eventlogging/ext.cx.eventlogging.js to >>> track. Now that Kartik has enabled EL in beta, that part should be done. >>> Since we are only barely turning it on, there will be very little data >>> until people create more articles using CX. However, we should be set up to >>> collect any new data that comes in. >>> >>> @Leila, can you verify that the db table now exists for the ContentTranslation >>> schema https://meta.wikimedia.org/wiki/Schema:ContentTranslation? >>> If it doesn’t, can you point us to right people we need to work with to >>> troubleshoot the issue? Also you mentioned in our meeting that personal >>> data may soon be purged after 90 days as part of a new privacy policy. >>> Could you explain that a bit more or point us to more information? If this >>> is the case, it may affect some of the metrics we would like to collect in >>> the future. >>> >>> @Dan, what do we need to do next in order to set up a very simple >>> visualization that would show the number of articles created per week by >>> language. Pau has an image of what he would like on the Trello >>> card >>> https://trello.com/c/vQm0hlkt/18-content-translation-analytics. >>> You mentioned something about being able to host a dashboard for us on one >>> of the Limn servers you already have set up. >>> >>> @Santhosh, I believe you said earlier you have a script you use to >>> export the data for the ULS analytics. If so can you share that please in >>> case we need a similar script for CX so I don’t have to write a new script >>> from scratch? >>> >>> @Pau, @Amir There is a section called High priorities for product >>> management >>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>> the Content translation analytics page. Did these priorities come from >>> outside the team or does this just represent our own internal view of the >>> high priorities? If the latter, have these priorities been >>> reviewed by anyone outside the team? I think we are safe to proceed with >>> our current plan, but it would be good to have product sign off on things >>> more generally. >>> >>> Thanks, >>> >>> Joel >>> >>> Joel Sahleen, Software Engineer >>> Language Engineering >>> Wikimedia Foundation >>> jsahleen@wikimedia.org >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Localisation-team mailing list >>> Localisation-team@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>> >>> >> >> >> -- >> Pau Giner >> Interaction Designer >> Wikimedia Foundation >> _______________________________________________ >> Localisation-team mailing list >> Localisation-team@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/localisation-team >> >> >> > > >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
[+ Ori]
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is waiting approval from ops. Let's touch-base tomorrow to see if we see events.
Leila
On Thu, Nov 13, 2014 at 1:30 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel:
I see, I was hoping to set aside the beta issues but if you are not deploying to prod any time soon I guess we will need to troubleshoot there. By the looks of it EL has not worked in beta since august, but, as I said before, I know very little about how beta is put together.
I have filed a bug to regarding the beta issue: https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
On Thu, Nov 13, 2014 at 12:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi Nuria,
Please let me know if there is any way I can help out or if there is
anything you need from our end. When you have deployed your newest code to production, let's check whether events appear on the production stream. Let us know when deployment is done and you think your code should be logging.
Our code is not scheduled to be released to production until January. Getting the metrics is partly to help us ensure and promote that release. We will keep you informed as our plans progress, but hopefully we can figure out what the issue is in beta soon.
To confirm: You have seen proper logging from your events in vagrant, right?
The output I am seeing with vagrant is what I pasted to this thread earlier. It does not contain the url-encoded section or the user agent information as we discussed before. I think that is an issue with my dev environment, however, and not a problem with the code. The same code appears to be sending valid events in beta. The http request I sent to your email earlier is what we are seeing there. It seems to include all the information you said it should include.
If you want to debug what is happening in beta yourself, an easy way I found to do that is:
- Go to our Content Translation translation view
http://en.wikipedia.beta.wmflabs.org/wiki/Special:ContentTranslation?page=Han+Feizi&from=es&to=ca&targettitle=Han+Feizi page in beta (you will need to create an account and sign in) 2. Open chrome dev tools, 3. Click the add translation links that appear in the middle column to add a few machine translated paragraphs to the editor 4. Click on the publish button in the header to publish the translation to your user namespace (triggers EL event) 5. Look at the network pane in chrome dev tools and find the entry with the event logging url (it should be near the bottom). 6. Click on the entry to see all the request and response information.
You probably already know all this, but I thought I would pass it along just in case it helps.
Di you setup a sampling rate or code is logging 1 to 1?
No sample rate. Just logging 1 to 1.
On our end we will work to troubleshoot the beta EL infrastructure, I am not familiar with it and neither is anyone on our team but we will ask around.
Yeah, Dan said you all kind of inherited EL so that’s totally understandable. We appreciate you looking into this for us. Let us know how else we can help.
Joel
On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi Nuria,
Thank you so much for your help on this. Please let me know if there is any way I can help out or if there is anything you need from our end.
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Hello,
Taking last statement back, asked Yuvi and beta does have a varnish instance so the flow of EL events "should" be the same one that production.
Now I looked on deployment-eventlogging02, which is the EL machine for labs and the last events I see there are from Aug 22.
So no events have come in as of late, which could point to an issue on the setup. I will look into it some more.
Thanks,
Nuria
On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org wrote:
To keep archives happy: Beta setup post events to http://bits.beta.wmflabs.org/event.gif http://bits.beta.wmflabs.org/event.gif?foo=bar that, while it does not look to be varnish, has some kind of listener that post those events to beta event logging database.
On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Niklas,
Can you answer this question from Nuria?
jsahleen: does beta have its own varnish instance? where are you posting your events in beta? can you send teh url?
Also would it be possible to document the steps you used when testing EL on beta so that others can reproduce them?
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
(Moving this discussion to analytics@ and localization-team@ based on Nuria’s suggestion below.)
Hi Leila,
The output I posted in the message is the only output I am seeing. I do not see the URL-encoded section or the validation section. I think there may be something wrong with my testing setup.
Niklas Laxstöm has checked what is happening with our event logging in beta and he confirmed that we are sending events and the events are valid. The issue seems to be that we are logging events to the beta event logging db while what we checked earlier was the production event logging db.
Can you (or anyone who is available) check the event logging db in beta to see if the table has been created and has data? The schema name again is ContentTranslation. If you don’t find anything, let us know and we will do some more investigation.
If there is data in the beta db the next step would be to follow with Dan’s instructions https://wikitech.wikimedia.org/wiki/Analytics/Dashboards to get a dashboard set up on limn1. I believe that most of Dan’s instructions need to be handled by someone on the analytics team, but let me know if there is anything I can help with.
Thanks again for your help!
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote:
Hi Joel,
When you log events, the output will be the URL-encoded JSON sent by the browser, the event record (similar to what you pasted in your email), and whether the event validates against the schema. For the sample output you pasted earlier, or another sample output, can you let us know if validation section shows Valid?
Leila
On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel,
For questions like these going forward you can contact analytics@ as you will be getting amore prompt response. Both Dan and Leila are OOTO the next couple of days.
>There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server? No, there is no need.
You would need sample rates to determine at which sampling rate you are logging if you are not logging all events, that is.
Thanks,
Nuria
On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu < dandreescu@wikimedia.org> wrote:
> Adding Nuria as she can probably help > > On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org > wrote: > >> Hi Leila, >> >> I have tested our EventLogging code and it seems to be working fine >> with the event logging dev server. I can see the events coming through and >> they are valid. Here is some sample output: >> >> {"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", >> "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, >> "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": >> "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", >> "event": {"action": "create-translated-page", "targetLanguage": "ca", >> "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": >> 7146627} >> >> Are there additional configuration options we need to add to get EL >> working aside from just requiring the main extension file. There are >> configuration options for the dev server that need to be added. Do similar >> options need to be added when not using the dev server? >> >> Any help on this would be much appreciated. >> >> Thanks, >> >> Joel >> >> On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org >> wrote: >> >> No problem, Dan. Enjoy your vacation! >> >> I will read through the document at the link you sent. I still need >> to fix our event logging code so it may be a couple days before we are >> ready anyway. If I have any questions I will contact Leila or Nuria. >> >> Thanks, >> >> Joel >> >> Joel Sahleen, Software Engineer >> Language Engineering >> Wikimedia Foundation >> jsahleen@wikimedia.org >> >> >> >> >> On Nov 7, 2014, at 3:10 PM, Dan Andreescu dandreescu@wikimedia.org >> wrote: >> >> Joel, re: visualization, >> >> I'm going on vacation tomorrow and will be back on November 19th. >> If that's not too late, I can set up a limn instance then. If it's too >> late, that's ok, I wrote up the steps needed. Someone with access to the >> limn1.eqiad.wmflabs instance can perform them: >> https://wikitech.wikimedia.org/wiki/Analytics/Dashboards >> >> If you have the data or are generating the data in some other way, >> then you don't need half of that setup, you just need the part that sets up >> the limn dashboard which is only an hour or so of work. Sorry I'm running >> out the door and can't take care of that for you. >> >> Dan >> >> On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen < >> jsahleen@wikimedia.org> wrote: >> >>> Thank you for the information, Pau. Very helpful. As you say, this >>> does not change our current plans or hold us up in any way. I was just >>> wasn’t clear about the relationship between the "high priorities" and >>> "other metrics” sections. Knowing these came from different people at >>> different times clarifies things a lot. >>> Joel >>> >>> On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org >>> wrote: >>> >>> @Pau, @Amir There is a section called High priorities for product >>>> management >>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>> the Content translation analytics page. Did these priorities come from >>>> outside the team or does this just represent our own internal view of the >>>> high priorities? >>> >>> >>> Here is the story of that page as I'm aware of it: >>> >>> In September 2013, I was in a meeting with the analytics team in >>> SF presenting an initial proposal for metrics >>> https://docs.google.com/a/wikimedia.org/presentation/d/1V1XLV7jUcAtco5ZC49SNTt3VecH7hARZ6vqbSFGnOYc/edit?usp=sharing. >>> On that meeting, Dario recommended to create hierarchy of metrics based on >>> the project goals. I created such image and a description for those metrics >>> (the image is on top of our analytics page and the metrics are described in >>> what it now the "Other metrics for created articles" section. >>> >>> In a meeting between Amir and Howie, they captured which should be >>> the most important metrics from the product perspective in the "High >>> priorities for product management". If I recalled correctly, as an outcome >>> of later meetings between Howie and Amir, Howie was happy focusing on >>> articles published as a single (initial?) metric for success. Amir can >>> provide more details since I was not on those meetings. >>> >>> In short: The analytics page >>> https://www.mediawiki.org/wiki/Content_translation/analytics >>> has pieces contributed by different people during the last year, and >>> although there are many ideas to organise and detail, measuring the number >>> of published articles seems to be the solid candidate to get started with, >>> learn from the value we get from it and polish the rest of our goal-to-signal >>> process http://www.rodden.org/kerry/heart/ for detecting better >>> metrics. >>> >>> >>> Pau >>> >>> On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen < >>> jsahleen@wikimedia.org> wrote: >>> >>>> Hi All, >>>> >>>> I have been reviewing our requirements for Content translation >>>> analytics >>>> https://www.mediawiki.org/wiki/Content_translation/analytics and >>>> I have a few questions/requests. I am sending them to the language team >>>> list and Leila and Dan in the hopes of getting some more clarity. I will >>>> add the same content to the Trello card. >>>> >>>> In the weekly team meeting earlier today we agreed that the first >>>> metric we want to collect data for is the number of articles created in >>>> each language over time. This is something has Amir has already set up our >>>> current Event Logging >>>> https://git.wikimedia.org/blob/mediawiki/extensions/ContentTranslation/89b6284f06b4419ddec6dcccee0eed500f267100/modules/eventlogging/ext.cx.eventlogging.js to >>>> track. Now that Kartik has enabled EL in beta, that part should be done. >>>> Since we are only barely turning it on, there will be very little data >>>> until people create more articles using CX. However, we should be set up to >>>> collect any new data that comes in. >>>> >>>> @Leila, can you verify that the db table now exists for the ContentTranslation >>>> schema >>>> https://meta.wikimedia.org/wiki/Schema:ContentTranslation? If >>>> it doesn’t, can you point us to right people we need to work with to >>>> troubleshoot the issue? Also you mentioned in our meeting that personal >>>> data may soon be purged after 90 days as part of a new privacy policy. >>>> Could you explain that a bit more or point us to more information? If this >>>> is the case, it may affect some of the metrics we would like to collect in >>>> the future. >>>> >>>> @Dan, what do we need to do next in order to set up a very simple >>>> visualization that would show the number of articles created per week by >>>> language. Pau has an image of what he would like on the Trello >>>> card >>>> https://trello.com/c/vQm0hlkt/18-content-translation-analytics. >>>> You mentioned something about being able to host a dashboard for us on one >>>> of the Limn servers you already have set up. >>>> >>>> @Santhosh, I believe you said earlier you have a script you use >>>> to export the data for the ULS analytics. If so can you share that please >>>> in case we need a similar script for CX so I don’t have to write a new >>>> script from scratch? >>>> >>>> @Pau, @Amir There is a section called High priorities for >>>> product management >>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>> the Content translation analytics page. Did these priorities come from >>>> outside the team or does this just represent our own internal view of the >>>> high priorities? If the latter, have these priorities been >>>> reviewed by anyone outside the team? I think we are safe to proceed with >>>> our current plan, but it would be good to have product sign off on things >>>> more generally. >>>> >>>> Thanks, >>>> >>>> Joel >>>> >>>> Joel Sahleen, Software Engineer >>>> Language Engineering >>>> Wikimedia Foundation >>>> jsahleen@wikimedia.org >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Localisation-team mailing list >>>> Localisation-team@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>> >>>> >>> >>> >>> -- >>> Pau Giner >>> Interaction Designer >>> Wikimedia Foundation >>> _______________________________________________ >>> Localisation-team mailing list >>> Localisation-team@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>> >>> >>> >> >> >>
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
[+ Ori]
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is waiting approval from ops. Let's touch-base tomorrow to see if we see events.
Leila
This is great news! Thank you so much for fixing things, Ori. Very much appreciated.
Leila, just ping me on IRC when you are ready to look for events. In the meantime I will review Dan’s instructions to see if there is a way I can help out with that.
Joel
On Thu, Nov 13, 2014 at 1:30 PM, Nuria Ruiz nuria@wikimedia.org wrote: Joel:
I see, I was hoping to set aside the beta issues but if you are not deploying to prod any time soon I guess we will need to troubleshoot there. By the looks of it EL has not worked in beta since august, but, as I said before, I know very little about how beta is put together.
I have filed a bug to regarding the beta issue: https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
On Thu, Nov 13, 2014 at 12:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Nuria,
Please let me know if there is any way I can help out or if there is anything you need from our end.
When you have deployed your newest code to production, let's check whether events appear on the production stream. Let us know when deployment is done and you think your code should be logging.
Our code is not scheduled to be released to production until January. Getting the metrics is partly to help us ensure and promote that release. We will keep you informed as our plans progress, but hopefully we can figure out what the issue is in beta soon.
To confirm: You have seen proper logging from your events in vagrant, right?
The output I am seeing with vagrant is what I pasted to this thread earlier. It does not contain the url-encoded section or the user agent information as we discussed before. I think that is an issue with my dev environment, however, and not a problem with the code. The same code appears to be sending valid events in beta. The http request I sent to your email earlier is what we are seeing there. It seems to include all the information you said it should include.
If you want to debug what is happening in beta yourself, an easy way I found to do that is:
Go to our Content Translation translation view page in beta (you will need to create an account and sign in) Open chrome dev tools, Click the add translation links that appear in the middle column to add a few machine translated paragraphs to the editor Click on the publish button in the header to publish the translation to your user namespace (triggers EL event) Look at the network pane in chrome dev tools and find the entry with the event logging url (it should be near the bottom). Click on the entry to see all the request and response information.
You probably already know all this, but I thought I would pass it along just in case it helps.
Di you setup a sampling rate or code is logging 1 to 1?
No sample rate. Just logging 1 to 1.
On our end we will work to troubleshoot the beta EL infrastructure, I am not familiar with it and neither is anyone on our team but we will ask around.
Yeah, Dan said you all kind of inherited EL so that’s totally understandable. We appreciate you looking into this for us. Let us know how else we can help.
Joel
On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Nuria,
Thank you so much for your help on this. Please let me know if there is any way I can help out or if there is anything you need from our end.
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Hello,
Taking last statement back, asked Yuvi and beta does have a varnish instance so the flow of EL events "should" be the same one that production.
Now I looked on deployment-eventlogging02, which is the EL machine for labs and the last events I see there are from Aug 22.
So no events have come in as of late, which could point to an issue on the setup. I will look into it some more.
Thanks,
Nuria
On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org wrote: To keep archives happy: Beta setup post events to http://bits.beta.wmflabs.org/event.gif that, while it does not look to be varnish, has some kind of listener that post those events to beta event logging database.
On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Niklas,
Can you answer this question from Nuria?
jsahleen: does beta have its own varnish instance? where are you posting your events in beta? can you send teh url?
Also would it be possible to document the steps you used when testing EL on beta so that others can reproduce them?
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
(Moving this discussion to analytics@ and localization-team@ based on Nuria’s suggestion below.)
Hi Leila,
The output I posted in the message is the only output I am seeing. I do not see the URL-encoded section or the validation section. I think there may be something wrong with my testing setup.
Niklas Laxstöm has checked what is happening with our event logging in beta and he confirmed that we are sending events and the events are valid. The issue seems to be that we are logging events to the beta event logging db while what we checked earlier was the production event logging db.
Can you (or anyone who is available) check the event logging db in beta to see if the table has been created and has data? The schema name again is ContentTranslation. If you don’t find anything, let us know and we will do some more investigation.
If there is data in the beta db the next step would be to follow with Dan’s instructions to get a dashboard set up on limn1. I believe that most of Dan’s instructions need to be handled by someone on the analytics team, but let me know if there is anything I can help with.
Thanks again for your help!
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote:
Hi Joel,
When you log events, the output will be the URL-encoded JSON sent by the browser, the event record (similar to what you pasted in your email), and whether the event validates against the schema. For the sample output you pasted earlier, or another sample output, can you let us know if validation section shows Valid?
Leila
On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org wrote: Joel,
For questions like these going forward you can contact analytics@ as you will be getting amore prompt response. Both Dan and Leila are OOTO the next couple of days.
There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server?
No, there is no need.
You would need sample rates to determine at which sampling rate you are logging if you are not logging all events, that is.
Thanks,
Nuria
On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu dandreescu@wikimedia.org wrote: Adding Nuria as she can probably help
On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Leila,
I have tested our EventLogging code and it seems to be working fine with the event logging dev server. I can see the events coming through and they are valid. Here is some sample output:
{"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", "event": {"action": "create-translated-page", "targetLanguage": "ca", "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": 7146627}
Are there additional configuration options we need to add to get EL working aside from just requiring the main extension file. There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server?
Any help on this would be much appreciated.
Thanks,
Joel
On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
No problem, Dan. Enjoy your vacation!
I will read through the document at the link you sent. I still need to fix our event logging code so it may be a couple days before we are ready anyway. If I have any questions I will contact Leila or Nuria.
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 7, 2014, at 3:10 PM, Dan Andreescu dandreescu@wikimedia.org wrote:
> Joel, re: visualization, > > I'm going on vacation tomorrow and will be back on November 19th. If that's not too late, I can set up a limn instance then. If it's too late, that's ok, I wrote up the steps needed. Someone with access to the limn1.eqiad.wmflabs instance can perform them: https://wikitech.wikimedia.org/wiki/Analytics/Dashboards > > If you have the data or are generating the data in some other way, then you don't need half of that setup, you just need the part that sets up the limn dashboard which is only an hour or so of work. Sorry I'm running out the door and can't take care of that for you. > > Dan > > On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: > Thank you for the information, Pau. Very helpful. As you say, this does not change our current plans or hold us up in any way. I was just wasn’t clear about the relationship between the "high priorities" and "other metrics” sections. Knowing these came from different people at different times clarifies things a lot. > Joel > > On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org wrote: > >> @Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? >> >> Here is the story of that page as I'm aware of it: >> >> In September 2013, I was in a meeting with the analytics team in SF presenting an initial proposal for metrics. On that meeting, Dario recommended to create hierarchy of metrics based on the project goals. I created such image and a description for those metrics (the image is on top of our analytics page and the metrics are described in what it now the "Other metrics for created articles" section. >> >> In a meeting between Amir and Howie, they captured which should be the most important metrics from the product perspective in the "High priorities for product management". If I recalled correctly, as an outcome of later meetings between Howie and Amir, Howie was happy focusing on articles published as a single (initial?) metric for success. Amir can provide more details since I was not on those meetings. >> >> In short: The analytics page has pieces contributed by different people during the last year, and although there are many ideas to organise and detail, measuring the number of published articles seems to be the solid candidate to get started with, learn from the value we get from it and polish the rest of our goal-to-signal processfor detecting better metrics. >> >> >> Pau >> >> On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen jsahleen@wikimedia.org wrote: >> Hi All, >> >> I have been reviewing our requirements for Content translation analytics and I have a few questions/requests. I am sending them to the language team list and Leila and Dan in the hopes of getting some more clarity. I will add the same content to the Trello card. >> >> In the weekly team meeting earlier today we agreed that the first metric we want to collect data for is the number of articles created in each language over time. This is something has Amir has already set up our current Event Logging to track. Now that Kartik has enabled EL in beta, that part should be done. Since we are only barely turning it on, there will be very little data until people create more articles using CX. However, we should be set up to collect any new data that comes in. >> >> @Leila, can you verify that the db table now exists for the ContentTranslation schema? If it doesn’t, can you point us to right people we need to work with to troubleshoot the issue? Also you mentioned in our meeting that personal data may soon be purged after 90 days as part of a new privacy policy. Could you explain that a bit more or point us to more information? If this is the case, it may affect some of the metrics we would like to collect in the future. >> >> @Dan, what do we need to do next in order to set up a very simple visualization that would show the number of articles created per week by language. Pau has an image of what he would like on the Trello card. You mentioned something about being able to host a dashboard for us on one of the Limn servers you already have set up. >> >> @Santhosh, I believe you said earlier you have a script you use to export the data for the ULS analytics. If so can you share that please in case we need a similar script for CX so I don’t have to write a new script from scratch? >> >> @Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? If the latter, have these priorities been reviewed by anyone outside the team? I think we are safe to proceed with our current plan, but it would be good to have product sign off on things more generally. >> >> Thanks, >> >> Joel >> >> Joel Sahleen, Software Engineer >> Language Engineering >> Wikimedia Foundation >> jsahleen@wikimedia.org >> >> >> >> >> >> _______________________________________________ >> Localisation-team mailing list >> Localisation-team@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/localisation-team >> >> >> >> >> -- >> Pau Giner >> Interaction Designer >> Wikimedia Foundation >> _______________________________________________ >> Localisation-team mailing list >> Localisation-team@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/localisation-team > >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Joel, Ori looked into this now. There was a problem with EL in labs which
affected logging events from Beta. Ori has fixed the issue, and the fix is
waiting approval from ops. Let's touch-base tomorrow to see if we see
events. In order to be able to properly test whether the fix fixes this issue we need to know what it is.
There is a bug logged for the situation of beta and EL, can we please link any commits to this bug? https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
Also, one thing is the setup of the varnish environment and other the setup of the eventlogging machine that has not received new code for quite a while, so I think we have more than one problem here.
On Thu, Nov 13, 2014 at 4:48 PM, Leila Zia leila@wikimedia.org wrote:
[+ Ori]
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is waiting approval from ops. Let's touch-base tomorrow to see if we see events.
Leila
On Thu, Nov 13, 2014 at 1:30 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel:
I see, I was hoping to set aside the beta issues but if you are not deploying to prod any time soon I guess we will need to troubleshoot there. By the looks of it EL has not worked in beta since august, but, as I said before, I know very little about how beta is put together.
I have filed a bug to regarding the beta issue: https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
On Thu, Nov 13, 2014 at 12:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi Nuria,
Please let me know if there is any way I can help out or if there is
anything you need from our end. When you have deployed your newest code to production, let's check whether events appear on the production stream. Let us know when deployment is done and you think your code should be logging.
Our code is not scheduled to be released to production until January. Getting the metrics is partly to help us ensure and promote that release. We will keep you informed as our plans progress, but hopefully we can figure out what the issue is in beta soon.
To confirm: You have seen proper logging from your events in vagrant, right?
The output I am seeing with vagrant is what I pasted to this thread earlier. It does not contain the url-encoded section or the user agent information as we discussed before. I think that is an issue with my dev environment, however, and not a problem with the code. The same code appears to be sending valid events in beta. The http request I sent to your email earlier is what we are seeing there. It seems to include all the information you said it should include.
If you want to debug what is happening in beta yourself, an easy way I found to do that is:
- Go to our Content Translation translation view
http://en.wikipedia.beta.wmflabs.org/wiki/Special:ContentTranslation?page=Han+Feizi&from=es&to=ca&targettitle=Han+Feizi page in beta (you will need to create an account and sign in) 2. Open chrome dev tools, 3. Click the add translation links that appear in the middle column to add a few machine translated paragraphs to the editor 4. Click on the publish button in the header to publish the translation to your user namespace (triggers EL event) 5. Look at the network pane in chrome dev tools and find the entry with the event logging url (it should be near the bottom). 6. Click on the entry to see all the request and response information.
You probably already know all this, but I thought I would pass it along just in case it helps.
Di you setup a sampling rate or code is logging 1 to 1?
No sample rate. Just logging 1 to 1.
On our end we will work to troubleshoot the beta EL infrastructure, I am not familiar with it and neither is anyone on our team but we will ask around.
Yeah, Dan said you all kind of inherited EL so that’s totally understandable. We appreciate you looking into this for us. Let us know how else we can help.
Joel
On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi Nuria,
Thank you so much for your help on this. Please let me know if there is any way I can help out or if there is anything you need from our end.
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Hello,
Taking last statement back, asked Yuvi and beta does have a varnish instance so the flow of EL events "should" be the same one that production.
Now I looked on deployment-eventlogging02, which is the EL machine for labs and the last events I see there are from Aug 22.
So no events have come in as of late, which could point to an issue on the setup. I will look into it some more.
Thanks,
Nuria
On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org wrote:
To keep archives happy: Beta setup post events to http://bits.beta.wmflabs.org/event.gif http://bits.beta.wmflabs.org/event.gif?foo=bar that, while it does not look to be varnish, has some kind of listener that post those events to beta event logging database.
On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Niklas,
Can you answer this question from Nuria?
jsahleen: does beta have its own varnish instance? where are you posting your events in beta? can you send teh url?
Also would it be possible to document the steps you used when testing EL on beta so that others can reproduce them?
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
(Moving this discussion to analytics@ and localization-team@ based on Nuria’s suggestion below.)
Hi Leila,
The output I posted in the message is the only output I am seeing. I do not see the URL-encoded section or the validation section. I think there may be something wrong with my testing setup.
Niklas Laxstöm has checked what is happening with our event logging in beta and he confirmed that we are sending events and the events are valid. The issue seems to be that we are logging events to the beta event logging db while what we checked earlier was the production event logging db.
Can you (or anyone who is available) check the event logging db in beta to see if the table has been created and has data? The schema name again is ContentTranslation. If you don’t find anything, let us know and we will do some more investigation.
If there is data in the beta db the next step would be to follow with Dan’s instructions https://wikitech.wikimedia.org/wiki/Analytics/Dashboards to get a dashboard set up on limn1. I believe that most of Dan’s instructions need to be handled by someone on the analytics team, but let me know if there is anything I can help with.
Thanks again for your help!
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote:
Hi Joel,
When you log events, the output will be the URL-encoded JSON sent by the browser, the event record (similar to what you pasted in your email), and whether the event validates against the schema. For the sample output you pasted earlier, or another sample output, can you let us know if validation section shows Valid?
Leila
On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org wrote:
> Joel, > > For questions like these going forward you can contact analytics@ > as you will be getting amore prompt response. Both Dan and Leila are OOTO > the next couple of days. > > >There are configuration options for the dev server that need to be > added. Do similar options need to be added when not using the dev server? > No, there is no need. > > You would need sample rates to determine at which sampling rate you > are logging if you are not logging all events, that is. > > Thanks, > > Nuria > > On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu < > dandreescu@wikimedia.org> wrote: > >> Adding Nuria as she can probably help >> >> On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org >> wrote: >> >>> Hi Leila, >>> >>> I have tested our EventLogging code and it seems to be working >>> fine with the event logging dev server. I can see the events coming through >>> and they are valid. Here is some sample output: >>> >>> {"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", >>> "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, >>> "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": >>> "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", >>> "event": {"action": "create-translated-page", "targetLanguage": "ca", >>> "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": >>> 7146627} >>> >>> Are there additional configuration options we need to add to get >>> EL working aside from just requiring the main extension file. There are >>> configuration options for the dev server that need to be added. Do similar >>> options need to be added when not using the dev server? >>> >>> Any help on this would be much appreciated. >>> >>> Thanks, >>> >>> Joel >>> >>> On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org >>> wrote: >>> >>> No problem, Dan. Enjoy your vacation! >>> >>> I will read through the document at the link you sent. I still >>> need to fix our event logging code so it may be a couple days before we are >>> ready anyway. If I have any questions I will contact Leila or Nuria. >>> >>> Thanks, >>> >>> Joel >>> >>> Joel Sahleen, Software Engineer >>> Language Engineering >>> Wikimedia Foundation >>> jsahleen@wikimedia.org >>> >>> >>> >>> >>> On Nov 7, 2014, at 3:10 PM, Dan Andreescu < >>> dandreescu@wikimedia.org> wrote: >>> >>> Joel, re: visualization, >>> >>> I'm going on vacation tomorrow and will be back on November 19th. >>> If that's not too late, I can set up a limn instance then. If it's too >>> late, that's ok, I wrote up the steps needed. Someone with access to the >>> limn1.eqiad.wmflabs instance can perform them: >>> https://wikitech.wikimedia.org/wiki/Analytics/Dashboards >>> >>> If you have the data or are generating the data in some other way, >>> then you don't need half of that setup, you just need the part that sets up >>> the limn dashboard which is only an hour or so of work. Sorry I'm running >>> out the door and can't take care of that for you. >>> >>> Dan >>> >>> On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen < >>> jsahleen@wikimedia.org> wrote: >>> >>>> Thank you for the information, Pau. Very helpful. As you say, >>>> this does not change our current plans or hold us up in any way. I was just >>>> wasn’t clear about the relationship between the "high priorities" and >>>> "other metrics” sections. Knowing these came from different people at >>>> different times clarifies things a lot. >>>> Joel >>>> >>>> On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org >>>> wrote: >>>> >>>> @Pau, @Amir There is a section called High priorities for >>>>> product management >>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>> the Content translation analytics page. Did these priorities come from >>>>> outside the team or does this just represent our own internal view of the >>>>> high priorities? >>>> >>>> >>>> Here is the story of that page as I'm aware of it: >>>> >>>> In September 2013, I was in a meeting with the analytics team in >>>> SF presenting an initial proposal for metrics >>>> https://docs.google.com/a/wikimedia.org/presentation/d/1V1XLV7jUcAtco5ZC49SNTt3VecH7hARZ6vqbSFGnOYc/edit?usp=sharing. >>>> On that meeting, Dario recommended to create hierarchy of metrics based on >>>> the project goals. I created such image and a description for those metrics >>>> (the image is on top of our analytics page and the metrics are described in >>>> what it now the "Other metrics for created articles" section. >>>> >>>> In a meeting between Amir and Howie, they captured which should >>>> be the most important metrics from the product perspective in the "High >>>> priorities for product management". If I recalled correctly, as an outcome >>>> of later meetings between Howie and Amir, Howie was happy focusing on >>>> articles published as a single (initial?) metric for success. Amir can >>>> provide more details since I was not on those meetings. >>>> >>>> In short: The analytics page >>>> https://www.mediawiki.org/wiki/Content_translation/analytics >>>> has pieces contributed by different people during the last year, and >>>> although there are many ideas to organise and detail, measuring the number >>>> of published articles seems to be the solid candidate to get started with, >>>> learn from the value we get from it and polish the rest of our goal-to-signal >>>> process http://www.rodden.org/kerry/heart/ for detecting >>>> better metrics. >>>> >>>> >>>> Pau >>>> >>>> On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen < >>>> jsahleen@wikimedia.org> wrote: >>>> >>>>> Hi All, >>>>> >>>>> I have been reviewing our requirements for Content translation >>>>> analytics >>>>> https://www.mediawiki.org/wiki/Content_translation/analytics and >>>>> I have a few questions/requests. I am sending them to the language team >>>>> list and Leila and Dan in the hopes of getting some more clarity. I will >>>>> add the same content to the Trello card. >>>>> >>>>> In the weekly team meeting earlier today we agreed that the >>>>> first metric we want to collect data for is the number of articles created >>>>> in each language over time. This is something has Amir has already set up our >>>>> current Event Logging >>>>> https://git.wikimedia.org/blob/mediawiki/extensions/ContentTranslation/89b6284f06b4419ddec6dcccee0eed500f267100/modules/eventlogging/ext.cx.eventlogging.js to >>>>> track. Now that Kartik has enabled EL in beta, that part should be done. >>>>> Since we are only barely turning it on, there will be very little data >>>>> until people create more articles using CX. However, we should be set up to >>>>> collect any new data that comes in. >>>>> >>>>> @Leila, can you verify that the db table now exists for the ContentTranslation >>>>> schema >>>>> https://meta.wikimedia.org/wiki/Schema:ContentTranslation? If >>>>> it doesn’t, can you point us to right people we need to work with to >>>>> troubleshoot the issue? Also you mentioned in our meeting that personal >>>>> data may soon be purged after 90 days as part of a new privacy policy. >>>>> Could you explain that a bit more or point us to more information? If this >>>>> is the case, it may affect some of the metrics we would like to collect in >>>>> the future. >>>>> >>>>> @Dan, what do we need to do next in order to set up a very >>>>> simple visualization that would show the number of articles created per >>>>> week by language. Pau has an image of what he would like on the Trello >>>>> card >>>>> https://trello.com/c/vQm0hlkt/18-content-translation-analytics. >>>>> You mentioned something about being able to host a dashboard for us on one >>>>> of the Limn servers you already have set up. >>>>> >>>>> @Santhosh, I believe you said earlier you have a script you use >>>>> to export the data for the ULS analytics. If so can you share that please >>>>> in case we need a similar script for CX so I don’t have to write a new >>>>> script from scratch? >>>>> >>>>> @Pau, @Amir There is a section called High priorities for >>>>> product management >>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>> the Content translation analytics page. Did these priorities come from >>>>> outside the team or does this just represent our own internal view of the >>>>> high priorities? If the latter, have these priorities been >>>>> reviewed by anyone outside the team? I think we are safe to proceed with >>>>> our current plan, but it would be good to have product sign off on things >>>>> more generally. >>>>> >>>>> Thanks, >>>>> >>>>> Joel >>>>> >>>>> Joel Sahleen, Software Engineer >>>>> Language Engineering >>>>> Wikimedia Foundation >>>>> jsahleen@wikimedia.org >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Localisation-team mailing list >>>>> Localisation-team@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>> >>>>> >>>> >>>> >>>> -- >>>> Pau Giner >>>> Interaction Designer >>>> Wikimedia Foundation >>>> _______________________________________________ >>>> Localisation-team mailing list >>>> Localisation-team@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>> >>>> >>>> >>> >>> >>> >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Hi all,
I wanted to check in on this and confirm where things are at. As far as I understand, the outstanding issues for beta are:
1. We still need to verify that events sent from Content Translation are being collected in beta. The analytics team is looking into the issues in beta and Nuria has created a bug in bugzilla to track any related work.
2. Sometime after Dan gets back from vacation, he and Joel will need to work together to set up a basic dashboard based on Dan's instructions. Timing is dependent on 1. @Dan, let me know what works best for you and how I can best help.
Since event logging in beta and production appear to be separate, I was wondering if it would be possible to set up separate dashboards for beta and production. That would be very useful for us because it would allow us to track the usage of languages we release to beta and then use that data to prioritize the languages we release to production.
Thanks,
Joel
On Nov 14, 2014, at 11:05 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is >waiting approval from ops. Let's touch-base tomorrow to see if we see events.
In order to be able to properly test whether the fix fixes this issue we need to know what it is.
There is a bug logged for the situation of beta and EL, can we please link any commits to this bug? https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
Also, one thing is the setup of the varnish environment and other the setup of the eventlogging machine that has not received new code for quite a while, so I think we have more than one problem here.
On Thu, Nov 13, 2014 at 4:48 PM, Leila Zia leila@wikimedia.org wrote: [+ Ori]
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is waiting approval from ops. Let's touch-base tomorrow to see if we see events.
Leila
On Thu, Nov 13, 2014 at 1:30 PM, Nuria Ruiz nuria@wikimedia.org wrote: Joel:
I see, I was hoping to set aside the beta issues but if you are not deploying to prod any time soon I guess we will need to troubleshoot there. By the looks of it EL has not worked in beta since august, but, as I said before, I know very little about how beta is put together.
I have filed a bug to regarding the beta issue: https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
On Thu, Nov 13, 2014 at 12:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Nuria,
Please let me know if there is any way I can help out or if there is anything you need from our end.
When you have deployed your newest code to production, let's check whether events appear on the production stream. Let us know when deployment is done and you think your code should be logging.
Our code is not scheduled to be released to production until January. Getting the metrics is partly to help us ensure and promote that release. We will keep you informed as our plans progress, but hopefully we can figure out what the issue is in beta soon.
To confirm: You have seen proper logging from your events in vagrant, right?
The output I am seeing with vagrant is what I pasted to this thread earlier. It does not contain the url-encoded section or the user agent information as we discussed before. I think that is an issue with my dev environment, however, and not a problem with the code. The same code appears to be sending valid events in beta. The http request I sent to your email earlier is what we are seeing there. It seems to include all the information you said it should include.
If you want to debug what is happening in beta yourself, an easy way I found to do that is:
Go to our Content Translation translation view page in beta (you will need to create an account and sign in) Open chrome dev tools, Click the add translation links that appear in the middle column to add a few machine translated paragraphs to the editor Click on the publish button in the header to publish the translation to your user namespace (triggers EL event) Look at the network pane in chrome dev tools and find the entry with the event logging url (it should be near the bottom). Click on the entry to see all the request and response information.
You probably already know all this, but I thought I would pass it along just in case it helps.
Di you setup a sampling rate or code is logging 1 to 1?
No sample rate. Just logging 1 to 1.
On our end we will work to troubleshoot the beta EL infrastructure, I am not familiar with it and neither is anyone on our team but we will ask around.
Yeah, Dan said you all kind of inherited EL so that’s totally understandable. We appreciate you looking into this for us. Let us know how else we can help.
Joel
On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Nuria,
Thank you so much for your help on this. Please let me know if there is any way I can help out or if there is anything you need from our end.
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Hello,
Taking last statement back, asked Yuvi and beta does have a varnish instance so the flow of EL events "should" be the same one that production.
Now I looked on deployment-eventlogging02, which is the EL machine for labs and the last events I see there are from Aug 22.
So no events have come in as of late, which could point to an issue on the setup. I will look into it some more.
Thanks,
Nuria
On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org wrote: To keep archives happy: Beta setup post events to http://bits.beta.wmflabs.org/event.gif that, while it does not look to be varnish, has some kind of listener that post those events to beta event logging database.
On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Niklas,
Can you answer this question from Nuria?
jsahleen: does beta have its own varnish instance? where are you posting your events in beta? can you send teh url?
Also would it be possible to document the steps you used when testing EL on beta so that others can reproduce them?
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
(Moving this discussion to analytics@ and localization-team@ based on Nuria’s suggestion below.)
Hi Leila,
The output I posted in the message is the only output I am seeing. I do not see the URL-encoded section or the validation section. I think there may be something wrong with my testing setup.
Niklas Laxstöm has checked what is happening with our event logging in beta and he confirmed that we are sending events and the events are valid. The issue seems to be that we are logging events to the beta event logging db while what we checked earlier was the production event logging db.
Can you (or anyone who is available) check the event logging db in beta to see if the table has been created and has data? The schema name again is ContentTranslation. If you don’t find anything, let us know and we will do some more investigation.
If there is data in the beta db the next step would be to follow with Dan’s instructions to get a dashboard set up on limn1. I believe that most of Dan’s instructions need to be handled by someone on the analytics team, but let me know if there is anything I can help with.
Thanks again for your help!
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote:
Hi Joel,
When you log events, the output will be the URL-encoded JSON sent by the browser, the event record (similar to what you pasted in your email), and whether the event validates against the schema. For the sample output you pasted earlier, or another sample output, can you let us know if validation section shows Valid?
Leila
On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org wrote: Joel,
For questions like these going forward you can contact analytics@ as you will be getting amore prompt response. Both Dan and Leila are OOTO the next couple of days.
There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server?
No, there is no need.
You would need sample rates to determine at which sampling rate you are logging if you are not logging all events, that is.
Thanks,
Nuria
On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu dandreescu@wikimedia.org wrote: Adding Nuria as she can probably help
On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Leila,
I have tested our EventLogging code and it seems to be working fine with the event logging dev server. I can see the events coming through and they are valid. Here is some sample output:
{"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", "event": {"action": "create-translated-page", "targetLanguage": "ca", "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": 7146627}
Are there additional configuration options we need to add to get EL working aside from just requiring the main extension file. There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server?
Any help on this would be much appreciated.
Thanks,
Joel
On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
No problem, Dan. Enjoy your vacation!
I will read through the document at the link you sent. I still need to fix our event logging code so it may be a couple days before we are ready anyway. If I have any questions I will contact Leila or Nuria.
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 7, 2014, at 3:10 PM, Dan Andreescu dandreescu@wikimedia.org wrote:
> Joel, re: visualization, > > I'm going on vacation tomorrow and will be back on November 19th. If that's not too late, I can set up a limn instance then. If it's too late, that's ok, I wrote up the steps needed. Someone with access to the limn1.eqiad.wmflabs instance can perform them: https://wikitech.wikimedia.org/wiki/Analytics/Dashboards > > If you have the data or are generating the data in some other way, then you don't need half of that setup, you just need the part that sets up the limn dashboard which is only an hour or so of work. Sorry I'm running out the door and can't take care of that for you. > > Dan > > On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: > Thank you for the information, Pau. Very helpful. As you say, this does not change our current plans or hold us up in any way. I was just wasn’t clear about the relationship between the "high priorities" and "other metrics” sections. Knowing these came from different people at different times clarifies things a lot. > Joel > > On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org wrote: > >> @Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? >> >> Here is the story of that page as I'm aware of it: >> >> In September 2013, I was in a meeting with the analytics team in SF presentingan initial proposal for metrics. On that meeting, Dario recommended to create hierarchy of metrics based on the project goals. I created such image and a description for those metrics (the image is on top of our analytics page and the metrics are described in what it now the "Other metrics for created articles" section. >> >> In a meeting between Amir and Howie, they captured which should be the most important metrics from the product perspective in the "High priorities for product management". If I recalled correctly, as an outcome of later meetings between Howie and Amir, Howie was happy focusing on articles published as a single (initial?) metric for success. Amir can provide more details since I was not on those meetings. >> >> In short: The analytics page has pieces contributed by different people during the last year, and although there are many ideas to organise and detail, measuring the number of published articles seems to be the solid candidate to get started with, learn from the value we get from it and polish the rest of ourgoal-to-signal process for detecting better metrics. >> >> >> Pau >> >> On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen jsahleen@wikimedia.orgwrote: >> Hi All, >> >> I have been reviewing our requirements for Content translation analytics and I have a few questions/requests. I am sending them to the language team list and Leila and Dan in the hopes of getting some more clarity. I will add the same content to the Trello card. >> >> In the weekly team meeting earlier today we agreed that the first metric we want to collect data for is the number of articles created in each language over time. This is something has Amir has already set up our current Event Logging to track. Now that Kartik has enabled EL in beta, that part should be done. Since we are only barely turning it on, there will be very little data until people create more articles using CX. However, we should be set up to collect any new data that comes in. >> >> @Leila, can you verify that the db table now exists for the ContentTranslation schema? If it doesn’t, can you point us to right people we need to work with to troubleshoot the issue? Also you mentioned in our meeting that personal data may soon be purged after 90 days as part of a new privacy policy. Could you explain that a bit more or point us to more information? If this is the case, it may affect some of the metrics we would like to collect in the future. >> >> @Dan, what do we need to do next in order to set up a very simple visualization that would show the number of articles created per week by language. Pau has an image of what he would like on the Trello card. You mentioned something about being able to host a dashboard for us on one of the Limn servers you already have set up. >> >> @Santhosh, I believe you said earlier you have a script you use to export the data for the ULS analytics. If so can you share that please in case we need a similar script for CX so I don’t have to write a new script from scratch? >> >> @Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? If the latter, have these priorities been reviewed by anyone outside the team? I think we are safe to proceed with our current plan, but it would be good to have product sign off on things more generally. >> >> Thanks, >> >> Joel >> >> Joel Sahleen, Software Engineer >> Language Engineering >> Wikimedia Foundation >> jsahleen@wikimedia.org >> >> >> >> >> >> _______________________________________________ >> Localisation-team mailing list >> Localisation-team@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/localisation-team >> >> >> >> >> -- >> Pau Giner >> Interaction Designer >> Wikimedia Foundation >> _______________________________________________ >> Localisation-team mailing list >> Localisation-team@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/localisation-team > >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Since event logging in beta and production appear to be separate, I was
wondering if it would be possible to set up separate dashboards for beta and >production.
Dashboards can only be created o production data, Joel. We might blow up data in beta environment database to test something else so there is no guaranteed availability there. It is purely a testing environment.
On Mon, Nov 17, 2014 at 8:09 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi all,
I wanted to check in on this and confirm where things are at. As far as I understand, the outstanding issues for beta are:
- We still need to verify that events sent from Content Translation are
being collected in beta. The analytics team is looking into the issues in beta and Nuria has created a bug https://bugzilla.wikimedia.org/show_bug.cgi?id=73388 in bugzilla to track any related work.
- Sometime after Dan gets back from vacation, he and Joel will need to
work together to set up a basic dashboard based on Dan's instructions https://wikitech.wikimedia.org/wiki/Analytics/Dashboards. Timing is dependent on 1. @Dan, let me know what works best for you and how I can best help.
Since event logging in beta and production appear to be separate, I was wondering if it would be possible to set up separate dashboards for beta and production. That would be very useful for us because it would allow us to track the usage of languages we release to beta and then use that data to prioritize the languages we release to production.
Thanks,
Joel
On Nov 14, 2014, at 11:05 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel, Ori looked into this now. There was a problem with EL in labs
which affected logging events from Beta. Ori has fixed the issue, and the fix is >waiting approval from ops. Let's touch-base tomorrow to see if we see events. In order to be able to properly test whether the fix fixes this issue we need to know what it is.
There is a bug logged for the situation of beta and EL, can we please link any commits to this bug? https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
Also, one thing is the setup of the varnish environment and other the setup of the eventlogging machine that has not received new code for quite a while, so I think we have more than one problem here.
On Thu, Nov 13, 2014 at 4:48 PM, Leila Zia leila@wikimedia.org wrote:
[+ Ori]
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is waiting approval from ops. Let's touch-base tomorrow to see if we see events.
Leila
On Thu, Nov 13, 2014 at 1:30 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel:
I see, I was hoping to set aside the beta issues but if you are not deploying to prod any time soon I guess we will need to troubleshoot there. By the looks of it EL has not worked in beta since august, but, as I said before, I know very little about how beta is put together.
I have filed a bug to regarding the beta issue: https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
On Thu, Nov 13, 2014 at 12:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi Nuria,
Please let me know if there is any way I can help out or if there is
anything you need from our end. When you have deployed your newest code to production, let's check whether events appear on the production stream. Let us know when deployment is done and you think your code should be logging.
Our code is not scheduled to be released to production until January. Getting the metrics is partly to help us ensure and promote that release. We will keep you informed as our plans progress, but hopefully we can figure out what the issue is in beta soon.
To confirm: You have seen proper logging from your events in vagrant, right?
The output I am seeing with vagrant is what I pasted to this thread earlier. It does not contain the url-encoded section or the user agent information as we discussed before. I think that is an issue with my dev environment, however, and not a problem with the code. The same code appears to be sending valid events in beta. The http request I sent to your email earlier is what we are seeing there. It seems to include all the information you said it should include.
If you want to debug what is happening in beta yourself, an easy way I found to do that is:
- Go to our Content Translation translation view
http://en.wikipedia.beta.wmflabs.org/wiki/Special:ContentTranslation?page=Han+Feizi&from=es&to=ca&targettitle=Han+Feizi page in beta (you will need to create an account and sign in) 2. Open chrome dev tools, 3. Click the add translation links that appear in the middle column to add a few machine translated paragraphs to the editor 4. Click on the publish button in the header to publish the translation to your user namespace (triggers EL event) 5. Look at the network pane in chrome dev tools and find the entry with the event logging url (it should be near the bottom). 6. Click on the entry to see all the request and response information.
You probably already know all this, but I thought I would pass it along just in case it helps.
Di you setup a sampling rate or code is logging 1 to 1?
No sample rate. Just logging 1 to 1.
On our end we will work to troubleshoot the beta EL infrastructure, I am not familiar with it and neither is anyone on our team but we will ask around.
Yeah, Dan said you all kind of inherited EL so that’s totally understandable. We appreciate you looking into this for us. Let us know how else we can help.
Joel
On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi Nuria,
Thank you so much for your help on this. Please let me know if there is any way I can help out or if there is anything you need from our end.
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Hello,
Taking last statement back, asked Yuvi and beta does have a varnish instance so the flow of EL events "should" be the same one that production.
Now I looked on deployment-eventlogging02, which is the EL machine for labs and the last events I see there are from Aug 22.
So no events have come in as of late, which could point to an issue on the setup. I will look into it some more.
Thanks,
Nuria
On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org wrote:
To keep archives happy: Beta setup post events to http://bits.beta.wmflabs.org/event.gif http://bits.beta.wmflabs.org/event.gif?foo=bar that, while it does not look to be varnish, has some kind of listener that post those events to beta event logging database.
On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen <jsahleen@wikimedia.org > wrote:
> Niklas, > > Can you answer this question from Nuria? > > jsahleen: does beta have its own varnish instance? where are you > posting your events in beta? can you send teh url? > > Also would it be possible to document the steps you used when > testing EL on beta so that others can reproduce them? > > Thanks, > > Joel > > Joel Sahleen, Software Engineer > Language Engineering > Wikimedia Foundation > jsahleen@wikimedia.org > > > > > On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org > wrote: > > (Moving this discussion to analytics@ and localization-team@ based > on Nuria’s suggestion below.) > > Hi Leila, > > The output I posted in the message is the only output I am seeing. I > do not see the URL-encoded section or the validation section. I think there > may be something wrong with my testing setup. > > Niklas Laxstöm has checked what is happening with our event logging > in beta and he confirmed that we are sending events and the events are > valid. The issue seems to be that we are logging events to the beta event > logging db while what we checked earlier was the production event logging > db. > > Can you (or anyone who is available) check the event logging db in > beta to see if the table has been created and has data? The schema name > again is ContentTranslation. If you don’t find anything, let us know and we > will do some more investigation. > > If there is data in the beta db the next step would be to follow > with Dan’s instructions > https://wikitech.wikimedia.org/wiki/Analytics/Dashboards to get a > dashboard set up on limn1. I believe that most of Dan’s instructions need > to be handled by someone on the analytics team, but let me know if there is > anything I can help with. > > Thanks again for your help! > > Joel > > Joel Sahleen, Software Engineer > Language Engineering > Wikimedia Foundation > jsahleen@wikimedia.org > > > > > On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote: > > Hi Joel, > > When you log events, the output will be the URL-encoded JSON > sent by the browser, the event record (similar to what you pasted in your > email), and whether the event validates against the schema. For the sample > output you pasted earlier, or another sample output, can you let us know if > validation section shows Valid? > > Leila > > On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org > wrote: > >> Joel, >> >> For questions like these going forward you can contact analytics@ >> as you will be getting amore prompt response. Both Dan and Leila are OOTO >> the next couple of days. >> >> >There are configuration options for the dev server that need to >> be added. Do similar options need to be added when not using the dev server? >> No, there is no need. >> >> You would need sample rates to determine at which sampling rate you >> are logging if you are not logging all events, that is. >> >> Thanks, >> >> Nuria >> >> On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu < >> dandreescu@wikimedia.org> wrote: >> >>> Adding Nuria as she can probably help >>> >>> On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org >>> wrote: >>> >>>> Hi Leila, >>>> >>>> I have tested our EventLogging code and it seems to be working >>>> fine with the event logging dev server. I can see the events coming through >>>> and they are valid. Here is some sample output: >>>> >>>> {"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", >>>> "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, >>>> "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": >>>> "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", >>>> "event": {"action": "create-translated-page", "targetLanguage": "ca", >>>> "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": >>>> 7146627} >>>> >>>> Are there additional configuration options we need to add to get >>>> EL working aside from just requiring the main extension file. There are >>>> configuration options for the dev server that need to be added. Do similar >>>> options need to be added when not using the dev server? >>>> >>>> Any help on this would be much appreciated. >>>> >>>> Thanks, >>>> >>>> Joel >>>> >>>> On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org >>>> wrote: >>>> >>>> No problem, Dan. Enjoy your vacation! >>>> >>>> I will read through the document at the link you sent. I still >>>> need to fix our event logging code so it may be a couple days before we are >>>> ready anyway. If I have any questions I will contact Leila or Nuria. >>>> >>>> Thanks, >>>> >>>> Joel >>>> >>>> Joel Sahleen, Software Engineer >>>> Language Engineering >>>> Wikimedia Foundation >>>> jsahleen@wikimedia.org >>>> >>>> >>>> >>>> >>>> On Nov 7, 2014, at 3:10 PM, Dan Andreescu < >>>> dandreescu@wikimedia.org> wrote: >>>> >>>> Joel, re: visualization, >>>> >>>> I'm going on vacation tomorrow and will be back on November >>>> 19th. If that's not too late, I can set up a limn instance then. If it's >>>> too late, that's ok, I wrote up the steps needed. Someone with access to >>>> the limn1.eqiad.wmflabs instance can perform them: >>>> https://wikitech.wikimedia.org/wiki/Analytics/Dashboards >>>> >>>> If you have the data or are generating the data in some other >>>> way, then you don't need half of that setup, you just need the part that >>>> sets up the limn dashboard which is only an hour or so of work. Sorry I'm >>>> running out the door and can't take care of that for you. >>>> >>>> Dan >>>> >>>> On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen < >>>> jsahleen@wikimedia.org> wrote: >>>> >>>>> Thank you for the information, Pau. Very helpful. As you say, >>>>> this does not change our current plans or hold us up in any way. I was just >>>>> wasn’t clear about the relationship between the "high priorities" and >>>>> "other metrics” sections. Knowing these came from different people at >>>>> different times clarifies things a lot. >>>>> Joel >>>>> >>>>> On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org >>>>> wrote: >>>>> >>>>> @Pau, @Amir There is a section called High priorities for >>>>>> product management >>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>>> the Content translation analytics page. Did these priorities come from >>>>>> outside the team or does this just represent our own internal view of the >>>>>> high priorities? >>>>> >>>>> >>>>> Here is the story of that page as I'm aware of it: >>>>> >>>>> In September 2013, I was in a meeting with the analytics team in >>>>> SF presentingan initial proposal for metrics >>>>> https://docs.google.com/a/wikimedia.org/presentation/d/1V1XLV7jUcAtco5ZC49SNTt3VecH7hARZ6vqbSFGnOYc/edit?usp=sharing. >>>>> On that meeting, Dario recommended to create hierarchy of metrics based on >>>>> the project goals. I created such image and a description for those metrics >>>>> (the image is on top of our analytics page and the metrics are described in >>>>> what it now the "Other metrics for created articles" section. >>>>> >>>>> In a meeting between Amir and Howie, they captured which should >>>>> be the most important metrics from the product perspective in the "High >>>>> priorities for product management". If I recalled correctly, as an outcome >>>>> of later meetings between Howie and Amir, Howie was happy focusing on >>>>> articles published as a single (initial?) metric for success. Amir can >>>>> provide more details since I was not on those meetings. >>>>> >>>>> In short: The analytics page >>>>> https://www.mediawiki.org/wiki/Content_translation/analytics has >>>>> pieces contributed by different people during the last year, and although >>>>> there are many ideas to organise and detail, measuring the number of >>>>> published articles seems to be the solid candidate to get started with, >>>>> learn from the value we get from it and polish the rest of ourgoal-to-signal >>>>> process http://www.rodden.org/kerry/heart/ for detecting >>>>> better metrics. >>>>> >>>>> >>>>> Pau >>>>> >>>>> On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen < >>>>> jsahleen@wikimedia.org>wrote: >>>>> >>>>>> Hi All, >>>>>> >>>>>> I have been reviewing our requirements for Content translation >>>>>> analytics >>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics and >>>>>> I have a few questions/requests. I am sending them to the language team >>>>>> list and Leila and Dan in the hopes of getting some more clarity. I will >>>>>> add the same content to the Trello card. >>>>>> >>>>>> In the weekly team meeting earlier today we agreed that the >>>>>> first metric we want to collect data for is the number of articles created >>>>>> in each language over time. This is something has Amir has already set up our >>>>>> current Event Logging >>>>>> https://git.wikimedia.org/blob/mediawiki/extensions/ContentTranslation/89b6284f06b4419ddec6dcccee0eed500f267100/modules/eventlogging/ext.cx.eventlogging.js to >>>>>> track. Now that Kartik has enabled EL in beta, that part should be done. >>>>>> Since we are only barely turning it on, there will be very little data >>>>>> until people create more articles using CX. However, we should be set up to >>>>>> collect any new data that comes in. >>>>>> >>>>>> @Leila, can you verify that the db table now exists for the ContentTranslation >>>>>> schema >>>>>> https://meta.wikimedia.org/wiki/Schema:ContentTranslation? >>>>>> If it doesn’t, can you point us to right people we need to work with to >>>>>> troubleshoot the issue? Also you mentioned in our meeting that personal >>>>>> data may soon be purged after 90 days as part of a new privacy policy. >>>>>> Could you explain that a bit more or point us to more information? If this >>>>>> is the case, it may affect some of the metrics we would like to collect in >>>>>> the future. >>>>>> >>>>>> @Dan, what do we need to do next in order to set up a very >>>>>> simple visualization that would show the number of articles created per >>>>>> week by language. Pau has an image of what he would like on the Trello >>>>>> card >>>>>> https://trello.com/c/vQm0hlkt/18-content-translation-analytics. >>>>>> You mentioned something about being able to host a dashboard for us on one >>>>>> of the Limn servers you already have set up. >>>>>> >>>>>> @Santhosh, I believe you said earlier you have a script you use >>>>>> to export the data for the ULS analytics. If so can you share that please >>>>>> in case we need a similar script for CX so I don’t have to write a new >>>>>> script from scratch? >>>>>> >>>>>> @Pau, @Amir There is a section called High priorities for >>>>>> product management >>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>>> the Content translation analytics page. Did these priorities come from >>>>>> outside the team or does this just represent our own internal view of the >>>>>> high priorities? If the latter, have these priorities been >>>>>> reviewed by anyone outside the team? I think we are safe to proceed with >>>>>> our current plan, but it would be good to have product sign off on things >>>>>> more generally. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Joel >>>>>> >>>>>> Joel Sahleen, Software Engineer >>>>>> Language Engineering >>>>>> Wikimedia Foundation >>>>>> jsahleen@wikimedia.org >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Localisation-team mailing list >>>>>> Localisation-team@lists.wikimedia.org >>>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Pau Giner >>>>> Interaction Designer >>>>> Wikimedia Foundation >>>>> _______________________________________________ >>>>> Localisation-team mailing list >>>>> Localisation-team@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>> >>>>> >>>>> >>>> >>>> >>>> >> > > > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > >
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
On Nov 17, 2014, at 9:13 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Since event logging in beta and production appear to be separate, I was wondering if it would be possible to set up separate dashboards for beta and >production.
Dashboards can only be created o production data, Joel. We might blow up data in beta environment database to test something else so there is no guaranteed availability there. It is purely a testing environment.
Makes sense I suppose, but if the data in beta is unstable there doesn’t seem much point in doing any of this there, beyond confirming that we are sending valid events, which has already been done. I guess we will just have to wait until we go to production to set things up. It would be nice if we had a real beta environment we could use for beta testing, but that’s a larger issue.
On Mon, Nov 17, 2014 at 8:09 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi all,
I wanted to check in on this and confirm where things are at. As far as I understand, the outstanding issues for beta are:
We still need to verify that events sent from Content Translation are being collected in beta. The analytics team is looking into the issues in beta and Nuria has created a bug in bugzilla to track any related work.
Sometime after Dan gets back from vacation, he and Joel will need to work together to set up a basic dashboard based on Dan's instructions. Timing is dependent on 1. @Dan, let me know what works best for you and how I can best help.
Since event logging in beta and production appear to be separate, I was wondering if it would be possible to set up separate dashboards for beta and production. That would be very useful for us because it would allow us to track the usage of languages we release to beta and then use that data to prioritize the languages we release to production.
Thanks,
Joel
On Nov 14, 2014, at 11:05 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is >waiting approval from ops. Let's touch-base tomorrow to see if we see events.
In order to be able to properly test whether the fix fixes this issue we need to know what it is.
There is a bug logged for the situation of beta and EL, can we please link any commits to this bug? https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
Also, one thing is the setup of the varnish environment and other the setup of the eventlogging machine that has not received new code for quite a while, so I think we have more than one problem here.
On Thu, Nov 13, 2014 at 4:48 PM, Leila Zia leila@wikimedia.org wrote: [+ Ori]
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is waiting approval from ops. Let's touch-base tomorrow to see if we see events.
Leila
On Thu, Nov 13, 2014 at 1:30 PM, Nuria Ruiz nuria@wikimedia.org wrote: Joel:
I see, I was hoping to set aside the beta issues but if you are not deploying to prod any time soon I guess we will need to troubleshoot there. By the looks of it EL has not worked in beta since august, but, as I said before, I know very little about how beta is put together.
I have filed a bug to regarding the beta issue: https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
On Thu, Nov 13, 2014 at 12:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Nuria,
Please let me know if there is any way I can help out or if there is anything you need from our end.
When you have deployed your newest code to production, let's check whether events appear on the production stream. Let us know when deployment is done and you think your code should be logging.
Our code is not scheduled to be released to production until January. Getting the metrics is partly to help us ensure and promote that release. We will keep you informed as our plans progress, but hopefully we can figure out what the issue is in beta soon.
To confirm: You have seen proper logging from your events in vagrant, right?
The output I am seeing with vagrant is what I pasted to this thread earlier. It does not contain the url-encoded section or the user agent information as we discussed before. I think that is an issue with my dev environment, however, and not a problem with the code. The same code appears to be sending valid events in beta. The http request I sent to your email earlier is what we are seeing there. It seems to include all the information you said it should include.
If you want to debug what is happening in beta yourself, an easy way I found to do that is:
Go to our Content Translation translation view page in beta (you will need to create an account and sign in) Open chrome dev tools, Click the add translation links that appear in the middle column to add a few machine translated paragraphs to the editor Click on the publish button in the header to publish the translation to your user namespace (triggers EL event) Look at the network pane in chrome dev tools and find the entry with the event logging url (it should be near the bottom). Click on the entry to see all the request and response information.
You probably already know all this, but I thought I would pass it along just in case it helps.
Di you setup a sampling rate or code is logging 1 to 1?
No sample rate. Just logging 1 to 1.
On our end we will work to troubleshoot the beta EL infrastructure, I am not familiar with it and neither is anyone on our team but we will ask around.
Yeah, Dan said you all kind of inherited EL so that’s totally understandable. We appreciate you looking into this for us. Let us know how else we can help.
Joel
On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Nuria,
Thank you so much for your help on this. Please let me know if there is any way I can help out or if there is anything you need from our end.
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Hello,
Taking last statement back, asked Yuvi and beta does have a varnish instance so the flow of EL events "should" be the same one that production.
Now I looked on deployment-eventlogging02, which is the EL machine for labs and the last events I see there are from Aug 22.
So no events have come in as of late, which could point to an issue on the setup. I will look into it some more.
Thanks,
Nuria
On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org wrote: To keep archives happy: Beta setup post events to http://bits.beta.wmflabs.org/event.gif that, while it does not look to be varnish, has some kind of listener that post those events to beta event logging database.
On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Niklas,
Can you answer this question from Nuria?
jsahleen: does beta have its own varnish instance? where are you posting your events in beta? can you send teh url?
Also would it be possible to document the steps you used when testing EL on beta so that others can reproduce them?
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
(Moving this discussion to analytics@ and localization-team@ based on Nuria’s suggestion below.)
Hi Leila,
The output I posted in the message is the only output I am seeing. I do not see the URL-encoded section or the validation section. I think there may be something wrong with my testing setup.
Niklas Laxstöm has checked what is happening with our event logging in beta and he confirmed that we are sending events and the events are valid. The issue seems to be that we are logging events to the beta event logging db while what we checked earlier was the production event logging db.
Can you (or anyone who is available) check the event logging db in beta to see if the table has been created and has data? The schema name again is ContentTranslation. If you don’t find anything, let us know and we will do some more investigation.
If there is data in the beta db the next step would be to follow with Dan’s instructions to get a dashboard set up on limn1. I believe that most of Dan’s instructions need to be handled by someone on the analytics team, but let me know if there is anything I can help with.
Thanks again for your help!
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote:
Hi Joel,
When you log events, the output will be the URL-encoded JSON sent by the browser, the event record (similar to what you pasted in your email), and whether the event validates against the schema. For the sample output you pasted earlier, or another sample output, can you let us know if validation section shows Valid?
Leila
On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org wrote: Joel,
For questions like these going forward you can contact analytics@ as you will be getting amore prompt response. Both Dan and Leila are OOTO the next couple of days.
>There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server? No, there is no need.
You would need sample rates to determine at which sampling rate you are logging if you are not logging all events, that is.
Thanks,
Nuria
On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu dandreescu@wikimedia.org wrote: Adding Nuria as she can probably help
On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Leila,
I have tested our EventLogging code and it seems to be working fine with the event logging dev server. I can see the events coming through and they are valid. Here is some sample output:
{"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", "event": {"action": "create-translated-page", "targetLanguage": "ca", "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": 7146627}
Are there additional configuration options we need to add to get EL working aside from just requiring the main extension file. There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server?
Any help on this would be much appreciated.
Thanks,
Joel
On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
> No problem, Dan. Enjoy your vacation! > > I will read through the document at the link you sent. I still need to fix our event logging code so it may be a couple days before we are ready anyway. If I have any questions I will contact Leila or Nuria. > > Thanks, > > Joel > > Joel Sahleen, Software Engineer > Language Engineering > Wikimedia Foundation > jsahleen@wikimedia.org > > > > > On Nov 7, 2014, at 3:10 PM, Dan Andreescu dandreescu@wikimedia.org wrote: > >> Joel, re: visualization, >> >> I'm going on vacation tomorrow and will be back on November 19th. If that's not too late, I can set up a limn instance then. If it's too late, that's ok, I wrote up the steps needed. Someone with access to the limn1.eqiad.wmflabs instance can perform them: https://wikitech.wikimedia.org/wiki/Analytics/Dashboards >> >> If you have the data or are generating the data in some other way, then you don't need half of that setup, you just need the part that sets up the limn dashboard which is only an hour or so of work. Sorry I'm running out the door and can't take care of that for you. >> >> Dan >> >> On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: >> Thank you for the information, Pau. Very helpful. As you say, this does not change our current plans or hold us up in any way. I was just wasn’t clear about the relationship between the "high priorities" and "other metrics” sections. Knowing these came from different people at different times clarifies things a lot. >> Joel >> >> On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org wrote: >> >>> @Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? >>> >>> Here is the story of that page as I'm aware of it: >>> >>> In September 2013, I was in a meeting with the analytics team in SF presentingan initial proposal for metrics. On that meeting, Dario recommended to create hierarchy of metrics based on the project goals. I created such image and a description for those metrics (the image is on top of our analytics page and the metrics are described in what it now the "Other metrics for created articles" section. >>> >>> In a meeting between Amir and Howie, they captured which should be the most important metrics from the product perspective in the "High priorities for product management". If I recalled correctly, as an outcome of later meetings between Howie and Amir, Howie was happy focusing on articles published as a single (initial?) metric for success. Amir can provide more details since I was not on those meetings. >>> >>> In short: The analytics page has pieces contributed by different people during the last year, and although there are many ideas to organise and detail, measuring the number of published articles seems to be the solid candidate to get started with, learn from the value we get from it and polish the rest of ourgoal-to-signal process for detecting better metrics. >>> >>> >>> Pau >>> >>> On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen jsahleen@wikimedia.orgwrote: >>> Hi All, >>> >>> I have been reviewing our requirements for Content translation analytics and I have a few questions/requests. I am sending them to the language team list and Leila and Dan in the hopes of getting some more clarity. I will add the same content to the Trello card. >>> >>> In the weekly team meeting earlier today we agreed that the first metric we want to collect data for is the number of articles created in each language over time. This is something has Amir has already set up our current Event Logging to track. Now that Kartik has enabled EL in beta, that part should be done. Since we are only barely turning it on, there will be very little data until people create more articles using CX. However, we should be set up to collect any new data that comes in. >>> >>> @Leila, can you verify that the db table now exists for the ContentTranslation schema? If it doesn’t, can you point us to right people we need to work with to troubleshoot the issue? Also you mentioned in our meeting that personal data may soon be purged after 90 days as part of a new privacy policy. Could you explain that a bit more or point us to more information? If this is the case, it may affect some of the metrics we would like to collect in the future. >>> >>> @Dan, what do we need to do next in order to set up a very simple visualization that would show the number of articles created per week by language. Pau has an image of what he would like on the Trello card. You mentioned something about being able to host a dashboard for us on one of the Limn servers you already have set up. >>> >>> @Santhosh, I believe you said earlier you have a script you use to export the data for the ULS analytics. If so can you share that please in case we need a similar script for CX so I don’t have to write a new script from scratch? >>> >>> @Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? If the latter, have these priorities been reviewed by anyone outside the team? I think we are safe to proceed with our current plan, but it would be good to have product sign off on things more generally. >>> >>> Thanks, >>> >>> Joel >>> >>> Joel Sahleen, Software Engineer >>> Language Engineering >>> Wikimedia Foundation >>> jsahleen@wikimedia.org >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Localisation-team mailing list >>> Localisation-team@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>> >>> >>> >>> >>> -- >>> Pau Giner >>> Interaction Designer >>> Wikimedia Foundation >>> _______________________________________________ >>> Localisation-team mailing list >>> Localisation-team@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >> >> >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Dashboards can only be created o production data,
From product we are constantly encouraged to be data-driven ("measure
twice, implement once"). When I read that we need to be in production to get metrics, it feels like a circular dependency: Product wants us to have numbers that justify the move to production, but Analytics tells us that we need to be in production to get those numbers. I think it is worth opening a conversation in the Product list to clarify their expectations.
My experience with the Multimedia team was that having the ability to visualise metrics and check how those were affected by changes in the product has been really useful, and we only wish we could have had such metrics available from day one. In addition, Content Translation is being used in beta for real work by some users, and we are already missing information on how they are doing so. So any idea on how can get and make sense of some of this information (apart from manual collection) would be appreciated (maybe get the data in a way we could use some quick d3-based tool http://code.shutterstock.com/rickshaw/?).
Thanks
Pau
On Mon, Nov 17, 2014 at 8:38 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
On Nov 17, 2014, at 9:13 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Since event logging in beta and production appear to be separate, I was
wondering if it would be possible to set up separate dashboards for beta and >production.
Dashboards can only be created o production data, Joel. We might blow up data in beta environment database to test something else so there is no guaranteed availability there. It is purely a testing environment.
Makes sense I suppose, but if the data in beta is unstable there doesn’t seem much point in doing any of this there, beyond confirming that we are sending valid events, which has already been done. I guess we will just have to wait until we go to production to set things up. It would be nice if we had a real beta environment we could use for beta testing, but that’s a larger issue.
On Mon, Nov 17, 2014 at 8:09 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi all,
I wanted to check in on this and confirm where things are at. As far as I understand, the outstanding issues for beta are:
- We still need to verify that events sent from Content Translation are
being collected in beta. The analytics team is looking into the issues in beta and Nuria has created a bug https://bugzilla.wikimedia.org/show_bug.cgi?id=73388 in bugzilla to track any related work.
- Sometime after Dan gets back from vacation, he and Joel will need to
work together to set up a basic dashboard based on Dan's instructions https://wikitech.wikimedia.org/wiki/Analytics/Dashboards. Timing is dependent on 1. @Dan, let me know what works best for you and how I can best help.
Since event logging in beta and production appear to be separate, I was wondering if it would be possible to set up separate dashboards for beta and production. That would be very useful for us because it would allow us to track the usage of languages we release to beta and then use that data to prioritize the languages we release to production.
Thanks,
Joel
On Nov 14, 2014, at 11:05 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel, Ori looked into this now. There was a problem with EL in labs
which affected logging events from Beta. Ori has fixed the issue, and the fix is >waiting approval from ops. Let's touch-base tomorrow to see if we see events. In order to be able to properly test whether the fix fixes this issue we need to know what it is.
There is a bug logged for the situation of beta and EL, can we please link any commits to this bug? https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
Also, one thing is the setup of the varnish environment and other the setup of the eventlogging machine that has not received new code for quite a while, so I think we have more than one problem here.
On Thu, Nov 13, 2014 at 4:48 PM, Leila Zia leila@wikimedia.org wrote:
[+ Ori]
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is waiting approval from ops. Let's touch-base tomorrow to see if we see events.
Leila
On Thu, Nov 13, 2014 at 1:30 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel:
I see, I was hoping to set aside the beta issues but if you are not deploying to prod any time soon I guess we will need to troubleshoot there. By the looks of it EL has not worked in beta since august, but, as I said before, I know very little about how beta is put together.
I have filed a bug to regarding the beta issue: https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
On Thu, Nov 13, 2014 at 12:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi Nuria,
Please let me know if there is any way I can help out or if there is
anything you need from our end. When you have deployed your newest code to production, let's check whether events appear on the production stream. Let us know when deployment is done and you think your code should be logging.
Our code is not scheduled to be released to production until January. Getting the metrics is partly to help us ensure and promote that release. We will keep you informed as our plans progress, but hopefully we can figure out what the issue is in beta soon.
To confirm: You have seen proper logging from your events in vagrant, right?
The output I am seeing with vagrant is what I pasted to this thread earlier. It does not contain the url-encoded section or the user agent information as we discussed before. I think that is an issue with my dev environment, however, and not a problem with the code. The same code appears to be sending valid events in beta. The http request I sent to your email earlier is what we are seeing there. It seems to include all the information you said it should include.
If you want to debug what is happening in beta yourself, an easy way I found to do that is:
- Go to our Content Translation translation view
http://en.wikipedia.beta.wmflabs.org/wiki/Special:ContentTranslation?page=Han+Feizi&from=es&to=ca&targettitle=Han+Feizi page in beta (you will need to create an account and sign in) 2. Open chrome dev tools, 3. Click the add translation links that appear in the middle column to add a few machine translated paragraphs to the editor 4. Click on the publish button in the header to publish the translation to your user namespace (triggers EL event) 5. Look at the network pane in chrome dev tools and find the entry with the event logging url (it should be near the bottom). 6. Click on the entry to see all the request and response information.
You probably already know all this, but I thought I would pass it along just in case it helps.
Di you setup a sampling rate or code is logging 1 to 1?
No sample rate. Just logging 1 to 1.
On our end we will work to troubleshoot the beta EL infrastructure, I am not familiar with it and neither is anyone on our team but we will ask around.
Yeah, Dan said you all kind of inherited EL so that’s totally understandable. We appreciate you looking into this for us. Let us know how else we can help.
Joel
On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi Nuria,
Thank you so much for your help on this. Please let me know if there is any way I can help out or if there is anything you need from our end.
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Hello,
Taking last statement back, asked Yuvi and beta does have a varnish instance so the flow of EL events "should" be the same one that production.
Now I looked on deployment-eventlogging02, which is the EL machine for labs and the last events I see there are from Aug 22.
So no events have come in as of late, which could point to an issue on the setup. I will look into it some more.
Thanks,
Nuria
On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org wrote:
> To keep archives happy: Beta setup post events to > http://bits.beta.wmflabs.org/event.gif > http://bits.beta.wmflabs.org/event.gif?foo=bar that, while it > does not look to be varnish, has some kind of listener that post those > events to beta event logging database. > > On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen < > jsahleen@wikimedia.org> wrote: > >> Niklas, >> >> Can you answer this question from Nuria? >> >> jsahleen: does beta have its own varnish instance? where are you >> posting your events in beta? can you send teh url? >> >> Also would it be possible to document the steps you used when >> testing EL on beta so that others can reproduce them? >> >> Thanks, >> >> Joel >> >> Joel Sahleen, Software Engineer >> Language Engineering >> Wikimedia Foundation >> jsahleen@wikimedia.org >> >> >> >> >> On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org >> wrote: >> >> (Moving this discussion to analytics@ and localization-team@ based >> on Nuria’s suggestion below.) >> >> Hi Leila, >> >> The output I posted in the message is the only output I am seeing. >> I do not see the URL-encoded section or the validation section. I think >> there may be something wrong with my testing setup. >> >> Niklas Laxstöm has checked what is happening with our event logging >> in beta and he confirmed that we are sending events and the events are >> valid. The issue seems to be that we are logging events to the beta event >> logging db while what we checked earlier was the production event logging >> db. >> >> Can you (or anyone who is available) check the event logging db in >> beta to see if the table has been created and has data? The schema name >> again is ContentTranslation. If you don’t find anything, let us know and we >> will do some more investigation. >> >> If there is data in the beta db the next step would be to follow >> with Dan’s instructions >> https://wikitech.wikimedia.org/wiki/Analytics/Dashboards to get >> a dashboard set up on limn1. I believe that most of Dan’s instructions need >> to be handled by someone on the analytics team, but let me know if there is >> anything I can help with. >> >> Thanks again for your help! >> >> Joel >> >> Joel Sahleen, Software Engineer >> Language Engineering >> Wikimedia Foundation >> jsahleen@wikimedia.org >> >> >> >> >> On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org >> wrote: >> >> Hi Joel, >> >> When you log events, the output will be the URL-encoded JSON >> sent by the browser, the event record (similar to what you pasted in your >> email), and whether the event validates against the schema. For the sample >> output you pasted earlier, or another sample output, can you let us know if >> validation section shows Valid? >> >> Leila >> >> On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org >> wrote: >> >>> Joel, >>> >>> For questions like these going forward you can contact analytics@ >>> as you will be getting amore prompt response. Both Dan and Leila are OOTO >>> the next couple of days. >>> >>> >There are configuration options for the dev server that need to >>> be added. Do similar options need to be added when not using the dev server? >>> No, there is no need. >>> >>> You would need sample rates to determine at which sampling rate >>> you are logging if you are not logging all events, that is. >>> >>> Thanks, >>> >>> Nuria >>> >>> On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu < >>> dandreescu@wikimedia.org> wrote: >>> >>>> Adding Nuria as she can probably help >>>> >>>> On Monday, November 10, 2014, Joel Sahleen < >>>> jsahleen@wikimedia.org> wrote: >>>> >>>>> Hi Leila, >>>>> >>>>> I have tested our EventLogging code and it seems to be working >>>>> fine with the event logging dev server. I can see the events coming through >>>>> and they are valid. Here is some sample output: >>>>> >>>>> {"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", >>>>> "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, >>>>> "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": >>>>> "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", >>>>> "event": {"action": "create-translated-page", "targetLanguage": "ca", >>>>> "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": >>>>> 7146627} >>>>> >>>>> Are there additional configuration options we need to add to get >>>>> EL working aside from just requiring the main extension file. There are >>>>> configuration options for the dev server that need to be added. Do similar >>>>> options need to be added when not using the dev server? >>>>> >>>>> Any help on this would be much appreciated. >>>>> >>>>> Thanks, >>>>> >>>>> Joel >>>>> >>>>> On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org >>>>> wrote: >>>>> >>>>> No problem, Dan. Enjoy your vacation! >>>>> >>>>> I will read through the document at the link you sent. I still >>>>> need to fix our event logging code so it may be a couple days before we are >>>>> ready anyway. If I have any questions I will contact Leila or Nuria. >>>>> >>>>> Thanks, >>>>> >>>>> Joel >>>>> >>>>> Joel Sahleen, Software Engineer >>>>> Language Engineering >>>>> Wikimedia Foundation >>>>> jsahleen@wikimedia.org >>>>> >>>>> >>>>> >>>>> >>>>> On Nov 7, 2014, at 3:10 PM, Dan Andreescu < >>>>> dandreescu@wikimedia.org> wrote: >>>>> >>>>> Joel, re: visualization, >>>>> >>>>> I'm going on vacation tomorrow and will be back on November >>>>> 19th. If that's not too late, I can set up a limn instance then. If it's >>>>> too late, that's ok, I wrote up the steps needed. Someone with access to >>>>> the limn1.eqiad.wmflabs instance can perform them: >>>>> https://wikitech.wikimedia.org/wiki/Analytics/Dashboards >>>>> >>>>> If you have the data or are generating the data in some other >>>>> way, then you don't need half of that setup, you just need the part that >>>>> sets up the limn dashboard which is only an hour or so of work. Sorry I'm >>>>> running out the door and can't take care of that for you. >>>>> >>>>> Dan >>>>> >>>>> On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen < >>>>> jsahleen@wikimedia.org> wrote: >>>>> >>>>>> Thank you for the information, Pau. Very helpful. As you say, >>>>>> this does not change our current plans or hold us up in any way. I was just >>>>>> wasn’t clear about the relationship between the "high priorities" and >>>>>> "other metrics” sections. Knowing these came from different people at >>>>>> different times clarifies things a lot. >>>>>> Joel >>>>>> >>>>>> On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org >>>>>> wrote: >>>>>> >>>>>> @Pau, @Amir There is a section called High priorities for >>>>>>> product management >>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>>>> the Content translation analytics page. Did these priorities come from >>>>>>> outside the team or does this just represent our own internal view of the >>>>>>> high priorities? >>>>>> >>>>>> >>>>>> Here is the story of that page as I'm aware of it: >>>>>> >>>>>> In September 2013, I was in a meeting with the analytics team >>>>>> in SF presentingan initial proposal for metrics >>>>>> https://docs.google.com/a/wikimedia.org/presentation/d/1V1XLV7jUcAtco5ZC49SNTt3VecH7hARZ6vqbSFGnOYc/edit?usp=sharing. >>>>>> On that meeting, Dario recommended to create hierarchy of metrics based on >>>>>> the project goals. I created such image and a description for those metrics >>>>>> (the image is on top of our analytics page and the metrics are described in >>>>>> what it now the "Other metrics for created articles" section. >>>>>> >>>>>> In a meeting between Amir and Howie, they captured which should >>>>>> be the most important metrics from the product perspective in the "High >>>>>> priorities for product management". If I recalled correctly, as an outcome >>>>>> of later meetings between Howie and Amir, Howie was happy focusing on >>>>>> articles published as a single (initial?) metric for success. Amir can >>>>>> provide more details since I was not on those meetings. >>>>>> >>>>>> In short: The analytics page >>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics has >>>>>> pieces contributed by different people during the last year, and although >>>>>> there are many ideas to organise and detail, measuring the number of >>>>>> published articles seems to be the solid candidate to get started with, >>>>>> learn from the value we get from it and polish the rest of ourgoal-to-signal >>>>>> process http://www.rodden.org/kerry/heart/ for detecting >>>>>> better metrics. >>>>>> >>>>>> >>>>>> Pau >>>>>> >>>>>> On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen < >>>>>> jsahleen@wikimedia.org>wrote: >>>>>> >>>>>>> Hi All, >>>>>>> >>>>>>> I have been reviewing our requirements for Content >>>>>>> translation analytics >>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics and >>>>>>> I have a few questions/requests. I am sending them to the language team >>>>>>> list and Leila and Dan in the hopes of getting some more clarity. I will >>>>>>> add the same content to the Trello card. >>>>>>> >>>>>>> In the weekly team meeting earlier today we agreed that the >>>>>>> first metric we want to collect data for is the number of articles created >>>>>>> in each language over time. This is something has Amir has already set up our >>>>>>> current Event Logging >>>>>>> https://git.wikimedia.org/blob/mediawiki/extensions/ContentTranslation/89b6284f06b4419ddec6dcccee0eed500f267100/modules/eventlogging/ext.cx.eventlogging.js to >>>>>>> track. Now that Kartik has enabled EL in beta, that part should be done. >>>>>>> Since we are only barely turning it on, there will be very little data >>>>>>> until people create more articles using CX. However, we should be set up to >>>>>>> collect any new data that comes in. >>>>>>> >>>>>>> @Leila, can you verify that the db table now exists for the ContentTranslation >>>>>>> schema >>>>>>> https://meta.wikimedia.org/wiki/Schema:ContentTranslation? >>>>>>> If it doesn’t, can you point us to right people we need to work with to >>>>>>> troubleshoot the issue? Also you mentioned in our meeting that personal >>>>>>> data may soon be purged after 90 days as part of a new privacy policy. >>>>>>> Could you explain that a bit more or point us to more information? If this >>>>>>> is the case, it may affect some of the metrics we would like to collect in >>>>>>> the future. >>>>>>> >>>>>>> @Dan, what do we need to do next in order to set up a very >>>>>>> simple visualization that would show the number of articles created per >>>>>>> week by language. Pau has an image of what he would like on the Trello >>>>>>> card >>>>>>> https://trello.com/c/vQm0hlkt/18-content-translation-analytics. >>>>>>> You mentioned something about being able to host a dashboard for us on one >>>>>>> of the Limn servers you already have set up. >>>>>>> >>>>>>> @Santhosh, I believe you said earlier you have a script you >>>>>>> use to export the data for the ULS analytics. If so can you share that >>>>>>> please in case we need a similar script for CX so I don’t have to write a >>>>>>> new script from scratch? >>>>>>> >>>>>>> @Pau, @Amir There is a section called High priorities for >>>>>>> product management >>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>>>> the Content translation analytics page. Did these priorities come from >>>>>>> outside the team or does this just represent our own internal view of the >>>>>>> high priorities? If the latter, have these priorities been >>>>>>> reviewed by anyone outside the team? I think we are safe to proceed with >>>>>>> our current plan, but it would be good to have product sign off on things >>>>>>> more generally. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Joel >>>>>>> >>>>>>> Joel Sahleen, Software Engineer >>>>>>> Language Engineering >>>>>>> Wikimedia Foundation >>>>>>> jsahleen@wikimedia.org >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Localisation-team mailing list >>>>>>> Localisation-team@lists.wikimedia.org >>>>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Pau Giner >>>>>> Interaction Designer >>>>>> Wikimedia Foundation >>>>>> _______________________________________________ >>>>>> Localisation-team mailing list >>>>>> Localisation-team@lists.wikimedia.org >>>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>> >> >> >> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > _______________________________________________ Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
+2
If the beta environment isn’t supposed to be used for beta testing, it shouldn’t be called beta.
I’m all for grabbing the data and doing our own visualizations, but there is no guarantee that any data we grab will be accurate since they data in the beta db may be blown up at any time.
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 17, 2014, at 11:26 AM, Pau Giner pginer@wikimedia.org wrote:
Dashboards can only be created o production data,
From product we are constantly encouraged to be data-driven ("measure twice, implement once"). When I read that we need to be in production to get metrics, it feels like a circular dependency: Product wants us to have numbers that justify the move to production, but Analytics tells us that we need to be in production to get those numbers. I think it is worth opening a conversation in the Product list to clarify their expectations.
My experience with the Multimedia team was that having the ability to visualise metrics and check how those were affected by changes in the product has been really useful, and we only wish we could have had such metrics available from day one. In addition, Content Translation is being used in beta for real work by some users, and we are already missing information on how they are doing so. So any idea on how can get and make sense of some of this information (apart from manual collection) would be appreciated (maybe get the data in a way we could use some quick d3-based tool?).
Thanks
Pau
On Mon, Nov 17, 2014 at 8:38 AM, Joel Sahleen jsahleen@wikimedia.org wrote: On Nov 17, 2014, at 9:13 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Since event logging in beta and production appear to be separate, I was wondering if it would be possible to set up separate dashboards for beta and >production.
Dashboards can only be created o production data, Joel. We might blow up data in beta environment database to test something else so there is no guaranteed availability there. It is purely a testing environment.
Makes sense I suppose, but if the data in beta is unstable there doesn’t seem much point in doing any of this there, beyond confirming that we are sending valid events, which has already been done. I guess we will just have to wait until we go to production to set things up. It would be nice if we had a real beta environment we could use for beta testing, but that’s a larger issue.
On Mon, Nov 17, 2014 at 8:09 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi all,
I wanted to check in on this and confirm where things are at. As far as I understand, the outstanding issues for beta are:
We still need to verify that events sent from Content Translation are being collected in beta. The analytics team is looking into the issues in beta and Nuria has created a bug in bugzilla to track any related work.
Sometime after Dan gets back from vacation, he and Joel will need to work together to set up a basic dashboard based on Dan's instructions. Timing is dependent on 1. @Dan, let me know what works best for you and how I can best help.
Since event logging in beta and production appear to be separate, I was wondering if it would be possible to set up separate dashboards for beta and production. That would be very useful for us because it would allow us to track the usage of languages we release to beta and then use that data to prioritize the languages we release to production.
Thanks,
Joel
On Nov 14, 2014, at 11:05 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is >waiting approval from ops. Let's touch-base tomorrow to see if we see events.
In order to be able to properly test whether the fix fixes this issue we need to know what it is.
There is a bug logged for the situation of beta and EL, can we please link any commits to this bug? https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
Also, one thing is the setup of the varnish environment and other the setup of the eventlogging machine that has not received new code for quite a while, so I think we have more than one problem here.
On Thu, Nov 13, 2014 at 4:48 PM, Leila Zia leila@wikimedia.org wrote: [+ Ori]
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is waiting approval from ops. Let's touch-base tomorrow to see if we see events.
Leila
On Thu, Nov 13, 2014 at 1:30 PM, Nuria Ruiz nuria@wikimedia.org wrote: Joel:
I see, I was hoping to set aside the beta issues but if you are not deploying to prod any time soon I guess we will need to troubleshoot there. By the looks of it EL has not worked in beta since august, but, as I said before, I know very little about how beta is put together.
I have filed a bug to regarding the beta issue: https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
On Thu, Nov 13, 2014 at 12:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Nuria,
Please let me know if there is any way I can help out or if there is anything you need from our end.
When you have deployed your newest code to production, let's check whether events appear on the production stream. Let us know when deployment is done and you think your code should be logging.
Our code is not scheduled to be released to production until January. Getting the metrics is partly to help us ensure and promote that release. We will keep you informed as our plans progress, but hopefully we can figure out what the issue is in beta soon.
To confirm: You have seen proper logging from your events in vagrant, right?
The output I am seeing with vagrant is what I pasted to this thread earlier. It does not contain the url-encoded section or the user agent information as we discussed before. I think that is an issue with my dev environment, however, and not a problem with the code. The same code appears to be sending valid events in beta. The http request I sent to your email earlier is what we are seeing there. It seems to include all the information you said it should include.
If you want to debug what is happening in beta yourself, an easy way I found to do that is:
Go to our Content Translation translation view page in beta (you will need to create an account and sign in) Open chrome dev tools, Click the add translation links that appear in the middle column to add a few machine translated paragraphs to the editor Click on the publish button in the header to publish the translation to your user namespace (triggers EL event) Look at the network pane in chrome dev tools and find the entry with the event logging url (it should be near the bottom). Click on the entry to see all the request and response information.
You probably already know all this, but I thought I would pass it along just in case it helps.
Di you setup a sampling rate or code is logging 1 to 1?
No sample rate. Just logging 1 to 1.
On our end we will work to troubleshoot the beta EL infrastructure, I am not familiar with it and neither is anyone on our team but we will ask around.
Yeah, Dan said you all kind of inherited EL so that’s totally understandable. We appreciate you looking into this for us. Let us know how else we can help.
Joel
On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Nuria,
Thank you so much for your help on this. Please let me know if there is any way I can help out or if there is anything you need from our end.
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Hello,
Taking last statement back, asked Yuvi and beta does have a varnish instance so the flow of EL events "should" be the same one that production.
Now I looked on deployment-eventlogging02, which is the EL machine for labs and the last events I see there are from Aug 22.
So no events have come in as of late, which could point to an issue on the setup. I will look into it some more.
Thanks,
Nuria
On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org wrote: To keep archives happy: Beta setup post events to http://bits.beta.wmflabs.org/event.gif that, while it does not look to be varnish, has some kind of listener that post those events to beta event logging database.
On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Niklas,
Can you answer this question from Nuria?
jsahleen: does beta have its own varnish instance? where are you posting your events in beta? can you send teh url?
Also would it be possible to document the steps you used when testing EL on beta so that others can reproduce them?
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
(Moving this discussion to analytics@ and localization-team@ based on Nuria’s suggestion below.)
Hi Leila,
The output I posted in the message is the only output I am seeing. I do not see the URL-encoded section or the validation section. I think there may be something wrong with my testing setup.
Niklas Laxstöm has checked what is happening with our event logging in beta and he confirmed that we are sending events and the events are valid. The issue seems to be that we are logging events to the beta event logging db while what we checked earlier was the production event logging db.
Can you (or anyone who is available) check the event logging db in beta to see if the table has been created and has data? The schema name again is ContentTranslation. If you don’t find anything, let us know and we will do some more investigation.
If there is data in the beta db the next step would be to follow with Dan’s instructions to get a dashboard set up on limn1. I believe that most of Dan’s instructions need to be handled by someone on the analytics team, but let me know if there is anything I can help with.
Thanks again for your help!
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote:
> Hi Joel, > > When you log events, the output will be the URL-encoded JSON sent by the browser, the event record (similar to what you pasted in your email), and whether the event validates against the schema. For the sample output you pasted earlier, or another sample output, can you let us know if validation section shows Valid? > > Leila > > On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org wrote: > Joel, > > For questions like these going forward you can contact analytics@ as you will be getting amore prompt response. Both Dan and Leila are OOTO the next couple of days. > > >There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server? > No, there is no need. > > You would need sample rates to determine at which sampling rate you are logging if you are not logging all events, that is. > > Thanks, > > Nuria > > On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu dandreescu@wikimedia.org wrote: > Adding Nuria as she can probably help > > On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org wrote: > Hi Leila, > > I have tested our EventLogging code and it seems to be working fine with the event logging dev server. I can see the events coming through and they are valid. Here is some sample output: > > {"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", "event": {"action": "create-translated-page", "targetLanguage": "ca", "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": 7146627} > > Are there additional configuration options we need to add to get EL working aside from just requiring the main extension file. There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server? > > Any help on this would be much appreciated. > > Thanks, > > Joel > > On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote: > >> No problem, Dan. Enjoy your vacation! >> >> I will read through the document at the link you sent. I still need to fix our event logging code so it may be a couple days before we are ready anyway. If I have any questions I will contact Leila or Nuria. >> >> Thanks, >> >> Joel >> >> Joel Sahleen, Software Engineer >> Language Engineering >> Wikimedia Foundation >> jsahleen@wikimedia.org >> >> >> >> >> On Nov 7, 2014, at 3:10 PM, Dan Andreescu dandreescu@wikimedia.org wrote: >> >>> Joel, re: visualization, >>> >>> I'm going on vacation tomorrow and will be back on November 19th. If that's not too late, I can set up a limn instance then. If it's too late, that's ok, I wrote up the steps needed. Someone with access to the limn1.eqiad.wmflabs instance can perform them: https://wikitech.wikimedia.org/wiki/Analytics/Dashboards >>> >>> If you have the data or are generating the data in some other way, then you don't need half of that setup, you just need the part that sets up the limn dashboard which is only an hour or so of work. Sorry I'm running out the door and can't take care of that for you. >>> >>> Dan >>> >>> On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: >>> Thank you for the information, Pau. Very helpful. As you say, this does not change our current plans or hold us up in any way. I was just wasn’t clear about the relationship between the "high priorities" and "other metrics” sections. Knowing these came from different people at different times clarifies things a lot. >>> Joel >>> >>> On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org wrote: >>> >>>> @Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? >>>> >>>> Here is the story of that page as I'm aware of it: >>>> >>>> In September 2013, I was in a meeting with the analytics team in SF presentingan initial proposal for metrics. On that meeting, Dario recommended to create hierarchy of metrics based on the project goals. I created such image and a description for those metrics (the image is on top of our analytics page and the metrics are described in what it now the "Other metrics for created articles" section. >>>> >>>> In a meeting between Amir and Howie, they captured which should be the most important metrics from the product perspective in the "High priorities for product management". If I recalled correctly, as an outcome of later meetings between Howie and Amir, Howie was happy focusing on articles published as a single (initial?) metric for success. Amir can provide more details since I was not on those meetings. >>>> >>>> In short: The analytics page has pieces contributed by different people during the last year, and although there are many ideas to organise and detail, measuring the number of published articles seems to be the solid candidate to get started with, learn from the value we get from it and polish the rest of ourgoal-to-signal process for detecting better metrics. >>>> >>>> >>>> Pau >>>> >>>> On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen jsahleen@wikimedia.orgwrote: >>>> Hi All, >>>> >>>> I have been reviewing our requirements for Content translation analytics and I have a few questions/requests. I am sending them to the language team list and Leila and Dan in the hopes of getting some more clarity. I will add the same content to the Trello card. >>>> >>>> In the weekly team meeting earlier today we agreed that the first metric we want to collect data for is the number of articles created in each language over time. This is something has Amir has already set up our current Event Logging to track. Now that Kartik has enabled EL in beta, that part should be done. Since we are only barely turning it on, there will be very little data until people create more articles using CX. However, we should be set up to collect any new data that comes in. >>>> >>>> @Leila, can you verify that the db table now exists for the ContentTranslation schema? If it doesn’t, can you point us to right people we need to work with to troubleshoot the issue? Also you mentioned in our meeting that personal data may soon be purged after 90 days as part of a new privacy policy. Could you explain that a bit more or point us to more information? If this is the case, it may affect some of the metrics we would like to collect in the future. >>>> >>>> @Dan, what do we need to do next in order to set up a very simple visualization that would show the number of articles created per week by language. Pau has an image of what he would like on the Trello card. You mentioned something about being able to host a dashboard for us on one of the Limn servers you already have set up. >>>> >>>> @Santhosh, I believe you said earlier you have a script you use to export the data for the ULS analytics. If so can you share that please in case we need a similar script for CX so I don’t have to write a new script from scratch? >>>> >>>> @Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? If the latter, have these priorities been reviewed by anyone outside the team? I think we are safe to proceed with our current plan, but it would be good to have product sign off on things more generally. >>>> >>>> Thanks, >>>> >>>> Joel >>>> >>>> Joel Sahleen, Software Engineer >>>> Language Engineering >>>> Wikimedia Foundation >>>> jsahleen@wikimedia.org >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Localisation-team mailing list >>>> Localisation-team@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>> >>>> >>>> >>>> >>>> -- >>>> Pau Giner >>>> Interaction Designer >>>> Wikimedia Foundation >>>> _______________________________________________ >>>> Localisation-team mailing list >>>> Localisation-team@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>> >>> >> > > >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
-- Pau Giner Interaction Designer Wikimedia Foundation _______________________________________________ Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
If the beta environment isn’t supposed to be used for beta testing, it
shouldn’t be called beta. Sorry it is confusing but on that regard you can talk to QA. Beta-labs is a testing environment to verify and QA release.
From product we are constantly encouraged to be data-driven ("measure
twice, implement once"). When I read that we need to be in >production to get metrics, it feels like a circular dependency: Product wants us to have numbers that justify the move to production, but >Analytics tells us that we need to be in production to get those numbers. I think it is worth opening a conversation in the Product list to >clarify their expectations. To clarify, you can be a beta-feature and get metrics, when we say production we mean "REAL USERS USAGE".
I think there is a confusion between the not so well named beta environment (testing environment in labs, which is what our thread refers to) and being a beta-feature. *If you are a beta-feature you are IN production and you can get data.*
On Mon, Nov 17, 2014 at 10:41 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
+2
If the beta environment isn’t supposed to be used for beta testing, it shouldn’t be called beta.
I’m all for grabbing the data and doing our own visualizations, but there is no guarantee that any data we grab will be accurate since they data in the beta db may be blown up at any time.
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 17, 2014, at 11:26 AM, Pau Giner pginer@wikimedia.org wrote:
Dashboards can only be created o production data,
From product we are constantly encouraged to be data-driven ("measure twice, implement once"). When I read that we need to be in production to get metrics, it feels like a circular dependency: Product wants us to have numbers that justify the move to production, but Analytics tells us that we need to be in production to get those numbers. I think it is worth opening a conversation in the Product list to clarify their expectations.
My experience with the Multimedia team was that having the ability to visualise metrics and check how those were affected by changes in the product has been really useful, and we only wish we could have had such metrics available from day one. In addition, Content Translation is being used in beta for real work by some users, and we are already missing information on how they are doing so. So any idea on how can get and make sense of some of this information (apart from manual collection) would be appreciated (maybe get the data in a way we could use some quick d3-based tool http://code.shutterstock.com/rickshaw/?).
Thanks
Pau
On Mon, Nov 17, 2014 at 8:38 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
On Nov 17, 2014, at 9:13 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Since event logging in beta and production appear to be separate, I was
wondering if it would be possible to set up separate dashboards for beta and >production.
Dashboards can only be created o production data, Joel. We might blow up data in beta environment database to test something else so there is no guaranteed availability there. It is purely a testing environment.
Makes sense I suppose, but if the data in beta is unstable there doesn’t seem much point in doing any of this there, beyond confirming that we are sending valid events, which has already been done. I guess we will just have to wait until we go to production to set things up. It would be nice if we had a real beta environment we could use for beta testing, but that’s a larger issue.
On Mon, Nov 17, 2014 at 8:09 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi all,
I wanted to check in on this and confirm where things are at. As far as I understand, the outstanding issues for beta are:
- We still need to verify that events sent from Content Translation are
being collected in beta. The analytics team is looking into the issues in beta and Nuria has created a bug https://bugzilla.wikimedia.org/show_bug.cgi?id=73388 in bugzilla to track any related work.
- Sometime after Dan gets back from vacation, he and Joel will need to
work together to set up a basic dashboard based on Dan's instructions https://wikitech.wikimedia.org/wiki/Analytics/Dashboards. Timing is dependent on 1. @Dan, let me know what works best for you and how I can best help.
Since event logging in beta and production appear to be separate, I was wondering if it would be possible to set up separate dashboards for beta and production. That would be very useful for us because it would allow us to track the usage of languages we release to beta and then use that data to prioritize the languages we release to production.
Thanks,
Joel
On Nov 14, 2014, at 11:05 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel, Ori looked into this now. There was a problem with EL in labs
which affected logging events from Beta. Ori has fixed the issue, and the fix is >waiting approval from ops. Let's touch-base tomorrow to see if we see events. In order to be able to properly test whether the fix fixes this issue we need to know what it is.
There is a bug logged for the situation of beta and EL, can we please link any commits to this bug? https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
Also, one thing is the setup of the varnish environment and other the setup of the eventlogging machine that has not received new code for quite a while, so I think we have more than one problem here.
On Thu, Nov 13, 2014 at 4:48 PM, Leila Zia leila@wikimedia.org wrote:
[+ Ori]
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is waiting approval from ops. Let's touch-base tomorrow to see if we see events.
Leila
On Thu, Nov 13, 2014 at 1:30 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel:
I see, I was hoping to set aside the beta issues but if you are not deploying to prod any time soon I guess we will need to troubleshoot there. By the looks of it EL has not worked in beta since august, but, as I said before, I know very little about how beta is put together.
I have filed a bug to regarding the beta issue: https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
On Thu, Nov 13, 2014 at 12:52 PM, Joel Sahleen <jsahleen@wikimedia.org
wrote:
Hi Nuria,
>Please let me know if there is any way I can help out or if there is anything you need from our end. When you have deployed your newest code to production, let's check whether events appear on the production stream. Let us know when deployment is done and you think your code should be logging.
Our code is not scheduled to be released to production until January. Getting the metrics is partly to help us ensure and promote that release. We will keep you informed as our plans progress, but hopefully we can figure out what the issue is in beta soon.
To confirm: You have seen proper logging from your events in vagrant, right?
The output I am seeing with vagrant is what I pasted to this thread earlier. It does not contain the url-encoded section or the user agent information as we discussed before. I think that is an issue with my dev environment, however, and not a problem with the code. The same code appears to be sending valid events in beta. The http request I sent to your email earlier is what we are seeing there. It seems to include all the information you said it should include.
If you want to debug what is happening in beta yourself, an easy way I found to do that is:
- Go to our Content Translation translation view
http://en.wikipedia.beta.wmflabs.org/wiki/Special:ContentTranslation?page=Han+Feizi&from=es&to=ca&targettitle=Han+Feizi page in beta (you will need to create an account and sign in) 2. Open chrome dev tools, 3. Click the add translation links that appear in the middle column to add a few machine translated paragraphs to the editor 4. Click on the publish button in the header to publish the translation to your user namespace (triggers EL event) 5. Look at the network pane in chrome dev tools and find the entry with the event logging url (it should be near the bottom). 6. Click on the entry to see all the request and response information.
You probably already know all this, but I thought I would pass it along just in case it helps.
Di you setup a sampling rate or code is logging 1 to 1?
No sample rate. Just logging 1 to 1.
On our end we will work to troubleshoot the beta EL infrastructure, I am not familiar with it and neither is anyone on our team but we will ask around.
Yeah, Dan said you all kind of inherited EL so that’s totally understandable. We appreciate you looking into this for us. Let us know how else we can help.
Joel
On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen <jsahleen@wikimedia.org > wrote:
> Hi Nuria, > > Thank you so much for your help on this. Please let me know if there > is any way I can help out or if there is anything you need from our end. > > Joel > > Joel Sahleen, Software Engineer > Language Engineering > Wikimedia Foundation > jsahleen@wikimedia.org > > > > > On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org wrote: > > Hello, > > Taking last statement back, asked Yuvi and beta does have a varnish > instance so the flow of EL events "should" be the same one that production. > > Now I looked on deployment-eventlogging02, which is the EL machine > for labs and the last events I see there are from Aug 22. > > So no events have come in as of late, which could point to an issue > on the setup. I will look into it some more. > > Thanks, > > Nuria > > On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org > wrote: > >> To keep archives happy: Beta setup post events to >> http://bits.beta.wmflabs.org/event.gif >> http://bits.beta.wmflabs.org/event.gif?foo=bar that, while it >> does not look to be varnish, has some kind of listener that post those >> events to beta event logging database. >> >> On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen < >> jsahleen@wikimedia.org> wrote: >> >>> Niklas, >>> >>> Can you answer this question from Nuria? >>> >>> jsahleen: does beta have its own varnish instance? where are you >>> posting your events in beta? can you send teh url? >>> >>> Also would it be possible to document the steps you used when >>> testing EL on beta so that others can reproduce them? >>> >>> Thanks, >>> >>> Joel >>> >>> Joel Sahleen, Software Engineer >>> Language Engineering >>> Wikimedia Foundation >>> jsahleen@wikimedia.org >>> >>> >>> >>> >>> On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org >>> wrote: >>> >>> (Moving this discussion to analytics@ and localization-team@ >>> based on Nuria’s suggestion below.) >>> >>> Hi Leila, >>> >>> The output I posted in the message is the only output I am seeing. >>> I do not see the URL-encoded section or the validation section. I think >>> there may be something wrong with my testing setup. >>> >>> Niklas Laxstöm has checked what is happening with our event >>> logging in beta and he confirmed that we are sending events and the events >>> are valid. The issue seems to be that we are logging events to the beta >>> event logging db while what we checked earlier was the production event >>> logging db. >>> >>> Can you (or anyone who is available) check the event logging db in >>> beta to see if the table has been created and has data? The schema name >>> again is ContentTranslation. If you don’t find anything, let us know and we >>> will do some more investigation. >>> >>> If there is data in the beta db the next step would be to follow >>> with Dan’s instructions >>> https://wikitech.wikimedia.org/wiki/Analytics/Dashboards to get >>> a dashboard set up on limn1. I believe that most of Dan’s instructions need >>> to be handled by someone on the analytics team, but let me know if there is >>> anything I can help with. >>> >>> Thanks again for your help! >>> >>> Joel >>> >>> Joel Sahleen, Software Engineer >>> Language Engineering >>> Wikimedia Foundation >>> jsahleen@wikimedia.org >>> >>> >>> >>> >>> On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org >>> wrote: >>> >>> Hi Joel, >>> >>> When you log events, the output will be the URL-encoded JSON >>> sent by the browser, the event record (similar to what you pasted in your >>> email), and whether the event validates against the schema. For the sample >>> output you pasted earlier, or another sample output, can you let us know if >>> validation section shows Valid? >>> >>> Leila >>> >>> On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org >>> wrote: >>> >>>> Joel, >>>> >>>> For questions like these going forward you can contact analytics@ >>>> as you will be getting amore prompt response. Both Dan and Leila are OOTO >>>> the next couple of days. >>>> >>>> >There are configuration options for the dev server that need to >>>> be added. Do similar options need to be added when not using the dev server? >>>> No, there is no need. >>>> >>>> You would need sample rates to determine at which sampling rate >>>> you are logging if you are not logging all events, that is. >>>> >>>> Thanks, >>>> >>>> Nuria >>>> >>>> On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu < >>>> dandreescu@wikimedia.org> wrote: >>>> >>>>> Adding Nuria as she can probably help >>>>> >>>>> On Monday, November 10, 2014, Joel Sahleen < >>>>> jsahleen@wikimedia.org> wrote: >>>>> >>>>>> Hi Leila, >>>>>> >>>>>> I have tested our EventLogging code and it seems to be working >>>>>> fine with the event logging dev server. I can see the events coming through >>>>>> and they are valid. Here is some sample output: >>>>>> >>>>>> {"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", >>>>>> "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, >>>>>> "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": >>>>>> "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", >>>>>> "event": {"action": "create-translated-page", "targetLanguage": "ca", >>>>>> "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": >>>>>> 7146627} >>>>>> >>>>>> Are there additional configuration options we need to add to >>>>>> get EL working aside from just requiring the main extension file. There are >>>>>> configuration options for the dev server that need to be added. Do similar >>>>>> options need to be added when not using the dev server? >>>>>> >>>>>> Any help on this would be much appreciated. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Joel >>>>>> >>>>>> On Nov 7, 2014, at 3:52 PM, Joel Sahleen < >>>>>> jsahleen@wikimedia.org> wrote: >>>>>> >>>>>> No problem, Dan. Enjoy your vacation! >>>>>> >>>>>> I will read through the document at the link you sent. I still >>>>>> need to fix our event logging code so it may be a couple days before we are >>>>>> ready anyway. If I have any questions I will contact Leila or Nuria. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Joel >>>>>> >>>>>> Joel Sahleen, Software Engineer >>>>>> Language Engineering >>>>>> Wikimedia Foundation >>>>>> jsahleen@wikimedia.org >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Nov 7, 2014, at 3:10 PM, Dan Andreescu < >>>>>> dandreescu@wikimedia.org> wrote: >>>>>> >>>>>> Joel, re: visualization, >>>>>> >>>>>> I'm going on vacation tomorrow and will be back on November >>>>>> 19th. If that's not too late, I can set up a limn instance then. If it's >>>>>> too late, that's ok, I wrote up the steps needed. Someone with access to >>>>>> the limn1.eqiad.wmflabs instance can perform them: >>>>>> https://wikitech.wikimedia.org/wiki/Analytics/Dashboards >>>>>> >>>>>> If you have the data or are generating the data in some other >>>>>> way, then you don't need half of that setup, you just need the part that >>>>>> sets up the limn dashboard which is only an hour or so of work. Sorry I'm >>>>>> running out the door and can't take care of that for you. >>>>>> >>>>>> Dan >>>>>> >>>>>> On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen < >>>>>> jsahleen@wikimedia.org> wrote: >>>>>> >>>>>>> Thank you for the information, Pau. Very helpful. As you say, >>>>>>> this does not change our current plans or hold us up in any way. I was just >>>>>>> wasn’t clear about the relationship between the "high priorities" and >>>>>>> "other metrics” sections. Knowing these came from different people at >>>>>>> different times clarifies things a lot. >>>>>>> Joel >>>>>>> >>>>>>> On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org >>>>>>> wrote: >>>>>>> >>>>>>> @Pau, @Amir There is a section called High priorities for >>>>>>>> product management >>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>>>>> the Content translation analytics page. Did these priorities come from >>>>>>>> outside the team or does this just represent our own internal view of the >>>>>>>> high priorities? >>>>>>> >>>>>>> >>>>>>> Here is the story of that page as I'm aware of it: >>>>>>> >>>>>>> In September 2013, I was in a meeting with the analytics team >>>>>>> in SF presentingan initial proposal for metrics >>>>>>> https://docs.google.com/a/wikimedia.org/presentation/d/1V1XLV7jUcAtco5ZC49SNTt3VecH7hARZ6vqbSFGnOYc/edit?usp=sharing. >>>>>>> On that meeting, Dario recommended to create hierarchy of metrics based on >>>>>>> the project goals. I created such image and a description for those metrics >>>>>>> (the image is on top of our analytics page and the metrics are described in >>>>>>> what it now the "Other metrics for created articles" section. >>>>>>> >>>>>>> In a meeting between Amir and Howie, they captured which >>>>>>> should be the most important metrics from the product perspective in the >>>>>>> "High priorities for product management". If I recalled correctly, as an >>>>>>> outcome of later meetings between Howie and Amir, Howie was happy focusing >>>>>>> on articles published as a single (initial?) metric for success. Amir can >>>>>>> provide more details since I was not on those meetings. >>>>>>> >>>>>>> In short: The analytics page >>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics >>>>>>> has pieces contributed by different people during the last >>>>>>> year, and although there are many ideas to organise and detail, measuring >>>>>>> the number of published articles seems to be the solid candidate to get >>>>>>> started with, learn from the value we get from it and polish the rest of ourgoal-to-signal >>>>>>> process http://www.rodden.org/kerry/heart/ for detecting >>>>>>> better metrics. >>>>>>> >>>>>>> >>>>>>> Pau >>>>>>> >>>>>>> On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen < >>>>>>> jsahleen@wikimedia.org>wrote: >>>>>>> >>>>>>>> Hi All, >>>>>>>> >>>>>>>> I have been reviewing our requirements for Content >>>>>>>> translation analytics >>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics and >>>>>>>> I have a few questions/requests. I am sending them to the language team >>>>>>>> list and Leila and Dan in the hopes of getting some more clarity. I will >>>>>>>> add the same content to the Trello card. >>>>>>>> >>>>>>>> In the weekly team meeting earlier today we agreed that the >>>>>>>> first metric we want to collect data for is the number of articles created >>>>>>>> in each language over time. This is something has Amir has already set up our >>>>>>>> current Event Logging >>>>>>>> https://git.wikimedia.org/blob/mediawiki/extensions/ContentTranslation/89b6284f06b4419ddec6dcccee0eed500f267100/modules/eventlogging/ext.cx.eventlogging.js to >>>>>>>> track. Now that Kartik has enabled EL in beta, that part should be done. >>>>>>>> Since we are only barely turning it on, there will be very little data >>>>>>>> until people create more articles using CX. However, we should be set up to >>>>>>>> collect any new data that comes in. >>>>>>>> >>>>>>>> @Leila, can you verify that the db table now exists for the ContentTranslation >>>>>>>> schema >>>>>>>> https://meta.wikimedia.org/wiki/Schema:ContentTranslation? >>>>>>>> If it doesn’t, can you point us to right people we need to work with to >>>>>>>> troubleshoot the issue? Also you mentioned in our meeting that personal >>>>>>>> data may soon be purged after 90 days as part of a new privacy policy. >>>>>>>> Could you explain that a bit more or point us to more information? If this >>>>>>>> is the case, it may affect some of the metrics we would like to collect in >>>>>>>> the future. >>>>>>>> >>>>>>>> @Dan, what do we need to do next in order to set up a very >>>>>>>> simple visualization that would show the number of articles created per >>>>>>>> week by language. Pau has an image of what he would like on the Trello >>>>>>>> card >>>>>>>> https://trello.com/c/vQm0hlkt/18-content-translation-analytics. >>>>>>>> You mentioned something about being able to host a dashboard for us on one >>>>>>>> of the Limn servers you already have set up. >>>>>>>> >>>>>>>> @Santhosh, I believe you said earlier you have a script you >>>>>>>> use to export the data for the ULS analytics. If so can you share that >>>>>>>> please in case we need a similar script for CX so I don’t have to write a >>>>>>>> new script from scratch? >>>>>>>> >>>>>>>> @Pau, @Amir There is a section called High priorities for >>>>>>>> product management >>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>>>>> the Content translation analytics page. Did these priorities come from >>>>>>>> outside the team or does this just represent our own internal view of the >>>>>>>> high priorities? If the latter, have these priorities been >>>>>>>> reviewed by anyone outside the team? I think we are safe to proceed with >>>>>>>> our current plan, but it would be good to have product sign off on things >>>>>>>> more generally. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Joel >>>>>>>> >>>>>>>> Joel Sahleen, Software Engineer >>>>>>>> Language Engineering >>>>>>>> Wikimedia Foundation >>>>>>>> jsahleen@wikimedia.org >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Localisation-team mailing list >>>>>>>> Localisation-team@lists.wikimedia.org >>>>>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Pau Giner >>>>>>> Interaction Designer >>>>>>> Wikimedia Foundation >>>>>>> _______________________________________________ >>>>>>> Localisation-team mailing list >>>>>>> Localisation-team@lists.wikimedia.org >>>>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>> >>> >>> >>> >>> _______________________________________________ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> > _______________________________________________ > Localisation-team mailing list > Localisation-team@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/localisation-team > > > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > > _______________________________________________ Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
-- Pau Giner Interaction Designer Wikimedia Foundation _______________________________________________ Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Nuria Ruiz, 17/11/2014 19:53:
I think there is a confusion between the not so well named beta environment (testing environment in labs, which is what our thread refers to) and being a beta-feature.
AKA https://bugzilla.wikimedia.org/show_bug.cgi?id=56537
Nemo
I think there is a confusion between the not so well named beta environment (testing environment in labs, which is what our thread refers to) and being a beta-feature. If you are a beta-feature you are IN production and you can get data.
Thanks for the clarification, Nuria.
The issue, as I understand it, is that Product is asking for metrics on Content Translation usage "in beta” so they can make a "data-driven" decision about deployment to production. If what Product means by “in beta” is “as a beta feature” then we really have no problem. We’ll just have to wait until after we deploy as a beta feature in January to start collecting data and doing visualizations.
My understanding, however, is that what Product is asking for is metrics on Content Translation usage in “the beta environment” where a group of beta-testers has been using the extension for several months now. If the event logging data in "the beta environment" is not stable and this environment is really a software testing environment instead of a beta testing environment, then we can’t really fulfill Product's request; at least not by using event logging.
It looks like we need clarification from Product regarding what they mean by “beta,” and if that turns out to be “the beta environment” then we will have to work something out.
Thanks,
Joel
On Mon, Nov 17, 2014 at 10:41 AM, Joel Sahleen jsahleen@wikimedia.org wrote: +2
If the beta environment isn’t supposed to be used for beta testing, it shouldn’t be called beta.
I’m all for grabbing the data and doing our own visualizations, but there is no guarantee that any data we grab will be accurate since they data in the beta db may be blown up at any time.
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 17, 2014, at 11:26 AM, Pau Giner pginer@wikimedia.org wrote:
Dashboards can only be created o production data,
From product we are constantly encouraged to be data-driven ("measure twice, implement once"). When I read that we need to be in production to get metrics, it feels like a circular dependency: Product wants us to have numbers that justify the move to production, but Analytics tells us that we need to be in production to get those numbers. I think it is worth opening a conversation in the Product list to clarify their expectations.
My experience with the Multimedia team was that having the ability to visualise metrics and check how those were affected by changes in the product has been really useful, and we only wish we could have had such metrics available from day one. In addition, Content Translation is being used in beta for real work by some users, and we are already missing information on how they are doing so. So any idea on how can get and make sense of some of this information (apart from manual collection) would be appreciated (maybe get the data in a way we could use some quick d3-based tool?).
Thanks
Pau
On Mon, Nov 17, 2014 at 8:38 AM, Joel Sahleen jsahleen@wikimedia.org wrote: On Nov 17, 2014, at 9:13 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Since event logging in beta and production appear to be separate, I was wondering if it would be possible to set up separate dashboards for beta and >production.
Dashboards can only be created o production data, Joel. We might blow up data in beta environment database to test something else so there is no guaranteed availability there. It is purely a testing environment.
Makes sense I suppose, but if the data in beta is unstable there doesn’t seem much point in doing any of this there, beyond confirming that we are sending valid events, which has already been done. I guess we will just have to wait until we go to production to set things up. It would be nice if we had a real beta environment we could use for beta testing, but that’s a larger issue.
On Mon, Nov 17, 2014 at 8:09 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi all,
I wanted to check in on this and confirm where things are at. As far as I understand, the outstanding issues for beta are:
We still need to verify that events sent from Content Translation are being collected in beta. The analytics team is looking into the issues in beta and Nuria has created a bug in bugzilla to track any related work.
Sometime after Dan gets back from vacation, he and Joel will need to work together to set up a basic dashboard based on Dan's instructions. Timing is dependent on 1. @Dan, let me know what works best for you and how I can best help.
Since event logging in beta and production appear to be separate, I was wondering if it would be possible to set up separate dashboards for beta and production. That would be very useful for us because it would allow us to track the usage of languages we release to beta and then use that data to prioritize the languages we release to production.
Thanks,
Joel
On Nov 14, 2014, at 11:05 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is >waiting approval from ops. Let's touch-base tomorrow to see if we see events.
In order to be able to properly test whether the fix fixes this issue we need to know what it is.
There is a bug logged for the situation of beta and EL, can we please link any commits to this bug? https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
Also, one thing is the setup of the varnish environment and other the setup of the eventlogging machine that has not received new code for quite a while, so I think we have more than one problem here.
On Thu, Nov 13, 2014 at 4:48 PM, Leila Zia leila@wikimedia.org wrote: [+ Ori]
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is waiting approval from ops. Let's touch-base tomorrow to see if we see events.
Leila
On Thu, Nov 13, 2014 at 1:30 PM, Nuria Ruiz nuria@wikimedia.org wrote: Joel:
I see, I was hoping to set aside the beta issues but if you are not deploying to prod any time soon I guess we will need to troubleshoot there. By the looks of it EL has not worked in beta since august, but, as I said before, I know very little about how beta is put together.
I have filed a bug to regarding the beta issue: https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
On Thu, Nov 13, 2014 at 12:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Nuria,
Please let me know if there is any way I can help out or if there is anything you need from our end.
When you have deployed your newest code to production, let's check whether events appear on the production stream. Let us know when deployment is done and you think your code should be logging.
Our code is not scheduled to be released to production until January. Getting the metrics is partly to help us ensure and promote that release. We will keep you informed as our plans progress, but hopefully we can figure out what the issue is in beta soon.
To confirm: You have seen proper logging from your events in vagrant, right?
The output I am seeing with vagrant is what I pasted to this thread earlier. It does not contain the url-encoded section or the user agent information as we discussed before. I think that is an issue with my dev environment, however, and not a problem with the code. The same code appears to be sending valid events in beta. The http request I sent to your email earlier is what we are seeing there. It seems to include all the information you said it should include.
If you want to debug what is happening in beta yourself, an easy way I found to do that is:
Go to our Content Translation translation view page in beta (you will need to create an account and sign in) Open chrome dev tools, Click the add translation links that appear in the middle column to add a few machine translated paragraphs to the editor Click on the publish button in the header to publish the translation to your user namespace (triggers EL event) Look at the network pane in chrome dev tools and find the entry with the event logging url (it should be near the bottom). Click on the entry to see all the request and response information.
You probably already know all this, but I thought I would pass it along just in case it helps.
Di you setup a sampling rate or code is logging 1 to 1?
No sample rate. Just logging 1 to 1.
On our end we will work to troubleshoot the beta EL infrastructure, I am not familiar with it and neither is anyone on our team but we will ask around.
Yeah, Dan said you all kind of inherited EL so that’s totally understandable. We appreciate you looking into this for us. Let us know how else we can help.
Joel
On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Nuria,
Thank you so much for your help on this. Please let me know if there is any way I can help out or if there is anything you need from our end.
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Hello,
Taking last statement back, asked Yuvi and beta does have a varnish instance so the flow of EL events "should" be the same one that production.
Now I looked on deployment-eventlogging02, which is the EL machine for labs and the last events I see there are from Aug 22.
So no events have come in as of late, which could point to an issue on the setup. I will look into it some more.
Thanks,
Nuria
On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org wrote: To keep archives happy: Beta setup post events to http://bits.beta.wmflabs.org/event.gif that, while it does not look to be varnish, has some kind of listener that post those events to beta event logging database.
On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Niklas,
Can you answer this question from Nuria?
jsahleen: does beta have its own varnish instance? where are you posting your events in beta? can you send teh url?
Also would it be possible to document the steps you used when testing EL on beta so that others can reproduce them?
Thanks,
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
> (Moving this discussion to analytics@ and localization-team@ based on Nuria’s suggestion below.) > > Hi Leila, > > The output I posted in the message is the only output I am seeing. I do not see the URL-encoded section or the validation section. I think there may be something wrong with my testing setup. > > Niklas Laxstöm has checked what is happening with our event logging in beta and he confirmed that we are sending events and the events are valid. The issue seems to be that we are logging events to the beta event logging db while what we checked earlier was the production event logging db. > > Can you (or anyone who is available) check the event logging db in beta to see if the table has been created and has data? The schema name again is ContentTranslation. If you don’t find anything, let us know and we will do some more investigation. > > If there is data in the beta db the next step would be to follow with Dan’s instructions to get a dashboard set up on limn1. I believe that most of Dan’s instructions need to be handled by someone on the analytics team, but let me know if there is anything I can help with. > > Thanks again for your help! > > Joel > > Joel Sahleen, Software Engineer > Language Engineering > Wikimedia Foundation > jsahleen@wikimedia.org > > > > > On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote: > >> Hi Joel, >> >> When you log events, the output will be the URL-encoded JSON sent by the browser, the event record (similar to what you pasted in your email), and whether the event validates against the schema. For the sample output you pasted earlier, or another sample output, can you let us know if validation section shows Valid? >> >> Leila >> >> On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org wrote: >> Joel, >> >> For questions like these going forward you can contact analytics@ as you will be getting amore prompt response. Both Dan and Leila are OOTO the next couple of days. >> >> >There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server? >> No, there is no need. >> >> You would need sample rates to determine at which sampling rate you are logging if you are not logging all events, that is. >> >> Thanks, >> >> Nuria >> >> On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu dandreescu@wikimedia.org wrote: >> Adding Nuria as she can probably help >> >> On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org wrote: >> Hi Leila, >> >> I have tested our EventLogging code and it seems to be working fine with the event logging dev server. I can see the events coming through and they are valid. Here is some sample output: >> >> {"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", "event": {"action": "create-translated-page", "targetLanguage": "ca", "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": 7146627} >> >> Are there additional configuration options we need to add to get EL working aside from just requiring the main extension file. There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server? >> >> Any help on this would be much appreciated. >> >> Thanks, >> >> Joel >> >> On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote: >> >>> No problem, Dan. Enjoy your vacation! >>> >>> I will read through the document at the link you sent. I still need to fix our event logging code so it may be a couple days before we are ready anyway. If I have any questions I will contact Leila or Nuria. >>> >>> Thanks, >>> >>> Joel >>> >>> Joel Sahleen, Software Engineer >>> Language Engineering >>> Wikimedia Foundation >>> jsahleen@wikimedia.org >>> >>> >>> >>> >>> On Nov 7, 2014, at 3:10 PM, Dan Andreescu dandreescu@wikimedia.org wrote: >>> >>>> Joel, re: visualization, >>>> >>>> I'm going on vacation tomorrow and will be back on November 19th. If that's not too late, I can set up a limn instance then. If it's too late, that's ok, I wrote up the steps needed. Someone with access to the limn1.eqiad.wmflabs instance can perform them: https://wikitech.wikimedia.org/wiki/Analytics/Dashboards >>>> >>>> If you have the data or are generating the data in some other way, then you don't need half of that setup, you just need the part that sets up the limn dashboard which is only an hour or so of work. Sorry I'm running out the door and can't take care of that for you. >>>> >>>> Dan >>>> >>>> On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: >>>> Thank you for the information, Pau. Very helpful. As you say, this does not change our current plans or hold us up in any way. I was just wasn’t clear about the relationship between the "high priorities" and "other metrics” sections. Knowing these came from different people at different times clarifies things a lot. >>>> Joel >>>> >>>> On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org wrote: >>>> >>>>> @Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? >>>>> >>>>> Here is the story of that page as I'm aware of it: >>>>> >>>>> In September 2013, I was in a meeting with the analytics team in SF presentingan initial proposal for metrics. On that meeting, Dario recommended to create hierarchy of metrics based on the project goals. I created such image and a description for those metrics (the image is on top of our analytics page and the metrics are described in what it now the "Other metrics for created articles" section. >>>>> >>>>> In a meeting between Amir and Howie, they captured which should be the most important metrics from the product perspective in the "High priorities for product management". If I recalled correctly, as an outcome of later meetings between Howie and Amir, Howie was happy focusing on articles published as a single (initial?) metric for success. Amir can provide more details since I was not on those meetings. >>>>> >>>>> In short: The analytics page has pieces contributed by different people during the last year, and although there are many ideas to organise and detail, measuring the number of published articles seems to be the solid candidate to get started with, learn from the value we get from it and polish the rest of ourgoal-to-signal process for detecting better metrics. >>>>> >>>>> >>>>> Pau >>>>> >>>>> On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen jsahleen@wikimedia.orgwrote: >>>>> Hi All, >>>>> >>>>> I have been reviewing our requirements for Content translation analytics and I have a few questions/requests. I am sending them to the language team list and Leila and Dan in the hopes of getting some more clarity. I will add the same content to the Trello card. >>>>> >>>>> In the weekly team meeting earlier today we agreed that the first metric we want to collect data for is the number of articles created in each language over time. This is something has Amir has already set up our current Event Logging to track. Now that Kartik has enabled EL in beta, that part should be done. Since we are only barely turning it on, there will be very little data until people create more articles using CX. However, we should be set up to collect any new data that comes in. >>>>> >>>>> @Leila, can you verify that the db table now exists for the ContentTranslation schema? If it doesn’t, can you point us to right people we need to work with to troubleshoot the issue? Also you mentioned in our meeting that personal data may soon be purged after 90 days as part of a new privacy policy. Could you explain that a bit more or point us to more information? If this is the case, it may affect some of the metrics we would like to collect in the future. >>>>> >>>>> @Dan, what do we need to do next in order to set up a very simple visualization that would show the number of articles created per week by language. Pau has an image of what he would like on the Trello card. You mentioned something about being able to host a dashboard for us on one of the Limn servers you already have set up. >>>>> >>>>> @Santhosh, I believe you said earlier you have a script you use to export the data for the ULS analytics. If so can you share that please in case we need a similar script for CX so I don’t have to write a new script from scratch? >>>>> >>>>> @Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? If the latter, have these priorities been reviewed by anyone outside the team? I think we are safe to proceed with our current plan, but it would be good to have product sign off on things more generally. >>>>> >>>>> Thanks, >>>>> >>>>> Joel >>>>> >>>>> Joel Sahleen, Software Engineer >>>>> Language Engineering >>>>> Wikimedia Foundation >>>>> jsahleen@wikimedia.org >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Localisation-team mailing list >>>>> Localisation-team@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Pau Giner >>>>> Interaction Designer >>>>> Wikimedia Foundation >>>>> _______________________________________________ >>>>> Localisation-team mailing list >>>>> Localisation-team@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>> >>>> >>> >> >> >> >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
-- Pau Giner Interaction Designer Wikimedia Foundation _______________________________________________ Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Joel,
Please look at the wiki page for beta features: "The primary purpose of Beta Features is to allow for Wikimedia designers and engineers (from the Wikimedia Foundation and community alike) to roll out technical improvements in an environment where large numbers of users can test, give feedback, and use these features in real-world settings. "
http://www.mediawiki.org/wiki/Beta_Features
This is normally what product refers to as "beta". You can, of course, confirm.
Beta cluster purpose is software testing (not quite the same thing): http://www.mediawiki.org/wiki/Beta_cluster
Thanks,
Nuria
On Mon, Nov 17, 2014 at 12:03 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
I think there is a confusion between the not so well named beta environment (testing environment in labs, which is what our thread refers to) and being a beta-feature. *If you are a beta-feature you are IN production and you can get data.*
Thanks for the clarification, Nuria.
The issue, as I understand it, is that Product is asking for metrics on Content Translation usage "in beta” so they can make a "data-driven" decision about deployment to production. If what Product means by “in beta” is “as a beta feature” then we really have no problem. We’ll just have to wait until after we deploy as a beta feature in January to start collecting data and doing visualizations.
My understanding, however, is that what Product is asking for is metrics on Content Translation usage in “the beta environment” where a group of beta-testers has been using the extension for several months now. If the event logging data in "the beta environment" is not stable and this environment is really a software testing environment instead of a beta testing environment, then we can’t really fulfill Product's request; at least not by using event logging.
It looks like we need clarification from Product regarding what they mean by “beta,” and if that turns out to be “the beta environment” then we will have to work something out.
Thanks,
Joel
On Mon, Nov 17, 2014 at 10:41 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
+2
If the beta environment isn’t supposed to be used for beta testing, it shouldn’t be called beta.
I’m all for grabbing the data and doing our own visualizations, but there is no guarantee that any data we grab will be accurate since they data in the beta db may be blown up at any time.
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 17, 2014, at 11:26 AM, Pau Giner pginer@wikimedia.org wrote:
Dashboards can only be created o production data,
From product we are constantly encouraged to be data-driven ("measure twice, implement once"). When I read that we need to be in production to get metrics, it feels like a circular dependency: Product wants us to have numbers that justify the move to production, but Analytics tells us that we need to be in production to get those numbers. I think it is worth opening a conversation in the Product list to clarify their expectations.
My experience with the Multimedia team was that having the ability to visualise metrics and check how those were affected by changes in the product has been really useful, and we only wish we could have had such metrics available from day one. In addition, Content Translation is being used in beta for real work by some users, and we are already missing information on how they are doing so. So any idea on how can get and make sense of some of this information (apart from manual collection) would be appreciated (maybe get the data in a way we could use some quick d3-based tool http://code.shutterstock.com/rickshaw/?).
Thanks
Pau
On Mon, Nov 17, 2014 at 8:38 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
On Nov 17, 2014, at 9:13 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Since event logging in beta and production appear to be separate, I was
wondering if it would be possible to set up separate dashboards for beta and >production.
Dashboards can only be created o production data, Joel. We might blow up data in beta environment database to test something else so there is no guaranteed availability there. It is purely a testing environment.
Makes sense I suppose, but if the data in beta is unstable there doesn’t seem much point in doing any of this there, beyond confirming that we are sending valid events, which has already been done. I guess we will just have to wait until we go to production to set things up. It would be nice if we had a real beta environment we could use for beta testing, but that’s a larger issue.
On Mon, Nov 17, 2014 at 8:09 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi all,
I wanted to check in on this and confirm where things are at. As far as I understand, the outstanding issues for beta are:
- We still need to verify that events sent from Content Translation
are being collected in beta. The analytics team is looking into the issues in beta and Nuria has created a bug https://bugzilla.wikimedia.org/show_bug.cgi?id=73388 in bugzilla to track any related work.
- Sometime after Dan gets back from vacation, he and Joel will need to
work together to set up a basic dashboard based on Dan's instructions https://wikitech.wikimedia.org/wiki/Analytics/Dashboards. Timing is dependent on 1. @Dan, let me know what works best for you and how I can best help.
Since event logging in beta and production appear to be separate, I was wondering if it would be possible to set up separate dashboards for beta and production. That would be very useful for us because it would allow us to track the usage of languages we release to beta and then use that data to prioritize the languages we release to production.
Thanks,
Joel
On Nov 14, 2014, at 11:05 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel, Ori looked into this now. There was a problem with EL in labs
which affected logging events from Beta. Ori has fixed the issue, and the fix is >waiting approval from ops. Let's touch-base tomorrow to see if we see events. In order to be able to properly test whether the fix fixes this issue we need to know what it is.
There is a bug logged for the situation of beta and EL, can we please link any commits to this bug? https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
Also, one thing is the setup of the varnish environment and other the setup of the eventlogging machine that has not received new code for quite a while, so I think we have more than one problem here.
On Thu, Nov 13, 2014 at 4:48 PM, Leila Zia leila@wikimedia.org wrote:
[+ Ori]
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is waiting approval from ops. Let's touch-base tomorrow to see if we see events.
Leila
On Thu, Nov 13, 2014 at 1:30 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel:
I see, I was hoping to set aside the beta issues but if you are not deploying to prod any time soon I guess we will need to troubleshoot there. By the looks of it EL has not worked in beta since august, but, as I said before, I know very little about how beta is put together.
I have filed a bug to regarding the beta issue: https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
On Thu, Nov 13, 2014 at 12:52 PM, Joel Sahleen < jsahleen@wikimedia.org> wrote:
> Hi Nuria, > > >Please let me know if there is any way I can help out or if there > is anything you need from our end. > When you have deployed your newest code to production, let's check > whether events appear on the production stream. Let us know when deployment > is done and you think your code should be logging. > > > Our code is not scheduled to be released to production until > January. Getting the metrics is partly to help us ensure and promote that > release. We will keep you informed as our plans progress, but hopefully we > can figure out what the issue is in beta soon. > > To confirm: You have seen proper logging from your events in > vagrant, right? > > > The output I am seeing with vagrant is what I pasted to this thread > earlier. It does not contain the url-encoded section or the user agent > information as we discussed before. I think that is an issue with my dev > environment, however, and not a problem with the code. The same code > appears to be sending valid events in beta. The http request I sent to your > email earlier is what we are seeing there. It seems to include all the > information you said it should include. > > If you want to debug what is happening in beta yourself, an easy way > I found to do that is: > > > 1. Go to our Content Translation translation view > http://en.wikipedia.beta.wmflabs.org/wiki/Special:ContentTranslation?page=Han+Feizi&from=es&to=ca&targettitle=Han+Feizi page > in beta (you will need to create an account and sign in) > 2. Open chrome dev tools, > 3. Click the add translation links that appear in the middle > column to add a few machine translated paragraphs to the editor > 4. Click on the publish button in the header to publish the > translation to your user namespace (triggers EL event) > 5. Look at the network pane in chrome dev tools and find the > entry with the event logging url (it should be near the bottom). > 6. Click on the entry to see all the request and response > information. > > > You probably already know all this, but I thought I would pass it > along just in case it helps. > > Di you setup a sampling rate or code is logging 1 to 1? > > > No sample rate. Just logging 1 to 1. > > On our end we will work to troubleshoot the beta EL infrastructure, > I am not familiar with it and neither is anyone on our team but we will ask > around. > > > Yeah, Dan said you all kind of inherited EL so that’s totally > understandable. We appreciate you looking into this for us. Let us know how > else we can help. > > Joel > > > > > > On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen < > jsahleen@wikimedia.org> wrote: > >> Hi Nuria, >> >> Thank you so much for your help on this. Please let me know if >> there is any way I can help out or if there is anything you need from our >> end. >> >> Joel >> >> Joel Sahleen, Software Engineer >> Language Engineering >> Wikimedia Foundation >> jsahleen@wikimedia.org >> >> >> >> >> On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org >> wrote: >> >> Hello, >> >> Taking last statement back, asked Yuvi and beta does have a varnish >> instance so the flow of EL events "should" be the same one that production. >> >> Now I looked on deployment-eventlogging02, which is the EL machine >> for labs and the last events I see there are from Aug 22. >> >> So no events have come in as of late, which could point to an issue >> on the setup. I will look into it some more. >> >> Thanks, >> >> Nuria >> >> On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org >> wrote: >> >>> To keep archives happy: Beta setup post events to >>> http://bits.beta.wmflabs.org/event.gif >>> http://bits.beta.wmflabs.org/event.gif?foo=bar that, while it >>> does not look to be varnish, has some kind of listener that post those >>> events to beta event logging database. >>> >>> On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen < >>> jsahleen@wikimedia.org> wrote: >>> >>>> Niklas, >>>> >>>> Can you answer this question from Nuria? >>>> >>>> jsahleen: does beta have its own varnish instance? where are you >>>> posting your events in beta? can you send teh url? >>>> >>>> Also would it be possible to document the steps you used when >>>> testing EL on beta so that others can reproduce them? >>>> >>>> Thanks, >>>> >>>> Joel >>>> >>>> Joel Sahleen, Software Engineer >>>> Language Engineering >>>> Wikimedia Foundation >>>> jsahleen@wikimedia.org >>>> >>>> >>>> >>>> >>>> On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org >>>> wrote: >>>> >>>> (Moving this discussion to analytics@ and localization-team@ >>>> based on Nuria’s suggestion below.) >>>> >>>> Hi Leila, >>>> >>>> The output I posted in the message is the only output I am >>>> seeing. I do not see the URL-encoded section or the validation section. I >>>> think there may be something wrong with my testing setup. >>>> >>>> Niklas Laxstöm has checked what is happening with our event >>>> logging in beta and he confirmed that we are sending events and the events >>>> are valid. The issue seems to be that we are logging events to the beta >>>> event logging db while what we checked earlier was the production event >>>> logging db. >>>> >>>> Can you (or anyone who is available) check the event logging db >>>> in beta to see if the table has been created and has data? The schema name >>>> again is ContentTranslation. If you don’t find anything, let us know and we >>>> will do some more investigation. >>>> >>>> If there is data in the beta db the next step would be to follow >>>> with Dan’s instructions >>>> https://wikitech.wikimedia.org/wiki/Analytics/Dashboards to >>>> get a dashboard set up on limn1. I believe that most of Dan’s instructions >>>> need to be handled by someone on the analytics team, but let me know if >>>> there is anything I can help with. >>>> >>>> Thanks again for your help! >>>> >>>> Joel >>>> >>>> Joel Sahleen, Software Engineer >>>> Language Engineering >>>> Wikimedia Foundation >>>> jsahleen@wikimedia.org >>>> >>>> >>>> >>>> >>>> On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org >>>> wrote: >>>> >>>> Hi Joel, >>>> >>>> When you log events, the output will be the URL-encoded JSON >>>> sent by the browser, the event record (similar to what you pasted in your >>>> email), and whether the event validates against the schema. For the sample >>>> output you pasted earlier, or another sample output, can you let us know if >>>> validation section shows Valid? >>>> >>>> Leila >>>> >>>> On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org >>>> wrote: >>>> >>>>> Joel, >>>>> >>>>> For questions like these going forward you can contact analytics@ >>>>> as you will be getting amore prompt response. Both Dan and Leila are OOTO >>>>> the next couple of days. >>>>> >>>>> >There are configuration options for the dev server that need >>>>> to be added. Do similar options need to be added when not using the dev >>>>> server? >>>>> No, there is no need. >>>>> >>>>> You would need sample rates to determine at which sampling rate >>>>> you are logging if you are not logging all events, that is. >>>>> >>>>> Thanks, >>>>> >>>>> Nuria >>>>> >>>>> On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu < >>>>> dandreescu@wikimedia.org> wrote: >>>>> >>>>>> Adding Nuria as she can probably help >>>>>> >>>>>> On Monday, November 10, 2014, Joel Sahleen < >>>>>> jsahleen@wikimedia.org> wrote: >>>>>> >>>>>>> Hi Leila, >>>>>>> >>>>>>> I have tested our EventLogging code and it seems to be working >>>>>>> fine with the event logging dev server. I can see the events coming through >>>>>>> and they are valid. Here is some sample output: >>>>>>> >>>>>>> {"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", >>>>>>> "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, >>>>>>> "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": >>>>>>> "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", >>>>>>> "event": {"action": "create-translated-page", "targetLanguage": "ca", >>>>>>> "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": >>>>>>> 7146627} >>>>>>> >>>>>>> Are there additional configuration options we need to add to >>>>>>> get EL working aside from just requiring the main extension file. There are >>>>>>> configuration options for the dev server that need to be added. Do similar >>>>>>> options need to be added when not using the dev server? >>>>>>> >>>>>>> Any help on this would be much appreciated. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Joel >>>>>>> >>>>>>> On Nov 7, 2014, at 3:52 PM, Joel Sahleen < >>>>>>> jsahleen@wikimedia.org> wrote: >>>>>>> >>>>>>> No problem, Dan. Enjoy your vacation! >>>>>>> >>>>>>> I will read through the document at the link you sent. I still >>>>>>> need to fix our event logging code so it may be a couple days before we are >>>>>>> ready anyway. If I have any questions I will contact Leila or Nuria. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Joel >>>>>>> >>>>>>> Joel Sahleen, Software Engineer >>>>>>> Language Engineering >>>>>>> Wikimedia Foundation >>>>>>> jsahleen@wikimedia.org >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Nov 7, 2014, at 3:10 PM, Dan Andreescu < >>>>>>> dandreescu@wikimedia.org> wrote: >>>>>>> >>>>>>> Joel, re: visualization, >>>>>>> >>>>>>> I'm going on vacation tomorrow and will be back on November >>>>>>> 19th. If that's not too late, I can set up a limn instance then. If it's >>>>>>> too late, that's ok, I wrote up the steps needed. Someone with access to >>>>>>> the limn1.eqiad.wmflabs instance can perform them: >>>>>>> https://wikitech.wikimedia.org/wiki/Analytics/Dashboards >>>>>>> >>>>>>> If you have the data or are generating the data in some other >>>>>>> way, then you don't need half of that setup, you just need the part that >>>>>>> sets up the limn dashboard which is only an hour or so of work. Sorry I'm >>>>>>> running out the door and can't take care of that for you. >>>>>>> >>>>>>> Dan >>>>>>> >>>>>>> On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen < >>>>>>> jsahleen@wikimedia.org> wrote: >>>>>>> >>>>>>>> Thank you for the information, Pau. Very helpful. As you say, >>>>>>>> this does not change our current plans or hold us up in any way. I was just >>>>>>>> wasn’t clear about the relationship between the "high priorities" and >>>>>>>> "other metrics” sections. Knowing these came from different people at >>>>>>>> different times clarifies things a lot. >>>>>>>> Joel >>>>>>>> >>>>>>>> On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org >>>>>>>> wrote: >>>>>>>> >>>>>>>> @Pau, @Amir There is a section called High priorities for >>>>>>>>> product management >>>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>>>>>> the Content translation analytics page. Did these priorities come from >>>>>>>>> outside the team or does this just represent our own internal view of the >>>>>>>>> high priorities? >>>>>>>> >>>>>>>> >>>>>>>> Here is the story of that page as I'm aware of it: >>>>>>>> >>>>>>>> In September 2013, I was in a meeting with the analytics team >>>>>>>> in SF presentingan initial proposal for metrics >>>>>>>> https://docs.google.com/a/wikimedia.org/presentation/d/1V1XLV7jUcAtco5ZC49SNTt3VecH7hARZ6vqbSFGnOYc/edit?usp=sharing. >>>>>>>> On that meeting, Dario recommended to create hierarchy of metrics based on >>>>>>>> the project goals. I created such image and a description for those metrics >>>>>>>> (the image is on top of our analytics page and the metrics are described in >>>>>>>> what it now the "Other metrics for created articles" section. >>>>>>>> >>>>>>>> In a meeting between Amir and Howie, they captured which >>>>>>>> should be the most important metrics from the product perspective in the >>>>>>>> "High priorities for product management". If I recalled correctly, as an >>>>>>>> outcome of later meetings between Howie and Amir, Howie was happy focusing >>>>>>>> on articles published as a single (initial?) metric for success. Amir can >>>>>>>> provide more details since I was not on those meetings. >>>>>>>> >>>>>>>> In short: The analytics page >>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics >>>>>>>> has pieces contributed by different people during the last >>>>>>>> year, and although there are many ideas to organise and detail, measuring >>>>>>>> the number of published articles seems to be the solid candidate to get >>>>>>>> started with, learn from the value we get from it and polish the rest of ourgoal-to-signal >>>>>>>> process http://www.rodden.org/kerry/heart/ for detecting >>>>>>>> better metrics. >>>>>>>> >>>>>>>> >>>>>>>> Pau >>>>>>>> >>>>>>>> On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen < >>>>>>>> jsahleen@wikimedia.org>wrote: >>>>>>>> >>>>>>>>> Hi All, >>>>>>>>> >>>>>>>>> I have been reviewing our requirements for Content >>>>>>>>> translation analytics >>>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics and >>>>>>>>> I have a few questions/requests. I am sending them to the language team >>>>>>>>> list and Leila and Dan in the hopes of getting some more clarity. I will >>>>>>>>> add the same content to the Trello card. >>>>>>>>> >>>>>>>>> In the weekly team meeting earlier today we agreed that the >>>>>>>>> first metric we want to collect data for is the number of articles created >>>>>>>>> in each language over time. This is something has Amir has already set up our >>>>>>>>> current Event Logging >>>>>>>>> https://git.wikimedia.org/blob/mediawiki/extensions/ContentTranslation/89b6284f06b4419ddec6dcccee0eed500f267100/modules/eventlogging/ext.cx.eventlogging.js to >>>>>>>>> track. Now that Kartik has enabled EL in beta, that part should be done. >>>>>>>>> Since we are only barely turning it on, there will be very little data >>>>>>>>> until people create more articles using CX. However, we should be set up to >>>>>>>>> collect any new data that comes in. >>>>>>>>> >>>>>>>>> @Leila, can you verify that the db table now exists for the ContentTranslation >>>>>>>>> schema >>>>>>>>> https://meta.wikimedia.org/wiki/Schema:ContentTranslation? >>>>>>>>> If it doesn’t, can you point us to right people we need to work with to >>>>>>>>> troubleshoot the issue? Also you mentioned in our meeting that personal >>>>>>>>> data may soon be purged after 90 days as part of a new privacy policy. >>>>>>>>> Could you explain that a bit more or point us to more information? If this >>>>>>>>> is the case, it may affect some of the metrics we would like to collect in >>>>>>>>> the future. >>>>>>>>> >>>>>>>>> @Dan, what do we need to do next in order to set up a very >>>>>>>>> simple visualization that would show the number of articles created per >>>>>>>>> week by language. Pau has an image of what he would like on the Trello >>>>>>>>> card >>>>>>>>> https://trello.com/c/vQm0hlkt/18-content-translation-analytics. >>>>>>>>> You mentioned something about being able to host a dashboard for us on one >>>>>>>>> of the Limn servers you already have set up. >>>>>>>>> >>>>>>>>> @Santhosh, I believe you said earlier you have a script you >>>>>>>>> use to export the data for the ULS analytics. If so can you share that >>>>>>>>> please in case we need a similar script for CX so I don’t have to write a >>>>>>>>> new script from scratch? >>>>>>>>> >>>>>>>>> @Pau, @Amir There is a section called High priorities for >>>>>>>>> product management >>>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>>>>>> the Content translation analytics page. Did these priorities come from >>>>>>>>> outside the team or does this just represent our own internal view of the >>>>>>>>> high priorities? If the latter, have these priorities been >>>>>>>>> reviewed by anyone outside the team? I think we are safe to proceed with >>>>>>>>> our current plan, but it would be good to have product sign off on things >>>>>>>>> more generally. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Joel >>>>>>>>> >>>>>>>>> Joel Sahleen, Software Engineer >>>>>>>>> Language Engineering >>>>>>>>> Wikimedia Foundation >>>>>>>>> jsahleen@wikimedia.org >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Localisation-team mailing list >>>>>>>>> Localisation-team@lists.wikimedia.org >>>>>>>>> >>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Pau Giner >>>>>>>> Interaction Designer >>>>>>>> Wikimedia Foundation >>>>>>>> _______________________________________________ >>>>>>>> Localisation-team mailing list >>>>>>>> Localisation-team@lists.wikimedia.org >>>>>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Analytics mailing list >>>> Analytics@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>>> >>> >> _______________________________________________ >> Localisation-team mailing list >> Localisation-team@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/localisation-team >> >> >> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > _______________________________________________ > Localisation-team mailing list > Localisation-team@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/localisation-team > > > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
-- Pau Giner Interaction Designer Wikimedia Foundation _______________________________________________ Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
This is normally what product refers to as "beta". You can, of course, confirm.
I think that the Product team is already aware that Content Translation is not a "beta feature". It is deployed on beta-labs, but it is used by users to create articles that later they copy and publish to a real Wikipedia (which may qualify as "REAL USERS USAGE"). You can check this page https://www.mediawiki.org/wiki/Content_translation/Published_pages for a list of articles created on beta-labs and published into real Wikipedias.
The beta-features framework isolates users from UI changes but not from the content produced/modified by the features. If we deploy Content Translation as a beta feature, the articles published by users will be visible to everyone because this is content created on a real Wikipedia. Thus, the community can tell us: "how do we know this is not going to flood Wikipedia with robot-like horrible translations? Or directly telling us we are doing so because they just found a couple of bad translated articles", and we can tell that the experience in our test environments was positive in number of ways (user testing, info from manually collecting numbers) but we don't have more detailed numbers on article production and we are not ready to measure the impact right after we deployed because it was not possible to get things ready in advance. I find this approach could be problematic, but I'm happy to follow the Analytics advice on this.
In any case, as said before, this is worth checking with product.
Pau
On Mon, Nov 17, 2014 at 12:17 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel,
Please look at the wiki page for beta features: "The primary purpose of Beta Features is to allow for Wikimedia designers and engineers (from the Wikimedia Foundation and community alike) to roll out technical improvements in an environment where large numbers of users can test, give feedback, and use these features in real-world settings. "
http://www.mediawiki.org/wiki/Beta_Features
This is normally what product refers to as "beta". You can, of course, confirm.
Beta cluster purpose is software testing (not quite the same thing): http://www.mediawiki.org/wiki/Beta_cluster
Thanks,
Nuria
On Mon, Nov 17, 2014 at 12:03 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
I think there is a confusion between the not so well named beta environment (testing environment in labs, which is what our thread refers to) and being a beta-feature. *If you are a beta-feature you are IN production and you can get data.*
Thanks for the clarification, Nuria.
The issue, as I understand it, is that Product is asking for metrics on Content Translation usage "in beta” so they can make a "data-driven" decision about deployment to production. If what Product means by “in beta” is “as a beta feature” then we really have no problem. We’ll just have to wait until after we deploy as a beta feature in January to start collecting data and doing visualizations.
My understanding, however, is that what Product is asking for is metrics on Content Translation usage in “the beta environment” where a group of beta-testers has been using the extension for several months now. If the event logging data in "the beta environment" is not stable and this environment is really a software testing environment instead of a beta testing environment, then we can’t really fulfill Product's request; at least not by using event logging.
It looks like we need clarification from Product regarding what they mean by “beta,” and if that turns out to be “the beta environment” then we will have to work something out.
Thanks,
Joel
On Mon, Nov 17, 2014 at 10:41 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
+2
If the beta environment isn’t supposed to be used for beta testing, it shouldn’t be called beta.
I’m all for grabbing the data and doing our own visualizations, but there is no guarantee that any data we grab will be accurate since they data in the beta db may be blown up at any time.
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 17, 2014, at 11:26 AM, Pau Giner pginer@wikimedia.org wrote:
Dashboards can only be created o production data,
From product we are constantly encouraged to be data-driven ("measure twice, implement once"). When I read that we need to be in production to get metrics, it feels like a circular dependency: Product wants us to have numbers that justify the move to production, but Analytics tells us that we need to be in production to get those numbers. I think it is worth opening a conversation in the Product list to clarify their expectations.
My experience with the Multimedia team was that having the ability to visualise metrics and check how those were affected by changes in the product has been really useful, and we only wish we could have had such metrics available from day one. In addition, Content Translation is being used in beta for real work by some users, and we are already missing information on how they are doing so. So any idea on how can get and make sense of some of this information (apart from manual collection) would be appreciated (maybe get the data in a way we could use some quick d3-based tool http://code.shutterstock.com/rickshaw/?).
Thanks
Pau
On Mon, Nov 17, 2014 at 8:38 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
On Nov 17, 2014, at 9:13 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Since event logging in beta and production appear to be separate, I
was wondering if it would be possible to set up separate dashboards for beta and >production.
Dashboards can only be created o production data, Joel. We might blow up data in beta environment database to test something else so there is no guaranteed availability there. It is purely a testing environment.
Makes sense I suppose, but if the data in beta is unstable there doesn’t seem much point in doing any of this there, beyond confirming that we are sending valid events, which has already been done. I guess we will just have to wait until we go to production to set things up. It would be nice if we had a real beta environment we could use for beta testing, but that’s a larger issue.
On Mon, Nov 17, 2014 at 8:09 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi all,
I wanted to check in on this and confirm where things are at. As far as I understand, the outstanding issues for beta are:
- We still need to verify that events sent from Content Translation
are being collected in beta. The analytics team is looking into the issues in beta and Nuria has created a bug https://bugzilla.wikimedia.org/show_bug.cgi?id=73388 in bugzilla to track any related work.
- Sometime after Dan gets back from vacation, he and Joel will need
to work together to set up a basic dashboard based on Dan's instructions https://wikitech.wikimedia.org/wiki/Analytics/Dashboards. Timing is dependent on 1. @Dan, let me know what works best for you and how I can best help.
Since event logging in beta and production appear to be separate, I was wondering if it would be possible to set up separate dashboards for beta and production. That would be very useful for us because it would allow us to track the usage of languages we release to beta and then use that data to prioritize the languages we release to production.
Thanks,
Joel
On Nov 14, 2014, at 11:05 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel, Ori looked into this now. There was a problem with EL in labs
which affected logging events from Beta. Ori has fixed the issue, and the fix is >waiting approval from ops. Let's touch-base tomorrow to see if we see events. In order to be able to properly test whether the fix fixes this issue we need to know what it is.
There is a bug logged for the situation of beta and EL, can we please link any commits to this bug? https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
Also, one thing is the setup of the varnish environment and other the setup of the eventlogging machine that has not received new code for quite a while, so I think we have more than one problem here.
On Thu, Nov 13, 2014 at 4:48 PM, Leila Zia leila@wikimedia.org wrote:
[+ Ori]
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is waiting approval from ops. Let's touch-base tomorrow to see if we see events.
Leila
On Thu, Nov 13, 2014 at 1:30 PM, Nuria Ruiz nuria@wikimedia.org wrote:
> Joel: > > I see, I was hoping to set aside the beta issues but if you are not > deploying to prod any time soon I guess we will need to troubleshoot there. > By the looks of it EL has not worked in beta since august, but, as I said > before, I know very little about how beta is put together. > > I have filed a bug to regarding the beta issue: > https://bugzilla.wikimedia.org/show_bug.cgi?id=73388 > > > > > > > On Thu, Nov 13, 2014 at 12:52 PM, Joel Sahleen < > jsahleen@wikimedia.org> wrote: > >> Hi Nuria, >> >> >Please let me know if there is any way I can help out or if there >> is anything you need from our end. >> When you have deployed your newest code to production, let's check >> whether events appear on the production stream. Let us know when deployment >> is done and you think your code should be logging. >> >> >> Our code is not scheduled to be released to production until >> January. Getting the metrics is partly to help us ensure and promote that >> release. We will keep you informed as our plans progress, but hopefully we >> can figure out what the issue is in beta soon. >> >> To confirm: You have seen proper logging from your events in >> vagrant, right? >> >> >> The output I am seeing with vagrant is what I pasted to this thread >> earlier. It does not contain the url-encoded section or the user agent >> information as we discussed before. I think that is an issue with my dev >> environment, however, and not a problem with the code. The same code >> appears to be sending valid events in beta. The http request I sent to your >> email earlier is what we are seeing there. It seems to include all the >> information you said it should include. >> >> If you want to debug what is happening in beta yourself, an easy >> way I found to do that is: >> >> >> 1. Go to our Content Translation translation view >> http://en.wikipedia.beta.wmflabs.org/wiki/Special:ContentTranslation?page=Han+Feizi&from=es&to=ca&targettitle=Han+Feizi page >> in beta (you will need to create an account and sign in) >> 2. Open chrome dev tools, >> 3. Click the add translation links that appear in the middle >> column to add a few machine translated paragraphs to the editor >> 4. Click on the publish button in the header to publish the >> translation to your user namespace (triggers EL event) >> 5. Look at the network pane in chrome dev tools and find the >> entry with the event logging url (it should be near the bottom). >> 6. Click on the entry to see all the request and response >> information. >> >> >> You probably already know all this, but I thought I would pass it >> along just in case it helps. >> >> Di you setup a sampling rate or code is logging 1 to 1? >> >> >> No sample rate. Just logging 1 to 1. >> >> On our end we will work to troubleshoot the beta EL infrastructure, >> I am not familiar with it and neither is anyone on our team but we will ask >> around. >> >> >> Yeah, Dan said you all kind of inherited EL so that’s totally >> understandable. We appreciate you looking into this for us. Let us know how >> else we can help. >> >> Joel >> >> >> >> >> >> On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen < >> jsahleen@wikimedia.org> wrote: >> >>> Hi Nuria, >>> >>> Thank you so much for your help on this. Please let me know if >>> there is any way I can help out or if there is anything you need from our >>> end. >>> >>> Joel >>> >>> Joel Sahleen, Software Engineer >>> Language Engineering >>> Wikimedia Foundation >>> jsahleen@wikimedia.org >>> >>> >>> >>> >>> On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org >>> wrote: >>> >>> Hello, >>> >>> Taking last statement back, asked Yuvi and beta does have a >>> varnish instance so the flow of EL events "should" be the same one that >>> production. >>> >>> Now I looked on deployment-eventlogging02, which is the EL machine >>> for labs and the last events I see there are from Aug 22. >>> >>> So no events have come in as of late, which could point to an >>> issue on the setup. I will look into it some more. >>> >>> Thanks, >>> >>> Nuria >>> >>> On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org >>> wrote: >>> >>>> To keep archives happy: Beta setup post events to >>>> http://bits.beta.wmflabs.org/event.gif >>>> http://bits.beta.wmflabs.org/event.gif?foo=bar that, while it >>>> does not look to be varnish, has some kind of listener that post those >>>> events to beta event logging database. >>>> >>>> On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen < >>>> jsahleen@wikimedia.org> wrote: >>>> >>>>> Niklas, >>>>> >>>>> Can you answer this question from Nuria? >>>>> >>>>> jsahleen: does beta have its own varnish instance? where are >>>>> you posting your events in beta? can you send teh url? >>>>> >>>>> Also would it be possible to document the steps you used when >>>>> testing EL on beta so that others can reproduce them? >>>>> >>>>> Thanks, >>>>> >>>>> Joel >>>>> >>>>> Joel Sahleen, Software Engineer >>>>> Language Engineering >>>>> Wikimedia Foundation >>>>> jsahleen@wikimedia.org >>>>> >>>>> >>>>> >>>>> >>>>> On Nov 12, 2014, at 4:28 AM, Joel Sahleen < >>>>> jsahleen@wikimedia.org> wrote: >>>>> >>>>> (Moving this discussion to analytics@ and localization-team@ >>>>> based on Nuria’s suggestion below.) >>>>> >>>>> Hi Leila, >>>>> >>>>> The output I posted in the message is the only output I am >>>>> seeing. I do not see the URL-encoded section or the validation section. I >>>>> think there may be something wrong with my testing setup. >>>>> >>>>> Niklas Laxstöm has checked what is happening with our event >>>>> logging in beta and he confirmed that we are sending events and the events >>>>> are valid. The issue seems to be that we are logging events to the beta >>>>> event logging db while what we checked earlier was the production event >>>>> logging db. >>>>> >>>>> Can you (or anyone who is available) check the event logging db >>>>> in beta to see if the table has been created and has data? The schema name >>>>> again is ContentTranslation. If you don’t find anything, let us know and we >>>>> will do some more investigation. >>>>> >>>>> If there is data in the beta db the next step would be to follow >>>>> with Dan’s instructions >>>>> https://wikitech.wikimedia.org/wiki/Analytics/Dashboards to >>>>> get a dashboard set up on limn1. I believe that most of Dan’s instructions >>>>> need to be handled by someone on the analytics team, but let me know if >>>>> there is anything I can help with. >>>>> >>>>> Thanks again for your help! >>>>> >>>>> Joel >>>>> >>>>> Joel Sahleen, Software Engineer >>>>> Language Engineering >>>>> Wikimedia Foundation >>>>> jsahleen@wikimedia.org >>>>> >>>>> >>>>> >>>>> >>>>> On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org >>>>> wrote: >>>>> >>>>> Hi Joel, >>>>> >>>>> When you log events, the output will be the URL-encoded JSON >>>>> sent by the browser, the event record (similar to what you pasted in your >>>>> email), and whether the event validates against the schema. For the sample >>>>> output you pasted earlier, or another sample output, can you let us know if >>>>> validation section shows Valid? >>>>> >>>>> Leila >>>>> >>>>> On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz <nuria@wikimedia.org >>>>> > wrote: >>>>> >>>>>> Joel, >>>>>> >>>>>> For questions like these going forward you can contact >>>>>> analytics@ as you will be getting amore prompt response. Both >>>>>> Dan and Leila are OOTO the next couple of days. >>>>>> >>>>>> >There are configuration options for the dev server that need >>>>>> to be added. Do similar options need to be added when not using the dev >>>>>> server? >>>>>> No, there is no need. >>>>>> >>>>>> You would need sample rates to determine at which sampling rate >>>>>> you are logging if you are not logging all events, that is. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Nuria >>>>>> >>>>>> On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu < >>>>>> dandreescu@wikimedia.org> wrote: >>>>>> >>>>>>> Adding Nuria as she can probably help >>>>>>> >>>>>>> On Monday, November 10, 2014, Joel Sahleen < >>>>>>> jsahleen@wikimedia.org> wrote: >>>>>>> >>>>>>>> Hi Leila, >>>>>>>> >>>>>>>> I have tested our EventLogging code and it seems to be >>>>>>>> working fine with the event logging dev server. I can see the events coming >>>>>>>> through and they are valid. Here is some sample output: >>>>>>>> >>>>>>>> {"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", >>>>>>>> "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, >>>>>>>> "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": >>>>>>>> "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", >>>>>>>> "event": {"action": "create-translated-page", "targetLanguage": "ca", >>>>>>>> "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": >>>>>>>> 7146627} >>>>>>>> >>>>>>>> Are there additional configuration options we need to add to >>>>>>>> get EL working aside from just requiring the main extension file. There are >>>>>>>> configuration options for the dev server that need to be added. Do similar >>>>>>>> options need to be added when not using the dev server? >>>>>>>> >>>>>>>> Any help on this would be much appreciated. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Joel >>>>>>>> >>>>>>>> On Nov 7, 2014, at 3:52 PM, Joel Sahleen < >>>>>>>> jsahleen@wikimedia.org> wrote: >>>>>>>> >>>>>>>> No problem, Dan. Enjoy your vacation! >>>>>>>> >>>>>>>> I will read through the document at the link you sent. I >>>>>>>> still need to fix our event logging code so it may be a couple days before >>>>>>>> we are ready anyway. If I have any questions I will contact Leila or Nuria. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Joel >>>>>>>> >>>>>>>> Joel Sahleen, Software Engineer >>>>>>>> Language Engineering >>>>>>>> Wikimedia Foundation >>>>>>>> jsahleen@wikimedia.org >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Nov 7, 2014, at 3:10 PM, Dan Andreescu < >>>>>>>> dandreescu@wikimedia.org> wrote: >>>>>>>> >>>>>>>> Joel, re: visualization, >>>>>>>> >>>>>>>> I'm going on vacation tomorrow and will be back on November >>>>>>>> 19th. If that's not too late, I can set up a limn instance then. If it's >>>>>>>> too late, that's ok, I wrote up the steps needed. Someone with access to >>>>>>>> the limn1.eqiad.wmflabs instance can perform them: >>>>>>>> https://wikitech.wikimedia.org/wiki/Analytics/Dashboards >>>>>>>> >>>>>>>> If you have the data or are generating the data in some other >>>>>>>> way, then you don't need half of that setup, you just need the part that >>>>>>>> sets up the limn dashboard which is only an hour or so of work. Sorry I'm >>>>>>>> running out the door and can't take care of that for you. >>>>>>>> >>>>>>>> Dan >>>>>>>> >>>>>>>> On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen < >>>>>>>> jsahleen@wikimedia.org> wrote: >>>>>>>> >>>>>>>>> Thank you for the information, Pau. Very helpful. As you >>>>>>>>> say, this does not change our current plans or hold us up in any way. I was >>>>>>>>> just wasn’t clear about the relationship between the "high priorities" and >>>>>>>>> "other metrics” sections. Knowing these came from different people at >>>>>>>>> different times clarifies things a lot. >>>>>>>>> Joel >>>>>>>>> >>>>>>>>> On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> @Pau, @Amir There is a section called High priorities for >>>>>>>>>> product management >>>>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>>>>>>> the Content translation analytics page. Did these priorities come from >>>>>>>>>> outside the team or does this just represent our own internal view of the >>>>>>>>>> high priorities? >>>>>>>>> >>>>>>>>> >>>>>>>>> Here is the story of that page as I'm aware of it: >>>>>>>>> >>>>>>>>> In September 2013, I was in a meeting with the analytics >>>>>>>>> team in SF presentingan initial proposal for metrics >>>>>>>>> https://docs.google.com/a/wikimedia.org/presentation/d/1V1XLV7jUcAtco5ZC49SNTt3VecH7hARZ6vqbSFGnOYc/edit?usp=sharing. >>>>>>>>> On that meeting, Dario recommended to create hierarchy of metrics based on >>>>>>>>> the project goals. I created such image and a description for those metrics >>>>>>>>> (the image is on top of our analytics page and the metrics are described in >>>>>>>>> what it now the "Other metrics for created articles" section. >>>>>>>>> >>>>>>>>> In a meeting between Amir and Howie, they captured which >>>>>>>>> should be the most important metrics from the product perspective in the >>>>>>>>> "High priorities for product management". If I recalled correctly, as an >>>>>>>>> outcome of later meetings between Howie and Amir, Howie was happy focusing >>>>>>>>> on articles published as a single (initial?) metric for success. Amir can >>>>>>>>> provide more details since I was not on those meetings. >>>>>>>>> >>>>>>>>> In short: The analytics page >>>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics >>>>>>>>> has pieces contributed by different people during the last >>>>>>>>> year, and although there are many ideas to organise and detail, measuring >>>>>>>>> the number of published articles seems to be the solid candidate to get >>>>>>>>> started with, learn from the value we get from it and polish the rest of ourgoal-to-signal >>>>>>>>> process http://www.rodden.org/kerry/heart/ for detecting >>>>>>>>> better metrics. >>>>>>>>> >>>>>>>>> >>>>>>>>> Pau >>>>>>>>> >>>>>>>>> On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen < >>>>>>>>> jsahleen@wikimedia.org>wrote: >>>>>>>>> >>>>>>>>>> Hi All, >>>>>>>>>> >>>>>>>>>> I have been reviewing our requirements for Content >>>>>>>>>> translation analytics >>>>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics and >>>>>>>>>> I have a few questions/requests. I am sending them to the language team >>>>>>>>>> list and Leila and Dan in the hopes of getting some more clarity. I will >>>>>>>>>> add the same content to the Trello card. >>>>>>>>>> >>>>>>>>>> In the weekly team meeting earlier today we agreed that the >>>>>>>>>> first metric we want to collect data for is the number of articles created >>>>>>>>>> in each language over time. This is something has Amir has already set up our >>>>>>>>>> current Event Logging >>>>>>>>>> https://git.wikimedia.org/blob/mediawiki/extensions/ContentTranslation/89b6284f06b4419ddec6dcccee0eed500f267100/modules/eventlogging/ext.cx.eventlogging.js to >>>>>>>>>> track. Now that Kartik has enabled EL in beta, that part should be done. >>>>>>>>>> Since we are only barely turning it on, there will be very little data >>>>>>>>>> until people create more articles using CX. However, we should be set up to >>>>>>>>>> collect any new data that comes in. >>>>>>>>>> >>>>>>>>>> @Leila, can you verify that the db table now exists for the ContentTranslation >>>>>>>>>> schema >>>>>>>>>> https://meta.wikimedia.org/wiki/Schema:ContentTranslation? >>>>>>>>>> If it doesn’t, can you point us to right people we need to work with to >>>>>>>>>> troubleshoot the issue? Also you mentioned in our meeting that personal >>>>>>>>>> data may soon be purged after 90 days as part of a new privacy policy. >>>>>>>>>> Could you explain that a bit more or point us to more information? If this >>>>>>>>>> is the case, it may affect some of the metrics we would like to collect in >>>>>>>>>> the future. >>>>>>>>>> >>>>>>>>>> @Dan, what do we need to do next in order to set up a very >>>>>>>>>> simple visualization that would show the number of articles created per >>>>>>>>>> week by language. Pau has an image of what he would like on the Trello >>>>>>>>>> card >>>>>>>>>> https://trello.com/c/vQm0hlkt/18-content-translation-analytics. >>>>>>>>>> You mentioned something about being able to host a dashboard for us on one >>>>>>>>>> of the Limn servers you already have set up. >>>>>>>>>> >>>>>>>>>> @Santhosh, I believe you said earlier you have a script you >>>>>>>>>> use to export the data for the ULS analytics. If so can you share that >>>>>>>>>> please in case we need a similar script for CX so I don’t have to write a >>>>>>>>>> new script from scratch? >>>>>>>>>> >>>>>>>>>> @Pau, @Amir There is a section called High priorities for >>>>>>>>>> product management >>>>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>>>>>>> the Content translation analytics page. Did these priorities come from >>>>>>>>>> outside the team or does this just represent our own internal view of the >>>>>>>>>> high priorities? If the latter, have these priorities been >>>>>>>>>> reviewed by anyone outside the team? I think we are safe to proceed with >>>>>>>>>> our current plan, but it would be good to have product sign off on things >>>>>>>>>> more generally. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Joel >>>>>>>>>> >>>>>>>>>> Joel Sahleen, Software Engineer >>>>>>>>>> Language Engineering >>>>>>>>>> Wikimedia Foundation >>>>>>>>>> jsahleen@wikimedia.org >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Localisation-team mailing list >>>>>>>>>> Localisation-team@lists.wikimedia.org >>>>>>>>>> >>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Pau Giner >>>>>>>>> Interaction Designer >>>>>>>>> Wikimedia Foundation >>>>>>>>> _______________________________________________ >>>>>>>>> Localisation-team mailing list >>>>>>>>> Localisation-team@lists.wikimedia.org >>>>>>>>> >>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Analytics mailing list >>>>> Analytics@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>>> >>>> >>> _______________________________________________ >>> Localisation-team mailing list >>> Localisation-team@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>> >>> >>> >>> _______________________________________________ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> _______________________________________________ >> Localisation-team mailing list >> Localisation-team@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/localisation-team >> >> >> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > >
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
-- Pau Giner Interaction Designer Wikimedia Foundation _______________________________________________ Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
I find the idea that "beta" is an stable environment where users can try stuff is not in sync with the fact thatbeta-labs is a release and QA testing environment for every single team.
For example: we are actually about to test some changes to EL on our end, those will affect data gathering on our side and that is should be no issue, we are dealing with a testing environment and we are changing our database interaction without affecting prod. However our changes might break the data collection in beta-labs. That should not block you, it's a testing playground for everyone.
A testing environment to test release is an OK place to test software, I am not so sure that is best suited to test "features". You require the rest of the system to be stable and that will not always be the case. Now, it could very well be that we have no better place to test content translation software at this time, if this is the best we have let's use it but with the right set of expectations.
On Mon, Nov 17, 2014 at 1:33 PM, Pau Giner pginer@wikimedia.org wrote:
This is normally what product refers to as "beta". You can, of course,
confirm.
I think that the Product team is already aware that Content Translation is not a "beta feature". It is deployed on beta-labs, but it is used by users to create articles that later they copy and publish to a real Wikipedia (which may qualify as "REAL USERS USAGE"). You can check this page https://www.mediawiki.org/wiki/Content_translation/Published_pages for a list of articles created on beta-labs and published into real Wikipedias.
The beta-features framework isolates users from UI changes but not from the content produced/modified by the features. If we deploy Content Translation as a beta feature, the articles published by users will be visible to everyone because this is content created on a real Wikipedia. Thus, the community can tell us: "how do we know this is not going to flood Wikipedia with robot-like horrible translations? Or directly telling us we are doing so because they just found a couple of bad translated articles", and we can tell that the experience in our test environments was positive in number of ways (user testing, info from manually collecting numbers) but we don't have more detailed numbers on article production and we are not ready to measure the impact right after we deployed because it was not possible to get things ready in advance. I find this approach could be problematic, but I'm happy to follow the Analytics advice on this.
In any case, as said before, this is worth checking with product.
Pau
On Mon, Nov 17, 2014 at 12:17 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel,
Please look at the wiki page for beta features: "The primary purpose of Beta Features is to allow for Wikimedia designers and engineers (from the Wikimedia Foundation and community alike) to roll out technical improvements in an environment where large numbers of users can test, give feedback, and use these features in real-world settings. "
http://www.mediawiki.org/wiki/Beta_Features
This is normally what product refers to as "beta". You can, of course, confirm.
Beta cluster purpose is software testing (not quite the same thing): http://www.mediawiki.org/wiki/Beta_cluster
Thanks,
Nuria
On Mon, Nov 17, 2014 at 12:03 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
I think there is a confusion between the not so well named beta environment (testing environment in labs, which is what our thread refers to) and being a beta-feature. *If you are a beta-feature you are IN production and you can get data.*
Thanks for the clarification, Nuria.
The issue, as I understand it, is that Product is asking for metrics on Content Translation usage "in beta” so they can make a "data-driven" decision about deployment to production. If what Product means by “in beta” is “as a beta feature” then we really have no problem. We’ll just have to wait until after we deploy as a beta feature in January to start collecting data and doing visualizations.
My understanding, however, is that what Product is asking for is metrics on Content Translation usage in “the beta environment” where a group of beta-testers has been using the extension for several months now. If the event logging data in "the beta environment" is not stable and this environment is really a software testing environment instead of a beta testing environment, then we can’t really fulfill Product's request; at least not by using event logging.
It looks like we need clarification from Product regarding what they mean by “beta,” and if that turns out to be “the beta environment” then we will have to work something out.
Thanks,
Joel
On Mon, Nov 17, 2014 at 10:41 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
+2
If the beta environment isn’t supposed to be used for beta testing, it shouldn’t be called beta.
I’m all for grabbing the data and doing our own visualizations, but there is no guarantee that any data we grab will be accurate since they data in the beta db may be blown up at any time.
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 17, 2014, at 11:26 AM, Pau Giner pginer@wikimedia.org wrote:
Dashboards can only be created o production data,
From product we are constantly encouraged to be data-driven ("measure twice, implement once"). When I read that we need to be in production to get metrics, it feels like a circular dependency: Product wants us to have numbers that justify the move to production, but Analytics tells us that we need to be in production to get those numbers. I think it is worth opening a conversation in the Product list to clarify their expectations.
My experience with the Multimedia team was that having the ability to visualise metrics and check how those were affected by changes in the product has been really useful, and we only wish we could have had such metrics available from day one. In addition, Content Translation is being used in beta for real work by some users, and we are already missing information on how they are doing so. So any idea on how can get and make sense of some of this information (apart from manual collection) would be appreciated (maybe get the data in a way we could use some quick d3-based tool http://code.shutterstock.com/rickshaw/?).
Thanks
Pau
On Mon, Nov 17, 2014 at 8:38 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
On Nov 17, 2014, at 9:13 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Since event logging in beta and production appear to be separate, I
was wondering if it would be possible to set up separate dashboards for beta and >production.
Dashboards can only be created o production data, Joel. We might blow up data in beta environment database to test something else so there is no guaranteed availability there. It is purely a testing environment.
Makes sense I suppose, but if the data in beta is unstable there doesn’t seem much point in doing any of this there, beyond confirming that we are sending valid events, which has already been done. I guess we will just have to wait until we go to production to set things up. It would be nice if we had a real beta environment we could use for beta testing, but that’s a larger issue.
On Mon, Nov 17, 2014 at 8:09 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi all,
I wanted to check in on this and confirm where things are at. As far as I understand, the outstanding issues for beta are:
- We still need to verify that events sent from Content Translation
are being collected in beta. The analytics team is looking into the issues in beta and Nuria has created a bug https://bugzilla.wikimedia.org/show_bug.cgi?id=73388 in bugzilla to track any related work.
- Sometime after Dan gets back from vacation, he and Joel will need
to work together to set up a basic dashboard based on Dan's instructions https://wikitech.wikimedia.org/wiki/Analytics/Dashboards. Timing is dependent on 1. @Dan, let me know what works best for you and how I can best help.
Since event logging in beta and production appear to be separate, I was wondering if it would be possible to set up separate dashboards for beta and production. That would be very useful for us because it would allow us to track the usage of languages we release to beta and then use that data to prioritize the languages we release to production.
Thanks,
Joel
On Nov 14, 2014, at 11:05 AM, Nuria Ruiz nuria@wikimedia.org wrote:
>Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is >waiting approval from ops. Let's touch-base tomorrow to see if we see events. In order to be able to properly test whether the fix fixes this issue we need to know what it is.
There is a bug logged for the situation of beta and EL, can we please link any commits to this bug? https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
Also, one thing is the setup of the varnish environment and other the setup of the eventlogging machine that has not received new code for quite a while, so I think we have more than one problem here.
On Thu, Nov 13, 2014 at 4:48 PM, Leila Zia leila@wikimedia.org wrote:
> [+ Ori] > > Joel, Ori looked into this now. There was a problem with EL in labs > which affected logging events from Beta. Ori has fixed the issue, and the > fix is waiting approval from ops. Let's touch-base tomorrow to see if we > see events. > > Leila > > > > On Thu, Nov 13, 2014 at 1:30 PM, Nuria Ruiz nuria@wikimedia.org > wrote: > >> Joel: >> >> I see, I was hoping to set aside the beta issues but if you are not >> deploying to prod any time soon I guess we will need to troubleshoot there. >> By the looks of it EL has not worked in beta since august, but, as I said >> before, I know very little about how beta is put together. >> >> I have filed a bug to regarding the beta issue: >> https://bugzilla.wikimedia.org/show_bug.cgi?id=73388 >> >> >> >> >> >> >> On Thu, Nov 13, 2014 at 12:52 PM, Joel Sahleen < >> jsahleen@wikimedia.org> wrote: >> >>> Hi Nuria, >>> >>> >Please let me know if there is any way I can help out or if >>> there is anything you need from our end. >>> When you have deployed your newest code to production, let's check >>> whether events appear on the production stream. Let us know when deployment >>> is done and you think your code should be logging. >>> >>> >>> Our code is not scheduled to be released to production until >>> January. Getting the metrics is partly to help us ensure and promote that >>> release. We will keep you informed as our plans progress, but hopefully we >>> can figure out what the issue is in beta soon. >>> >>> To confirm: You have seen proper logging from your events in >>> vagrant, right? >>> >>> >>> The output I am seeing with vagrant is what I pasted to this >>> thread earlier. It does not contain the url-encoded section or the user >>> agent information as we discussed before. I think that is an issue with my >>> dev environment, however, and not a problem with the code. The same code >>> appears to be sending valid events in beta. The http request I sent to your >>> email earlier is what we are seeing there. It seems to include all the >>> information you said it should include. >>> >>> If you want to debug what is happening in beta yourself, an easy >>> way I found to do that is: >>> >>> >>> 1. Go to our Content Translation translation view >>> http://en.wikipedia.beta.wmflabs.org/wiki/Special:ContentTranslation?page=Han+Feizi&from=es&to=ca&targettitle=Han+Feizi page >>> in beta (you will need to create an account and sign in) >>> 2. Open chrome dev tools, >>> 3. Click the add translation links that appear in the middle >>> column to add a few machine translated paragraphs to the editor >>> 4. Click on the publish button in the header to publish the >>> translation to your user namespace (triggers EL event) >>> 5. Look at the network pane in chrome dev tools and find the >>> entry with the event logging url (it should be near the bottom). >>> 6. Click on the entry to see all the request and response >>> information. >>> >>> >>> You probably already know all this, but I thought I would pass it >>> along just in case it helps. >>> >>> Di you setup a sampling rate or code is logging 1 to 1? >>> >>> >>> No sample rate. Just logging 1 to 1. >>> >>> On our end we will work to troubleshoot the beta EL >>> infrastructure, I am not familiar with it and neither is anyone on our team >>> but we will ask around. >>> >>> >>> Yeah, Dan said you all kind of inherited EL so that’s totally >>> understandable. We appreciate you looking into this for us. Let us know how >>> else we can help. >>> >>> Joel >>> >>> >>> >>> >>> >>> On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen < >>> jsahleen@wikimedia.org> wrote: >>> >>>> Hi Nuria, >>>> >>>> Thank you so much for your help on this. Please let me know if >>>> there is any way I can help out or if there is anything you need from our >>>> end. >>>> >>>> Joel >>>> >>>> Joel Sahleen, Software Engineer >>>> Language Engineering >>>> Wikimedia Foundation >>>> jsahleen@wikimedia.org >>>> >>>> >>>> >>>> >>>> On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org >>>> wrote: >>>> >>>> Hello, >>>> >>>> Taking last statement back, asked Yuvi and beta does have a >>>> varnish instance so the flow of EL events "should" be the same one that >>>> production. >>>> >>>> Now I looked on deployment-eventlogging02, which is the EL >>>> machine for labs and the last events I see there are from Aug 22. >>>> >>>> So no events have come in as of late, which could point to an >>>> issue on the setup. I will look into it some more. >>>> >>>> Thanks, >>>> >>>> Nuria >>>> >>>> On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz <nuria@wikimedia.org >>>> > wrote: >>>> >>>>> To keep archives happy: Beta setup post events to >>>>> http://bits.beta.wmflabs.org/event.gif >>>>> http://bits.beta.wmflabs.org/event.gif?foo=bar that, while it >>>>> does not look to be varnish, has some kind of listener that post those >>>>> events to beta event logging database. >>>>> >>>>> On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen < >>>>> jsahleen@wikimedia.org> wrote: >>>>> >>>>>> Niklas, >>>>>> >>>>>> Can you answer this question from Nuria? >>>>>> >>>>>> jsahleen: does beta have its own varnish instance? where are >>>>>> you posting your events in beta? can you send teh url? >>>>>> >>>>>> Also would it be possible to document the steps you used when >>>>>> testing EL on beta so that others can reproduce them? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Joel >>>>>> >>>>>> Joel Sahleen, Software Engineer >>>>>> Language Engineering >>>>>> Wikimedia Foundation >>>>>> jsahleen@wikimedia.org >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Nov 12, 2014, at 4:28 AM, Joel Sahleen < >>>>>> jsahleen@wikimedia.org> wrote: >>>>>> >>>>>> (Moving this discussion to analytics@ and localization-team@ >>>>>> based on Nuria’s suggestion below.) >>>>>> >>>>>> Hi Leila, >>>>>> >>>>>> The output I posted in the message is the only output I am >>>>>> seeing. I do not see the URL-encoded section or the validation section. I >>>>>> think there may be something wrong with my testing setup. >>>>>> >>>>>> Niklas Laxstöm has checked what is happening with our event >>>>>> logging in beta and he confirmed that we are sending events and the events >>>>>> are valid. The issue seems to be that we are logging events to the beta >>>>>> event logging db while what we checked earlier was the production event >>>>>> logging db. >>>>>> >>>>>> Can you (or anyone who is available) check the event logging db >>>>>> in beta to see if the table has been created and has data? The schema name >>>>>> again is ContentTranslation. If you don’t find anything, let us know and we >>>>>> will do some more investigation. >>>>>> >>>>>> If there is data in the beta db the next step would be to >>>>>> follow with Dan’s instructions >>>>>> https://wikitech.wikimedia.org/wiki/Analytics/Dashboards to >>>>>> get a dashboard set up on limn1. I believe that most of Dan’s instructions >>>>>> need to be handled by someone on the analytics team, but let me know if >>>>>> there is anything I can help with. >>>>>> >>>>>> Thanks again for your help! >>>>>> >>>>>> Joel >>>>>> >>>>>> Joel Sahleen, Software Engineer >>>>>> Language Engineering >>>>>> Wikimedia Foundation >>>>>> jsahleen@wikimedia.org >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org >>>>>> wrote: >>>>>> >>>>>> Hi Joel, >>>>>> >>>>>> When you log events, the output will be the URL-encoded >>>>>> JSON sent by the browser, the event record (similar to what you pasted in >>>>>> your email), and whether the event validates against the schema. For the >>>>>> sample output you pasted earlier, or another sample output, can you let us >>>>>> know if validation section shows Valid? >>>>>> >>>>>> Leila >>>>>> >>>>>> On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz < >>>>>> nuria@wikimedia.org> wrote: >>>>>> >>>>>>> Joel, >>>>>>> >>>>>>> For questions like these going forward you can contact >>>>>>> analytics@ as you will be getting amore prompt response. >>>>>>> Both Dan and Leila are OOTO the next couple of days. >>>>>>> >>>>>>> >There are configuration options for the dev server that need >>>>>>> to be added. Do similar options need to be added when not using the dev >>>>>>> server? >>>>>>> No, there is no need. >>>>>>> >>>>>>> You would need sample rates to determine at which sampling >>>>>>> rate you are logging if you are not logging all events, that is. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Nuria >>>>>>> >>>>>>> On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu < >>>>>>> dandreescu@wikimedia.org> wrote: >>>>>>> >>>>>>>> Adding Nuria as she can probably help >>>>>>>> >>>>>>>> On Monday, November 10, 2014, Joel Sahleen < >>>>>>>> jsahleen@wikimedia.org> wrote: >>>>>>>> >>>>>>>>> Hi Leila, >>>>>>>>> >>>>>>>>> I have tested our EventLogging code and it seems to be >>>>>>>>> working fine with the event logging dev server. I can see the events coming >>>>>>>>> through and they are valid. Here is some sample output: >>>>>>>>> >>>>>>>>> {"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", >>>>>>>>> "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, >>>>>>>>> "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": >>>>>>>>> "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", >>>>>>>>> "event": {"action": "create-translated-page", "targetLanguage": "ca", >>>>>>>>> "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": >>>>>>>>> 7146627} >>>>>>>>> >>>>>>>>> Are there additional configuration options we need to add to >>>>>>>>> get EL working aside from just requiring the main extension file. There are >>>>>>>>> configuration options for the dev server that need to be added. Do similar >>>>>>>>> options need to be added when not using the dev server? >>>>>>>>> >>>>>>>>> Any help on this would be much appreciated. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Joel >>>>>>>>> >>>>>>>>> On Nov 7, 2014, at 3:52 PM, Joel Sahleen < >>>>>>>>> jsahleen@wikimedia.org> wrote: >>>>>>>>> >>>>>>>>> No problem, Dan. Enjoy your vacation! >>>>>>>>> >>>>>>>>> I will read through the document at the link you sent. I >>>>>>>>> still need to fix our event logging code so it may be a couple days before >>>>>>>>> we are ready anyway. If I have any questions I will contact Leila or Nuria. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Joel >>>>>>>>> >>>>>>>>> Joel Sahleen, Software Engineer >>>>>>>>> Language Engineering >>>>>>>>> Wikimedia Foundation >>>>>>>>> jsahleen@wikimedia.org >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Nov 7, 2014, at 3:10 PM, Dan Andreescu < >>>>>>>>> dandreescu@wikimedia.org> wrote: >>>>>>>>> >>>>>>>>> Joel, re: visualization, >>>>>>>>> >>>>>>>>> I'm going on vacation tomorrow and will be back on November >>>>>>>>> 19th. If that's not too late, I can set up a limn instance then. If it's >>>>>>>>> too late, that's ok, I wrote up the steps needed. Someone with access to >>>>>>>>> the limn1.eqiad.wmflabs instance can perform them: >>>>>>>>> https://wikitech.wikimedia.org/wiki/Analytics/Dashboards >>>>>>>>> >>>>>>>>> If you have the data or are generating the data in some >>>>>>>>> other way, then you don't need half of that setup, you just need the part >>>>>>>>> that sets up the limn dashboard which is only an hour or so of work. Sorry >>>>>>>>> I'm running out the door and can't take care of that for you. >>>>>>>>> >>>>>>>>> Dan >>>>>>>>> >>>>>>>>> On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen < >>>>>>>>> jsahleen@wikimedia.org> wrote: >>>>>>>>> >>>>>>>>>> Thank you for the information, Pau. Very helpful. As you >>>>>>>>>> say, this does not change our current plans or hold us up in any way. I was >>>>>>>>>> just wasn’t clear about the relationship between the "high priorities" and >>>>>>>>>> "other metrics” sections. Knowing these came from different people at >>>>>>>>>> different times clarifies things a lot. >>>>>>>>>> Joel >>>>>>>>>> >>>>>>>>>> On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> @Pau, @Amir There is a section called High priorities for >>>>>>>>>>> product management >>>>>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>>>>>>>> the Content translation analytics page. Did these priorities come from >>>>>>>>>>> outside the team or does this just represent our own internal view of the >>>>>>>>>>> high priorities? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Here is the story of that page as I'm aware of it: >>>>>>>>>> >>>>>>>>>> In September 2013, I was in a meeting with the analytics >>>>>>>>>> team in SF presentingan initial proposal for metrics >>>>>>>>>> https://docs.google.com/a/wikimedia.org/presentation/d/1V1XLV7jUcAtco5ZC49SNTt3VecH7hARZ6vqbSFGnOYc/edit?usp=sharing. >>>>>>>>>> On that meeting, Dario recommended to create hierarchy of metrics based on >>>>>>>>>> the project goals. I created such image and a description for those metrics >>>>>>>>>> (the image is on top of our analytics page and the metrics are described in >>>>>>>>>> what it now the "Other metrics for created articles" section. >>>>>>>>>> >>>>>>>>>> In a meeting between Amir and Howie, they captured which >>>>>>>>>> should be the most important metrics from the product perspective in the >>>>>>>>>> "High priorities for product management". If I recalled correctly, as an >>>>>>>>>> outcome of later meetings between Howie and Amir, Howie was happy focusing >>>>>>>>>> on articles published as a single (initial?) metric for success. Amir can >>>>>>>>>> provide more details since I was not on those meetings. >>>>>>>>>> >>>>>>>>>> In short: The analytics page >>>>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics >>>>>>>>>> has pieces contributed by different people during the >>>>>>>>>> last year, and although there are many ideas to organise and detail, >>>>>>>>>> measuring the number of published articles seems to be the solid candidate >>>>>>>>>> to get started with, learn from the value we get from it and polish the >>>>>>>>>> rest of ourgoal-to-signal process >>>>>>>>>> http://www.rodden.org/kerry/heart/ for detecting better >>>>>>>>>> metrics. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Pau >>>>>>>>>> >>>>>>>>>> On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen < >>>>>>>>>> jsahleen@wikimedia.org>wrote: >>>>>>>>>> >>>>>>>>>>> Hi All, >>>>>>>>>>> >>>>>>>>>>> I have been reviewing our requirements for Content >>>>>>>>>>> translation analytics >>>>>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics and >>>>>>>>>>> I have a few questions/requests. I am sending them to the language team >>>>>>>>>>> list and Leila and Dan in the hopes of getting some more clarity. I will >>>>>>>>>>> add the same content to the Trello card. >>>>>>>>>>> >>>>>>>>>>> In the weekly team meeting earlier today we agreed that >>>>>>>>>>> the first metric we want to collect data for is the number of articles >>>>>>>>>>> created in each language over time. This is something has Amir has already >>>>>>>>>>> set up our current Event Logging >>>>>>>>>>> https://git.wikimedia.org/blob/mediawiki/extensions/ContentTranslation/89b6284f06b4419ddec6dcccee0eed500f267100/modules/eventlogging/ext.cx.eventlogging.js to >>>>>>>>>>> track. Now that Kartik has enabled EL in beta, that part should be done. >>>>>>>>>>> Since we are only barely turning it on, there will be very little data >>>>>>>>>>> until people create more articles using CX. However, we should be set up to >>>>>>>>>>> collect any new data that comes in. >>>>>>>>>>> >>>>>>>>>>> @Leila, can you verify that the db table now exists for >>>>>>>>>>> the ContentTranslation schema >>>>>>>>>>> https://meta.wikimedia.org/wiki/Schema:ContentTranslation? >>>>>>>>>>> If it doesn’t, can you point us to right people we need to work with to >>>>>>>>>>> troubleshoot the issue? Also you mentioned in our meeting that personal >>>>>>>>>>> data may soon be purged after 90 days as part of a new privacy policy. >>>>>>>>>>> Could you explain that a bit more or point us to more information? If this >>>>>>>>>>> is the case, it may affect some of the metrics we would like to collect in >>>>>>>>>>> the future. >>>>>>>>>>> >>>>>>>>>>> @Dan, what do we need to do next in order to set up a very >>>>>>>>>>> simple visualization that would show the number of articles created per >>>>>>>>>>> week by language. Pau has an image of what he would like on the Trello >>>>>>>>>>> card >>>>>>>>>>> https://trello.com/c/vQm0hlkt/18-content-translation-analytics. >>>>>>>>>>> You mentioned something about being able to host a dashboard for us on one >>>>>>>>>>> of the Limn servers you already have set up. >>>>>>>>>>> >>>>>>>>>>> @Santhosh, I believe you said earlier you have a script >>>>>>>>>>> you use to export the data for the ULS analytics. If so can you share that >>>>>>>>>>> please in case we need a similar script for CX so I don’t have to write a >>>>>>>>>>> new script from scratch? >>>>>>>>>>> >>>>>>>>>>> @Pau, @Amir There is a section called High priorities for >>>>>>>>>>> product management >>>>>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>>>>>>>> the Content translation analytics page. Did these priorities come from >>>>>>>>>>> outside the team or does this just represent our own internal view of the >>>>>>>>>>> high priorities? If the latter, have these priorities >>>>>>>>>>> been reviewed by anyone outside the team? I think we are safe to proceed >>>>>>>>>>> with our current plan, but it would be good to have product sign off on >>>>>>>>>>> things more generally. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Joel >>>>>>>>>>> >>>>>>>>>>> Joel Sahleen, Software Engineer >>>>>>>>>>> Language Engineering >>>>>>>>>>> Wikimedia Foundation >>>>>>>>>>> jsahleen@wikimedia.org >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Localisation-team mailing list >>>>>>>>>>> Localisation-team@lists.wikimedia.org >>>>>>>>>>> >>>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Pau Giner >>>>>>>>>> Interaction Designer >>>>>>>>>> Wikimedia Foundation >>>>>>>>>> _______________________________________________ >>>>>>>>>> Localisation-team mailing list >>>>>>>>>> Localisation-team@lists.wikimedia.org >>>>>>>>>> >>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Analytics mailing list >>>>>> Analytics@lists.wikimedia.org >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>> >>>>>> >>>>> >>>> _______________________________________________ >>>> Localisation-team mailing list >>>> Localisation-team@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>> >>>> >>>> >>>> _______________________________________________ >>>> Analytics mailing list >>>> Analytics@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>>> >>> _______________________________________________ >>> Localisation-team mailing list >>> Localisation-team@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>> >>> >>> >>> _______________________________________________ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > > _______________________________________________ Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
-- Pau Giner Interaction Designer Wikimedia Foundation _______________________________________________ Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
-- Pau Giner Interaction Designer Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Nuria wrote:
I find the idea that "beta" is an stable environment where users can try stuff is not in sync with the fact thatbeta-labs is a release and QA testing environment for every single team.
Agreed. Whatever we would like “beta” to be it is clearly not a stable environment and therefore not appropriate for true beta testing. However, it is also all we have at the moment, as you point out. Ideally, there would be dev, testing, beta and production environments along with a set process for moving code from one environment to another, but since that is not the case all we can do is get by with what we have. The important thing to me is to make sure the Product team is aware that event logging cannot be used in “beta," so they can reset any expectations they may have regarding pre-release metrics collected in that way.
Pau wrote:
In any case, as said before, this is worth checking with product.
Who should do this? Me? Amir? Seems like something to bring up at the meeting with Howie and Erik later this week, but it might be worth giving Howie a heads up beforehand.
Pau
On Mon, Nov 17, 2014 at 12:17 PM, Nuria Ruiz nuria@wikimedia.org wrote: Joel,
Please look at the wiki page for beta features: "The primary purpose of Beta Features is to allow for Wikimedia designers and engineers (from the Wikimedia Foundation and community alike) to roll out technical improvements in an environment where large numbers of users can test, give feedback, and use these features in real-world settings. "
http://www.mediawiki.org/wiki/Beta_Features
This is normally what product refers to as "beta". You can, of course, confirm.
Beta cluster purpose is software testing (not quite the same thing): http://www.mediawiki.org/wiki/Beta_cluster
Thanks,
Nuria
On Mon, Nov 17, 2014 at 12:03 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
I think there is a confusion between the not so well named beta environment (testing environment in labs, which is what our thread refers to) and being a beta-feature. If you are a beta-feature you are IN production and you can get data.
Thanks for the clarification, Nuria.
The issue, as I understand it, is that Product is asking for metrics on Content Translation usage "in beta” so they can make a "data-driven" decision about deployment to production. If what Product means by “in beta” is “as a beta feature” then we really have no problem. We’ll just have to wait until after we deploy as a beta feature in January to start collecting data and doing visualizations.
My understanding, however, is that what Product is asking for is metrics on Content Translation usage in “the beta environment” where a group of beta-testers has been using the extension for several months now. If the event logging data in "the beta environment" is not stable and this environment is really a software testing environment instead of a beta testing environment, then we can’t really fulfill Product's request; at least not by using event logging.
It looks like we need clarification from Product regarding what they mean by “beta,” and if that turns out to be “the beta environment” then we will have to work something out.
Thanks,
Joel
On Mon, Nov 17, 2014 at 10:41 AM, Joel Sahleen jsahleen@wikimedia.org wrote: +2
If the beta environment isn’t supposed to be used for beta testing, it shouldn’t be called beta.
I’m all for grabbing the data and doing our own visualizations, but there is no guarantee that any data we grab will be accurate since they data in the beta db may be blown up at any time.
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 17, 2014, at 11:26 AM, Pau Giner pginer@wikimedia.org wrote:
Dashboards can only be created o production data,
From product we are constantly encouraged to be data-driven ("measure twice, implement once"). When I read that we need to be in production to get metrics, it feels like a circular dependency: Product wants us to have numbers that justify the move to production, but Analytics tells us that we need to be in production to get those numbers. I think it is worth opening a conversation in the Product list to clarify their expectations.
My experience with the Multimedia team was that having the ability to visualise metrics and check how those were affected by changes in the product has been really useful, and we only wish we could have had such metrics available from day one. In addition, Content Translation is being used in beta for real work by some users, and we are already missing information on how they are doing so. So any idea on how can get and make sense of some of this information (apart from manual collection) would be appreciated (maybe get the data in a way we could use some quick d3-based tool?).
Thanks
Pau
On Mon, Nov 17, 2014 at 8:38 AM, Joel Sahleen jsahleen@wikimedia.org wrote: On Nov 17, 2014, at 9:13 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Since event logging in beta and production appear to be separate, I was wondering if it would be possible to set up separate dashboards for beta and >production.
Dashboards can only be created o production data, Joel. We might blow up data in beta environment database to test something else so there is no guaranteed availability there. It is purely a testing environment.
Makes sense I suppose, but if the data in beta is unstable there doesn’t seem much point in doing any of this there, beyond confirming that we are sending valid events, which has already been done. I guess we will just have to wait until we go to production to set things up. It would be nice if we had a real beta environment we could use for beta testing, but that’s a larger issue.
On Mon, Nov 17, 2014 at 8:09 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi all,
I wanted to check in on this and confirm where things are at. As far as I understand, the outstanding issues for beta are:
We still need to verify that events sent from Content Translation are being collected in beta. The analytics team is looking into the issues in beta and Nuria has created a bug in bugzilla to track any related work.
Sometime after Dan gets back from vacation, he and Joel will need to work together to set up a basic dashboard based on Dan's instructions. Timing is dependent on 1. @Dan, let me know what works best for you and how I can best help.
Since event logging in beta and production appear to be separate, I was wondering if it would be possible to set up separate dashboards for beta and production. That would be very useful for us because it would allow us to track the usage of languages we release to beta and then use that data to prioritize the languages we release to production.
Thanks,
Joel
On Nov 14, 2014, at 11:05 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is >waiting approval from ops. Let's touch-base tomorrow to see if we see events.
In order to be able to properly test whether the fix fixes this issue we need to know what it is.
There is a bug logged for the situation of beta and EL, can we please link any commits to this bug? https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
Also, one thing is the setup of the varnish environment and other the setup of the eventlogging machine that has not received new code for quite a while, so I think we have more than one problem here.
On Thu, Nov 13, 2014 at 4:48 PM, Leila Zia leila@wikimedia.org wrote: [+ Ori]
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is waiting approval from ops. Let's touch-base tomorrow to see if we see events.
Leila
On Thu, Nov 13, 2014 at 1:30 PM, Nuria Ruiz nuria@wikimedia.org wrote: Joel:
I see, I was hoping to set aside the beta issues but if you are not deploying to prod any time soon I guess we will need to troubleshoot there. By the looks of it EL has not worked in beta since august, but, as I said before, I know very little about how beta is put together.
I have filed a bug to regarding the beta issue: https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
On Thu, Nov 13, 2014 at 12:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Nuria,
>Please let me know if there is any way I can help out or if there is anything you need from our end. When you have deployed your newest code to production, let's check whether events appear on the production stream. Let us know when deployment is done and you think your code should be logging.
Our code is not scheduled to be released to production until January. Getting the metrics is partly to help us ensure and promote that release. We will keep you informed as our plans progress, but hopefully we can figure out what the issue is in beta soon.
To confirm: You have seen proper logging from your events in vagrant, right?
The output I am seeing with vagrant is what I pasted to this thread earlier. It does not contain the url-encoded section or the user agent information as we discussed before. I think that is an issue with my dev environment, however, and not a problem with the code. The same code appears to be sending valid events in beta. The http request I sent to your email earlier is what we are seeing there. It seems to include all the information you said it should include.
If you want to debug what is happening in beta yourself, an easy way I found to do that is:
Go to our Content Translation translation view page in beta (you will need to create an account and sign in) Open chrome dev tools, Click the add translation links that appear in the middle column to add a few machine translated paragraphs to the editor Click on the publish button in the header to publish the translation to your user namespace (triggers EL event) Look at the network pane in chrome dev tools and find the entry with the event logging url (it should be near the bottom). Click on the entry to see all the request and response information.
You probably already know all this, but I thought I would pass it along just in case it helps.
Di you setup a sampling rate or code is logging 1 to 1?
No sample rate. Just logging 1 to 1.
On our end we will work to troubleshoot the beta EL infrastructure, I am not familiar with it and neither is anyone on our team but we will ask around.
Yeah, Dan said you all kind of inherited EL so that’s totally understandable. We appreciate you looking into this for us. Let us know how else we can help.
Joel
On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen jsahleen@wikimedia.org wrote: Hi Nuria,
Thank you so much for your help on this. Please let me know if there is any way I can help out or if there is anything you need from our end.
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org wrote:
> Hello, > > Taking last statement back, asked Yuvi and beta does have a varnish instance so the flow of EL events "should" be the same one that production. > > Now I looked on deployment-eventlogging02, which is the EL machine for labs and the last events I see there are from Aug 22. > > So no events have come in as of late, which could point to an issue on the setup. I will look into it some more. > > Thanks, > > Nuria > > On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org wrote: > To keep archives happy: Beta setup post events to http://bits.beta.wmflabs.org/event.gif that, while it does not look to be varnish, has some kind of listener that post those events to beta event logging database. > > On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: > Niklas, > > Can you answer this question from Nuria? > > jsahleen: does beta have its own varnish instance? where are you posting your events in beta? can you send teh url? > > Also would it be possible to document the steps you used when testing EL on beta so that others can reproduce them? > > Thanks, > > Joel > > Joel Sahleen, Software Engineer > Language Engineering > Wikimedia Foundation > jsahleen@wikimedia.org > > > > > On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org wrote: > >> (Moving this discussion to analytics@ and localization-team@ based on Nuria’s suggestion below.) >> >> Hi Leila, >> >> The output I posted in the message is the only output I am seeing. I do not see the URL-encoded section or the validation section. I think there may be something wrong with my testing setup. >> >> Niklas Laxstöm has checked what is happening with our event logging in beta and he confirmed that we are sending events and the events are valid. The issue seems to be that we are logging events to the beta event logging db while what we checked earlier was the production event logging db. >> >> Can you (or anyone who is available) check the event logging db in beta to see if the table has been created and has data? The schema name again is ContentTranslation. If you don’t find anything, let us know and we will do some more investigation. >> >> If there is data in the beta db the next step would be to follow with Dan’s instructions to get a dashboard set up on limn1. I believe that most of Dan’s instructions need to be handled by someone on the analytics team, but let me know if there is anything I can help with. >> >> Thanks again for your help! >> >> Joel >> >> Joel Sahleen, Software Engineer >> Language Engineering >> Wikimedia Foundation >> jsahleen@wikimedia.org >> >> >> >> >> On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote: >> >>> Hi Joel, >>> >>> When you log events, the output will be the URL-encoded JSON sent by the browser, the event record (similar to what you pasted in your email), and whether the event validates against the schema. For the sample output you pasted earlier, or another sample output, can you let us know if validation section shows Valid? >>> >>> Leila >>> >>> On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org wrote: >>> Joel, >>> >>> For questions like these going forward you can contact analytics@ as you will be getting amore prompt response. Both Dan and Leila are OOTO the next couple of days. >>> >>> >There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server? >>> No, there is no need. >>> >>> You would need sample rates to determine at which sampling rate you are logging if you are not logging all events, that is. >>> >>> Thanks, >>> >>> Nuria >>> >>> On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu dandreescu@wikimedia.org wrote: >>> Adding Nuria as she can probably help >>> >>> On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org wrote: >>> Hi Leila, >>> >>> I have tested our EventLogging code and it seems to be working fine with the event logging dev server. I can see the events coming through and they are valid. Here is some sample output: >>> >>> {"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", "event": {"action": "create-translated-page", "targetLanguage": "ca", "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": 7146627} >>> >>> Are there additional configuration options we need to add to get EL working aside from just requiring the main extension file. There are configuration options for the dev server that need to be added. Do similar options need to be added when not using the dev server? >>> >>> Any help on this would be much appreciated. >>> >>> Thanks, >>> >>> Joel >>> >>> On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote: >>> >>>> No problem, Dan. Enjoy your vacation! >>>> >>>> I will read through the document at the link you sent. I still need to fix our event logging code so it may be a couple days before we are ready anyway. If I have any questions I will contact Leila or Nuria. >>>> >>>> Thanks, >>>> >>>> Joel >>>> >>>> Joel Sahleen, Software Engineer >>>> Language Engineering >>>> Wikimedia Foundation >>>> jsahleen@wikimedia.org >>>> >>>> >>>> >>>> >>>> On Nov 7, 2014, at 3:10 PM, Dan Andreescu dandreescu@wikimedia.org wrote: >>>> >>>>> Joel, re: visualization, >>>>> >>>>> I'm going on vacation tomorrow and will be back on November 19th. If that's not too late, I can set up a limn instance then. If it's too late, that's ok, I wrote up the steps needed. Someone with access to the limn1.eqiad.wmflabs instance can perform them: https://wikitech.wikimedia.org/wiki/Analytics/Dashboards >>>>> >>>>> If you have the data or are generating the data in some other way, then you don't need half of that setup, you just need the part that sets up the limn dashboard which is only an hour or so of work. Sorry I'm running out the door and can't take care of that for you. >>>>> >>>>> Dan >>>>> >>>>> On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen jsahleen@wikimedia.org wrote: >>>>> Thank you for the information, Pau. Very helpful. As you say, this does not change our current plans or hold us up in any way. I was just wasn’t clear about the relationship between the "high priorities" and "other metrics” sections. Knowing these came from different people at different times clarifies things a lot. >>>>> Joel >>>>> >>>>> On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org wrote: >>>>> >>>>>> @Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? >>>>>> >>>>>> Here is the story of that page as I'm aware of it: >>>>>> >>>>>> In September 2013, I was in a meeting with the analytics team in SF presentingan initial proposal for metrics. On that meeting, Dario recommended to create hierarchy of metrics based on the project goals. I created such image and a description for those metrics (the image is on top of our analytics page and the metrics are described in what it now the "Other metrics for created articles" section. >>>>>> >>>>>> In a meeting between Amir and Howie, they captured which should be the most important metrics from the product perspective in the "High priorities for product management". If I recalled correctly, as an outcome of later meetings between Howie and Amir, Howie was happy focusing on articles published as a single (initial?) metric for success. Amir can provide more details since I was not on those meetings. >>>>>> >>>>>> In short: The analytics page has pieces contributed by different people during the last year, and although there are many ideas to organise and detail, measuring the number of published articles seems to be the solid candidate to get started with, learn from the value we get from it and polish the rest of ourgoal-to-signal process for detecting better metrics. >>>>>> >>>>>> >>>>>> Pau >>>>>> >>>>>> On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen jsahleen@wikimedia.orgwrote: >>>>>> Hi All, >>>>>> >>>>>> I have been reviewing our requirements for Content translation analytics and I have a few questions/requests. I am sending them to the language team list and Leila and Dan in the hopes of getting some more clarity. I will add the same content to the Trello card. >>>>>> >>>>>> In the weekly team meeting earlier today we agreed that the first metric we want to collect data for is the number of articles created in each language over time. This is something has Amir has already set up our current Event Logging to track. Now that Kartik has enabled EL in beta, that part should be done. Since we are only barely turning it on, there will be very little data until people create more articles using CX. However, we should be set up to collect any new data that comes in. >>>>>> >>>>>> @Leila, can you verify that the db table now exists for the ContentTranslation schema? If it doesn’t, can you point us to right people we need to work with to troubleshoot the issue? Also you mentioned in our meeting that personal data may soon be purged after 90 days as part of a new privacy policy. Could you explain that a bit more or point us to more information? If this is the case, it may affect some of the metrics we would like to collect in the future. >>>>>> >>>>>> @Dan, what do we need to do next in order to set up a very simple visualization that would show the number of articles created per week by language. Pau has an image of what he would like on the Trello card. You mentioned something about being able to host a dashboard for us on one of the Limn servers you already have set up. >>>>>> >>>>>> @Santhosh, I believe you said earlier you have a script you use to export the data for the ULS analytics. If so can you share that please in case we need a similar script for CX so I don’t have to write a new script from scratch? >>>>>> >>>>>> @Pau, @Amir There is a section called High priorities for product management on the Content translation analytics page. Did these priorities come from outside the team or does this just represent our own internal view of the high priorities? If the latter, have these priorities been reviewed by anyone outside the team? I think we are safe to proceed with our current plan, but it would be good to have product sign off on things more generally. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Joel >>>>>> >>>>>> Joel Sahleen, Software Engineer >>>>>> Language Engineering >>>>>> Wikimedia Foundation >>>>>> jsahleen@wikimedia.org >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Localisation-team mailing list >>>>>> Localisation-team@lists.wikimedia.org >>>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Pau Giner >>>>>> Interaction Designer >>>>>> Wikimedia Foundation >>>>>> _______________________________________________ >>>>>> Localisation-team mailing list >>>>>> Localisation-team@lists.wikimedia.org >>>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>> >>>>> >>>> >>> >>> >>> >> > > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > > > > _______________________________________________ > Localisation-team mailing list > Localisation-team@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
-- Pau Giner Interaction Designer Wikimedia Foundation _______________________________________________ Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
-- Pau Giner Interaction Designer Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Who should do this? Me? Amir?
Well, someone from you team, I guess.
On Mon, Nov 17, 2014 at 2:17 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
Nuria wrote:
I find the idea that "beta" is an stable environment where users can try stuff is not in sync with the fact thatbeta-labs is a release and QA testing environment for every single team.
Agreed. Whatever we would like “beta” to be it is clearly not a stable environment and therefore not appropriate for true beta testing. However, it is also all we have at the moment, as you point out. Ideally, there would be dev, testing, beta and production environments along with a set process for moving code from one environment to another, but since that is not the case all we can do is get by with what we have. The important thing to me is to make sure the Product team is aware that event logging cannot be used in “beta," so they can reset any expectations they may have regarding pre-release metrics collected in that way.
Pau wrote:
In any case, as said before, this is worth checking with product.
Who should do this? Me? Amir? Seems like something to bring up at the meeting with Howie and Erik later this week, but it might be worth giving Howie a heads up beforehand.
Pau
On Mon, Nov 17, 2014 at 12:17 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel,
Please look at the wiki page for beta features: "The primary purpose of Beta Features is to allow for Wikimedia designers and engineers (from the Wikimedia Foundation and community alike) to roll out technical improvements in an environment where large numbers of users can test, give feedback, and use these features in real-world settings. "
http://www.mediawiki.org/wiki/Beta_Features
This is normally what product refers to as "beta". You can, of course, confirm.
Beta cluster purpose is software testing (not quite the same thing): http://www.mediawiki.org/wiki/Beta_cluster
Thanks,
Nuria
On Mon, Nov 17, 2014 at 12:03 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
I think there is a confusion between the not so well named beta environment (testing environment in labs, which is what our thread refers to) and being a beta-feature. *If you are a beta-feature you are IN production and you can get data.*
Thanks for the clarification, Nuria.
The issue, as I understand it, is that Product is asking for metrics on Content Translation usage "in beta” so they can make a "data-driven" decision about deployment to production. If what Product means by “in beta” is “as a beta feature” then we really have no problem. We’ll just have to wait until after we deploy as a beta feature in January to start collecting data and doing visualizations.
My understanding, however, is that what Product is asking for is metrics on Content Translation usage in “the beta environment” where a group of beta-testers has been using the extension for several months now. If the event logging data in "the beta environment" is not stable and this environment is really a software testing environment instead of a beta testing environment, then we can’t really fulfill Product's request; at least not by using event logging.
It looks like we need clarification from Product regarding what they mean by “beta,” and if that turns out to be “the beta environment” then we will have to work something out.
Thanks,
Joel
On Mon, Nov 17, 2014 at 10:41 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
+2
If the beta environment isn’t supposed to be used for beta testing, it shouldn’t be called beta.
I’m all for grabbing the data and doing our own visualizations, but there is no guarantee that any data we grab will be accurate since they data in the beta db may be blown up at any time.
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 17, 2014, at 11:26 AM, Pau Giner pginer@wikimedia.org wrote:
Dashboards can only be created o production data,
From product we are constantly encouraged to be data-driven ("measure twice, implement once"). When I read that we need to be in production to get metrics, it feels like a circular dependency: Product wants us to have numbers that justify the move to production, but Analytics tells us that we need to be in production to get those numbers. I think it is worth opening a conversation in the Product list to clarify their expectations.
My experience with the Multimedia team was that having the ability to visualise metrics and check how those were affected by changes in the product has been really useful, and we only wish we could have had such metrics available from day one. In addition, Content Translation is being used in beta for real work by some users, and we are already missing information on how they are doing so. So any idea on how can get and make sense of some of this information (apart from manual collection) would be appreciated (maybe get the data in a way we could use some quick d3-based tool http://code.shutterstock.com/rickshaw/?).
Thanks
Pau
On Mon, Nov 17, 2014 at 8:38 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
On Nov 17, 2014, at 9:13 AM, Nuria Ruiz nuria@wikimedia.org wrote:
>Since event logging in beta and production appear to be separate, I was wondering if it would be possible to set up separate dashboards for beta and >production.
Dashboards can only be created o production data, Joel. We might blow up data in beta environment database to test something else so there is no guaranteed availability there. It is purely a testing environment.
Makes sense I suppose, but if the data in beta is unstable there doesn’t seem much point in doing any of this there, beyond confirming that we are sending valid events, which has already been done. I guess we will just have to wait until we go to production to set things up. It would be nice if we had a real beta environment we could use for beta testing, but that’s a larger issue.
On Mon, Nov 17, 2014 at 8:09 AM, Joel Sahleen <jsahleen@wikimedia.org > wrote:
> Hi all, > > I wanted to check in on this and confirm where things are at. As far > as I understand, the outstanding issues for beta are: > > 1. We still need to verify that events sent from Content Translation > are being collected in beta. The analytics team is looking into the issues > in beta and Nuria has created a bug > https://bugzilla.wikimedia.org/show_bug.cgi?id=73388 in bugzilla > to track any related work. > > 2. Sometime after Dan gets back from vacation, he and Joel will need > to work together to set up a basic dashboard based on Dan's > instructions > https://wikitech.wikimedia.org/wiki/Analytics/Dashboards. Timing > is dependent on 1. @Dan, let me know what works best for you and how I can > best help. > > Since event logging in beta and production appear to be separate, I > was wondering if it would be possible to set up separate dashboards for > beta and production. That would be very useful for us because it would > allow us to track the usage of languages we release to beta and then use > that data to prioritize the languages we release to production. > > Thanks, > > Joel > > On Nov 14, 2014, at 11:05 AM, Nuria Ruiz nuria@wikimedia.org > wrote: > > >Joel, Ori looked into this now. There was a problem with EL in > labs which affected logging events from Beta. Ori has fixed the issue, and > the fix is >waiting approval from ops. Let's touch-base tomorrow to > see if we see events. > In order to be able to properly test whether the fix fixes this > issue we need to know what it is. > > There is a bug logged for the situation of beta and EL, can we > please link any commits to this bug? > https://bugzilla.wikimedia.org/show_bug.cgi?id=73388 > > Also, one thing is the setup of the varnish environment and other > the setup of the eventlogging machine that has not received new code for > quite a while, so I think we have more than one problem here. > > > > > On Thu, Nov 13, 2014 at 4:48 PM, Leila Zia leila@wikimedia.org > wrote: > >> [+ Ori] >> >> Joel, Ori looked into this now. There was a problem with EL in labs >> which affected logging events from Beta. Ori has fixed the issue, and the >> fix is waiting approval from ops. Let's touch-base tomorrow to see if we >> see events. >> >> Leila >> >> >> >> On Thu, Nov 13, 2014 at 1:30 PM, Nuria Ruiz nuria@wikimedia.org >> wrote: >> >>> Joel: >>> >>> I see, I was hoping to set aside the beta issues but if you are >>> not deploying to prod any time soon I guess we will need to troubleshoot >>> there. By the looks of it EL has not worked in beta since august, but, as I >>> said before, I know very little about how beta is put together. >>> >>> I have filed a bug to regarding the beta issue: >>> https://bugzilla.wikimedia.org/show_bug.cgi?id=73388 >>> >>> >>> >>> >>> >>> >>> On Thu, Nov 13, 2014 at 12:52 PM, Joel Sahleen < >>> jsahleen@wikimedia.org> wrote: >>> >>>> Hi Nuria, >>>> >>>> >Please let me know if there is any way I can help out or if >>>> there is anything you need from our end. >>>> When you have deployed your newest code to production, let's >>>> check whether events appear on the production stream. Let us know when >>>> deployment is done and you think your code should be logging. >>>> >>>> >>>> Our code is not scheduled to be released to production until >>>> January. Getting the metrics is partly to help us ensure and promote that >>>> release. We will keep you informed as our plans progress, but hopefully we >>>> can figure out what the issue is in beta soon. >>>> >>>> To confirm: You have seen proper logging from your events in >>>> vagrant, right? >>>> >>>> >>>> The output I am seeing with vagrant is what I pasted to this >>>> thread earlier. It does not contain the url-encoded section or the user >>>> agent information as we discussed before. I think that is an issue with my >>>> dev environment, however, and not a problem with the code. The same code >>>> appears to be sending valid events in beta. The http request I sent to your >>>> email earlier is what we are seeing there. It seems to include all the >>>> information you said it should include. >>>> >>>> If you want to debug what is happening in beta yourself, an easy >>>> way I found to do that is: >>>> >>>> >>>> 1. Go to our Content Translation translation view >>>> http://en.wikipedia.beta.wmflabs.org/wiki/Special:ContentTranslation?page=Han+Feizi&from=es&to=ca&targettitle=Han+Feizi page >>>> in beta (you will need to create an account and sign in) >>>> 2. Open chrome dev tools, >>>> 3. Click the add translation links that appear in the middle >>>> column to add a few machine translated paragraphs to the editor >>>> 4. Click on the publish button in the header to publish the >>>> translation to your user namespace (triggers EL event) >>>> 5. Look at the network pane in chrome dev tools and find the >>>> entry with the event logging url (it should be near the bottom). >>>> 6. Click on the entry to see all the request and response >>>> information. >>>> >>>> >>>> You probably already know all this, but I thought I would pass it >>>> along just in case it helps. >>>> >>>> Di you setup a sampling rate or code is logging 1 to 1? >>>> >>>> >>>> No sample rate. Just logging 1 to 1. >>>> >>>> On our end we will work to troubleshoot the beta EL >>>> infrastructure, I am not familiar with it and neither is anyone on our team >>>> but we will ask around. >>>> >>>> >>>> Yeah, Dan said you all kind of inherited EL so that’s totally >>>> understandable. We appreciate you looking into this for us. Let us know how >>>> else we can help. >>>> >>>> Joel >>>> >>>> >>>> >>>> >>>> >>>> On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen < >>>> jsahleen@wikimedia.org> wrote: >>>> >>>>> Hi Nuria, >>>>> >>>>> Thank you so much for your help on this. Please let me know if >>>>> there is any way I can help out or if there is anything you need from our >>>>> end. >>>>> >>>>> Joel >>>>> >>>>> Joel Sahleen, Software Engineer >>>>> Language Engineering >>>>> Wikimedia Foundation >>>>> jsahleen@wikimedia.org >>>>> >>>>> >>>>> >>>>> >>>>> On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org >>>>> wrote: >>>>> >>>>> Hello, >>>>> >>>>> Taking last statement back, asked Yuvi and beta does have a >>>>> varnish instance so the flow of EL events "should" be the same one that >>>>> production. >>>>> >>>>> Now I looked on deployment-eventlogging02, which is the EL >>>>> machine for labs and the last events I see there are from Aug 22. >>>>> >>>>> So no events have come in as of late, which could point to an >>>>> issue on the setup. I will look into it some more. >>>>> >>>>> Thanks, >>>>> >>>>> Nuria >>>>> >>>>> On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz < >>>>> nuria@wikimedia.org> wrote: >>>>> >>>>>> To keep archives happy: Beta setup post events to >>>>>> http://bits.beta.wmflabs.org/event.gif >>>>>> http://bits.beta.wmflabs.org/event.gif?foo=bar that, while >>>>>> it does not look to be varnish, has some kind of listener that post those >>>>>> events to beta event logging database. >>>>>> >>>>>> On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen < >>>>>> jsahleen@wikimedia.org> wrote: >>>>>> >>>>>>> Niklas, >>>>>>> >>>>>>> Can you answer this question from Nuria? >>>>>>> >>>>>>> jsahleen: does beta have its own varnish instance? where are >>>>>>> you posting your events in beta? can you send teh url? >>>>>>> >>>>>>> Also would it be possible to document the steps you used when >>>>>>> testing EL on beta so that others can reproduce them? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Joel >>>>>>> >>>>>>> Joel Sahleen, Software Engineer >>>>>>> Language Engineering >>>>>>> Wikimedia Foundation >>>>>>> jsahleen@wikimedia.org >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Nov 12, 2014, at 4:28 AM, Joel Sahleen < >>>>>>> jsahleen@wikimedia.org> wrote: >>>>>>> >>>>>>> (Moving this discussion to analytics@ and localization-team@ >>>>>>> based on Nuria’s suggestion below.) >>>>>>> >>>>>>> Hi Leila, >>>>>>> >>>>>>> The output I posted in the message is the only output I am >>>>>>> seeing. I do not see the URL-encoded section or the validation section. I >>>>>>> think there may be something wrong with my testing setup. >>>>>>> >>>>>>> Niklas Laxstöm has checked what is happening with our event >>>>>>> logging in beta and he confirmed that we are sending events and the events >>>>>>> are valid. The issue seems to be that we are logging events to the beta >>>>>>> event logging db while what we checked earlier was the production event >>>>>>> logging db. >>>>>>> >>>>>>> Can you (or anyone who is available) check the event logging >>>>>>> db in beta to see if the table has been created and has data? The schema >>>>>>> name again is ContentTranslation. If you don’t find anything, let us know >>>>>>> and we will do some more investigation. >>>>>>> >>>>>>> If there is data in the beta db the next step would be to >>>>>>> follow with Dan’s instructions >>>>>>> https://wikitech.wikimedia.org/wiki/Analytics/Dashboards to >>>>>>> get a dashboard set up on limn1. I believe that most of Dan’s instructions >>>>>>> need to be handled by someone on the analytics team, but let me know if >>>>>>> there is anything I can help with. >>>>>>> >>>>>>> Thanks again for your help! >>>>>>> >>>>>>> Joel >>>>>>> >>>>>>> Joel Sahleen, Software Engineer >>>>>>> Language Engineering >>>>>>> Wikimedia Foundation >>>>>>> jsahleen@wikimedia.org >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org >>>>>>> wrote: >>>>>>> >>>>>>> Hi Joel, >>>>>>> >>>>>>> When you log events, the output will be the URL-encoded >>>>>>> JSON sent by the browser, the event record (similar to what you pasted in >>>>>>> your email), and whether the event validates against the schema. For the >>>>>>> sample output you pasted earlier, or another sample output, can you let us >>>>>>> know if validation section shows Valid? >>>>>>> >>>>>>> Leila >>>>>>> >>>>>>> On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz < >>>>>>> nuria@wikimedia.org> wrote: >>>>>>> >>>>>>>> Joel, >>>>>>>> >>>>>>>> For questions like these going forward you can contact >>>>>>>> analytics@ as you will be getting amore prompt response. >>>>>>>> Both Dan and Leila are OOTO the next couple of days. >>>>>>>> >>>>>>>> >There are configuration options for the dev server that >>>>>>>> need to be added. Do similar options need to be added when not using the >>>>>>>> dev server? >>>>>>>> No, there is no need. >>>>>>>> >>>>>>>> You would need sample rates to determine at which sampling >>>>>>>> rate you are logging if you are not logging all events, that is. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Nuria >>>>>>>> >>>>>>>> On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu < >>>>>>>> dandreescu@wikimedia.org> wrote: >>>>>>>> >>>>>>>>> Adding Nuria as she can probably help >>>>>>>>> >>>>>>>>> On Monday, November 10, 2014, Joel Sahleen < >>>>>>>>> jsahleen@wikimedia.org> wrote: >>>>>>>>> >>>>>>>>>> Hi Leila, >>>>>>>>>> >>>>>>>>>> I have tested our EventLogging code and it seems to be >>>>>>>>>> working fine with the event logging dev server. I can see the events coming >>>>>>>>>> through and they are valid. Here is some sample output: >>>>>>>>>> >>>>>>>>>> {"wiki": "wiki", "uuid": >>>>>>>>>> "e9dde14cf18552269ae81a7897f45d0c", "webHost": "localhost", "timestamp": >>>>>>>>>> 1415651367, "clientValidated": true, "recvFrom": "1.0.0.127.in-addr.arpa", >>>>>>>>>> "seqId": 2, "clientIp": "80f7683f3565e3d365740a1c8d1771ba95caaaaa", >>>>>>>>>> "schema": "ContentTranslation", "event": {"action": >>>>>>>>>> "create-translated-page", "targetLanguage": "ca", "token": "Tester", >>>>>>>>>> "version": 1, "contentLanguage": "es"}, "revision": 7146627} >>>>>>>>>> >>>>>>>>>> Are there additional configuration options we need to add >>>>>>>>>> to get EL working aside from just requiring the main extension file. There >>>>>>>>>> are configuration options for the dev server that need to be added. Do >>>>>>>>>> similar options need to be added when not using the dev server? >>>>>>>>>> >>>>>>>>>> Any help on this would be much appreciated. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Joel >>>>>>>>>> >>>>>>>>>> On Nov 7, 2014, at 3:52 PM, Joel Sahleen < >>>>>>>>>> jsahleen@wikimedia.org> wrote: >>>>>>>>>> >>>>>>>>>> No problem, Dan. Enjoy your vacation! >>>>>>>>>> >>>>>>>>>> I will read through the document at the link you sent. I >>>>>>>>>> still need to fix our event logging code so it may be a couple days before >>>>>>>>>> we are ready anyway. If I have any questions I will contact Leila or Nuria. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Joel >>>>>>>>>> >>>>>>>>>> Joel Sahleen, Software Engineer >>>>>>>>>> Language Engineering >>>>>>>>>> Wikimedia Foundation >>>>>>>>>> jsahleen@wikimedia.org >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Nov 7, 2014, at 3:10 PM, Dan Andreescu < >>>>>>>>>> dandreescu@wikimedia.org> wrote: >>>>>>>>>> >>>>>>>>>> Joel, re: visualization, >>>>>>>>>> >>>>>>>>>> I'm going on vacation tomorrow and will be back on November >>>>>>>>>> 19th. If that's not too late, I can set up a limn instance then. If it's >>>>>>>>>> too late, that's ok, I wrote up the steps needed. Someone with access to >>>>>>>>>> the limn1.eqiad.wmflabs instance can perform them: >>>>>>>>>> https://wikitech.wikimedia.org/wiki/Analytics/Dashboards >>>>>>>>>> >>>>>>>>>> If you have the data or are generating the data in some >>>>>>>>>> other way, then you don't need half of that setup, you just need the part >>>>>>>>>> that sets up the limn dashboard which is only an hour or so of work. Sorry >>>>>>>>>> I'm running out the door and can't take care of that for you. >>>>>>>>>> >>>>>>>>>> Dan >>>>>>>>>> >>>>>>>>>> On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen < >>>>>>>>>> jsahleen@wikimedia.org> wrote: >>>>>>>>>> >>>>>>>>>>> Thank you for the information, Pau. Very helpful. As you >>>>>>>>>>> say, this does not change our current plans or hold us up in any way. I was >>>>>>>>>>> just wasn’t clear about the relationship between the "high priorities" and >>>>>>>>>>> "other metrics” sections. Knowing these came from different people at >>>>>>>>>>> different times clarifies things a lot. >>>>>>>>>>> Joel >>>>>>>>>>> >>>>>>>>>>> On Nov 7, 2014, at 3:44 AM, Pau Giner < >>>>>>>>>>> pginer@wikimedia.org> wrote: >>>>>>>>>>> >>>>>>>>>>> @Pau, @Amir There is a section called High priorities for >>>>>>>>>>>> product management >>>>>>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>>>>>>>>> the Content translation analytics page. Did these priorities come from >>>>>>>>>>>> outside the team or does this just represent our own internal view of the >>>>>>>>>>>> high priorities? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Here is the story of that page as I'm aware of it: >>>>>>>>>>> >>>>>>>>>>> In September 2013, I was in a meeting with the analytics >>>>>>>>>>> team in SF presentingan initial proposal for metrics >>>>>>>>>>> https://docs.google.com/a/wikimedia.org/presentation/d/1V1XLV7jUcAtco5ZC49SNTt3VecH7hARZ6vqbSFGnOYc/edit?usp=sharing. >>>>>>>>>>> On that meeting, Dario recommended to create hierarchy of metrics based on >>>>>>>>>>> the project goals. I created such image and a description for those metrics >>>>>>>>>>> (the image is on top of our analytics page and the metrics are described in >>>>>>>>>>> what it now the "Other metrics for created articles" section. >>>>>>>>>>> >>>>>>>>>>> In a meeting between Amir and Howie, they captured which >>>>>>>>>>> should be the most important metrics from the product perspective in the >>>>>>>>>>> "High priorities for product management". If I recalled correctly, as an >>>>>>>>>>> outcome of later meetings between Howie and Amir, Howie was happy focusing >>>>>>>>>>> on articles published as a single (initial?) metric for success. Amir can >>>>>>>>>>> provide more details since I was not on those meetings. >>>>>>>>>>> >>>>>>>>>>> In short: The analytics page >>>>>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics >>>>>>>>>>> has pieces contributed by different people during the >>>>>>>>>>> last year, and although there are many ideas to organise and detail, >>>>>>>>>>> measuring the number of published articles seems to be the solid candidate >>>>>>>>>>> to get started with, learn from the value we get from it and polish the >>>>>>>>>>> rest of ourgoal-to-signal process >>>>>>>>>>> http://www.rodden.org/kerry/heart/ for detecting better >>>>>>>>>>> metrics. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Pau >>>>>>>>>>> >>>>>>>>>>> On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen < >>>>>>>>>>> jsahleen@wikimedia.org>wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi All, >>>>>>>>>>>> >>>>>>>>>>>> I have been reviewing our requirements for Content >>>>>>>>>>>> translation analytics >>>>>>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics and >>>>>>>>>>>> I have a few questions/requests. I am sending them to the language team >>>>>>>>>>>> list and Leila and Dan in the hopes of getting some more clarity. I will >>>>>>>>>>>> add the same content to the Trello card. >>>>>>>>>>>> >>>>>>>>>>>> In the weekly team meeting earlier today we agreed that >>>>>>>>>>>> the first metric we want to collect data for is the number of articles >>>>>>>>>>>> created in each language over time. This is something has Amir has already >>>>>>>>>>>> set up our current Event Logging >>>>>>>>>>>> https://git.wikimedia.org/blob/mediawiki/extensions/ContentTranslation/89b6284f06b4419ddec6dcccee0eed500f267100/modules/eventlogging/ext.cx.eventlogging.js to >>>>>>>>>>>> track. Now that Kartik has enabled EL in beta, that part should be done. >>>>>>>>>>>> Since we are only barely turning it on, there will be very little data >>>>>>>>>>>> until people create more articles using CX. However, we should be set up to >>>>>>>>>>>> collect any new data that comes in. >>>>>>>>>>>> >>>>>>>>>>>> @Leila, can you verify that the db table now exists for >>>>>>>>>>>> the ContentTranslation schema >>>>>>>>>>>> https://meta.wikimedia.org/wiki/Schema:ContentTranslation? >>>>>>>>>>>> If it doesn’t, can you point us to right people we need to work with to >>>>>>>>>>>> troubleshoot the issue? Also you mentioned in our meeting that personal >>>>>>>>>>>> data may soon be purged after 90 days as part of a new privacy policy. >>>>>>>>>>>> Could you explain that a bit more or point us to more information? If this >>>>>>>>>>>> is the case, it may affect some of the metrics we would like to collect in >>>>>>>>>>>> the future. >>>>>>>>>>>> >>>>>>>>>>>> @Dan, what do we need to do next in order to set up a >>>>>>>>>>>> very simple visualization that would show the number of articles created >>>>>>>>>>>> per week by language. Pau has an image of what he would like on the Trello >>>>>>>>>>>> card >>>>>>>>>>>> https://trello.com/c/vQm0hlkt/18-content-translation-analytics. >>>>>>>>>>>> You mentioned something about being able to host a dashboard for us on one >>>>>>>>>>>> of the Limn servers you already have set up. >>>>>>>>>>>> >>>>>>>>>>>> @Santhosh, I believe you said earlier you have a script >>>>>>>>>>>> you use to export the data for the ULS analytics. If so can you share that >>>>>>>>>>>> please in case we need a similar script for CX so I don’t have to write a >>>>>>>>>>>> new script from scratch? >>>>>>>>>>>> >>>>>>>>>>>> @Pau, @Amir There is a section called High priorities >>>>>>>>>>>> for product management >>>>>>>>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>>>>>>>>> the Content translation analytics page. Did these priorities come from >>>>>>>>>>>> outside the team or does this just represent our own internal view of the >>>>>>>>>>>> high priorities? If the latter, have these priorities >>>>>>>>>>>> been reviewed by anyone outside the team? I think we are safe to proceed >>>>>>>>>>>> with our current plan, but it would be good to have product sign off on >>>>>>>>>>>> things more generally. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Joel >>>>>>>>>>>> >>>>>>>>>>>> Joel Sahleen, Software Engineer >>>>>>>>>>>> Language Engineering >>>>>>>>>>>> Wikimedia Foundation >>>>>>>>>>>> jsahleen@wikimedia.org >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> Localisation-team mailing list >>>>>>>>>>>> Localisation-team@lists.wikimedia.org >>>>>>>>>>>> >>>>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Pau Giner >>>>>>>>>>> Interaction Designer >>>>>>>>>>> Wikimedia Foundation >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Localisation-team mailing list >>>>>>>>>>> Localisation-team@lists.wikimedia.org >>>>>>>>>>> >>>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Analytics mailing list >>>>>>> Analytics@lists.wikimedia.org >>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>>> >>>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Localisation-team mailing list >>>>> Localisation-team@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Analytics mailing list >>>>> Analytics@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>>> >>>> _______________________________________________ >>>> Localisation-team mailing list >>>> Localisation-team@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>> >>>> >>>> >>>> _______________________________________________ >>>> Analytics mailing list >>>> Analytics@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>>> >>> >>> _______________________________________________ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > _______________________________________________ > Localisation-team mailing list > Localisation-team@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/localisation-team > > > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > > _______________________________________________ Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
-- Pau Giner Interaction Designer Wikimedia Foundation _______________________________________________ Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
-- Pau Giner Interaction Designer Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
- We still need to verify that events sent from Content Translation are
being collected in beta.
I have added instructions how to do this to the EL beta labs bug. As far as we know the environment should be working now.
https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
On Mon, Nov 17, 2014 at 8:09 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi all,
I wanted to check in on this and confirm where things are at. As far as I understand, the outstanding issues for beta are:
- We still need to verify that events sent from Content Translation are
being collected in beta. The analytics team is looking into the issues in beta and Nuria has created a bug https://bugzilla.wikimedia.org/show_bug.cgi?id=73388 in bugzilla to track any related work.
- Sometime after Dan gets back from vacation, he and Joel will need to
work together to set up a basic dashboard based on Dan's instructions https://wikitech.wikimedia.org/wiki/Analytics/Dashboards. Timing is dependent on 1. @Dan, let me know what works best for you and how I can best help.
Since event logging in beta and production appear to be separate, I was wondering if it would be possible to set up separate dashboards for beta and production. That would be very useful for us because it would allow us to track the usage of languages we release to beta and then use that data to prioritize the languages we release to production.
Thanks,
Joel
On Nov 14, 2014, at 11:05 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel, Ori looked into this now. There was a problem with EL in labs
which affected logging events from Beta. Ori has fixed the issue, and the fix is >waiting approval from ops. Let's touch-base tomorrow to see if we see events. In order to be able to properly test whether the fix fixes this issue we need to know what it is.
There is a bug logged for the situation of beta and EL, can we please link any commits to this bug? https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
Also, one thing is the setup of the varnish environment and other the setup of the eventlogging machine that has not received new code for quite a while, so I think we have more than one problem here.
On Thu, Nov 13, 2014 at 4:48 PM, Leila Zia leila@wikimedia.org wrote:
[+ Ori]
Joel, Ori looked into this now. There was a problem with EL in labs which affected logging events from Beta. Ori has fixed the issue, and the fix is waiting approval from ops. Let's touch-base tomorrow to see if we see events.
Leila
On Thu, Nov 13, 2014 at 1:30 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Joel:
I see, I was hoping to set aside the beta issues but if you are not deploying to prod any time soon I guess we will need to troubleshoot there. By the looks of it EL has not worked in beta since august, but, as I said before, I know very little about how beta is put together.
I have filed a bug to regarding the beta issue: https://bugzilla.wikimedia.org/show_bug.cgi?id=73388
On Thu, Nov 13, 2014 at 12:52 PM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi Nuria,
Please let me know if there is any way I can help out or if there is
anything you need from our end. When you have deployed your newest code to production, let's check whether events appear on the production stream. Let us know when deployment is done and you think your code should be logging.
Our code is not scheduled to be released to production until January. Getting the metrics is partly to help us ensure and promote that release. We will keep you informed as our plans progress, but hopefully we can figure out what the issue is in beta soon.
To confirm: You have seen proper logging from your events in vagrant, right?
The output I am seeing with vagrant is what I pasted to this thread earlier. It does not contain the url-encoded section or the user agent information as we discussed before. I think that is an issue with my dev environment, however, and not a problem with the code. The same code appears to be sending valid events in beta. The http request I sent to your email earlier is what we are seeing there. It seems to include all the information you said it should include.
If you want to debug what is happening in beta yourself, an easy way I found to do that is:
- Go to our Content Translation translation view
http://en.wikipedia.beta.wmflabs.org/wiki/Special:ContentTranslation?page=Han+Feizi&from=es&to=ca&targettitle=Han+Feizi page in beta (you will need to create an account and sign in) 2. Open chrome dev tools, 3. Click the add translation links that appear in the middle column to add a few machine translated paragraphs to the editor 4. Click on the publish button in the header to publish the translation to your user namespace (triggers EL event) 5. Look at the network pane in chrome dev tools and find the entry with the event logging url (it should be near the bottom). 6. Click on the entry to see all the request and response information.
You probably already know all this, but I thought I would pass it along just in case it helps.
Di you setup a sampling rate or code is logging 1 to 1?
No sample rate. Just logging 1 to 1.
On our end we will work to troubleshoot the beta EL infrastructure, I am not familiar with it and neither is anyone on our team but we will ask around.
Yeah, Dan said you all kind of inherited EL so that’s totally understandable. We appreciate you looking into this for us. Let us know how else we can help.
Joel
On Thu, Nov 13, 2014 at 8:45 AM, Joel Sahleen jsahleen@wikimedia.org wrote:
Hi Nuria,
Thank you so much for your help on this. Please let me know if there is any way I can help out or if there is anything you need from our end.
Joel
Joel Sahleen, Software Engineer Language Engineering Wikimedia Foundation jsahleen@wikimedia.org
On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nuria@wikimedia.org wrote:
Hello,
Taking last statement back, asked Yuvi and beta does have a varnish instance so the flow of EL events "should" be the same one that production.
Now I looked on deployment-eventlogging02, which is the EL machine for labs and the last events I see there are from Aug 22.
So no events have come in as of late, which could point to an issue on the setup. I will look into it some more.
Thanks,
Nuria
On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nuria@wikimedia.org wrote:
To keep archives happy: Beta setup post events to http://bits.beta.wmflabs.org/event.gif http://bits.beta.wmflabs.org/event.gif?foo=bar that, while it does not look to be varnish, has some kind of listener that post those events to beta event logging database.
On Wed, Nov 12, 2014 at 9:37 AM, Joel Sahleen <jsahleen@wikimedia.org > wrote:
> Niklas, > > Can you answer this question from Nuria? > > jsahleen: does beta have its own varnish instance? where are you > posting your events in beta? can you send teh url? > > Also would it be possible to document the steps you used when > testing EL on beta so that others can reproduce them? > > Thanks, > > Joel > > Joel Sahleen, Software Engineer > Language Engineering > Wikimedia Foundation > jsahleen@wikimedia.org > > > > > On Nov 12, 2014, at 4:28 AM, Joel Sahleen jsahleen@wikimedia.org > wrote: > > (Moving this discussion to analytics@ and localization-team@ based > on Nuria’s suggestion below.) > > Hi Leila, > > The output I posted in the message is the only output I am seeing. I > do not see the URL-encoded section or the validation section. I think there > may be something wrong with my testing setup. > > Niklas Laxstöm has checked what is happening with our event logging > in beta and he confirmed that we are sending events and the events are > valid. The issue seems to be that we are logging events to the beta event > logging db while what we checked earlier was the production event logging > db. > > Can you (or anyone who is available) check the event logging db in > beta to see if the table has been created and has data? The schema name > again is ContentTranslation. If you don’t find anything, let us know and we > will do some more investigation. > > If there is data in the beta db the next step would be to follow > with Dan’s instructions > https://wikitech.wikimedia.org/wiki/Analytics/Dashboards to get a > dashboard set up on limn1. I believe that most of Dan’s instructions need > to be handled by someone on the analytics team, but let me know if there is > anything I can help with. > > Thanks again for your help! > > Joel > > Joel Sahleen, Software Engineer > Language Engineering > Wikimedia Foundation > jsahleen@wikimedia.org > > > > > On Nov 11, 2014, at 11:47 PM, Leila Zia leila@wikimedia.org wrote: > > Hi Joel, > > When you log events, the output will be the URL-encoded JSON > sent by the browser, the event record (similar to what you pasted in your > email), and whether the event validates against the schema. For the sample > output you pasted earlier, or another sample output, can you let us know if > validation section shows Valid? > > Leila > > On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nuria@wikimedia.org > wrote: > >> Joel, >> >> For questions like these going forward you can contact analytics@ >> as you will be getting amore prompt response. Both Dan and Leila are OOTO >> the next couple of days. >> >> >There are configuration options for the dev server that need to >> be added. Do similar options need to be added when not using the dev server? >> No, there is no need. >> >> You would need sample rates to determine at which sampling rate you >> are logging if you are not logging all events, that is. >> >> Thanks, >> >> Nuria >> >> On Mon, Nov 10, 2014 at 2:39 PM, Dan Andreescu < >> dandreescu@wikimedia.org> wrote: >> >>> Adding Nuria as she can probably help >>> >>> On Monday, November 10, 2014, Joel Sahleen jsahleen@wikimedia.org >>> wrote: >>> >>>> Hi Leila, >>>> >>>> I have tested our EventLogging code and it seems to be working >>>> fine with the event logging dev server. I can see the events coming through >>>> and they are valid. Here is some sample output: >>>> >>>> {"wiki": "wiki", "uuid": "e9dde14cf18552269ae81a7897f45d0c", >>>> "webHost": "localhost", "timestamp": 1415651367, "clientValidated": true, >>>> "recvFrom": "1.0.0.127.in-addr.arpa", "seqId": 2, "clientIp": >>>> "80f7683f3565e3d365740a1c8d1771ba95caaaaa", "schema": "ContentTranslation", >>>> "event": {"action": "create-translated-page", "targetLanguage": "ca", >>>> "token": "Tester", "version": 1, "contentLanguage": "es"}, "revision": >>>> 7146627} >>>> >>>> Are there additional configuration options we need to add to get >>>> EL working aside from just requiring the main extension file. There are >>>> configuration options for the dev server that need to be added. Do similar >>>> options need to be added when not using the dev server? >>>> >>>> Any help on this would be much appreciated. >>>> >>>> Thanks, >>>> >>>> Joel >>>> >>>> On Nov 7, 2014, at 3:52 PM, Joel Sahleen jsahleen@wikimedia.org >>>> wrote: >>>> >>>> No problem, Dan. Enjoy your vacation! >>>> >>>> I will read through the document at the link you sent. I still >>>> need to fix our event logging code so it may be a couple days before we are >>>> ready anyway. If I have any questions I will contact Leila or Nuria. >>>> >>>> Thanks, >>>> >>>> Joel >>>> >>>> Joel Sahleen, Software Engineer >>>> Language Engineering >>>> Wikimedia Foundation >>>> jsahleen@wikimedia.org >>>> >>>> >>>> >>>> >>>> On Nov 7, 2014, at 3:10 PM, Dan Andreescu < >>>> dandreescu@wikimedia.org> wrote: >>>> >>>> Joel, re: visualization, >>>> >>>> I'm going on vacation tomorrow and will be back on November >>>> 19th. If that's not too late, I can set up a limn instance then. If it's >>>> too late, that's ok, I wrote up the steps needed. Someone with access to >>>> the limn1.eqiad.wmflabs instance can perform them: >>>> https://wikitech.wikimedia.org/wiki/Analytics/Dashboards >>>> >>>> If you have the data or are generating the data in some other >>>> way, then you don't need half of that setup, you just need the part that >>>> sets up the limn dashboard which is only an hour or so of work. Sorry I'm >>>> running out the door and can't take care of that for you. >>>> >>>> Dan >>>> >>>> On Fri, Nov 7, 2014 at 7:37 AM, Joel Sahleen < >>>> jsahleen@wikimedia.org> wrote: >>>> >>>>> Thank you for the information, Pau. Very helpful. As you say, >>>>> this does not change our current plans or hold us up in any way. I was just >>>>> wasn’t clear about the relationship between the "high priorities" and >>>>> "other metrics” sections. Knowing these came from different people at >>>>> different times clarifies things a lot. >>>>> Joel >>>>> >>>>> On Nov 7, 2014, at 3:44 AM, Pau Giner pginer@wikimedia.org >>>>> wrote: >>>>> >>>>> @Pau, @Amir There is a section called High priorities for >>>>>> product management >>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>>> the Content translation analytics page. Did these priorities come from >>>>>> outside the team or does this just represent our own internal view of the >>>>>> high priorities? >>>>> >>>>> >>>>> Here is the story of that page as I'm aware of it: >>>>> >>>>> In September 2013, I was in a meeting with the analytics team in >>>>> SF presentingan initial proposal for metrics >>>>> https://docs.google.com/a/wikimedia.org/presentation/d/1V1XLV7jUcAtco5ZC49SNTt3VecH7hARZ6vqbSFGnOYc/edit?usp=sharing. >>>>> On that meeting, Dario recommended to create hierarchy of metrics based on >>>>> the project goals. I created such image and a description for those metrics >>>>> (the image is on top of our analytics page and the metrics are described in >>>>> what it now the "Other metrics for created articles" section. >>>>> >>>>> In a meeting between Amir and Howie, they captured which should >>>>> be the most important metrics from the product perspective in the "High >>>>> priorities for product management". If I recalled correctly, as an outcome >>>>> of later meetings between Howie and Amir, Howie was happy focusing on >>>>> articles published as a single (initial?) metric for success. Amir can >>>>> provide more details since I was not on those meetings. >>>>> >>>>> In short: The analytics page >>>>> https://www.mediawiki.org/wiki/Content_translation/analytics has >>>>> pieces contributed by different people during the last year, and although >>>>> there are many ideas to organise and detail, measuring the number of >>>>> published articles seems to be the solid candidate to get started with, >>>>> learn from the value we get from it and polish the rest of ourgoal-to-signal >>>>> process http://www.rodden.org/kerry/heart/ for detecting >>>>> better metrics. >>>>> >>>>> >>>>> Pau >>>>> >>>>> On Fri, Nov 7, 2014 at 1:57 AM, Joel Sahleen < >>>>> jsahleen@wikimedia.org>wrote: >>>>> >>>>>> Hi All, >>>>>> >>>>>> I have been reviewing our requirements for Content translation >>>>>> analytics >>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics and >>>>>> I have a few questions/requests. I am sending them to the language team >>>>>> list and Leila and Dan in the hopes of getting some more clarity. I will >>>>>> add the same content to the Trello card. >>>>>> >>>>>> In the weekly team meeting earlier today we agreed that the >>>>>> first metric we want to collect data for is the number of articles created >>>>>> in each language over time. This is something has Amir has already set up our >>>>>> current Event Logging >>>>>> https://git.wikimedia.org/blob/mediawiki/extensions/ContentTranslation/89b6284f06b4419ddec6dcccee0eed500f267100/modules/eventlogging/ext.cx.eventlogging.js to >>>>>> track. Now that Kartik has enabled EL in beta, that part should be done. >>>>>> Since we are only barely turning it on, there will be very little data >>>>>> until people create more articles using CX. However, we should be set up to >>>>>> collect any new data that comes in. >>>>>> >>>>>> @Leila, can you verify that the db table now exists for the ContentTranslation >>>>>> schema >>>>>> https://meta.wikimedia.org/wiki/Schema:ContentTranslation? >>>>>> If it doesn’t, can you point us to right people we need to work with to >>>>>> troubleshoot the issue? Also you mentioned in our meeting that personal >>>>>> data may soon be purged after 90 days as part of a new privacy policy. >>>>>> Could you explain that a bit more or point us to more information? If this >>>>>> is the case, it may affect some of the metrics we would like to collect in >>>>>> the future. >>>>>> >>>>>> @Dan, what do we need to do next in order to set up a very >>>>>> simple visualization that would show the number of articles created per >>>>>> week by language. Pau has an image of what he would like on the Trello >>>>>> card >>>>>> https://trello.com/c/vQm0hlkt/18-content-translation-analytics. >>>>>> You mentioned something about being able to host a dashboard for us on one >>>>>> of the Limn servers you already have set up. >>>>>> >>>>>> @Santhosh, I believe you said earlier you have a script you use >>>>>> to export the data for the ULS analytics. If so can you share that please >>>>>> in case we need a similar script for CX so I don’t have to write a new >>>>>> script from scratch? >>>>>> >>>>>> @Pau, @Amir There is a section called High priorities for >>>>>> product management >>>>>> https://www.mediawiki.org/wiki/Content_translation/analytics#High_priorities_for_product_management on >>>>>> the Content translation analytics page. Did these priorities come from >>>>>> outside the team or does this just represent our own internal view of the >>>>>> high priorities? If the latter, have these priorities been >>>>>> reviewed by anyone outside the team? I think we are safe to proceed with >>>>>> our current plan, but it would be good to have product sign off on things >>>>>> more generally. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Joel >>>>>> >>>>>> Joel Sahleen, Software Engineer >>>>>> Language Engineering >>>>>> Wikimedia Foundation >>>>>> jsahleen@wikimedia.org >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Localisation-team mailing list >>>>>> Localisation-team@lists.wikimedia.org >>>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Pau Giner >>>>> Interaction Designer >>>>> Wikimedia Foundation >>>>> _______________________________________________ >>>>> Localisation-team mailing list >>>>> Localisation-team@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/localisation-team >>>>> >>>>> >>>>> >>>> >>>> >>>> >> > > > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > >
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Localisation-team mailing list Localisation-team@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/localisation-team
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics