Re: [Multimedia] [Analytics] Using EventLogging for funnel analysis

15 May 2014

      ...
...
[gergo] send the event log as a synchronous request from an unload event handler.
[giles]That works really well, I've done it before for autosaving features. Obviously this only works if sampling users is enough (as opposed to measuring every single one), since it doesn't work on all browsers
Please avoid logging synchronously, this will make the UI slower for
all users that are part of the logging sample. And not just a tad
slower, potentially it could be much slower. For some of our users a
network roundtrip is over 500 ms at the 50th percentile. So you can
potentially block the UI for a long time.
We had a similar discussion with growth team regarding synchronous
logging. You can see (a lot) of details here:
https://bugzilla.wikimedia.org/show_bug.cgi?id=52287
We decided to switch to a localStorage based solution. In your case I
think with UserTimings and sessionStorage you could get you the data
you need. Support for storage is broad:
http://caniuse.com/#feat=namevalue-storage, support for user timings
less so but you get chrome and IE and that is a big  percentage of
user base: http://caniuse.com/#feat=user-timing
...
[gergo] store the event in cookies/localStorage, log it on the next page load. This works in all browsers but it is less reliable
I do not think so, clearing some concerns:
...
Probably runs int all sorts of complications with multiple tabs.
This should not be a concern, as the page visibility API tells you
whether the tab is actually visible.
You can restrict user timings logging and event logging reporting
according to visibility so they only happen when user is interacting
with the page.
...
what if the user comes back after a month?
If you use session storage the events disappear when the user closes
the browser.
...
store the event in cookies/localStorage, log it on the next page load
Actually you can store the 'transition' in sessionStorage and use
regular polling to report it. You do not necessarily need to report
the transition from the next page. That being said you are right that
the "last" step might be under-reported as user might leave the page.
Now, we can analyze the data keeping this in mind. We can even
'estimate' how much are we underreporting the last step the user did.
On Wed, May 14, 2014 at 9:02 AM, Gilles Dubuc gilles@wikimedia.org wrote:
...
...

send the event log as a synchronous request from an unload event

handler.
That works really well, I've done it before for autosaving features.
Obviously this only works if sampling users is enough (as opposed to
measuring every single one), since it doesn't work on all browsers.
...
set a random identifier (which only lives until the page is unloaded), and
add it to every event
That sounds perfectly fine. Ops can add indexes to the EventLogging tables
for us, SQL queries grouping by that column should pose no challenge. That
sounds like the simplest and most universal option.
On Wed, May 14, 2014 at 1:54 AM, Gergo Tisza gtisza@wikimedia.org wrote:
...
Hi all,
the Multimedia team is preparing to collect data to better understand
usability problems with UploadWizard. UW has a "checkout" structure (step 1:
put files in basket, step 2: choose license, step 3: add description, step
4: you are done), so a funnel analysis to identify which step causes the
most users to abort the upload process and why seems like a good approach.
I'm trying to understand how well the existing EventLogging infrastructure
supports this.
The problem is how to get information about the actions of users who fell
out of the funnel. I'll try to illustrate with an example: in one of the
steps, the user can choose between "I am uploading my own work" and "I am
uploading someone else's work" and the resulting interaction will be quite
different. We would like to know whether that choice has a big effect on the
likeliness of the user making it to the next step.
Using EventLogging, I can count the number of users who make it until that
step. I can count the number of users making it to the next step. I can
count the number of users choosing this or that author option. These numbers
do not tell us much on their own, though; the interesting information would
be how they are correlated.
Another thing I could do is creating a schema which includes both the
choice of author option and the step number; when the user chooses "own
work", we log an ownwork event, when they click "next step", we log a
step(step=3, work=own) event. We can then calculate the number of users who
did choose "own work" but did not make it to the next step as the difference
of the two. But this won't work: "own work" is a radio button, the user
select and deselect it any number of times before proceeding to the next
step (or leaving the page).
So what we are trying to log are not really events but application states
that describe users who are successful vs. unsuccessful in the given step.
I thought of two ways of dealing with this; any feedback on the
plausibility of these or possible alternatives would be highly appreciated.
One would be to have a "step X succeeded" and a "step X failed" event (the
schema for which could include all sorts of state, such as which authorship
option was selected). This would require the ability to log an event when
the user leaves the page. I see two ways two do that:

send the event log as a synchronous request from an unload event

handler. This is not supported on ancient browsers; also, there is probably
some mechanism in most browsers to kill an unload event handler if it takes
long.

store the event in cookies/localStorage, log it on the next page load.

This works in all browsers but it is less reliable (what if the user never
comes back?) and logs the event for a different page load from where it
actually occurred (what if the user comes back after a month?), and probably
runs int all sorts of complications with multiple tabs.
The other way could be to log event chains: set a random identifier (which
only lives until the page is unloaded), and add it to every event. Event
groups can then be merged into meta-events by SQL magic, although that looks
like it will be extremely painful to do. On the other hand, this is much
more generic than the previous method, and could be used to answer more
complex questions.
What do you think? Which would be the method I am not shooting myself in
the foot with? Currently I am leaning towards using unload handlers.

Multimedia mailing list
Multimedia@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/multimedia

Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [Multimedia] [Analytics] Using EventLogging for funnel analysis