Hi Gergo,

The number of users that drop-off at each stage will be really useful. Would it be possible to get the information in such a way that we could also check how long each step takes? In that way we could get an idea of how much time on average a user spends on each step and in total, even if they succeeded in the process.

Pau


On Wed, May 14, 2014 at 9:02 AM, Gilles Dubuc <gilles@wikimedia.org> wrote:
- send the event log as a synchronous request from an unload event handler.

That works really well, I've done it before for autosaving features. Obviously this only works if sampling users is enough (as opposed to measuring every single one), since it doesn't work on all browsers.


set a random identifier (which only lives until the page is unloaded), and add it to every event

That sounds perfectly fine. Ops can add indexes to the EventLogging tables for us, SQL queries grouping by that column should pose no challenge. That sounds like the simplest and most universal option.


On Wed, May 14, 2014 at 1:54 AM, Gergo Tisza <gtisza@wikimedia.org> wrote:
Hi all,

the Multimedia team is preparing to collect data to better understand usability problems with UploadWizard. UW has a "checkout" structure (step 1: put files in basket, step 2: choose license, step 3: add description, step 4: you are done), so a funnel analysis to identify which step causes the most users to abort the upload process and why seems like a good approach. I'm trying to understand how well the existing EventLogging infrastructure supports this.

The problem is how to get information about the actions of users who fell out of the funnel. I'll try to illustrate with an example: in one of the steps, the user can choose between "I am uploading my own work" and "I am uploading someone else's work" and the resulting interaction will be quite different. We would like to know whether that choice has a big effect on the likeliness of the user making it to the next step.

Using EventLogging, I can count the number of users who make it until that step. I can count the number of users making it to the next step. I can count the number of users choosing this or that author option. These numbers do not tell us much on their own, though; the interesting information would be how they are correlated.

Another thing I could do is creating a schema which includes both the choice of author option and the step number; when the user chooses "own work", we log an ownwork event, when they click "next step", we log a step(step=3, work=own) event. We can then calculate the number of users who did choose "own work" but did not make it to the next step as the difference of the two. But this won't work: "own work" is a radio button, the user select and deselect it any number of times before proceeding to the next step (or leaving the page).

So what we are trying to log are not really events but application states that describe users who are successful vs. unsuccessful in the given step.

I thought of two ways of dealing with this; any feedback on the plausibility of these or possible alternatives would be highly appreciated.

One would be to have a "step X succeeded" and a "step X failed" event (the schema for which could include all sorts of state, such as which authorship option was selected). This would require the ability to log an event when the user leaves the page. I see two ways two do that:
- send the event log as a synchronous request from an unload event handler. This is not supported on ancient browsers; also, there is probably some mechanism in most browsers to kill an unload event handler if it takes long.
- store the event in cookies/localStorage, log it on the next page load. This works in all browsers but it is less reliable (what if the user never comes back?) and logs the event for a different page load from where it actually occurred (what if the user comes back after a month?), and probably runs int all sorts of complications with multiple tabs.

The other way could be to log event chains: set a random identifier (which only lives until the page is unloaded), and add it to every event. Event groups can then be merged into meta-events by SQL magic, although that looks like it will be extremely painful to do. On the other hand, this is much more generic than the previous method, and could be used to answer more complex questions.

What do you think? Which would be the method I am not shooting myself in the foot with? Currently I am leaning towards using unload handlers.

_______________________________________________
Multimedia mailing list
Multimedia@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/multimedia



_______________________________________________
Multimedia mailing list
Multimedia@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/multimedia




--
Pau Giner
Interaction Designer
Wikimedia Foundation