On Wed, May 14, 2014 at 5:58 AM, Aaron Halfaker ahalfaker@wikimedia.orgwrote:
Hey guys,
Here's how I'd do it.
*Assumption:* Only logged-in users can start the UW funnel
*Schemas:*
UploadWizardStep
Stored when the user loads a new step of the Upload Wizard
- user_id : int -- The user's identifier
- flow_initialized : str -- The timestamp at which the current flow
through the funnel began (will need to be stored in a cookie and reset at loads of step 1)
- step : int -- 1 - 4 of the UW process
UploadWizardRightsSelection
Stored when the user selects a "rights" option.
- user_id : int -- The user's identifier
- flow_initialized : str -- The timestamp at which the current flow
through the funnel began (will need to be stored in a cookie and reset at loads of step 1)
- rights_selected : enum("own", "other) -- The rights that a user
selected (note that multiple selections actions can take place for a single flow)
I'd make a pass over the DB, to identify the last RightsSelection for each flow_initialization (if any) to figure out what an uploading user settled on during a particular flow. I'd also look at how many selections a user makes per flow to see evidence of confusion & indecisiveness or maybe just exploration of the UI.
Thanks Aaron, I will try something along these lines. This avoids the latency concerns mentioned by Nuria, and it is very flexible - we'll see how painful it is to aggregate the data on the backend.
(will need to be stored in a cookie and reset at loads of step 1)
We don't even need this part since UploadWizard is a single-page application with no page load between the steps, so we can just store the token in memory. I don't want to log userids unless we really need them, so I'll just go with initial timestamp + random number. I don't think connecting separate upload attempts by the same user is particularly useful at this point.