Mobile-l June 2015

mobile-l@lists.wikimedia.org

39 participants
58 discussions

Fwd: [Analytics] Pageview API Status update

by Adam Baso

Cross posting. See follow up discussion on analytics list web archive. ---------- Forwarded message ---------- From: *Dan Andreescu* <dandreescu(a)wikimedia.org> Date: Friday, June 5, 2015 Subject: [Analytics] Pageview API Status update To: Analytics List <analytics(a)lists.wikimedia.org> I just posted a comment on the famous task: https://phabricator.wikimedia.org/T44259#1341010 :) Here it is for those who would rather discuss on this list: We have finished analyzing the intermediate hourly aggregate with all the columns that we think are interesting. The data is too large to query and anonymize in real time. We'd rather get an API out faster than deal with that problem, so we decided to produce smaller "cubes" [1] of data for specific purposes. We have two cubes in mind and I'll explain those here. For each cube, we're aiming to have: * Direct access to a postgresql database in labs with the data * API access through RESTBase * Mondrian / Saiku access in labs for dimensional analysis * Data will be pre-aggregated so that any single data point has k-anonymity (we have not determined a good k yet) * Higher level aggregations will be pre-computed so they use all data And, the cubes are: **stats.grok.se Cube: basic pageview data** Hourly resolution. Will serve the same purpose as stats.grok.se has served for so many years. The dimensions available will be: * project - 'Project name from requests host name' * dialect - 'Dialect from requests path (not set if present in project name)' * page_title - 'Page Title from requests path and query' * access_method - 'Method used to access the pages, can be desktop, mobile web, or mobile app' * is_zero - 'accessed through a zero provider' * agent_type - 'Agent accessing the pages, can be spider or user' * referer_class - 'Can be internal, external or unknown' **Geo Cube: geo-coded pageview data** Daily resolution. Will allow researchers to track the flu, breaking news, etc. Dimensions will be: * project - 'Project name from requests hostname' * page_title - 'Page Title from requests path and query' * country_code - 'Country ISO code of the accessing agents (computed using MaxMind GeoIP database)' * province - 'State / Province of the accessing agents (computed using MaxMind GeoIP database)' * city - 'Metro area of the accessing agents (computed using MaxMind GeoIP database)' So, if anyone wants another cube, **now** is the time to speak up. We'll probably add cubes later, but it may be a while. [1] OLAP cubes: https://en.wikipedia.org/wiki/OLAP_cube

8 years, 10 months

Gather update

by Jon Katz

Hi People Interested in Gather, Given the reorg and the traffic being driven to beta, we need to revisit Gather's Q4 goals: TLDR: Continue working on gather to finish MVP features (end of the month, max), not pushing to stable unless we see 2x the number of logged-in edits on beta (or >10x current state). Before the serious stuff: Top 5 best collections since Monday: https://en.m.wikipedia.org/wiki/Special:Gather/by/Johnrenfro https://en.m.wikipedia.org/wiki/Special:Gather/id/3454/Philadelphia_watch https://en.m.wikipedia.org/wiki/Special:Gather/id/3401/engineering https://en.m.wikipedia.org/wiki/Special:Gather/id/3532/Mental_Health https://en.m.wikipedia.org/wiki/Special:Gather/id/3132/Libertarianismo Okay, now the goals. Please feel free to comment in email or in this google doc <https://docs.google.com/a/wikimedia.org/document/d/1DK1pa3PEpIbiON0Bj6BrhGA…> Thanks, Jon Given the reorg and an improvement made to beta, we need to revisit Gather's goals Pushing to Stable Given that we can now test Gather numbers in beta, the original goal of launching on stable to test adoption is no longer valid. It is expensive in terms of future maintenance and commitments to launch features to stable, so we only want to do so after we have proven success, if possible. Target numbers Originally, the benchmarks for success on stable that were agreed on were low (10k creators a month on stable, and 1k shares). Given the current beta numbers, it looks like we will blow the first number out of the water. Share has been deprioritized given current usage patterns. However, given that we now have 4 engineers in charge of the entire web, the standards for what we work on have to be more rigorous. We cannot allocate multiple engineers to an experimental product unless it shows promise of impacting greater numbers 1. Goal 1. New Goal: 1. Round out Gather hypothesis (criteria below) 2. By end of June know whether or not we want to push Gather to stable 1. Next Eng+PM Steps (to round out hypothesis) 1. Improve onboarding (a few tasks) 2. Surface collections publicly (this is a big missing feature) 3. Qualitative and quantitative research 2. Criteria for passing to stable: We don’t have a great way to measure success of Gather based on usage by a proportion of users or logged in users. However, we can compare to something similar like edits. In terms of pure value to WP, we can consider a collection to be like a low-value edit. - There are 2x ‘good’ collections made as there are total logged-in edits/month - in May there were 2,180 edits by logged in users (2,694 logged out). - At current rates this suggests ~4,500 collections per month is our target. - ‘good’ here, means >1 collection--it’s not a perfect definition, but it’s a strong proxy. - Our current rate of ‘good’ collections is roughly 250 a month, so we will need to increase the number by almost 20x. - It might be worth exploring % of those 2,180 edits that are reverted and adjusting down accordingly OR - Views of collections or where collection is the referrer > .5% of total pageviews - Currently, the views of collections are minimal. - If very few people create collections, but they drive a significant boost in page views, say .5% of total PVs (not an end goal, but also not bad for an MVP introducing a new use case), then the feature is a success. 1. What if we don’t pass to stable: Lets burn that bridge when we get to it :) . Seriously though, until we get some qualitative data back from our readers, we will not be able to make important calls on the feature as is. There are a few great alternatives I can think of right off the bat: - Keep code in beta and work on Gather opportunistically or as qualitative data dictates - One example might be to make collections private by default and launch as bookmarker for readers (primary current use-case) - Promote as beta feature on desktop - Use codebase as start of multiple watchlists (some good work started here by JRobson) - Codebase is fairly generic list table that has some basic and interesting features built in that could be used for a number of other ends 1. Validation and why we choose this criteria: Qualitative questions to answer: Why aren't more people using Gather? (correctly) Why aren't people returning to use Gather more Success metric questions to answer (that we can’t already): What % of logged in users use Gather What % of users who visit > 1 page use Gather What is our denominator Status Measure logins and signup funnel directed from Gather, what is % success rate? This is not instrumented Measure % of sessions with more than 1 pageview Working on this Measure % of sessions with logged in users Cannot get this What is our baseline? Logged in edits is probably the best thing to measure against.

8 years, 10 months

mobile wikipedia with service worker

by Derk-Jan Hartman

Jake Archibald did some playing around using wikipedia mobile as an example https://www.youtube.com/watch?v=d5_6yHixpsQ <https://www.youtube.com/watch?v=d5_6yHixpsQ> https://github.com/jakearchibald/offline-wikipedia <https://github.com/jakearchibald/offline-wikipedia> DJ

8 years, 10 months

Fwd: [Wikitech-l] API BREAKING CHANGE: Default continuation mode for action=query will change at the end of this month

by Adam Baso

Cross posting for visibility. This kind of stuff gets posted to the API lists, so be sure to subscribe to those if you aren't already subscribed to them. ---------- Forwarded message ---------- From: Brad Jorsch (Anomie) <bjorsch(a)wikimedia.org> Date: Tue, Jun 2, 2015 at 1:42 PM Subject: [Wikitech-l] API BREAKING CHANGE: Default continuation mode for action=query will change at the end of this month To: Wikimedia developers <wikitech-l(a)lists.wikimedia.org>, mediawiki-api-announce(a)lists.wikimedia.org As has been announced several times (most recently at https://lists.wikimedia.org/pipermail/wikitech-l/2015-April/081559.html), the default continuation mode for action=query requests to api.php will be changing to be easier for new coders to use correctly. *The date is now set:* we intend to merge the change to ride the deployment train at the end of June. That should be 1.26wmf12, to be deployed to test wikis on June 30, non-Wikipedias on July 1, and Wikipedias on July 2. If your bot or script is receiving the warning about this upcoming change (as seen here <https://www.mediawiki.org/w/api.php?action=query&list=allpages>, for example), it's time to fix your code! - The simple solution is to simply include the "rawcontinue" parameter with your request to continue receiving the raw continuation data ( example < https://www.mediawiki.org/w/api.php?action=query&list=allpages&rawcontinue=1 >). No other code changes should be necessary. - Or you could update your code to use the simplified continuation documented at https://www.mediawiki.org/wiki/API:Query#Continuing_queries (example <https://www.mediawiki.org/w/api.php?action=query&list=allpages&continue= >), which is much easier for clients to implement correctly. Either of the above solutions may be tested immediately, you'll know it works because you stop seeing the warning. I've compiled a list of bots that have hit the deprecation warning more than 10000 times over the course of the week May 23–29. If you are responsible for any of these bots, please fix them. If you know who is, please make sure they've seen this notification. Thanks. AAlertBot AboHeidiBot AbshirBot Acebot Ameenbot ArnauBot Beau.bot Begemot-Bot BeneBot* BeriBot BOT-Superzerocool CalakBot CamelBot CandalBot CategorizationBot CatWatchBot ClueBot_III ClueBot_NG CobainBot CorenSearchBot Cyberbot_I Cyberbot_II DanmicholoBot DeltaQuadBot Dexbot Dibot EdinBot ElphiBot ErfgoedBot Faebot Fatemibot FawikiPatroller HAL HasteurBot HerculeBot Hexabot HRoestBot IluvatarBot Invadibot Irclogbot Irfan-bot Jimmy-abot JYBot Krdbot Legobot Lowercase_sigmabot_III MahdiBot MalarzBOT MastiBot Merge_bot NaggoBot NasirkhanBot NirvanaBot Obaid-bot PatruBOT PBot Phe-bot Rezabot RMCD_bot Shuaib-bot SineBot SteinsplitterBot SvickBOT TaxonBot Theo's_Little_Bot W2Bot WLE-SpainBot Xqbot YaCBot ZedlikBot ZkBot -- Brad Jorsch (Anomie) Software Engineer Wikimedia Foundation _______________________________________________ Wikitech-l mailing list Wikitech-l(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

8 years, 10 months

iOS 4.1.4 is live

by Corey Floyd

This version has a lot of visual updates, performance improvements, and bug fixes - we even tweaked the icon. Overall this is a pretty nice release that applies some much needed polish in a lot of places. Please check it out when you get a chance and let us know what you think. https://itunes.apple.com/us/app/wikipedia-mobile/id324715238?mt=8 -- Corey Floyd Software Engineer Mobile Apps / iOS Wikimedia Foundation

8 years, 10 months

hacks I particularly liked at Lyon

by Brian Gerstle

Hey everyone, Just wanted to send my notes on the hack demos from this weekend, and potentially start a discussion about shipping some of them! *Apple Watch: *Corey & Jason did an incredible job on this, and I'm essentially sold that a watch companion app could augment our *read later* and *search* flows for what seems like a reasonable development cost. Not sure who did it, but server-side image/upload validation could be further facilitate cross-platform mobile uploads? *Image surface content gap*: IOW, which pages are in the most need of pictures. Seems like a "micro edit" workflow that could work well for apps when combined with location/geo-fencing. *Haikus from recent changes:* More of a technical inspiration: I'd love to discuss building a service that sends push notifications in response to RC stream events. *Dmitry's "Wikipedia Lite" hack*: I think we should do some prototypes on this. In particular, I'd like to play around with parsoid to create a "reader view" for Wikipedia pages. Aside from streamlining content for mobile and improving performance, this could also make it easier to bring back a lot of reader-centric features. Dynamic font sizes & color/contrast configurations are two things I see pop up from time to time in OTRS & iTunes. *Bernd's map view for Nearby:* This seems like a no-brainer. I sent a separate email to mobile-l, because it seems like some progress has been made in this area since I last looked into it. What were your favorite hacks? -- EN Wikipedia user page: https://en.wikipedia.org/wiki/User:Brian.gerstle IRC: bgerstle

8 years, 10 months

Re: [WikimediaMobile] [FYI] Popular mobile phone in Ghana

by Adam Baso

Sharing to mobile-l On Thursday, May 28, 2015, Rachel diCerbo <rdicerb(a)wikimedia.org> wrote: > Not certain if you guys are aware of this: > > > > https://medium.com/product-notes/the-mystery-of-the-power-bank-phone-taking… >

8 years, 10 months

Re: [WikimediaMobile] [Analytics] Share a Fact Initial Analysis

by Dario Taraborelli

Thanks for sharing this, Adam. Aside from engagement/funnel data, the critical question for this feature is: does it bring back eyeballs to the site from social media? It looks like it doesn’t yet, at least not in a substantial way, even with the caveat that App traffic is a very small fraction of total mobile traffic. Having looked into referrals for this feature before and after comparing them to Twitter’s own engagement analytics (and finding some big discrepancy), you should consider removing spiders/crawlers from the data (see [1]) to avoid inflating pageviews with non-human activity. I’m a big fan of this feature and look forward to seeing how you guys intend to scale it. Dario [1] https://github.com/ewulczyn/wmf/blob/b9f726ee3468852c3fed2780af1d8ac0004eda… <https://github.com/ewulczyn/wmf/blob/b9f726ee3468852c3fed2780af1d8ac0004eda…> > On May 21, 2015, at 12:37 PM, Toby Negrin <tnegrin(a)wikimedia.org> wrote: > > Hi all - some interesting analysis on the share-a-fact feature from the mobile team. > > -Toby > > Begin forwarded message: > >> From: Adam Baso <abaso(a)wikimedia.org <mailto:abaso@wikimedia.org>> >> Date: May 21, 2015 at 12:05:29 PDT >> To: mobile-l <mobile-l(a)lists.wikimedia.org <mailto:mobile-l@lists.wikimedia.org>> >> Subject: [WikimediaMobile] Share a Fact Initial Analysis >> >> Hello all, >> >> We’ve been looking at some initial results from the Share a Fact feature introduced on the Wikipedia apps for Android and iOS in its basic "minimal viable product" implementation. Here’s some analysis, using data from one day (20150512) with respect to the latest stable versions of the apps (2.0-r-2015-04-23 on Android and 4.1.2 on iOS) for that day. >> >> * On iOS, when a user initiates the first step of the default sharing workflow - tapping the up-arrow box share button (6,194 non-highlighting instances for the day under question) - about 11.7% of the time it yielded successful sharing. >> >> * On Android, it’s not possible to easily tell when the sharing workflow was carried through to successful share, but we anticipate the Android success rate is currently much higher, as general engagement percentage up to the point of picking an app for sharing is higher on Android than on iOS. >> >> * On Android, when presented with the share card preview, 28.0% of the time the ‘Share as image’ button was tapped and 55.5% of the time the 'Share as text' button was tapped, whereas on iOS it was 8.4% ‘Share as image’ and 16.8% ‘Share as text’. >> >> * The forthcoming 4.1.4 version of the iOS app will relax its default sharing snippet generation rules and be more like the Android version in that respect. We anticipate this will result in higher engagement with both the ‘Share as image’ and ‘Share as text’ buttons on iOS, and we should be able to verify this once the 4.1.4 iOS version is released and generally adopted (usually takes 4-5 days after release; the 4.1.4 release isn’t released yet). >> >> * On the Android app the ‘Share’ option is located on the overflow menu, not as part of the main set of UI buttons. This potentially increases the likelihood of Android users being primed to step through the workflow. On the iOS app, the share button (up-arrow box) is plainly visible from the main UI and not an overflow menu, and this probably creates a different priming dynamic for the iOS demographic. >> >> * When users on iOS tapped on the ‘Share as image’ or ‘Share as text’ buttons, there is a pretty sharp drop off at the next stage - the system sharesheet. Once the sharesheet was presented to iOS users, 41.6% of the time it resulted in active abandonment. We believe this probably has something to do with the relatively small set of default apps listed on the sharesheet and the extra work involved with exposing additional social apps for sharing in that context. As with the Android app, the labels of ‘Share as image’ and ’Share as text’ may also pose something of a hurdle at least for first time users of the feature. To this end, there is an onboarding tutorial planned at least on Android. >> >> * For a one hour period (2015051201) there were about 100 pageviews in some sense attributable to Share a Fact using a provenance parameter available on the latest stable versions of the apps at that time; this may slightly overstate the number of pageviews attributable to the two specific apps reviewed in this analysis, but probably not too much (n.b., previously a different source parameter was used than the new wprov provenance parameter). Pageviews are not the sole motivation for the feature, but following the trendline over the long run should be interesting. Impact on social media and the destinations of shares is a little harder to capture directly, but https://twitter.com/search?f=realtime&q=%40wikipedia%20-%40itzwikipedia%20f… <https://twitter.com/search?f=realtime&q=%40wikipedia%20-%40itzwikipedia%20f…> gives one a sense about image shares, at least. >> >> * A couple potential options for increasing sharing include: >> >> ** Trying to add support for sharing to the Photos app on iOS. People may be interested in using images from the Photos apps for various workflows, as Dan Garry has noted. >> >> ** Offering a more concise app picklist, in particular explicitly adding the native OS app components (namely, Twitter and Facebook, and as mentioned, Photos if possible), with an option to expose the sharesheet for additional options if necessary. This is probably also somewhat confined to iOS, although conceivably a similar approach could be possible on Android. On Android the full list of applications in its equivalent of the sharesheet is by default readily available to the user, though. >> >> ** On Android, exposing the diagonal arrow share button on the main interface akin to how the iOS version of the app shows the up-arrow share button. This may introduce more opportunities for sharing (and thus numbers of abandons would go up in tandem with numbers of shares), but would also partially clutter the interface and probably increase abandon. A controlled experiment may be useful for observing the impact of such an approach. >> >> * As a point of reference, for the app versions in scope for this analysis over a single day, there appeared to be approximately 3.78 million Wikipedia for Android pageviews and 1.19 Wikipedia Mobile for iOS app pageviews. There were about 6.73 million app pageviews on the “modern” versions of these apps total for this particular day, meaning there were about 1.75 million pageviews on other modern versions of the app. >> >> * Examination of the categories of successful shares on iOS showed the following distributions: >> >> Images: >> 48.5% messaging >> 25.5% sharesheet copy >> 22.9% social >> 1.8% productivity >> 0.9% reading >> >> >> Text: >> 53.6% messaging >> 31.9% sharesheet copy >> 7.1% social >> 5.4% reading >> 2.0% productivity >> >> >> Here were some queries used in the analysis: >> >> == SHARE A FACT ATTRIBUTABLE PAGEVIEWS FOR ONE HOUR == >> >> select wprov, uri_host, count(*) from (select x_analytics_map['wprov'] as wprov, uri_host >> from webrequest where year = 2015 and month = 5 and day = 12 and hour = 1 and is_pageview = true and uri_host like '%.wikipedia.org <http://wikipedia.org/>' and x_analytics_map['wprov'] is not null) t >> group by wprov, uri_host; >> >> >> == PAGE VIEWS FOR THE DAY FOR THE “MODERN” VERSIONS OF THE APPS == >> >> SELECT >> user_agent, count(*) >> FROM >> wmf.webrequest >> tablesample(BUCKET 1 OUT OF 100 ON rand()) >> WHERE >> YEAR = 2015 >> AND MONTH = 5 >> AND DAY = 12 >> AND is_pageview = TRUE >> AND lower(uri_host) like '%.wikipedia.org <http://wikipedia.org/>' >> AND user_agent like 'WikipediaApp%' >> GROUP BY user_agent; >> >> >> >> == HIGHLIGHTING SESSION CASE FOR SPECIFIC VERSIONS OF THE APPS == >> select CASE WHEN t2.userAgent LIKE 'WikipediaApp/2.0-r-2015-04-23%' THEN 'Android' WHEN t2.userAgent LIKE 'WikipediaApp/4.1.2%' THEN 'iOS' END AS 'ua', t1.event_action, t1.event_sharemode, t1.event_target, count(*) from MobileWikiAppShareAFact_11331974 t1 inner join MobileWikiAppShareAFact_11331974 t2 on t1.event_shareSessionToken = t2.event_shareSessionToken where t1.timestamp > '20150512' and t1.timestamp < '20150513' and t2.timestamp > '20150512' and t2.timestamp < '20150513' and t1.event_action != 'highlight' and t2.event_action = 'highlight' and (t2.userAgent like 'WikipediaApp/2.0-r-2015-04-23%' or t2.userAgent like 'WikipediaApp/4.1.2%') group by ua, t1.event_action, t1.event_sharemode, t1.event_target; >> >> >> == NON-HIGHLIGHTING SESSION CASE FOR SPECIFIC VERSIONS OF THE APPS == >> n.b., subtract the highlighting cases from the non-highlighting cases to arrive at the default sharing behavior. Technically, inner joins can be used to do more comprehensive session analysis, but the queries take a long time. >> >> select CASE >> WHEN userAgent LIKE 'WikipediaApp/2.0-r-2015-04-23%' THEN 'Android' >> WHEN userAgent LIKE 'WikipediaApp/4.1.2%' THEN 'iOS' >> END AS 'ua', event_action, event_sharemode, event_target, >> count(*) from MobileWikiAppShareAFact_11331974 where timestamp > '20150512' and timestamp < '20150513' and (userAgent like 'WikipediaApp/2.0-r-2015-04-23%' or userAgent like 'WikipediaApp/4.1.2%') group by ua, event_action, event_sharemode, event_target; >> >> -Adam >> _______________________________________________ >> Mobile-l mailing list >> Mobile-l(a)lists.wikimedia.org <mailto:Mobile-l@lists.wikimedia.org> >> https://lists.wikimedia.org/mailman/listinfo/mobile-l <https://lists.wikimedia.org/mailman/listinfo/mobile-l> > _______________________________________________ > Analytics mailing list > Analytics(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics

8 years, 10 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Mobile-l June 2015