Mobile-l January 2016

mobile-l@lists.wikimedia.org

19 participants
21 discussions

[Apps] Stripping content inside brackets from the first sentence of articles

by Dan Garry

Hi everyone, *tl;dr: We'll be stripping all content contained inside brackets from the first sentence of articles in the Wikipedia app.* The Mobile Apps Team is focussed on making the app a beautiful and engaging reader experience, and trying to support use cases like wanting to look something up quickly to find what it is. Unfortunately, there are several aspects of Wikipedia at present that are actively detrimental to that goal. One example of this are the lead sentences. As mentioned in the other thread on this matter <https://lists.wikimedia.org/pipermail/mobile-l/2015-March/008715.html>, lead sentences are poorly formatted and contain information that is detrimental to quickly looking up a topic. The team did a quick audit <https://docs.google.com/a/wikimedia.org/spreadsheets/d/1BJ7uDgzO8IJT0M3UM2q…> of the information available inside brackets in the first sentences, and typically it is pronunciation information which is probably better placed in the infobox rather than breaking up the first sentence. The other problem is that this information was typically inserted and previewed on a platform where space is not at a premium, and that calculation is different on mobile devices. In order to better serve the quick lookup use case, the team has reached the decision to strip anything inside brackets in the first sentence of articles in the Wikipedia app. Stripping content is not a decision to be made lightly. People took the time to write it, and that should be respected. We realise this is controversial. That said, it's the opinion of the team that the problem is pretty clear: this content is not optimised for users quickly looking things up on mobile devices at all, and will take a long time to solve through alternative means. A quicker solution is required. The screenshots below are mockups of the before and after of the change. These are not final, I just put them together quickly to illustrate what I'm talking about. - Before: http://i.imgur.com/VwKerbv.jpg - After: http://i.imgur.com/2A5PLmy.jpg If you have any questions, let me know. Thanks, Dan -- Dan Garry Associate Product Manager, Mobile Apps Wikimedia Foundation

4 years, 6 months

by Erik Bernhardson

Both mobile apps and web are using CirrusSearch's morelike: feature which is showing some performance issues on our end. We would like to make a performance optimization to it, but before we would prefer to run an A/B test to see if the results are still "about as good" as they are currently. The optimization is basically: Currently more like this takes the entire article into account, we would like to change this to take only the opening text of an article into account. This should reduce the amount of work we have to do on the backend saving both server load and latency the user sees running the query. This can be triggered by adding these two query parameters to the search api request that is being performed: cirrusMltUseFields=yes&cirrusMltFields=opening_text The API will give a warning that these parameters do not exist, but they are safe to ignore. Would any of you be willing to run this test? We would basically want to look at user perceived latency along with click through rates for the current default setup along with the restricted setup using only opening_text. Erik B.

8 years, 2 months

Lead section improvement contest on English Wikipedia

by Tilman Bayer

User:Casliber has started an editing contest on the English Wikipedia focused on improving or adding lead (intro) sections: https://en.wikipedia.org/wiki/Wikipedia:Take_the_lead! (open until this Sunday, January 31) Notably, the invitation cites mobile readers as a motivation: "Many articles on the English Wikipedia have deficient or poorly written leads that do not summarise the article or present the information in an engaging manner. With increased mobile phone usage, this is becoming more of an issue, because mobile interfaces often show the lead alone, with other sections collapsed." I added the subsequent sentence, to support this point with data from the analysis that JonR and I did some weeks ago for the MobileWebSectionUsage data. This also lead to some interesting community discussion in several places, and someone added that chart to https://en.wikipedia.org/wiki/Wikipedia:How_to_create_and_manage_a_good_lea… . It's a great example of how we can support editors' work with readership data beyond mere pageviews (I also submitted a Wikimania talk about this topic earlier this month). -- Tilman Bayer Senior Analyst Wikimedia Foundation IRC (Freenode): HaeB

8 years, 2 months

[Apps] Android Wikipedia beta app release (2.1.139-beta-2016-01-22)

by Stephen Niedzielski

The Android team[0] is happy to announce a new Wikipedia Android app beta release, v2.1.139-beta-2016-01-22[1]. This revision contains the following new fixes and improvements[2]: * Fix Wiktionary lookup and temporary outage. * Fix UI for nonexistent pages. * Various improvements to the networking layer and stability. Included in this version are contributions from famous repeat contributor YuviPanda. Woo! You too can help make it better! Read our getting started guide[3]. We can't wait for your contributions! -The WMF Android team [0] https://www.mediawiki.org/wiki/Wikimedia_Apps/Team#Android_App [1] Rolling out at https://play.google.com/store/apps/details?id=org.wikipedia.beta [2] A complete list of changes is available at https://phabricator.wikimedia.org/diffusion/APAW/history/master/;beta/2.1.1… [3] https://www.mediawiki.org/wiki/Wikimedia_Apps/Team/Wikipedia_Android_app_ha…

8 years, 3 months

Added MediaWiki parser analysis to the content analysis report

by Joaquin Oltra Hernandez

Hi, We've added Mediawiki parser content analysis to the content analysis report that the Reading web team performed last quarter. We also added the option to see the Gzip (lvl6) version of the report to have a look at more realistic numbers (since traffic is gzipped in prod) (see select box at the top). http://chimeces.com/loot-content-analysis/ No surprises, the results are pretty similar to the restbase analysis, in that navboxes are around 14% of the content and references are around 50%. *Request*: If you know about useless html markup emitted by the mediawiki parser and would like to see what % of the content it accounts for, please answer here or in the task with examples and we'll add it to the report (like we did with restbase and the *extraneous markup*). *Related phab task: https://phabricator.wikimedia.org/T123325 <https://phabricator.wikimedia.org/T123325>* Thanks, Joaquin

8 years, 3 months

by Erik Bernhardson

reply-all is hard... ---------- Forwarded message ---------- From: Erik Bernhardson <ebernhardson(a)wikimedia.org> Date: Wed, Jan 20, 2016 at 12:14 PM Subject: Re: [WikimediaMobile] Similar articles feature performance in CirrusSearch for apps and mobile web To: Joaquin Oltra Hernandez <jhernandez(a)wikimedia.org> On Wed, Jan 20, 2016 at 7:45 AM, Joaquin Oltra Hernandez < jhernandez(a)wikimedia.org> wrote: > I'd be up to it if we manage to cram it up in a following sprint and it is > worth it. > > We could run a controlled test against production with a long batch of > articles and check median/percentiles response time with repeated runs and > highlight the different results for human inspection regarding quality. > > I can work this up i think. David and I have done some basic checks of a few dozen articles and on average latency was around half using only opening text. I'll work up something a bit more complete with a few thousand articles across. One difficulty of this is that morelike performance changes depending on cluster load. During the busy part of our day morelike takes 50% longer than at the low points[1] It's been noted previously that the results are far from ideal (which they > are because it is just *morelike*), and I think it would be a great idea > to change the endpoint to a specific one that is smarter and has some cache > (we could do much more to get relevant results besides text similarity, > take into account links, or *see also* links if there are, etc...). > We've talked about a dedicated endpoint internally but haven't gotten anywhere on it. I can fairly easily put together a cirrus specific api endpoint, it's been hung up deciding if we should instead be putting the api into core and building up some sort of abstraction around it. Putting it into core would probably make more sense if we are doing more than the basic morelike query. > > As a note, in mobile web the related articles extension allows editors to > specify articles to show in the section, which would avoid queries to > cirrussearch if it was more used (once rolled into stable I guess). > > I remember that the performance related task was closed as resolved ( > https://phabricator.wikimedia.org/T121254#1907192), should we reopen it > or create a new one? > > I'll create a new one, some performance concerns were addressed there and we did see a reduction in server work (average fetch latency cut in half). Morelike is still accounting for around ~20% of server load, even though it is only in the 700 qps range (vs 4k for fulltext and 8k for prefix. Note these are after fanning out to shards, not the number sent to mediawiki). I'm not sure if we ended up adding the smaxage parameter (I think we didn't > <https://github.com/wikimedia/mediawiki-extensions-RelatedArticles/search?ut…>), > should we? To me it seems a no-brainer that we should be caching this > results in varnish since they don't need to be completely up to date for > this use case. > > I've been unsure about using smaxage on the search api, due to fragmentation between how different clients use the api. After further investigation I've perhaps been worried for no reason. A relatively naive query in hive suggests in the span of 24h we could cut morelike queries to the backend from 7.3M to 1.7M: select sum(total), sum(deduplicated) from (select count(1) as total, count(distinct requests[0].query) as deduplicated from wmf_raw.cirrussearchrequestset where year=2016 and month=1 and day=10 and requests[0].querytype = 'more_like' group by wikiid) x; _c0 _c1 7331659 1726091 This tries to get a rough estimate on how that compares to the variance in the way uri's are sent. I'm not sure how good of an approximation this is, but the totals are similar enough it might be a good guess: select sum(total), sum(deduplicated) from (select count(1) as total, count(distinct uri_query) as deduplicated from wmf.webrequest where year=2016 and month=1 and day=10 and uri_query LIKE '%search=morelike%' group by uri_host) x; _c0 _c1 7383599 2214332 In summary to resolve the current load issues we are seeing I will figure out how to get these cached. I've created https://phabricator.wikimedia.org/T124216 for discovery to figure that out. Changing to opening_text would still provide a large benefit for latency on non-cached pages. It may also help with relevancy, but that is hard to guesstimate. I still think measuring click through rates could inform the relevancy decision without too much programming overhead (depending on how much work it is to add an AB test, i've been told its fairly painless in the apps?) On Tue, Jan 19, 2016 at 11:54 PM, Erik Bernhardson < > ebernhardson(a)wikimedia.org> wrote: > >> Both mobile apps and web are using CirrusSearch's morelike: feature which >> is showing some performance issues on our end. We would like to make a >> performance optimization to it, but before we would prefer to run an A/B >> test to see if the results are still "about as good" as they are currently. >> >> The optimization is basically: Currently more like this takes the entire >> article into account, we would like to change this to take only the opening >> text of an article into account. This should reduce the amount of work we >> have to do on the backend saving both server load and latency the user sees >> running the query. >> >> This can be triggered by adding these two query parameters to the search >> api request that is being performed: >> >> cirrusMltUseFields=yes&cirrusMltFields=opening_text >> >> >> The API will give a warning that these parameters do not exist, but they >> are safe to ignore. Would any of you be willing to run this test? We would >> basically want to look at user perceived latency along with click through >> rates for the current default setup along with the restricted setup using >> only opening_text. >> >> Erik B. >> >> _______________________________________________ >> Mobile-l mailing list >> Mobile-l(a)lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/mobile-l >> >> > [1] https://grafana.wikimedia.org/dashboard/db/elasticsearch?panelId=28&fullscr…

8 years, 3 months

Mobile webpagetest dashboard

by Joaquin Oltra Hernandez

Hi! I just learned via another email thread about the Mobile webpagetest dashboard. It has really interesting data. https://grafana.wikimedia.org/dashboard/db/mobile-webpagetest Cheers, Joaquin

8 years, 3 months

Fwd: [Wikitech-ambassadors] Looking for small Wikipedias to test a new feature for them

by Adam Baso

FYI. Note this is captured this as something to think about in https://phabricator.wikimedia.org/T123349. We're fully booked in Q3, but this sort of thing seems worth examining for collaboration opportunities for our future-facing roadmap. -Adam ---------- Forwarded message ---------- From: Lucie Kaffee <lucie.kaffee(a)wikimedia.de> Date: Wed, Jan 20, 2016 at 5:57 AM Subject: [Wikitech-ambassadors] Looking for small Wikipedias to test a new feature for them To: wikitech-ambassadors(a)lists.wikimedia.org As part of my Bachelor’s thesis I worked on an extension called “ArticlePlaceholder” https://www.mediawiki.org/wiki/Extension:ArticlePlaceholder over the last months. One of the biggest barriers for accessing the knowledge Wikipedia provides is language. There are many topics that are only covered in few, big Wikipedias. People who don’t speak any of these languages don’t have access to all the information available potentially vital to them. The Article Placeholder extensions aims at smaller Wikipedias to support them in increasing access to data available on Wikidata. Article Placeholders are automatically generated content pages in Wikipedia or other mediawiki projects displaying data from Wikidata. They are clearly not actual articles but an overview of data on a topic which does not have an article yet. The design of the page and its content is under the control of the local community via Lua and templates but we will provide defaults so smaller Wikipedias can work with them without having to worry about the technical side of it. I have a test setup on Labs with an example for Ada Lovelace http://articleplaceholder.wmflabs.org/mediawiki/index.php/Special:AboutTopi… The reader can find these pages by searching for a topic and gets results if there is an Item on Wikidata with the respective label and/or alias. The reader would benefit a lot since even if there is no article on a topic yet, they will still have basic information provided in their language. But it also might increase the numbers of editors due to increased usefulness of that Wikipedia. We are now looking for the first Wikipedias to support the extension by deploying it and giving their input. I am still developing the extension and the first Wikipedias to try it will naturally have a larger say in how it evolves. If your Wikipedia would like to give it a try please let me know. We would start it as a beta feature. Thank you, Lucie (Frimelle) -- Lucie-Aimée Kaffee Working Student Software Development Wikimedia Deutschland e.V. | Tempelhofer Ufer 23-24 | 10963 Berlin Phone: +49 (0)30 219 158 26-0http://wikimedia.de Imagine a world in which every single human being can freely share in the sum of all knowledge. That‘s our commitment. Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207. _______________________________________________ Wikitech-ambassadors mailing list Wikitech-ambassadors(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-ambassadors

8 years, 3 months

[Apps] New Android production release

by Dmitry Brant

Hi all, We've just released an updated version of the Wikipedia Android app[1][2], rolling out as we speak to the Google Play store! This is mostly a maintenance update, focusing on improved memory usage for lower-memory devices. Additional enhancements include: * Introduced a toolbar under the lead image for quick access to share an article or save it for offline reading. * Moved "similar pages" and "page issues" to the overflow menu (when applicable). * Improved support for Norwegian language in the app. * HTML tags are now stripped from edit summaries. Cheers, -- Dmitry Brant Mobile Apps Team (Android) Wikimedia Foundation https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering [1] https://play.google.com/store/apps/details?id=org.wikipedia&hl=en [2] https://releases.wikimedia.org/mobile/android/wikipedia/stable/wikipedia-2.…

8 years, 3 months

Related Articles / Read More Data Analysis

by Adam Baso

Hi there, I posed some basic funnel analysis and Referer analysis for the Related Articles / Read More functionality: https://www.mediawiki.org/wiki/Reading/Web/Projects/Related_pages#Metrics_a… Short version: it seems to have relatively higher engagement on the mobile web beta channel, but relatively lower engagement on the desktop web beta feature. Analysis and discussion on the feature is ongoing, but I wanted to share out this data anyway. -Adam

8 years, 3 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Mobile-l January 2016