Wikitech-l August 2015

wikitech-l@lists.wikimedia.org

104 participants
81 discussions

Page Curation for other wikis
by Amir Ladsgroup 21 Aug '15

21 Aug '15

Hey, Page Curation [1][2][3] is a project funded by WMF but strangely the product is only available for English Wikipedia. This is surprising because this tool is useful for all Wikis and I don't think i18n would be very hard. Is there any plans to add i18n and rolling it out in other languages? If it's not funded anymore, Is anyone willing to help as volunteer to work? [1] https://en.wikipedia.org/wiki/Wikipedia:Page_Curation [2] https://www.mediawiki.org/wiki/Page_Curation [3] https://www.mediawiki.org/wiki/Extension:PageTriage Best

5 7

[Announcement] SQLite storage in RESTBase now available
by Marko Obrovac 21 Aug '15

21 Aug '15

TL;DR: Getting started with RESTBase is now easier thanks to a new SQLite3 backend Hello, The WMF Services team~[0] is proud to announce the initial release of the SQLite storage back-end~[1] for RESTBase~[2], the service that powers Wikimedia projects’ public REST API~[3]. Traditionally, RESTBase was limited to Cassandra persistence, which, while scalable, presented a weighty and difficult dependency for smaller installations and test environments. This new module allows RESTBase administrators to replace Cassandra with a lightweight, file-based persistence based on SQLite. This is an exciting milestone in RESTBase’s development. We expect the option to run RESTBase atop an SQLite database to significantly widen the target audience, allowing users with small installations and limited resources to take advantage of this valuable service. We are also hoping to attract more contributors by having a simpler, less resource-intensive alternative to Cassandra; SQLite is now the storage module of choice in MediaWiki-Vagrant~[6] installations with the restbase role enabled. Contributing is just a vagrant up away! The abstract storage interface~[7] we developed for RESTBase allows users to seamlessly switch between storage back-ends (Cassandra or SQLite). Due to data volumes we are facing in WMF, we are not using the SQLite back-end in production. However, we are testing each RESTBase changeset against both Cassandra and SQLite to ensure they are behaving as expected. Additionally, a 93% code coverage makes us confident to recommend it for production use. Bringing the SQLite storage module to a usable state represents a significant amount of team effort, but nevertheless we’d like to extend a special thank you to Petr Pchelko for his tireless efforts in driving this project home. Thanks Petr! This is only the first step in providing better third-party and developer support. One of the next challenges will be basing Parsoid~[8] on service-runner~[9,10], a general-purpose library for running and managing Node.JS services. Amongst other things, it enables bundling and running multiple services together. Packaging and distributing Parsoid and RESTBase together will simplify their installation, configuration and administration in small-setup and development environments. Best, Marko Obrovac, PhD Senior Services Engineer Wikimedia Foundation [0] https://www.mediawiki.org/wiki/Wikimedia_Services [1] https://github.com/wikimedia/restbase-mod-table-sqlite [2] https://www.mediawiki.org/wiki/RESTBase [3] https://en.wikipedia.org/api/rest_v1/?doc (for a complete list of supported domains go to https://rest.wikimedia.org/) [4] https://phabricator.wikimedia.org/tag/restbase/ [5] https://phabricator.wikimedia.org/tag/restbase-api/ [6] https://www.mediawiki.org/wiki/MediaWiki-Vagrant [7] https://github.com/wikimedia/restbase-mod-table-spec [8] https://www.mediawiki.org/wiki/Parsoid [9] https://github.com/wikimedia/service-runner [10] https://phabricator.wikimedia.org/T90668

1 0

Call for Participation: SEMANTiCS 2015, 11th International Conference on Semantic Systems
by Tassilo Pellegrini 21 Aug '15

21 Aug '15

== Apologies for Crossposting == Below you will find useful information and links about the program, our satellite events and registration opportunities for SEMANTiCS 2015 -- 11th International Conference on Semantic Systems 15-17 September 2015, Vienna / Austria http://www.semantics.cc/ // #semantics2015 // #semanticsconf *— Conference Scope ---* The annual SEMANTiCS conference is the meeting place for professionals who make semantic computing work, and understand its benefits and know its limitations. Every year, SEMANTiCS attracts information managers, IT-architects, software engineers, and researchers, from organisations ranging from NPOs, universities, public administrations to the largest companies in the world. *— Conference Program ---* The 2015 edition offers a rich program consisting of 5 keynotes, 24 scientific presentations, 30 industry talks, 38 posters and various workshops and social events. For details please visit our program page: http://www.semantics.cc/programme *— Keynote Speakers --- * ·Jeanne Holms -- Chief Knowledge Architect at NASA ·Peter Mika -- Director Semantic Lab at Yahoo ·Oscar Orcho -- Associate Professor for Artifical Intelligence, Universidad de Madrid ·Klaus Tochtermann -- Leibniz Information Center for Economics ·Sam Rehman -- CTO at EPAM Systems *— Workshops & Satellite Events ---* *MeetUp: SMART DATA SOLUTIONS* An outlook into the world of data centric business, technologies and innovations: Data is everywhere these days and efficient data management is THE key factor for success in nearly all industries in the meantime. McKinsey lists data as a key factor for production alongside with labor and capital in one of their recent reports. Furthermore, Data is produced in huge amounts by sensors, social networks or mobile devices and the amounts of data available worldwide grow exponentially… Place: Haus der Ingenieure, Eschenbachgasse 9, 1010 Wien Date: 15.09.2015, Entrance: 18:30pm CET ; Start event: 19:30 - 22:30pm CET *2nd International Workshop on Geospatial Linked Data* In recent years, Semantic Web technologies have strengthened their position in the areas of data and knowledge management. Standards for organizing and querying semantic information, such as RDF(S) and SPARQL are adopted by large academic communities, while corporate vendors adopt semantic technologies to organize, expose, exchange and retrieve their datasets as Linked Data. Chairs: Alejandra Garcia-Rojas M. (Ontos AG), Robert Isele or Rene Pietzsch, Jens Lehmann (AKSW, University of Leipzig) Date: 15th of September 2015, 09.00 to 13.00 CEST *The SEMANTIC EXPERIENCE Coffee & Cocktail LOUNGE* Sponsored Event: Enjoy semantic technology inspiration in a relaxed atmosphere. Refresh yourself with Viennese coffee or a cocktail and get in touch with semantic industry experts. Date: 15th of September 2015, 15.45 to 18.00 CEST Room: LC Club Room *Linked Data in Industry 4.0* The overall goal of the workshop is to identify challenges and limitations from the manufacturing engineering industry in the scope of the mentioned design principles, and bring them together with experts and solution approaches from the linked data community in the scope of Industry 4.0. Chairs: Thomas Moser (FH St. Pölten), Stefan Hupe (IoT Austria) Date: 15th of September 2015, 14.00 to 17.00 CEST *European Data Economy Workshop - Focus Data Value Chain & Big and Open Data* This workshop is to overview the state of the art in Europe regarding Big and Open Data initiatives and its impact in the Europan economy and benefits for theEuropean society. Representatives from the Big Data Value Association, the annual European Data Forum and data related projects will participate during the first session of the workshop. Furthermore it gives information about the Austrian Big Data Study carried out in 2014 by AIT and IDC Chairs: Nelia Lasierra (STI Innsbruck), Martin Kaltenböck (Semantic Web Company) Date: 15th of September 2015, 09.00 to 13.00 CEST *1st Workshop on Data Science: Methods, Technology and Applications (DSci15)* This workshop is meant as an opportunity to bring together researchers and practitioners interested in data science to present their ideas and discuss the most important scientific, technical and socio-economical challenges of this emerging field. Chairs: Bernhard Haslhofer (AIT - Austrian Institute of Technology), Elena Simperl (Univ. Southampton), Rainer Stütz (AIT - Austrian Institute of Technology), Ingo Feinerer (FH Wiener Neustadt) Date: 15th of September 2015, 09.00 to 17.00 CEST *Workshop on Linked Data Strategies - Commercialisation of Interlinked Data* In this workshop, we will give several demos and concrete examples of how Linked Data can be used by enterprises in various industries. The workshop aims to give users and providers of Linked Data valuable methods and best practices at hand, which help them to make profound decisions in their Linked Data projects. Chairs: Christian Dirschl (Wolters Kluwer), Andreas Blumauer (Semantic Web Company), Tassilo Pellegrini (FH St. Pölten) Date: 15th of September 2015, 14.00 to 15.30 CEST *Hackathon on "The power of Linked Data in Agriculture and Food Safety"* “Data+Need=Hack”, this is the idea of a hackathon that brings together like-minded people to develop, in a short time frame, novel solutions to problems around the theme “Agriculture and Food Safety”. Chairs: Christian Blaschke (Semantic Web Company, Vienna), Stasinos Konstantopoulos (Institute of Informatics & Telecommunications of the NCSR Demokritos, Athens) Date: 18th of September 2015, 10.00 to 16.00 CEST *— Registration ---* To register, please go to: http://www.semantics.cc/registration We are looking forward to meet you at SEMANTiCS 2015!

1 0

oldimage naming convention
by Daren Welsh 21 Aug '15

21 Aug '15

In the version history of an image (or any attached file in MediaWiki), the page displays "Date/Time" with a link to that version. The timestamp displayed is the upload timestamp of that version. If you look closely, you can see that the real filename includes a different timestamp. This turns out to be the timestamp of when that file was superseded by a subsequent version. I have looked in the database tables and can see that in the oldimage table, each row has an "oi_archive_name" with the timestamp of when that version was superseded and an "oi_timestamp" of when that version was actually uploaded. Is there a reason to name the old versions of the files with the superseding timestamp instead of the upload timestamp? It seems to me that the timestamp of when that version was uploaded is more relevant. Daren -- __________________ http://enterprisemediawiki.org http://mixcloud.com/darenwelsh http://www.beatportfolio.com

3 2

ANN: Version 2.1.2 of the Memento MediaWiki Extension
by Shawn Jones 21 Aug '15

21 Aug '15

All, Incorporating bugfixes, version 2.1.2 of the Memento time travel Extension for MediaWiki has been released and has been tested with MediaWiki 1.25.2. The extension can be downloaded via [1]. Information on the extension is available at [2]. A demonstration wiki equipped with the extension is available at [3]. We really appreciate the feedback from the Wikimedia team and look forward to additional assistance and improvements. Our goal is to make the extension as solid as possible for MediaWiki users everywhere. The extension works with Memento clients [4]. Memento clients allow one to select a past date and time, and then browse the web as if it were that date and time. Installing this extension in a MediaWiki installation allows Memento clients to seamlessly transition from using web archives to wikis, allowing one to view the past versions of web pages without interruption. This has numerous applications, from avoiding spoilers [5, 6, 7], which has even received interest from the press [8], to studying the evolution of current and historical events. Additionally, this extension attempts to address the issue of "temporal coherence", ensuring that old revisions of images and templates match the revision of the page they are embedded in. This functionality is still optional and experimental, but has received some interest from the community. Last year, we presented our experiences with reconstructing the past using MediaWiki [9, 10], and demonstrated using the extension to avoid spoilers in Game of Thrones [11, 12, 13] at WikiConference USA 2014. The extension is fully compliant with RFC 7089 [14], which specifies the Memento protocol. The effort was supported in part by the Andrew W. Mellon Foundation and is a joint effort between Old Dominion University and Los Alamos National Laboratory. Videos [15, 16] show Memento at work in the web at large, the latter paying attention to navigation within Wikipedia. The Memento protocol is currently used by major web archives [17], with more becoming compliant as time moves on, and is supported by the International Internet Presevation Consortium [18]. Though we have the support of web archives, the Memento team also considers time travel in platforms such as MediaWiki to be a major use of the protocol. Thank you again, on behalf of the Memento Team, Shawn M. Jones Graduate Research Assistant Los Alamos National Laboratory Email: jones.shawn.m(a)gmail.com Twitter: @shawnmjones Research Groups: http://ws-dl.blogspot.com http://www.lanl.gov/library/about/research-prototyping.php [1] https://github.com/mementoweb/mediawiki/releases/tag/v2.1.2 [2] https://www.mediawiki.org/wiki/Extension:Memento [3] http://ws-dl-05.cs.odu.edu/demo/ [4] http://bit.ly/memento-for-chrome [5] <http://ws-dl.blogspot.com/2013/12/2013-12-18-avoiding-spoilers-with.html> http://ws-dl.blogspot.com/2013/12/2013-12-18-avoiding-spoilers-with.html [6] http://arxiv.org/abs/1506.06279 [7] http://www.technologyreview.com/view/539046/other-interesting-arxiv-papers-… [8] http://www.elconfidencial.com/tecnologia/2015-07-08/spoilers-internet-memen… [9] http://wikiconferenceusa.org/wiki/Submissions:Reconstructing_the_past_with_… [10] http://www.slideshare.net/shawnmjones/reconstructing-the-past-with-media-wi… [11] http://wikiconferenceusa.org/wiki/Submissions:Using_the_Memento_Mediawiki_E… [12] http://www.slideshare.net/shawnmjones/using-the-memento-mediawiki-extension… [13] https://www.youtube.com/watch?v=ciClYjTnscs [14] http://tools.ietf.org/html/rfc7089 [15] http://www.youtube.com/watch?v=0_70lQPOOIg [16] http://www.youtube.com/watch?v=WtZHKeFwjzk [17] http://mementoweb.org/depot/ [18] http://netpreserve.org

1 0

Fwd: Evaluation of opt-in alternatives to Grid Engine on Tool Labs ('clustering solution')
by Yuvi Panda 20 Aug '15

20 Aug '15

FYI ---------- Forwarded message ---------- From: Yuvi Panda <yuvipanda(a)gmail.com> Date: Thu, Aug 20, 2015 at 4:45 PM Subject: Evaluation of opt-in alternatives to Grid Engine on Tool Labs ('clustering solution') To: Wikimedia Labs <labs-l(a)lists.wikimedia.org> Hello! One of the experimental goals for this quarter for labs' team is to make available an new, more modern gridengine alternative just for webservices on Tool Labs. We are starting to evaluate which systems we should use - this is tracked at https://phabricator.wikimedia.org/T106475. The (still incomplete) evaluation spreadsheet is at https://docs.google.com/spreadsheets/d/1YkVsd8Y5wBn9fvwVQmp9Sf8K9DZCqmyJ-ew… We are evaluationg Kubernetes and Mesos/Marathon as alternatives. GridEngine is also being scored along with them, so if we find that it wins we'll abandon the experiment and continue using GridEngine only. Do provide comments on the phab ticket and follow along :) == WHY? == Because our current webservices setup is a pile of hacks on top of GridEngine, causing... interesting problems due to the complexity involved. GridEngine doesn't support a lot of features that people using more modern systems take for granted - like containerization + isolation, a nice API, continuous deploy, autoscaling.... Having an alternative to play with allows us to build newer, better featured and more robust systems. == OMG, WILL I HAVE TO CHANGE THE WAY MY CODE WORKS NOW?! == Nope. For now this is just an alternative - when completed, you will be able to run your webservice on this cluster by something like: webservice --provider=<something> start And nothing else will change - everything else should still be compatible. We'll eventually provide more features on the new setup, but there will be no forced migration of any sort. If the alternative becomes the default at any point, we'll ensure that things that worked before continue working without any extra effort from the Tool Author's part. -- Yuvi Panda T http://yuvi.in/blog -- Yuvi Panda T http://yuvi.in/blog

1 0

RFC: Replace Tidy with HTML 5 parse/reserialize
by Tim Starling 20 Aug '15

20 Aug '15

I'm elevating this task of mine to RFC status: https://phabricator.wikimedia.org/T89331 Running the output of the MediaWiki parser through HTML Tidy always seemed like a nasty hack. The effects on wikitext syntax are arbitrary and change from version to version. When we upgrade our Linux distribution, we sometimes see changes in the HTML generated by given wikitext, which is not ideal. Parsoid took a different approach. After token-level transformations, tokens are fed into the HTML 5 parse algorithm, a complex but well-specified algorithm which generates a DOM tree from quirky input text. http://www.w3.org/TR/html5/syntax.html We can get nearly the same effect in MediaWiki by replacing the Tidy transformation stage with an HTML 5 parse followed by serialization of the DOM back to HTML. This would stabilize wikitext syntax and resolve several important syntax differences compared to Parsoid. However: * I have not been able to find any PHP implementation of this algorithm. Masterminds and Ressio do not even attempt it. Electrolinux attempts it but does not implement the error recovery parts that are of interest to us. * Writing our own would be difficult. * Even if we did write it, it would probably be too slow. So the question is: what language should we use? Since this is the standard programmer troll question, please bring popcorn. The best implementation of this algorithm is in Java: the validator.nu parser is maintained by Mozilla, and has source translation to C++, which is used by Mozilla and could potentially be used for an HHVM extension. There is also a Rust port (also written by Mozilla), and notable implementations in JavaScript and Python. For WMF, a Java service would be quite easily done, and I have prototyped it already. An HHVM extension might also be possible. A non-service fallback for small installations might be Node.js or a compiled binary from Rust or C++. -- Tim Starling

15 27

2015-Aug-19 Scrum of Scrums meeting notes
by Grace Gellerman 19 Aug '15

19 Aug '15

https://www.mediawiki.org/wiki/Scrum_of_scrums/2015-08-19

1 0

RFC meeting this week
by Tim Starling 19 Aug '15

19 Aug '15

In the next RFC meeting, we will discuss the following RFC: * Multi-Content Revisions <https://phabricator.wikimedia.org/T107595> The meeting will be on the IRC channel #wikimedia-office on chat.freenode.net at the following time: * UTC: Wednesday 21:00 * US PDT: Wednesday 14:00 * Europe CEST: Wednesday 23:00 * Australia AEST: Thursday 07:00 -- Tim Starling

1 0

Geohack tools
by Pine W 18 Aug '15

18 Aug '15

I just now realized how powerful these tools are when I started clicking around. https://tools.wmflabs.org/geohack/geohack.php?pagename=File%3AWhite-cheeked… Is there any chance of integrating some of these tools more directly onto Wikipedia pages, and into mobile web/mobile apps? Pine

7 9

← Newer
1
2
3
4
5
6
7
8
9
Older →

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l August 2015