Hey all!
wmfdata-python https://github.com/wikimedia/wmfdata-python (a package that streamlines access to private analytics data) has been updated to version 1.1. Here's what's new:
- The new presto module supports querying the Data Lake using Presto https://wikitech.wikimedia.org/wiki/Analytics/Systems/Presto. - The spark module has been refactored to support local and custom sessions. - A new utils.get_dblist function provides easy access to wiki database lists, which is particularly useful with mariadb.run. - The hive.run_cli function now creates its temp files in standard location, to avoid creating distracting new entries in the current working directory.
Many thanks to:
- Andrew Otto and Adam Roses Wight for writing significant new code - Mikhail Popov, Andrew Otto, and Luca Toscano for careful code review
As always, if you have questions or feedback about wmfdata-python, please email Product Analytics at product-analytics@wikimedia.org.
analytics-announce@lists.wikimedia.org