Hi! I'm currently attending Wikimania (I have a session on Friday at
4.30pm).
Tilman Bayer suggested to share this tool and techniques here, so I am
following his advice :).
I've been using Google BigQuery for a while to analyze Wikipedia's publicly
available data. It's main advantages:
- It's unbelievable fast (try it - operations that you might expect to run
in minutes or hours run in seconds).
- It's secure, but you can also instantly share data (no need to download
and setup locally before being able to analyze - BigQuery is always on).
- Everyone can use BigQuery with a free quota of 1TB of monthly analysis.
Interesting links:
- Quick getting started:
https://www.reddit.com/r/bigquery/comments/3dg9le/analyzing_50_billion_wiki…
- Analyzing the gender gap in Wikipedia (Freebase, and joining it with
pageviews): https://www.youtube.com/watch?v=lV5vk3higvA
- Massive Geo-Ip geolocation from the changelog:
https://www.reddit.com/r/bigquery/comments/1zh7ty/massive_geoip_geolocation…
- Just for fun, the most popular numbers:
https://www.reddit.com/r/bigquery/comments/2p0vz4/query_of_the_day_the_most…
- Top Wikipedia Entries Which Are Most-Edited by Members of the U.S.
Congress http://minimaxir.com/2014/07/caucus-needed/
- Music recommendations:
http://apassant.net/2014/07/11/music-recommendations-300m-data-points-sql/
I have a couple other interesting examples I haven't written about, but the
invitation here is for you to try your own :).
My main challenge today: How to get more publicly available data into
BigQuery. Let's work together :). I'm sitting around the big data analytics
team today at the Wikimedia hackathon - and as said earlier, I'll do a
session on this topic on Friday at 4:30pm.
Thanks!
Final Call for Participation
Conference on Intelligent Computer Mathematics
CICM 2015
13-17 July 2015
Washington DC, USA
Registration Deadline July 6th, 2015
The programme for this year's CICM in Washington can be found as
http://www.cicm-conference.org/2015/cicm.php?event=&menu=detailed-programme
The accepted papers as
http://www.cicm-conference.org/2015/cicm.php?event=&menu=talks
In addition we solicit for posters which will not be peer reviewed, but we
will just do a screen review for relevance to the conference. A poster
presentation will consist of a 5 minute teaser talk and the presentation of
the poster on Tuesday morning (together with the other presentations in the
Systems/Data/Projects track).
You can submit a brief abstract on a poster by 22 June 2015 via EasyChair:
https://www.easychair.org/conferences/?conf=cicm2015
You will be informed about acceptance shortly after your submission.
Registration to the conference will open shortly.
For details on the conference, registration, accommodation, etc. see
http://www.cicm-conference.org/2015/cicm.php
**********************************************************************
Invited Speakers:
**********************************************************************
* Leonardo de Moura, https://leodemoura.github.io/
"Formalizing mathematics using the Lean Theorem Prover"
(http://leanprover.github.io/)
* Tobias Nipkow, http://www21.in.tum.de/~nipkow/
"Analyzing the Archive of Formal Proofs"
* Jim Pitman, http://www.stat.berkeley.edu/~pitman/
"Towards a Global Digital Mathematics Library"
* Richard Zanibbi, http://www.cs.rit.edu/~rlaz/
"Math Search for the Masses: Multimodal Search
Interfaces and Appearance-Based Retrieval"
**********************************************************************
The principal tracks of the conference will be:
**********************************************************************
* Calculemus (Symbolic Computation and Mechanised Reasoning)
Chair: Jacques Carette
* DML (Digital Mathematical Libraries)
Chair: Volker Sorge
* MKM (Mathematical Knowledge Management)
Chair: Cezary Kaliszyk
* Systems and Data
Chair: Florian Rabe
* Doctoral Programme
Chair: Umair Siddique
Publicity chair is Serge Autexier. The local arrangements are
coordinated by the Local Arrangements Chairs, Bruce R. Miller
(National Institute of Standards and Technology, USA) and Abdou
Youssef (The George Washington University, Washington, D.C.), and the
overall programme is organized by the General Programme Chair,
Manfred Kerber (U. Birmingham, UK).
As in previous years, we have co-located workshops:
* Formal Mathematics for Mathematicians
* Theorem proving components for Educational software (ThEdu'15)
* MathUI
Furthermore we have a doctoral programme to mentor doctoral
students giving presentations and a tutorial on the generic proof
assistant Isabelle.
--------------------------------------------------------------------------------
---------- Forwarded message ----------
From: *Dirk Riehle* <dirk(a)riehle.org>
Date: Saturday, July 4, 2015
Subject: [opensource] Call for participation in OpenSym 2015, Aug 19-20,
San Francisco!
To: opensource(a)lists.stanford.edu
Call for participation in OpenSym 2015!
Aug 19-20, 2015, San Francisco, http://opensym.org
----
FOUR FANTASTIC KEYNOTES
Richard Gabriel (IBM) on Using Machines to Manage Public Sentiment on
Social Media
Peter Norvig (GOOGLE) on Applying Machine Learning to Programs
Robert Glushko (UC BERKELEY) on Collaborative Authoring, Evolution, and
Personalization
Anthony Wassermann (CMU SV) on Barriers and Pathways to Successful
Collaboration
More at
http://www.opensym.org/category/conference-contributions/keynotes-invited-t…
----
GREAT RESEARCH PROGRAM
All core open collaboration tracks, including
- free/libre/open source
- open data
- Wikipedia
- wikis and open collaboration, and
- open innovation
More at
http://www.opensym.org/2015/06/25/preliminary-opensym-2015-program-announce…
----
INCLUDING OPEN SPACE
The facilities provide room and space for your own working groups.
----
AT A WONDERFUL LOCATION
OpenSym 2015 takes place from Aug 19-20 at the Golden Gate Club of San
Francisco, smack in the middle of the Presidio, with a wonderful view of
the Golden Gate Bridge.
More at http://www.opensym.org/os2015/location/
----
REGISTRATION
Is simple, subsidized, and all-encompassing.
Find it here: http://www.opensym.org/os2015/registration/
Prices will go up after July 12th, so be sure to register early!
----
We would like to thank our sponsors Wikimedia Foundation, Google, TJEF, and
the ACM.
_______________________________________________
opensource mailing list
opensource(a)lists.stanford.edu
https://mailman.stanford.edu/mailman/listinfo/opensource
Call for participation in OpenSym 2015!
Aug 19-20, 2015, San Francisco, http://opensym.org
----
FOUR FANTASTIC KEYNOTES
Richard Gabriel (IBM) on Using Machines to Manage Public Sentiment on Social Media
Peter Norvig (GOOGLE) on Applying Machine Learning to Programs
Robert Glushko (UC BERKELEY) on Collaborative Authoring, Evolution, and
Personalization
Anthony Wassermann (CMU SV) on Barriers and Pathways to Successful Collaboration
More at
http://www.opensym.org/category/conference-contributions/keynotes-invited-t…
----
GREAT RESEARCH PROGRAM
All core open collaboration tracks, including
- free/libre/open source
- open data
- Wikipedia
- wikis and open collaboration, and
- open innovation
More at
http://www.opensym.org/2015/06/25/preliminary-opensym-2015-program-announce…
----
INCLUDING OPEN SPACE
The facilities provide room and space for your own working groups.
----
AT A WONDERFUL LOCATION
OpenSym 2015 takes place from Aug 19-20 at the Golden Gate Club of San
Francisco, smack in the middle of the Presidio, with a wonderful view of the
Golden Gate Bridge.
More at http://www.opensym.org/os2015/location/
----
REGISTRATION
Is simple, subsidized, and all-encompassing.
Find it here: http://www.opensym.org/os2015/registration/
Prices will go up after July 12th, so be sure to register early!
----
We would like to thank our sponsors Wikimedia Foundation, Google, TJEF, and
the ACM.