Just a reminder this will be taking place in one hour!
On Tue, Feb 14, 2017 at 2:49 PM, Sarah R <srodlund(a)wikimedia.org> wrote:
> Hi Everyone,
>
> The next Research Showcase will be live-streamed this February 15, 2017 at
> 11:30 AM (PST) 18:30 UTC.
>
> YouTube stream: https://www.youtube.com/watch?v=m6smzMppb-I
>
> As usual, you can join the conversation on IRC at #wikimedia-research.
> And, you can watch our past research showcases here
> <https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#February_2017>
> .
>
> This month's presentations:
>
> Wikipedia and the Urban-Rural DivideBy *Isaac Johnson*Wikipedia articles
> about places, OpenStreetMap features, and other forms of peer-produced
> content have become critical sources of geographic knowledge for humans and
> intelligent technologies. We explore the effectiveness of the peer
> production model across the rural/urban divide, a divide that has been
> shown to be an important factor in many online social systems. We find that
> in Wikipedia (as well as OpenStreetMap), peer-produced content about rural
> areas is of systematically lower quality, less likely to have been produced
> by contributors who focus on the local area, and more likely to have been
> generated by automated software agents (i.e. “bots”). We continue to
> explore and codify the systemic challenges inherent to characterizing rural
> phenomena through peer production as well as discuss potential solutions.
>
>
> Wikipedia Navigation VectorsBy *Ellery Wulczyn
> <https://www.mediawiki.org/wiki/User:Ewulczyn_(WMF)>*In this project, we
> learned embeddings for Wikipedia articles and Wikidata
> <https://www.wikidata.org/wiki/Wikidata:Main_Page> items by applying
> Word2vec <https://en.wikipedia.org/wiki/Word2vec> models to a corpus of
> reading sessions. Although Word2vec models were developed to learn word
> embeddings from a corpus of sentences, they can be applied to any kind of
> sequential data. The learned embeddings have the property that items with
> similar neighbors in the training corpus have similar representations (as
> measured by the cosine similarity
> <https://en.wikipedia.org/wiki/Cosine_similarity>, for example).
> Consequently, applying Wor2vec to reading sessions results in article
> embeddings, where articles that tend to be read in close succession have
> similar representations. Since people usually generate sequences of
> semantically related articles while reading, these embeddings also capture
> semantic similarity between articles.
>
> --
> Sarah R. Rodlund
> Senior Project Coordinator-Product & Technology, Wikimedia Foundation
> srodlund(a)wikimedia.org
>
--
Sarah R. Rodlund
Senior Project Coordinator-Product & Technology, Wikimedia Foundation
srodlund(a)wikimedia.org
Hi Everyone,
The next Research Showcase will be live-streamed this February 15, 2017 at
11:30 AM (PST) 18:30 UTC.
YouTube stream: https://www.youtube.com/watch?v=m6smzMppb-I
As usual, you can join the conversation on IRC at #wikimedia-research. And,
you can watch our past research showcases here
<https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#February_2017>.
This month's presentations:
Wikipedia and the Urban-Rural DivideBy *Isaac Johnson*Wikipedia articles
about places, OpenStreetMap features, and other forms of peer-produced
content have become critical sources of geographic knowledge for humans and
intelligent technologies. We explore the effectiveness of the peer
production model across the rural/urban divide, a divide that has been
shown to be an important factor in many online social systems. We find that
in Wikipedia (as well as OpenStreetMap), peer-produced content about rural
areas is of systematically lower quality, less likely to have been produced
by contributors who focus on the local area, and more likely to have been
generated by automated software agents (i.e. “bots”). We continue to
explore and codify the systemic challenges inherent to characterizing rural
phenomena through peer production as well as discuss potential solutions.
Wikipedia Navigation VectorsBy *Ellery Wulczyn
<https://www.mediawiki.org/wiki/User:Ewulczyn_(WMF)>*In this project, we
learned embeddings for Wikipedia articles and Wikidata
<https://www.wikidata.org/wiki/Wikidata:Main_Page> items by applying
Word2vec <https://en.wikipedia.org/wiki/Word2vec> models to a corpus of
reading sessions. Although Word2vec models were developed to learn word
embeddings from a corpus of sentences, they can be applied to any kind of
sequential data. The learned embeddings have the property that items with
similar neighbors in the training corpus have similar representations (as
measured by the cosine similarity
<https://en.wikipedia.org/wiki/Cosine_similarity>, for example).
Consequently, applying Wor2vec to reading sessions results in article
embeddings, where articles that tend to be read in close succession have
similar representations. Since people usually generate sequences of
semantically related articles while reading, these embeddings also capture
semantic similarity between articles.
--
Sarah R. Rodlund
Senior Project Coordinator-Product & Technology, Wikimedia Foundation
srodlund(a)wikimedia.org
Please take this as a final opportunity to do review/suggest
changes/etc. to the Amendments section of the draft Code of Conduct.
This text has been up for a while, but I recently put in a small
proposed change to make it harder for the Committee to veto amendments.
This is the last section. After it's approved, the Code of Conduct will
become policy, and the Amendments section will specify how future
changes to the policy work.
* Current text:
https://www.mediawiki.org/w/index.php?title=Code_of_Conduct/Draft&oldid=238…
(under "Page: Code of Conduct/Amendments")
* Discussion:
https://www.mediawiki.org/wiki/Talk:Code_of_Conduct/Draft#New_proposal_for_…
The approval discussion hasn't started yet. It will be next and I will
send out a separate email.
Thanks,
Matt Flaschen
P.S. You can still participate in deciding whether to approve "Creation
and renewal of the Committee" at
https://www.mediawiki.org/wiki/Talk:Code_of_Conduct/Draft#Finalize_.22Creat…
.
Hi all,
Thanks for launching this to the public! I have created an unofficial SSE
stream based on the IRC stream for open consumption for a while now at
http://wikipedia-edits.herokuapp.com/ (see link on top).
Looking forward to migrating over to the new official SSE stream, but
wanted to check first if anyone relied on mine? I would probably turn it
off soonish (time permitting), but could also keep it running indefinitely
if people rely heavily on it. Just let me know.
Cheers,
Tom
--
Dr. Thomas Steiner, Employee (https://blog.tomayac.com,
https://twitter.com/tomayac)
Google Germany GmbH, ABC-Str. 19, 20354 Hamburg, Germany
Managing Directors: Matthew Scott Sucherman, Paul Terence Manicle
Registration office and registration number: Hamburg, HRB 86891
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.1.17 (GNU/Linux)
iFy0uwAntT0bE3xtRa5AfeCheCkthAtTh3reSabiGbl0ck0fjumBl3DCharaCTersAttH3b0ttom.hTtP5://xKcd.c0m/1181/
-----END PGP SIGNATURE-----
Forwarding.
Pine
---------- Forwarded message ----------
From: Léa Lacroix <lea.lacroix(a)wikimedia.de>
Date: Tue, Feb 7, 2017 at 2:45 AM
Subject: [Wikidata] WMDE looking for a data analyst
To: "Discussion list for the Wikidata project." <
wikidata(a)lists.wikimedia.org>
Hello all,
Our development team is looking for a data analyst, in freelance, remotely,
to work mostly on Wikidata.
The person will:
- Work closely with product managers and UX researchers to maintain and
improve detailed on-going analysis of the department’s products, their
usage patterns and performance.
- Write database queries and supporting code to analyze usage volume,
user behaviour and performance data to identify opportunities and areas for
improvement.
- Collaborate with other analysts in the department to maintain our
department’s dashboards, ensuring they are up-to-date, accurate, fair and
focussed on representations of our product efficiency.
- Support product managers through rapidly surfacing positive and
adverse data trends, and complete ad hoc analysis support as needed.
- Communicate clearly and responsively your findings to a range of
departmental, organisational, volunteer and public stakeholders in order to
inform and educate them.
If you want to know more and apply: https://software.wikimedia.de/
jobs/data-analyst
See also our other job offers: https://software.wikimedia.de/jobs
--
Léa Lacroix
Project Manager Community Communication for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
One of our longest running sets of stats on Wikipedia is the time between
ten million edits - we now have stats for this over a fifteen year period.
After emailing User:Katalaveno and getting their agreement I have moved
https://en.wikipedia.org/w/index.php?title=User:Katalaveno/TBE&redirect=no
to https://en.wikipedia.org/wiki/Wikipedia:Time_Between_Edits and am making
a few changes.
One area that perhaps someone on this list can explain is the difference
between number of edits as measured by revisionID and as measured by
NUMBEROFEDITS - the difference is over a hundred million. That is too big a
number for it to be a measure of logged admin actions, unless when you
delete a page it increments number of edits for each revision deleted. It
might be in the right ballpark to give a measure of edit conflicts, if so
it would be very good to have a measure of something we had thought
unmeasurable.
So I'm wondering if anyone on this list knows the difference between
{{NUMBEROFEDITS}}. and revisionID
WSC
Hello volunteer developers & technical contributors!
The Wikimedia Foundation is asking for your feedback in a survey. We want
to know how well we are supporting your contributions on and off wiki, and
how we can change or improve things in the future.[1] The opinions you
share will
directly affect the current and future work of the Wikimedia Foundation.
*** The survey will close on February 15, 2017. ***
To say thank you for your time, we are giving away 10 Wikimedia T-shirts to
randomly selected people who take the survey.[2] The survey is available in
various languages and will take between 20 and 40 minutes.
Use this link to take the survey now:
https://wikimedia.qualtrics.com/SE/?SID=SV_6mTVlPf6O06r3mt&Aud=DEV&Src=DEV
You can find more information about this project here[3]. This survey is
hosted by a third-party service and governed by this privacy statement[4].
Please visit our frequently asked questions page to find more information
about this survey[5]. If you need additional help or have questions about
this survey, send an email to surveys(a)wikimedia.org.
Thank you!
Edward Galvez
Community Engagement
Wikimedia Foundation
[1] This survey is primarily meant to get feedback on the Wikimedia
Foundation's current work, not long-term strategy.
[2]Legal information we have to share: No purchase necessary. Must be the
age of majority to participate. Sponsored by the Wikimedia Foundation
located at 149 New Montgomery, San Francisco, CA, USA, 94105. Ends February
16, 2017. Void where prohibited. Follow this link for the contest rules:
https://meta.wikimedia.org/wiki/Community_Engagement_Insights/2017_second_c…
[3] About this survey:
https://meta.wikimedia.org/wiki/Community_Engagement_Insight
s/About_CE_Insights
[4] Privacy statement: https://wikimediafoundation.or
g/wiki/Community_Engagement_Insights_2016_Survey_Privacy_Statement
[5] FAQ:
https://meta.wikimedia.org/wiki/Community_Engagement_Insight
s/Frequently_asked_questions
--
Edward Galvez
Evaluation Strategist (Survey Specialist), and
Affiliations Committee Liaison
Learning & Evaluation
Community Engagement
Wikimedia Foundation