Hi everyone,
I've been hacking on a new tool and I thought I'd share what (little) I
have so far to get some comments and learn of related approaches from the
community.
The basic idea would be to have a browser extension that tells the user if
the current page they're viewing looks like a good reference for a
Wikipedia article, for some whitelisted domains like news websites. This
would hopefully prompt casual/opportunistic edits, especially for articles
that may be overlooked normally.
As a proof of concept for a backend, I built a simple bag-of-words model of
the TextExtracts of enwiki's
Category:All_articles_needing_additional_references. I then set up a tool
[1] to receive HTML input and retrieve the 5 most similar articles to that
input. You can try it out in your browser [2], or on the command line [3].
The results could definitely be better, but having tried it on a few
different articles over the past few days, I think there's some potential
there.
I'd be interested in hearing your thoughts on this. Specifically:
* If such a backend/API were available, would you be interested in using it
for other tools? If so, what functionality would you expect from it?
* I'm thinking of just throwing away the above proof of concept and using
ElasticSearch, though I don't know a lot about it. Is anyone aware of a
similar dataset that already exists there, by any chance? Or any reasons
not to go that way?
* Any other comments on the overall idea or implementation?
Thanks!
1- https://github.com/eggpi/similarity
2- https://tools.wmflabs.org/similarity/
3- Example: curl
https://www.nytimes.com/2017/09/22/opinion/sunday/portugal-drug-decriminali…
| curl -X POST http://tools.wmflabs.org/similarity/search --form "text=<-"
--
Guilherme P. Gonçalves
The Cloud Services team has two positions open that might be
interesting to some members of our technical community. We are
especially hoping to find someone from our existing community of
technical contributors for the Technical Support position.
== Cloud Services Technical Support (Contract) ==
Type: Contract
Hours: 20 hours per week
Duration: 3 months (with possibility for extension)
The Wikimedia Cloud Services team is looking for a part-time Technical
Support specialist to help the Wikimedia movement's volunteer
developers to rapidly clarify and resolve their technical questions
and issues.
Responsibilities:
* Actively monitor various IRC channels, wikipages, and mailing lists,
for support requests
* Help volunteer developers to diagnose and resolve technical issues
with our IaaS and PaaS infrastructure and their code
* Communicate clearly and patiently with individuals from our
international communities of volunteer developers
* Write, verify, and improve documentation used by volunteer
developers and technical staff
* Ensure positive and constructive discussions with the community and
the Wikimedia Foundation
* Work and communicate effectively within a small team distributed
across multiple time zones
Full job posting: <http://grnh.se/h5gjvl1>
== Operations Engineer (Cloud Services) ==
Responsibilities:
* Perform day-to-day operational tasks on Wikimedia’s Cloud Services
infrastructure (deployment, maintenance, configuration,
troubleshooting)
* Support volunteer and staff developers using Infrastructure as a
Service (IaaS) and Platform as a Service (PaaS) products
* Assist in the architectural design of new services and making them
operate at scale
* Assist in or lead incident response, diagnosis and followup on
system outages or alerts across our stack
* On-call support and off-hours coverage in a 24x7 environment
Full job posting: <http://grnh.se/1xb8om1>
If you are interested in either job, please read the full job postings
and apply online. Feel free to share the links with people you may
know as well via your favorite social media platforms, email, or just
old fashioned voice communications. :) You can find these and other
jobs with the Wikimedia Foundation at
<https://wikimediafoundation.org/wiki/Work_with_us>.
Bryan
--
Bryan Davis Wikimedia Foundation <bd808(a)wikimedia.org>
[[m:User:BDavis_(WMF)]] Manager, Cloud Services Boise, ID USA
irc: bd808 v:415.839.6885 x6855
labsdb1001.eqiad.wmnet (aka c1.labsdb) will be rebooted at 2017-10-30
14:30 UTC for kernel updates
(<https://phabricator.wikimedia.org/T168584>).
Normal usage of the *.labsdb databases should experience only limited
interruption as DNS is changed to point to the labsdb1003.eqiad.wmnet
(aka c3.labsdb). The c1.labsdb service name will *not* be updated
however, so tools hardcoded to that service name will be interrupted
until the reboot is complete.
There is a possibility of catastrophic hardware failure in this
reboot. There will be no way to recover the server or the data it
currently hosts if that happens. Tools that are hosting self-created
data on c1.labsdb *will* lose that data if there is hardware failure.
If you are unsure if your tool is hosting data on c1.labsdb, you can
check at <https://tools.wmflabs.org/tool-db-usage/>.
This reboot is an intermediate step before the complete shutdown of
the server on Wednesday 2017-12-13. See
<https://wikitech.wikimedia.org/wiki/Wiki_Replica_c1_and_c3_shutdown>
for more information.
Bryan (on behalf of the Wikimedia Cloud Services and DBA teams)
--
Bryan Davis Wikimedia Foundation <bd808(a)wikimedia.org>
[[m:User:BDavis_(WMF)]] Manager, Cloud Services Boise, ID USA
irc: bd808 v:415.839.6885 x6855
_______________________________________________
Wikimedia Cloud Services announce mailing list
Cloud-announce(a)lists.wikimedia.org (formerly labs-announce(a)lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
Hi,
In case you're not aware, I maintain a small Python library that has
some common utility functions for my Toolforge tools.
Features:
* toolforge.connect('xxwiki') to easily get a database connection
* toolforge.set_user_agent('mytool') to set a requests user-agent that
complies with the Wikimedia User-Agent policy.
* toolforge.redirect_to_https - Enforce HTTPS-only for Flask tools.
If you're already using the library, you can update to version 4.1.0 to
automatically switch to the new database servers. Thanks to Bryan Davis
for the patch.
Full documentation for the library is on Wikitech[1]. If you have
suggestions for new features, let me know :)
[1] https://wikitech.wikimedia.org/wiki/User:Legoktm/toolforge_library
-- Legoktm
The labsdb1001.eqiad.wmnet (aka c1.labsdb) and labsdb1003.eqiad.wmnet
(aka c3.labsdb) servers are being shutdown and permanently removed
from service on Wednesday 2017-12-13.
TL;DR
* Change your tools and scripts to use:
- "*.web.db.svc.eqiad.wmflabs" (real-time response needed)
- "*.analytics.db.svc.eqiad.wmflabs" (batch jobs; long queries)
* Replace "*" with either a shard name (e.g. s1) or a wikidb name
(e.g. enwiki).
* The new servers do not support user created databases/tables because
replication can't be guaranteed. See T156869 and below for more
information.
* Migrate your user created tables to tools.db.svc.eqiad.wmflabs
(also known as tools.labsdb) and JOIN via application space logic
rather than in-process in the database.
What is changing?
* Week of 2017-10-30 to 2017-11-03 (exact date to be determined)
** Reboot labsdb1001.eqiad.wmnet (aka c1.labsdb) for kernel updates
** There is a possibility of catastrophic hardware failure in this
reboot. There will be no way to recover the server or the data it
currently hosts if that happens.
* Week of 2017-11-06 to 2017-11-10 (exact date to be determined)
** Reboot labsdb1003.eqiad.wmnet (aka c3.labsdb) for kernel updates
** There is a possibility of catastrophic hardware failure in this
reboot. There will be no way to recover the server or the data it
currently hosts if that happens.
* Wednesday 2017-12-13
* "*.labsdb" service names switched to point at
"*.web.db.svc.eqiad.wmflabs" equivalents.
* User created tables will not be allowed on the new servers
"c1.labsdb" and "c3.labsdb" point to.
* labsdb1001.eqiad.wmnet removed from service.
* labsdb1003.eqiad.wmnet removed from service.
Why are we doing this?
See <https://wikitech.wikimedia.org/wiki/Wiki_Replica_c1_and_c3_shutdown>
and <https://phabricator.wikimedia.org/T142807> for a more complete
description of the reasons for these changes.
Bryan (on behalf of the Wikimedia Cloud Services and DBA teams)
--
Bryan Davis Wikimedia Foundation <bd808(a)wikimedia.org>
[[m:User:BDavis_(WMF)]] Manager, Cloud Services Boise, ID USA
irc: bd808 v:415.839.6885 x6855
_______________________________________________
Wikimedia Cloud Services announce mailing list
Cloud-announce(a)lists.wikimedia.org (formerly labs-announce(a)lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
Sorry for cross-posting!
Reminder: Technical Advice IRC meeting again **tomorrow 3-4 pm UTC** on
#wikimedia-tech.
The Technical Advice IRC meeting is open for all volunteer developers,
topics and questions. This can be anything from "how to get started" over
"who would be the best contact for X" to specific questions on your project.
If you know already what you would like to discuss or ask, please add your
topic to the next meeting: https://www.mediawiki.org/
wiki/Technical_Advice_IRC_Meeting
This meeting is an offer by WMDE’s tech team. Hosts of tomorrows meeting
are: @addshore & @CFisch_WMDE.
Hope to see you there!
Michi (for WMDE’s tech team)
--
Michael F. Schönitzer
Wikimedia Deutschland e.V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Tel. (030) 219 158 26-0
http://wikimedia.de
Stellen Sie sich eine Welt vor, in der jeder Mensch an der Menge allen
Wissens frei teilhaben kann. Helfen Sie uns dabei!
http://spenden.wikimedia.de/
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
Hi,
I recently spun up a VM and am looking to learn more about how to maintain
backups of the data that is on it, particularly its database. How do others
go about accomplishing this?
Cheers,
Morten
Sorry for cross-posting!
Reminder: Technical Advice IRC meeting again **tomorrow 3-4 pm UTC** on
#wikimedia-tech.
The Technical Advice IRC meeting is open for all volunteer developers,
topics and questions. This can be anything from "how to get started" over
"who would be the best contact for X" to specific questions on your project.
If you know already what you would like to discuss or ask, please add your
topic to the next meeting:
https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting
This meeting is an offer by WMDE’s tech team. Hosts of todays meeting are:
@addshore & @CFisch_WMDE.
Hope to see you there!
Michi (for WMDE’s tech team)
--
Michael F. Schönitzer
Wikimedia Deutschland e.V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Tel. (030) 219 158 26-0
http://wikimedia.de
Stellen Sie sich eine Welt vor, in der jeder Mensch an der Menge allen
Wissens frei teilhaben kann. Helfen Sie uns dabei!
http://spenden.wikimedia.de/
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
Hate to necro an old thread. This issue went away for the most part for
about 6-8 months but now is back with a vengeance. Its happening almost
daily now.
On Sun, Dec 4, 2016 at 2:08 PM, John <phoenixoverride(a)gmail.com> wrote:
> Via the Python log rotation process. I've been using the same code for
> years though.
>
> On Sun, Dec 4, 2016 at 1:14 PM Tim Landscheidt <tim(a)tim-landscheidt.de>
> wrote:
>
>> (anonymous) wrote:
>>
>> > Correct, when I notice files with 0000 I fix them. However it's
>> happening
>> > repeatedly and just started recently.
>>
>> > […]
>>
>> How are those files created?
>>
>> Tim
>>
>>
>> _______________________________________________
>> Labs-l mailing list
>> Labs-l(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>>
>