Hello all,
The Wikimania Hackathon is August 12-14, just over 2 weeks away! Details
are listed below.
Registration
Registration for the Wikimania Hackathon has now opened [1]! To
register, submit
your information to the Wikimania organizers, which will give you access to
the Wikimania platform [2]. You can also optionally add your name to
the participants
page on the Wikimania wiki [3].
Platform
The Hackathon will take place virtually on Pheedloop, the Wikimania
platform [4]. This platform complies with WCAG 2.1 AA, and will support
screen readers, font adjustments, and many other accessibility features.
Video sessions will be held in Jitsi through this platform.
Format of the event
The Hackathon consists of events spread over three days [5]:
-
On the first day, there will be a pre-Hacking showcase to share project
ideas and find collaborators. Anyone can present a project, and anyone can
come as an observer.
-
Throughout the next two days, there will be open hacking, social events,
and technical sessions. Anyone can offer a session; just claim a slot on
the schedule!
-
Finally, there will be a final showcase to share the projects worked on
during the Hackathon.
Preparing for the Hackathon
There are many ways to take part in the event. Think about what project you
might want to work on (see examples from past Hackathons [6]), add your
idea to Phabricator [7], and consider presenting at the pre-Hacking
showcase. Host a session by adding information to the schedule [5]. Check
out information for newcomers [8]. And don’t forget to register [2]!
Best wishes,
Haley and the Developer Advocacy Team
[1] https://wikimania.wikimedia.org/wiki/Hackathon
[2] https://pheedloop.com/register/wikimania2022/attendee/
[3] https://wikimania.wikimedia.org/wiki/Hackathon/Participants
[4]
https://diff.wikimedia.org/2022/07/20/the-platform-powering-wikimania-2022/
[5] https://wikimania.wikimedia.org/wiki/Hackathon/Schedule
[6] https://www.mediawiki.org/wiki/Wikimedia_Hackathon_2022/Showcase
[7] https://phabricator.wikimedia.org/project/board/6030/
[8] https://wikimania.wikimedia.org/wiki/Hackathon/Newcomers
(If you don’t work with links tables such as templatelinks, pagelinks and
so on, feel free to ignore this message)
TLDR: The schema of links tables (starting with templatelinks) will change
to have numeric id pointing to linktarget table instead of repeating
namespace and title.
Hello,
The current schema and storage of most links tables are: page id (the
source), namespace id of the target link and title of the target. For
example, if a page with id of 1 uses Template:Foo, the row in the database
would be 1, 6, and Foo (Template namespace has id of 6)
Repeating the target’s title is not sustainable, for example more than half
of Wikimedia Commons database is just three links tables. The sheer size of
these tables makes a considerable portion of all queries slower, backups
and dumps taking longer and taking much more space than needed due to
unnecessary duplication. In Wikimedia Commons, on average a title is
duplicated around 100 times for templatelinks and around 20 times for
pagelinks. The numbers for other wikis depend on the usage patterns.
Moving forward, these tables will be normalized, meaning a typical row will
hold mapping of page id to linktarget id instead. Linktarget is a new table
deployed in production and contains immutable records of namespace id and
string. The major differences between page and linktarget tables are: 1-
linktarget values won’t change (unlike page records that change with page
move) 2- linktarget values can point to non-existent pages (=red links).
The first table being done is templatelinks, then pagelinks, imagelinks and
categorylinks will follow. During the migration phase both values will be
accessible but we will turn off writing to the old columns once the values
are backfilled and switched to be read from the new schema. We will
announce any major changes beforehand but this is to let you know these
changes are coming.
While the normalization of all links tables will take several years to
finish, templatelinks will finish in the next few months and is the most
pressing one.
So if you:
-
… rely on the schema of these tables in cloud replicas, you will need to
change your tools.
-
… rely on dumps of these tables, you will need to change your scripts.
Currently, templatelinks writes to both data schemes for new rows in most
wikis. This week we will start backfilling the data with the new schema but
it will take months to finish in large wikis.
You can keep track of the general long-term work in
https://phabricator.wikimedia.org/T300222 and the specific work for
templatelinks in https://phabricator.wikimedia.org/T299417. You can also
read more on the reasoning in https://phabricator.wikimedia.org/T222224.
Thanks
--
*Amir Sarabadani (he/him)*
Staff Database Architect
Wikimedia Foundation <https://wikimediafoundation.org/>
Hi everyone!
TL;DR;
Currently there's a degradation on the service for VMs and anything running on them (ex. toolforge, quarry, paws,
...), you might be able to use the services or they might become too slow, we are working on it and will update when
fixed.
Long story:
We were adding a new ceph node to the ceph cluster. This time the node was in a different subnet, but ceph is supposed
to be transparently able to work with many subnets. For some reason the new node was added to the cluster, but it's
missing to reply to any heartbeats sent from any other nodes in the cluster and that causes the cluster to keep
rebalancing data around, what creates a continuous IO slowness for any clients (like VMs).
We are trying to minimize the impact by limiting the amount of data that gets re-shuffled, that slows down the
intervention a bit, but should improve the client experience.
We are actively working on this, and will update with any changes.
Cheers!
--
David Caro
SRE - Cloud Services
Wikimedia Foundation <https://wikimediafoundation.org/>
PGP Signature: 7180 83A2 AC8B 314F B4CE 1171 4071 C7E1 D262 69C3
"Imagine a world in which every single human being can freely share in the
sum of all knowledge. That's our commitment."
_______________________________________________
Cloud-announce mailing list -- cloud-announce(a)lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.…
Hello,
The externallinks table
<https://www.mediawiki.org/wiki/Manual:Externallinks_table> in MediaWiki is
among the largest in Wikimedia production. It's the second largest database
table in Wikimedia Commons (and will soon claim the first place, after
templatelinks
normalization <https://phabricator.wikimedia.org/T299417> completes).
There is a proposal to redesign this table. You can read more about it in
T312666 <https://phabricator.wikimedia.org/T312666>. If you use this table
or have some feedback about this proposal, please comment on the ticket.
Also, this means its schema and data will change soon. Be prepared to
update your tools and reports if you depend on the externallinks table.
Providing technical support to Wikimedia Commons is one of the official
goals of Wikimedia Foundation for this fiscal year, [1] and this redesign
will help address long-standing storage and database capacity issues of
this important project.
[1]: From the annual plan
<https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2022-2023>:
"Deepen our commitment to Knowledge as a Service by strengthening how we
prioritize and allocate product and tech support to 740+ Wikimedia
projects, starting with Wikimedia Commons and Wikidata."
Best
--
*Amir Sarabadani (he/him)*
Staff Database Architect
Wikimedia Foundation <https://wikimediafoundation.org/>
Hello!
This is a friendly reminder for you to spare 5-10 minutes to leave
feedback[0] on the Toolhub taxonomy[1].
The period for providing feedback ends on August 21, 2022.
Toolhub[2] is a catalog of 1500+ tools used by a wide range of Wikimedia
contributors: editors, developers, patrollers, researchers, admins and more.
We want to make finding and categorizing these tools as easy as possible.
The taxonomy is at the heart of how tool search works, and your feedback
would help improve it.
Whether you are a current user of Toolhub or hearing about it for the first
time doesn't matter – your input is valuable and much appreciated either
way!
=== How To Provide Feedback ===
Use the discussion page[3] of the feedback page to provide your responses
to the questions.
You will find more details on the feedback page.
=== Implementation ===
At the end of the feedback round, the team will evaluate and work on the
necessary improvements.
This is expected to be completed by the end of September 2022.
[0]: https://meta.wikimedia.org/wiki/Toolhub/Data_model/Feedback
[1]: https://meta.wikimedia.org/wiki/Toolhub/Data_model#Taxonomy_v2
[2]: https://toolhub.wikimedia.org/
[3]: https://meta.wikimedia.org/wiki/Talk:Toolhub/Data_model/Feedback
--
Seyram Komla Sapaty
Developer Advocate
Wikimedia Cloud Services
_______________________________________________
Cloud-announce mailing list -- cloud-announce(a)lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.…
Hello!
The Toolhub project team wants your feedback[0] on the Toolhub taxonomy[1].
Toolhub[2] is a catalog of 1500+ tools used by a wide range of Wikimedia
contributors: editors, developers, patrollers, researchers, admins and more.
We want to make finding and categorizing these tools as easy as possible.
The taxonomy is at the heart of how tool search works, and your feedback
would help improve it.
Whether you are a current user of Toolhub or hearing about it for the first
time doesn't matter – your input is valuable and much appreciated either
way!
Please take 5-10 minutes to leave feedback.
=== How To Provide Feedback ===
Use the discussion page[3] of the feedback page to provide your responses
to the questions.
You will find more details on the feedback page.
The period for providing feedback ends on August 21, 2022.
=== Implementation ===
At the end of the feedback round, the team will evaluate and work on the
necessary improvements.
This is expected to be completed and announced by the end of September 2022.
[0]: https://meta.wikimedia.org/wiki/Toolhub/Data_model/Feedback
[1]: https://meta.wikimedia.org/wiki/Toolhub/Data_model#Taxonomy_v2
[2]: https://toolhub.wikimedia.org/
[3]: https://meta.wikimedia.org/wiki/Talk:Toolhub/Data_model/Feedback
Thanks
--
Seyram Komla Sapaty
Developer Advocate
Wikimedia Cloud Services
_______________________________________________
Cloud-announce mailing list -- cloud-announce(a)lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.…