(If you don’t work with pagelinks table, feel free to ignore this message)
Hello,
Here is an update and reminder on the previous announcement
<https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/…>
regarding normalization of links tables that was sent around a year ago.
As part of that work, soon the pl_namespace and pl_title columns of
pagelinks table will be dropped and you will need to use pl_target_id
joining with the linktarget table instead. This is basically identical to
the templatelinks normalization that happened a year ago.
Currently, MediaWiki writes to both data schemes of pagelinks for new rows
in all wikis except English Wikipedia and Wikimedia Commons (we will start
writing to these two wikis next week). We have started to backfill the data
with the new schema but it will take weeks to finish in large wikis.
So if you query this table directly or your tools do, You will need to
update them accordingly. I will write a reminder before dropping the old
columns once the data has been fully backfilled.
You can keep track of the general long-term work in T300222
<https://phabricator.wikimedia.org/T300222> and the specific work for
pagelinks in T299947 <https://phabricator.wikimedia.org/T299947>. You can
also read more on the reasoning in T222224
<https://phabricator.wikimedia.org/T222224> or the previous announcement
<https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/…>
.
Thank you,
--
*Amir Sarabadani (he/him)*
Staff Database Architect
Wikimedia Foundation <https://wikimediafoundation.org/>
I have a VPS instance that I think can be shut down. What I want to do is shut it down in a way that I can restart it again without losing anything, but not yet commit to deleting it. The idea is to leave it in that state for a while just to make sure there's no dependencies on it that i've missed. Do I want to pause it, suspend it, or shelve it? What's the difference between those three?
Thanks for the clear explanation, this gives more context for the urgency.
> On Jan 17, 2024, at 3:04 PM, Amir Sarabadani <asarabadani(a)wikimedia.org> wrote:
> What about only dropping it from Commons to reduce the risk of outage and leave the rest until the all are finished (or all except Wikidata)? You'd have to write something for the new schema regardless.
If we insist on dropping now, I do think treating Commons as special and waiting until all are finished before dropping the rest would be better. It’s a small difference but potentially makes the logic simpler to implement.
Ben / Earwig
Hi Amir & others,
I’m glad we are making changes to improve DB storage/query efficiency. I wanted to express my agreement with Tacsipacsi that dropping the data before the migration has completed is a really bad outcome. Now tool maintainers need to deal with multiple migrations depending on the wikis they query or add more code complexity. And there is little time to make the changes for those of us who had planned to wait until the new data was available.
> Commons has grown to 1.8TB already
That’s a big number yes, but it doesn’t really answer the question — is the database actually about to fill up? How much time do you have until that happens and how much time until s1/s8 finish their migration? Is there a reason you can share why this work wasn’t started earlier if the timing is so close?
> so you need to use the old way for the list of thirty-ish wikis (s1, s2, s6, s7, s8) and for any wiki not a member of that, you can just switch to the new system
IMO, a likely outcome is some tools/bots will simply be broken on a subset of wikis until the migtation is completed across all DBs.
Thanks for all your work on this task so far.
Ben / Earwig
VideoWiki is a project of WikiProjectMed aimed at disseminating wikipedia
content visually, rather than textually, by generating a video from the
text and images of a wiki page. It is currently deployed at
https://videowiki.wmcloud.org. (It is not fully operational yet.)
We asked for a static IP address to access it from videowiki.org, but were
denied. So I have two questions:
1) Is there community support for granting this project a static IP to
facilitate this access?
2) If not, is there a recommended way to use dynamic dns, such as noip, to
cause videowiki.org to resolve as videowiki.wmcloud.org? I think most
dyndns clients assume the IP of the host is the target rather than a proxy,
though I suppose the proxy could be pinged and that address supplied
periodically to the dyndns service.
Any help would be appreciated,
Tim