I'm a student currently pursuing a MSc in Data Science and I've been
thinking of applying to GSoC with Wikimedia this year. For over a year now
I've been a system admin of a medium-sized wiki, I wrote a couple
extensions (you can find them here:
) and some patches to core.
By being a sysadmin of a wiki I watch its performance closely and over time
I've discovered the single thing that slowed down the wiki the most was
InstantCommons. It turns out the ForeignApiRepo code is fine for a few
pages with little images, but once your wiki starts using Commons imagery a
lot, things get ugly, quick. Like parsing-a-page-takes-2-minutes-ugly. Or
the whole wiki can collapse if Commons isn't responding for some reason.
I think improving this would kind of correlate with Wikimedia's mission of
hosting the most accessible free media repository in the world :) I really
wish more people could use Commons extensively, and that would certainly
I did some research into that topic and came up with a few solutions, but I
am by no means an expert in MW architecture, I would be grateful if I
received some help from people familiar with Parsoid and the action API.
You can find a more detailed explanation here:
I am also looking for mentors for this project :)