We had a discussion today at the MediaWiki Stakeholders' Group membership meeting
about the current status of development on the new WikiApiary as well as future
directions. Below is a brief summary as well as a request for feedback.
We have created a new database schema to support WikiApiary, scripts to populate the data,
as well as a MediaWiki extension to expose the data to a wiki. We will be making that
extension available on gerrit very soon. It currently temporarily resides at
https://github.com/cicalese/NewWikiApiary.
We exported the wiki URLs found in the old WikiApiary and imported them into a new wiki,
which resides at
https://wikiapiary.wmcloud.org hosted by Wikimedia Cloud Services. There
were approximately 96,000 wiki URLs retrieved from the old WikiApiary. Or those, about 90%
are Fandom wikis. There may have been URLs that were missed in the export, despite the
fact that we found far more URLs than the approximately 46,000 that the old WikiApiary
reported as active. We will be making available a mechanism to contribute wiki URLs both
through the wiki user interface as well as through an API.
We created pages for the imported wikis in the new wiki with page titles that consist of
the wiki sitename followed by its language code in parenthesis. In the approximately 500
cases where this resulted in naming collisions, alternative page titles were used. Because
of this naming scheme, page titles differ between the old WikiApiary and the new
WikiApiary.
We have done an initial scrape of all of those wikis to get statistics and configuration
information through the MediaWiki action API, as was done in the old WikiApiary, storing
the data in the database rather than in subpages in the wiki. It takes roughly 24 hours to
scrape approximately 96000 wikis. We have begun work on the front end of the new wiki to
render the data. We are taking advantage of the extension information available at
mediawiki.org in Module:ExtensionJson as well.
The old WikiApiary is still accessible in a read only mode at
https://wikiapiary.com. It
is hosted on an Azure instance but will be moving as-is to a Wikimedia Cloud Services
instance. It will be maintained for the near future to allow the community to identify
features that should be made available in the new WikiApiary. At some point, when it has
enough capabilities, the new WikiApiary will become available at
https://wikiapiary.com,
while the old WikiApiary will continue to be available at another URL. We anticipate that
the old WikiApiary will be retired at some point, but not before we have had plenty of
time to migrate useful functionality.
This is very much a work in progress, but we have gotten far enough along in the
development to be reassured that this new data model will be much more performant that
that in the old WikiApiary. At this point it would be helpful to have feedback from the
WikiApiary community, especially on a few key upcoming decisions. In addition, we welcome
volunteers to contribute to the effort. Some items we will need to decide:
1. We currently have account creation on the new wiki disabled to avoid being besieged by
spam account creations. We will need to decide what approach to take to account creation
and spam prevention. We would very much like to enable account creation so other can
contribute.
2. At this point, the main effort has been to gather statistics on wikis and their
extensions/skins through MediaWiki action API queries. There is a wealth of information in
the old WikiApiary that is not yet incorporated, such as Internet archive information. We
welcome volunteers to contribute to any migration that is determined useful.
3. We do not yet have an extension enabled on the wiki for semantic tagging. We are
hesitant to enable Semantic MediaWiki at this point, as there appears to be a bug that is
preventing properties from being reliably saved. That needs more investigation. We may
also consider Cargo or other mechanisms, such as investigating approaches to indexing the
data in Elasticsearch. And, we may investigate the relationship to WikiData.
4. We currently have less than 10 extensions/skins enabled on the new wiki. We will need
some decisions on what other extensions would be useful.
5. The user interface of the new WikiApiary isn't very spiffy. We welcome volunteers
who have a passion for creating appealing, responsive user interfaces to give it a more
modern look.
OK, maybe this wasn't such a brief status update. But, hopefully it gives you a sense
of where we're at and what opportunities there are to contribute. Several of us will
be at the Wikimedia Hackathon in May working on this project, and we would welcome others
to collaborate with us there or remotely. In the meantime, we will continue to volunteer
what time we can to move this project forward.
A hearty thank you to Mark Hershberger and Charly Cobben who have worked with me on this
project to get it this far.