Hi,
On 2/4/22 08:58, Adam Baso wrote:
Thanks for sharing this. I was wondering, might it be possible to help bring this sort of functionality into Code Search ( https://codesearch.wmflabs.org https://codesearch.wmflabs.org ) ? I noticed the presentation of the search UI looked similar, but I see how the symbol resolution might be something useful for Code Search and upstream Hound.
Indeed, we've been discussing and exploring symbol-based search for a while now: https://phabricator.wikimedia.org/T183795. There are some pretty neat upstream projects that do this like https://searchfox.org/ and zoekt, which is designed for Gerrit integration. I would also note that things are likely to change whenever we migrate to GitLab, which has its own search functionality built-in (https://phabricator.wikimedia.org/T268196). My assumption is that GitLab will add symbol-based search eventually to compete with GitHub, hopefully that ends up in the CE version someday...
While I very much disagree with the opening proposition that "Working with Wikimedia code is time-consuming and risky", I think symbol search of the MediaWiki codebase would be incredibly powerful and unlock a new level of tooling, just like Codesearch did when it was first introduced, so I'm glad to see people looking into it! For example we could do stuff like https://phabricator.wikimedia.org/T186771 with it.
There were two main principles in building MediaWiki Codesearch, first that everything be licensed as free software[1] (which Bryan covered as well) and that we try to use upstream as much as possible. Our modifications are injected by a proxy rather than patch the upstream code...which has turned out to be incredibly stable over the years.
If you want to collaborate, all the code and setup is in Git and can add you to the project, but I see little to no value in building proprietary tools or reinventing what other projects have done pretty well rather than building on top of them.
[1] https://mako.cc/writing/hill-free_tools.html
-- Legoktm