Hello Rob,
Just to be really clear, I'm not looking for a "right" answer on any of those questions. It's not necessary for you to be even interested in getting deeply involved in the Wikipedia user community to have a really successful project. The purpose of this line of questions is to figure out if we should continue helping you refine your current idea, or suggest some other direction that's a bigger payoff and/or easier sell.
I understand, and that'd be very helpful. To be honest, I'm not passionately committed to any project at all. I've been writing projects for university and for a computer lab I work at, but it's mostly small, one-off sysadmin things and usually the emphasis is more on "xyz server has to be back up before we open tomorrow" than writing good, clean code. So, yes, I'd welcome other suggestions.
As I'm sure you've already gathered from the other responses, this is exactly the right place. I'm a little skeptical myself that porting that particular piece of code from OCaml to Python is going to be a really big win for us (because it's still a "foreign" language as far as PHP-based MediaWiki is concerned, so integration is still a little clunky and performance may take a hit due to yet another interpreter needing to load), but I'll let others weigh in on whether I'm making too big a deal about that.
There are ways to make this run faster if performance is a concern. For example, mod_python or mod_wscgi, or explicitly pulling the Python out into a standalone daemon that listens for requests from the webserver.
Another possibility be writing it in C to avoid all interpreter overhead, and using a foreign function interface. Unfortunately, I'm not familiar with PHP's FFI. Google takes me to http://wiki.php.net/rfc/php_native_interface which seems to think that as of a year ago there weren't any good ones, but this doesn't look too painful: http://theserverpages.com/php/manual/en/zend.creating.php
Stepping back from the specifics of your proposal (which I think the others on this list have responded to pretty well), I'd like to find out more about what general sorts of projects interest you the most, which may help us figure out if we should keep going in this direction. Some questions:
- Are you most interested in having a Python-based project, or would you
be *equally* happy and productive programming something in PHP?
I'm most familiar with Python and C, for whatever that's worth coming from an undergrad who didn't know Python existed five years ago. I learned PHP to maintain the web interfaces of an in-house print system at work, but I haven't used it for anything as involved as what we're discussing here. So, in terms of productivity, yes, if I have to work in PHP my mentor will probably get asked a few more newbie questions.
In terms of happiness, though, it'd be a great opportunity to dig into PHP and finally learn to use it as more than really smart CSS with a database connection. Although I prefer Python or even C because I think I'd be more useful, I wouldn't be very upset at all if it turned out you guys were willing to let me learn PHP on your time.
- Are you zeroing in on <math> parsing and parsing in general because
that's an area that you're already developing expertise in and/or are deeply interested in getting into, or is that just something that looked kinda interesting to learn about relative to other opportunities you considered?
I like the <math> parsing project because it seems well-suited for a third-year undergrad who knows LaTeX and reads a few other functional languages and has studied lex/yacc before in his coursework. The goals are clear, and I know how to break them down into smaller problems and how to tackle each one. It's a little isolated from the rest of Mediawiki, so I don't need to grok the entire code base.
Basically, this looks like a way to make a concrete contribution despite being a newcomer to the project. That doesn't mean I'm not happy to entertain alternatives, just that they have a pretty high bar to clear.
- Are you coming at this as someone who is already deep into
Wikipedia/MediaWiki usage who is looking to resolve particular things (like <math> parsing) that are painful as an end user, or are you more casually involved and more interested in applying in this project because it looks like we've got a lot of interesting programming problems to solve?
The second. I just want to tackle a problem that's near but not quite beyond my limits, and if I can help out a site I use daily, so much the better.
Yours, Damon Wang