Timezone-appropriate greeting, wikitech!
I've been working on a new extension, BetaFeatures[0]. A lot of you have heard about it through the grapevine, and for the rest of you, consider this an announcement for the developers. :)
The basic idea of the extension is to enable features to be enabled experimentally on a wiki, on an opt-in basis, instead of just launching them immediately, sometimes hidden behind a checkbox that has no special meaning in the interface. It also has a lot of cool design work on top of it, courtesy of Jared and May of the WMF design team, so thanks very much to them. There are still a few[1] things[2] we have to build out, but overall the extension is looking pretty nice so far.
I am of course always soliciting advice about the extension in general, but in particular, we have a request for a feature for the fields that has been giving me a bit of trouble. We want to put a count of users that have each preference enabled on the page, but we don't want to, say, crash the site with long SQL queries. Our theories thus far have been:
* Count all rows (grouped) in user_properties that correspond to properties registered through the BetaFeatures hook. Potentially a lot of rows, but we have at least decided to use an "IN" query, as opposed to "LIKE", which would have been an outright disaster. Obviously: Caching. Caching more would lead to more of the below issues, though.
* Fire off a job, every once in a while, to update the counts in a table that the extension registers. Downsides: Less granular, sort of fakey (since one of the subfeatures will be incrementing the count, live, when a user enables a preference). Upside: Faster.
* Update counts with simple increment/decrement queries. Upside: Blazingly faster. Potential downside: Might get out of sync. Maybe fire off jobs even less frequently, to ensure it's not always out of date in weird ways?
So my question is, which of these are best, and are there even better ways out there? I love doing things right the first time, hence my asking.
[0] https://www.mediawiki.org/wiki/Extension:BetaFeatures [1] https://mingle.corp.wikimedia.org/projects/multimedia/cards/2 [2] https://mingle.corp.wikimedia.org/projects/multimedia/cards/21
P.S. One of the first features that we'll launch with this framework is the "MultimediaViewer" extension which is also under[3] development[4] as we speak. Exciting times for the Multimedia team!
[3] https://mingle.corp.wikimedia.org/projects/multimedia/cards/8 [4] https://mingle.corp.wikimedia.org/projects/multimedia/cards/12
I worked on an accounting system with similar requirements and we had an even more complicated system but one you might want to consider: 1. When something happens record the event and how much it changed the value along with a timestamp. In our case we'd just have enable and disable events. 2. We ran a job that summarized those events into hourly changes. 3. Every day we a log of the actual value (at midnight or whatever).
This let us quickly make all kinds of crazy graphs with super deep granularity over short periods of time and less granularity over long periods of time. Essentially it was an accountant's version of RRDtool. It didn't have problems with getting out of sync because we never had more than one process update more than one field.
It is probably overkill but might serve as a dramatic foil to the simpler ideas.
Nik
On Tue, Sep 3, 2013 at 5:58 PM, Mark Holmquist mtraceur@member.fsf.org wrote:
Timezone-appropriate greeting, wikitech!
I've been working on a new extension, BetaFeatures[0]. A lot of you have heard about it through the grapevine, and for the rest of you, consider this an announcement for the developers. :)
The basic idea of the extension is to enable features to be enabled experimentally on a wiki, on an opt-in basis, instead of just launching them immediately, sometimes hidden behind a checkbox that has no special meaning in the interface. It also has a lot of cool design work on top of it, courtesy of Jared and May of the WMF design team, so thanks very much to them. There are still a few[1] things[2] we have to build out, but overall the extension is looking pretty nice so far.
I am of course always soliciting advice about the extension in general, but in particular, we have a request for a feature for the fields that has been giving me a bit of trouble. We want to put a count of users that have each preference enabled on the page, but we don't want to, say, crash the site with long SQL queries. Our theories thus far have been:
Count all rows (grouped) in user_properties that correspond to properties registered through the BetaFeatures hook. Potentially a lot of rows, but we have at least decided to use an "IN" query, as opposed to "LIKE", which would have been an outright disaster. Obviously: Caching. Caching more would lead to more of the below issues, though.
Fire off a job, every once in a while, to update the counts in a table that the extension registers. Downsides: Less granular, sort of fakey (since one of the subfeatures will be incrementing the count, live, when a user enables a preference). Upside: Faster.
Update counts with simple increment/decrement queries. Upside: Blazingly faster. Potential downside: Might get out of sync. Maybe fire off jobs even less frequently, to ensure it's not always out of date in weird ways?
So my question is, which of these are best, and are there even better ways out there? I love doing things right the first time, hence my asking.
[0] https://www.mediawiki.org/wiki/Extension:BetaFeatures [1] https://mingle.corp.wikimedia.org/projects/multimedia/cards/2 [2] https://mingle.corp.wikimedia.org/projects/multimedia/cards/21
P.S. One of the first features that we'll launch with this framework is the "MultimediaViewer" extension which is also under[3] development[4] as we speak. Exciting times for the Multimedia team!
[3] https://mingle.corp.wikimedia.org/projects/multimedia/cards/8 [4] https://mingle.corp.wikimedia.org/projects/multimedia/cards/12
-- Mark Holmquist Software Engineer, Multimedia Wikimedia Foundation mtraceur@member.fsf.org https://wikimediafoundation.org/wiki/User:MHolmquist
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Wed, Sep 04, 2013 at 09:34:49AM -0400, Nikolas Everett wrote:
I worked on an accounting system with similar requirements and we had an even more complicated system but one you might want to consider:
- When something happens record the event and how much it changed
the value along with a timestamp. In our case we'd just have enable and disable events. 2. We ran a job that summarized those events into hourly changes. 3. Every day we a log of the actual value (at midnight or whatever).
This let us quickly make all kinds of crazy graphs with super deep granularity over short periods of time and less granularity over long periods of time. Essentially it was an accountant's version of RRDtool. It didn't have problems with getting out of sync because we never had more than one process update more than one field.
It is probably overkill but might serve as a dramatic foil to the simpler ideas.
Thanks a lot, Nik - I'm sure it is overkill, but it sounds like one of the more seamless solutions without being too heavy on performance. Probably our solution will look like "fire events to update the counts, then sync them periodically when some number of events have fired *or* when some amount of time has passed without a sync."
I'll set about working on that now; should be a fun day, since I've never worked with our job queue before :)
In other news, the BetaFeatures framework, as well as the start of our MultimediaViewer, is up at http://multimedia-alpha.wmflabs.org - go ahead and experiment!
wikitech-l@lists.wikimedia.org