A few things to note:
* APC is not LRU, it just detects expired items on get() and clears everything when full (https://groups.drupal.org/node/397938) * APC has a low max keys config on production, so using key-per-item would require that to change * Implementing LRU groups for BagOStuff would require heavy CAS use and would definitely be bad over the wire (and not great locally either)
Just how high is the label traffic/queries? Do we profile this?
If it is super high, I'd suggest the following as a possibility: a) Install a tiny redis instance on each app server. b) Have a sorted set in redis containing (label key => score) and individual redis keys for label strings (with label keys). Label keys would be like P33-en. The sorted set and string values would use a common key prefix in redis. The sorted-set key would mention the max size. c) Cache get() method would use the normal redis GET method. Once every 10 times it could send a Lua command to bump the label key's score in the sorted-set (ZSCORE) to that of the highest score +1 (find via ZRANGE key -1 -1 WITHSCORES). d) Cache set() method would be a no-op except once every 10 times. When it does anything, it would send a Lua command to remove the lowest scored key if there is no room (ZREMRANGEBYRANK key 0 1) and in any case add the label key with a score equal to the highest score + 1. It would also add the value in the separate key for that value with a TTL (likewise deleting it on eviction). The sorted-set TTL would be set to max(current TTL, new value TTL). e) Cache misses would fetch from the DB rather than text store
If high traffic causes flooding, the "10" number can be tweaked (or eliminated) or the "highest rank + 1" logic could be tweaked to insert new labels with a score that's better than only 3/8 of the stuff rather than all of it (borrowing from MySQL). The above method just uses O(logN) redis stuff.
Such a thing could probably be useful for at least a few more use cases I'd bet.
-- View this message in context: http://wikimedia.7.x6.nabble.com/Efficient-caching-of-large-data-sets-for-wi... Sent from the Wikipedia Developers mailing list archive at Nabble.com.