makes sense. We will indeed be doing a batch process once a week to build the completion indices which ideally will run through all the wiki's in a day. We are going to do some analysis into how up to date our page view data really needs to be for scoring purposes though, if we can get good scoring results while only updating page view info when a page is edited we might be able to spread out the load across time that way and just hit the page view api once for each edit. Otherwise i'm sure we can do as suggested earlier and pull the data from hive directly and stuff into a temporary structure we can query while building the completion indices.