On Fri, 29 Feb 2008 16:39:47 -0500, Simetrical wrote:
On Fri, Feb 29, 2008 at 2:50 PM, Steve Sanbeg ssanbeg@ask.com wrote:
How important are intersections for non-existent categories? Without we could have something like (page_id int, cat_intersect bigint) or (page_id int, cat1 int, cat2 int) to get two cat intersection without collisions; and maybe even scale up by defining n-intersections recursively, without collisions.
Maybe, except we don't have category id's. If we did, there would be no such thing as a nonexistent category, logically: there would be categories with no associated article pages, but they would still have category ID's. Unless you're proposing we use article id's, but currently categories do not need any article associated with them, and I'm not sure it's valuable to change that.
That was my thinking; that categories without a page ID are probably typos, and anyway less useful for intersection; if not, the articles could be added. So using the IDs and some recursion could be simpler and more scalable than using a hash.
This is the first I've heard of category IDs. I can see where it would be more consistent to implement to use those once they're available.