Nicolas Dumazet ha scritto:
From this test, dict. __setitem__ seem to be 25% faster than set.add on Python 2.5 :
[...]
# 312 seconds for set.add # 224 seconds for dict.__setitem__
# 359 seconds for set.__contains__ # 362 seconds for dict.__contains__
As you can see, __contains__ shows the same performances between set and dict.
I tried on 3 different hardwares: different absolute values, same relative differences.
I can confirm this values on my Linux system. On Cygwin __contains__ part looks as slower:
400 seconds for set.add 291 seconds for dict.__setitem__ 505 seconds for set.__contains__ 550 seconds for dict.__contains__
Asking around about that "slow set" rumor, I got told that set might have performed quite bad in Python 2.4 . Well, I tried, and 2.4 is not significantly slower than 2.5 on those operations (did the set improvements got backported to 2.5 ?)
Probably 2.3 or earlier version instead.
Anyway, bottom line:
- "bite your tongue a thousand times..."
- set is mutable, and has not been immutable ever in Python history
- dict.__setitem__ looks 25% faster than set.add
- key length (strings, from 5 to 100 characters long) does not appear to
affect run time
- Python 2.4 & 2.5 perform equally well wrt set & dict operations
Ok, I think we can replace 'set' with 'dict', but there are some other near functions that are extremely slow and should be optimized before, so the performance change is more noticeable. I am going to do that.