|
From: Robi-Wan-Kenobi on 4 Apr 2008 04:24 Hi folks, for our application i have to cache data. For that purpose i am currently using the vector as a container to store the pointers to objects holding the data. Currently that's ok since the keys for the acces are rather sequential with no bigger gaps in between. But for the future we attempt to use some other key which will not be sequential. Then i will have to use the map container for storage of the pointers with that new key. In some tests i just made it turned out that it will get slower with the use of map. I generated a bucket of these cahce-objects, stored them in the container, retrieved them from there about 1000 times and erased them again. For that scenario the map is about four times slower than the vector. The use of hash_map was only a little bit faster. Has anyone experiences in optimization of the map? Is there some way to make it faster? Or woud it be faster to use some special map implementation (or CMap / QMap)? Thanks a lot. Robert -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Alberto Ganesh Barbati on 5 Apr 2008 04:29 Robi-Wan-Kenobi ha scritto: > Hi folks, > > for our application i have to cache data. For that purpose i am > currently using the vector as a container to store the pointers to > objects holding the data. > Currently that's ok since the keys for the acces are rather sequential > with no bigger gaps in between. > But for the future we attempt to use some other key which will not be > sequential. Then i will have to use the map container for storage of > the pointers with that new key. > In some tests i just made it turned out that it will get slower with > the use of map. > I generated a bucket of these cahce-objects, stored them in the > container, retrieved them from there about 1000 times and erased them > again. > For that scenario the map is about four times slower than the vector. > The use of hash_map was only a little bit faster. > Has anyone experiences in optimization of the map? Is there some way > to make it faster? > Faster? You are saying that map is four time slower than vector, well... I might say that it is *only* four time slower and that looks like an excellent achievement to me, given that that map access is O(log(n)) while vector access is O(1)! How much faster would you want it? If you want an associative container you have to pay for it! Anyway, a few advices: 1) profiling benchmarks as you did may be useful, but nothing is better than profiling in "real" runtime conditions. Sometimes the result are strikingly different! 2) it seems strange to hear about an hashed container performing a "little bit" faster than an ordered container, given that an hashed container has average constant complexity as opposed to logarithmic complexity. This may be an hint that your benchmark is not measuring the actual performances of the container, but is dominated by some other cost. Maybe you should try using a different hash function. 3) It's true that std::map is an all-purpose container and ad hoc solution might be more performant. But, believe me, unless you are using some crappy standard library, std::map (or even the hash map you are using) is probably very optimized. Usually the bottleneck is in the way *you* use the map rather than in the map itself. > Or woud it be faster to use some special map > implementation (or CMap / QMap)? I'm sorry but I don't know what CMap and QMap are, but really, I don't think they might make a big difference. HTH, Ganesh -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Lance Diduck on 6 Apr 2008 08:59 On Apr 4, 3:24 pm, Robi-Wan-Kenobi <robert.ka...(a)gmx.de> wrote: > Hi folks, > > for our application i have to cache data. For that purpose i am > currently using the vector as a container to store the pointers to > objects holding the data. > Currently that's ok since the keys for the acces are rather sequential > with no bigger gaps in between. > But for the future we attempt to use some other key which will not be > sequential. Then i will have to use the map container for storage of > the pointers with that new key. > In some tests i just made it turned out that it will get slower with > the use of map. > I generated a bucket of these cahce-objects, stored them in the > container, retrieved them from there about 1000 times and erased them > again. > For that scenario the map is about four times slower than the vector. > The use of hash_map was only a little bit faster. > Has anyone experiences in optimization of the map? Is there some way > to make it faster? Or woud it be faster to use some special map > implementation (or CMap / QMap)? > > Thanks a lot. > > Robert Without seeing a profile of the ratio of modification to reads it is hard to tell just what the best solution is. For example, vector element access is only O(1) if you happen to know just what index you want to access. If you are doing a lookup by key, then this turns into O(n). If the vector were sorted, then look up is O(ln n). However, inserts into the vector is O(1) in the O(n) find case, but O(n ln n) in the sorted case. That said, O(1) for one container is not the same as O(1) for another. THe container, vendor, and actual number of elements plays a big role. The fact that map and hash performance were nearly identical is not surprising. It is a urban legend that hash has O(1) lookups. Actually that number is O(n/buckets). So it is only a finely tuned hash that has O(1) lookup. Another advantage of map is that it is easy to write a free list allocator that works with it, whereas with hash this is more difficult. For example, in one of my apps, I have several threads that populate their own isolated maps. So what I do is make an allocator per thread, and pass it to that therads map. Why? If you ever profiled a high performance MT app, you will note that there is considerable contention in the default allocators. This trick completely isolated the map and the allocator. Doing this with a hash is much more difficult, because writing an allocator for it is harder (you cant just have a free list of equal size blocks) Here is a study I recently did-- The data (in ms) for two compilers (A and B) to load (10,100,1000,10000) pair<int,double> 100 times per thread, then sort by various STL methods: (A-1 thrd) (A-2 thrds) (B-1 thrd) (B-2 thrds) native vector with pre-sorted data (0,15,46,359) (16,25,250,2625) (0,0,2,17) (1,1,2,18) native map (0,47,329,3157) (16,220,2100,19K) (1,12,118,1222) (11,90,840,7800) native map w/ lockfree allocator (0,16,304,2094) (16,165,2000,16K) (0,7,75,781) (1,9,100,1050) native vector (16,125,1469,17469) (188,2500,25K,302K) (0,29,53,238) (0,1,16,200) open source vector (using native std::sort) (0,0,78,1093) (0,0,78,1100) (0,0,15,203) (0,0,15,203) native list (31,250,3093,37500) (469,5K,63K,685K) (1,12,130,1360) (2,80,890,7600) native list /alloc (31,250,3094,37400) (464,5K,63K,686K) (0,9,85,917) (0,9,100,1147) open source list (0,63,781,9328) (110,1391,15K,168K) (0,22,145,1561) (13,112,968,7307) open source list /alloc (15,62,734,8672) (85,1350,16K,168K) (0,9,90,1162) (1,10,125,1410) [Intel Core2 @2.13Ghz processor (just a regular PC, running XP sp2)] There is obvious high observational error, but you get the idea. (and yes, reserve was called on the vectors). You can see that I get radically different results. Compiler B did fine, as long as I used my own allocator concoction. Lance -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Mathias Gaunard on 6 Apr 2008 08:54 On Apr 4, 9:24 pm, Robi-Wan-Kenobi <robert.ka...(a)gmx.de> wrote: > But for the future we attempt to use some other key which will not be > sequential. Then i will have to use the map container for storage of > the pointers with that new key. > In some tests i just made it turned out that it will get slower with > the use of map. > I generated a bucket of these cahce-objects, stored them in the > container, retrieved them from there about 1000 times and erased them > again. > For that scenario the map is about four times slower than the vector. > The use of hash_map was only a little bit faster. > Has anyone experiences in optimization of the map? Did you try std::tr1::unordered_set? What implementation are you using of the tools you tried? What's the type of your key? int? What's your value type? (size, copying cost) What's your usage pattern? -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
|
Pages: 1 Prev: Implicitly defined functions properties Next: Lame question about template ids |