Is lookup required when adding elements to std::map?

161 Views Asked by At

(a natural question from a beginner's perspective, but I couldn't find one. It may be due to my lack of understanding of data structure.)

Let's assume a situation where there is no need to check if the key already exists when inserting an element into the std::map. E.g. Inserting a pair from another std::map(which doesn't have duplicate keys).

But, as far as I know, operator[], insert, and emplace lookup the key among the existing keys.

Is lookup required, because std::map is a tree-type data structure? (although it's up to the implementation). Since the map must be sorted by key, it seems impossible not to do a lookup when inserting.

Is my understanding correct, or is there a way to add a key (and its mapped value) into std::map without checking for duplicates?

2

There are 2 best solutions below

4
JaMiT On BEST ANSWER

Is lookup required, because std::map is a tree-type data structure?

No, lookup is (practically) required because lookup is required to be faster than O(n). (It is not required that a std::map be implemented as a tree-type data structure.)

If lookup was allowed to be O(n), then it would not matter where a new element goes; a search could simply look at everything. Conversely, looking at everything takes O(n) time. In order to beat this time complexity, a search must be able to eliminate some entries from consideration without looking at them.

When adding an element to a std::map, the new element must not be placed where the lookup will eliminate it without looking at it. That is, it must be placed so that looking it up will find it. How long does it take to find a place for the new element? Well, if it took less time than finding the element later, replace your "find" algorithm with whatever you use to find a place for the new element. If it takes more time, replace whatever you use to find a place for the new element with your "find" algorithm. The end result is that whatever you use to find a place for the new element is equivalent, in terms of algorithmic complexity, to finding the element later.

So an insertion requires a process that is as complex as lookup. For practical considerations, we might as well say insertion does a lookup. (I'll add the caveat that the lookup could be done in advance of the insertion – with other work done in between – as long as the insertion has the result of the lookup and as long as the "other work" does not invalidate the lookup. The standard containers do not expose such an interface, though.)


Note that the above applies to all standard sorted containers, not just std::map. I would in particular call out std::multimap, where the existence of a duplicate key does not prevent an insertion. (To me, the question reads as if the thought was that the point of performing the lookup was to prevent duplicates.) Even when duplicate keys are allowed, insertion requires a lookup (or equivalent) whenever lookup is required to be faster than O(n).

0
Sam Varshavchik On

A lookup is required (from a conceptual level) for a very simple reason: std::map can only hold a single value for a specified key.

Therefore: the rules of logic of our shared universe impose a requirement, in order to determine whether the key already exists, or not, in the map. The requirement is to look it up. There is no alternative.

Now, note that this is referencing what std::map must do, when it is given the task of inserting something into itself. There is no requirement for you to manually check if the key exists, or not, in the map, before attempting to insert() something into it. You have no obligation to do that. Except that if you don't do that, and the key exists, nothing gets inserted in the map.

Forging ahead on this topic: insert() returns a value. Your C++ textbook should have a description of what, specifically, insert() returns, and by inspecting the returned value you will discover if the insert really happened, or did not happen because the key already existed.

So, from a different point of view: no, you are not required, in your own code, to check if the key exists before inserting. If you don't, insert() will do it for you. In fact, insert() will always do it (except that the C++ standard allows your suffering C++ compiler to optimize it away if it is able to logically prove that insert() will never find a duplicate key because your code checked for that first, and there are no observable side effects from this optimization, but you can't rely on that of course).