12

This is a common pattern I use to index tokens as they come in: check if the token is in a map, and if not, add it to the map, assigning the map's size.

When doing this in C++, it unexpectedly increments the map's size before the assignment is made:

#include <cstdio>
#include <map>
using namespace std;

int main() {
    map<char, int> m;
    printf("Size before adding: %d\n", m.size());
    m['A'] = m.size();
    printf("Size after adding: %d\n", m.size());
    printf("What was added: %d\n", m['A']);

    return 0;
}

This prints out:

Size before adding: 0
Size after adding: 1
What was added: 1

The way I understand it, it should evaluate the right hand side, which is zero, then pass it to a function that puts 'A' and the zero into the map. But it seems to be evaluating it after it started assigning, which doesn't make sense.

Shouldn't the right-hand side be evaluated before the assignment operation?

Evgeni Sergeev
  • 22,495
  • 17
  • 107
  • 124
  • Nope, I'm pretty sure this is UB. – Borgleader Aug 04 '13 at 05:39
  • 1
    @Borgleader: It is Unspecified behavior. – Nawaz Aug 04 '13 at 05:41
  • @Nawaz oh right, unspecified – Borgleader Aug 04 '13 at 05:43
  • Just a thought. Instead of `map m;` can you use `struct X { char c; int id; }; set s`? Generate the `id` using a counter. – iammilind Aug 04 '13 at 05:50
  • @Nawaz: It seems to me it's undefined, for the same reason that an expession like `arr[i++] = i;` (where `arr` is an array of integers, and `i` is an integer) is undefined. It's modifying a variable (the `std::map` internal variable that stores the size) and reading its value. – Benjamin Lindley Aug 04 '13 at 05:50
  • @BenjaminLindley: `arr[i++] = i;` is **not** UB if the type of `i` is user defined class type and so is `arr`. Hope that tells you the *crucial* difference. – Nawaz Aug 04 '13 at 05:51
  • @Nawaz: But in this case, `i` is not a user defined type, it is `std::map::size_type`, which is probably `size_t`, which is probably a typedef for a non-user-defined type. – Benjamin Lindley Aug 04 '13 at 05:54
  • 1
    @BenjaminLindley: In this case, this is no `i` to begin with. In this case, this is `m` which is user-defined type. It appears on both sides. The creation and initialization of `size_type` happens *inside* the function `operator[]` which is followed by sequence point, in fact many sequence points and then *assignment* happens in the user code (outside the function). Between them there are lots of sequence points. I don't see UB. – Nawaz Aug 04 '13 at 05:55

5 Answers5

7

The behavior is unspecified, pedantically speaking.

But what happens in your case is this:

m['A'] = m.size();

m is a std::map which creates a new entry if the key doesn't exist.

In your case, the key 'A' doesn't exist, so it creates the entry, and return the reference to the value (which is default created) which then is assigned to m.size() in your case.

As said above, the behaviour is unspecified, as the order of evaluation of operands is unspecified which means m.size() might be evaluated before m['A']. If it does, then m['A'] will be 0, not 1.

Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • 2
    Right, so the order of evaluation of operands is unspecified *in general* in C++? – Evgeni Sergeev Aug 04 '13 at 05:45
  • 1
    @EvgeniSergeev: Yes, it is true for all operators, **except** comma operator `,` in which case the order is defined to be left-to-right. – Nawaz Aug 04 '13 at 05:47
  • 4
    ... and also for the `or` (`||`), `and` (`&&`) operators too the order is defined. – iammilind Aug 04 '13 at 05:51
  • I guess my confusion was primarily about the evaluation of the left hand side being independent (and returning a reference — to something that's probably been initialised to a default value in the meantime). I had implicitly assumed that the statement would be translated into a function call like `assign(key, value)`, so the value would have to be evaluated first. But that's not the way it works in C++. – Evgeni Sergeev Aug 04 '13 at 05:52
  • 4
    @EvgeniSergeev: C++ doesn't define the order of evaluation of function parameters either so with something like `assign(key, value);`, the `key` and `value` are still evaluated in an unspecified order. This is part of why you should usually avoid overloading `operator,`, `operator||` and `operator&&` -- you lose the sequencing guaranteed for the normal operators. – Jerry Coffin Aug 04 '13 at 05:57
  • @JerryCoffin Yes, but `value` is always evaluated before `assign(..)`. If not, it would be lazy evaluation which is not something to be expected from C++. – Evgeni Sergeev Aug 04 '13 at 06:32
  • 1
    Yes, the arguments are always evaluated before the function itself -- but that's irrelevant to your situation here (which would require specified order in which the function arguments are evaluated). – Jerry Coffin Aug 04 '13 at 06:33
  • 4
    In C++17, this is now defined behavior as the left hand side is guaranteed to be sequenced before the right hand side. Would you mind updating the answer, or I can if you are willing? Thanks. – NathanOliver May 01 '20 at 16:54
5

No.

(§5.17/1): "In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression."

Note, however, that while the assignment happens after the right and left operands are evaluated, no sequencing is specified between the evaluation of the left and right operands themselves. Therefore, the left could be evaluated first, then the right, or vice versa.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
4

But it seems to be evaluating it after it started assigning...

The order of evaluation in assignment is actually unspecified (whether the left hand or right-hand expression is evaluated first is unspecified), as indicated by the answers of this question.

If m.size() is evaluated first, then your code will work as intended, but you are not guaranteed of this behavior, and another implementation might evaluate m['A'] first, the same case with yours. These ambiguous cases must be avoided.

Better do something like this instead

auto size = m.size();
m['A'] = size;

which you are guaranteed that the size query is evaluated first before the element assignment.

LIVE CODE with the improvement..

Community
  • 1
  • 1
Mark Garcia
  • 17,424
  • 4
  • 58
  • 94
1

This is approximate implementation of [] operator, edited from stl header file

mapped_type& operator[](const key_type& key){
auto itr = lower_bound(key);
// itr->first is greater than or equivalent to key.
if (itr == end() || comp_func(key, (*itr).first))
      itr = insert(itr, value_type(key, mapped_type()));
return (*itr).second;
}

So as you can see for new element it inserts first, thereby increases map size by 1

Ref std::map::operator[]

If key does not match the key of any element in the container, the function inserts a new element with that key and returns a reference to its mapped value. Notice that this always increases the container size by one, even if no mapped value is assigned to the element (the element is constructed using its default constructor).

Edit :

As pointed out by others, m['A'] = m.size(); leads to an unspecified behavior,never use such statements, instead you can calculate the size first and then assign it to new key.

P0W
  • 46,614
  • 9
  • 72
  • 119
0

According to cppref

  1. In every simple assignment expression E1 = E2 and every compound assignment expression E1 @= E2, every value computation and side effect of E2 is sequenced before every value computation and side effect of E1

As mentioned above,m.size() will be evaluated before m['A'], so the answer is certain.
see it online

Kargath
  • 450
  • 5
  • 15