26

This SO question sparked a discussion about std::generate and the guarantees made by the standard. In particular, can you use function objects with internal state and rely on generate(it1, it2, gen) to call gen(), store the result in *it, call gen() again, store in *(it + 1) etc., or can it start at the back, for example?

The standard (n3337, §25.3.7/1) says this:

Effects: The first algorithm invokes the function object gen and assigns the return value of gen through all the iterators in the range [first,last). The second algorithm invokes the function object gen and assigns the return value of gen through all the iterators in the range [first,first + n) if n is positive, otherwise it does nothing.

It seems like no ordering is guaranteed, especially since other paragraphs have stronger wording, for example std::for_each (Effects: Applies f to the result of dereferencing every iterator in the range [first,last), starting from first and proceeding to last - 1. If we're taking this literally, it only guarantees to start at first and end at last though - no guarantees on the ordering in between).

But: Both Microsoft's and Apache's C++ standard library both give examples on their documentation pages that require the evaluation to be sequential. And both libc++ (in algorithm) and libstdc++ (in bits/stl_algo.h) implement it that way. Moreover, you lose a lot of potential applications for generate without this guarantee.

Does the current wording imply sequentiality? If not, was this an oversight by the members of the committee or intentional?

(I am well aware that there aren't many people who can provide insightful answers to this question without merely speculating or discussing, but in my humble opinion, this does not make this question 'not constructive' as per SO guidelines.)


Thanks to @juanchopanza for pointing out this issue and referring me to the paragraph about for_each.

Community
  • 1
  • 1
us2012
  • 16,083
  • 3
  • 46
  • 62
  • 1
    I don't think `generate()` is very useful if it's not sequential. – Lily Ballard Feb 12 '13 at 00:59
  • 8
    I believe the vagueness is considerably lifted when taken in the context of the required iterator *class* the template function mandates it be provided; **`template`**. Since both first and last are forward-required only, save for stacking them all into an array or vector construct, then non-sequentially leaping about the sequence, you have little choice but to start at the beginning, and arrive at the end. – WhozCraig Feb 12 '13 at 01:00
  • @WhozCraig Heh, I totally missed that. In my opinion, this is an insightful answer and you should post it as such. Nevertheless, I would still be interested to hear from people close to the committee or library implementors about their thoughts on this. – us2012 Feb 12 '13 at 01:02
  • @WhozCraig: That looks more like an answer than a comment to me, particularly as it applies to many more algorithms. – rici Feb 12 '13 at 01:03
  • 1
    @us2012 still a strong question, `generate_n()` requires a different iterator type. make special note of that because technically it can go *backwards*. It's required template params are **`template`** – WhozCraig Feb 12 '13 at 01:03
  • 1
    Nothing stops an implementation from supplying alternative implementations for iterator types that are better than forward though. An implementation might, for example, implement the algorithm in parallel if it recieves random access iterators. – Benjamin Lindley Feb 12 '13 at 01:06
  • @BenjaminLindley true, but the standard dictates that you can invoke successfully only supplying a forward iterator pair. Anything beyond that is beyond the standard, and would at-best be an implementation specialization. – WhozCraig Feb 12 '13 at 01:07
  • 2
    @WhozCraig: Right. But my point is that it invalidates the idea that you can assume sequentiality just because the minimum required iterator category for the algorithm is forward. How often do we pass iterators that only meet forward iterator requirements and no better anyway? The only container in the standard library that fits that description is `forward_list`, and I've never used that. – Benjamin Lindley Feb 12 '13 at 01:10
  • @BenjaminLindley And my point was, if the implementation chooses to provide behavior for iterators of a different class level than the standard requires, it does so outside the standard. Also, the reason I didn't bother posting this as an answer is because frankly, I presented in the initial comment a mechanism where, even exactly by the standard minimum of providing only a `FowardIterator pair`, you could still fill non-sequentially. you'd have to step way out of the realm of reason to do it, but it is none-the-less not *forbidden* according to the letter of the standard as I read it. – WhozCraig Feb 12 '13 at 01:14
  • @BenjaminLindley thus, I don't think you can, with 100% reliability, *guarantee* a sequential fill, even in a `forward_list`, and frankly I find that troubling =P. It does seem to *imply* it as at least reliable to the non-word-mincing reader, but by the letter, I see nothing that says it is gospel. – WhozCraig Feb 12 '13 at 01:15

2 Answers2

8

In the discussion of LWG475, std::for_each is compared with std::transform. It's noted that "transform does not guarantee the order in which its function object is called". So, yes, the committee is aware of the lack of sequential guarantees in the standard.

There is no opposite requirement for non-sequential behavior either, so Microsoft and Apache are free to use sequential evaluation.

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • Very interesting (if somewhat puzzling, because for `generate`/`generate_n` in particular I would think that the benefits of mandating sequential behaviour by far outweigh the potential drawbacks). Thank you! – us2012 Feb 12 '13 at 10:59
5

Anywhere the standard doesn't specify an ordering on an algorithm, you should assume that an implementation can exploit that for parallelism. The paper n3408 discusses options for parallelisation, and points to the Thrust library, which is both a usable parallel-enabled reimplementation of the standard algorithms and a proof-of-concept for future standardisation of parallelism in the algorithms.

Looking at Thrust's implementation of generate, it calls gen in a parallel loop whenever the iterator category is random access. As you've observed, this is consistent with the standard, so you should not assume that generate will always be sequential. (For example, a thread-safe std::rand can be efficiently used with generate and does not require sequential invocation.)

The only algorithms that guarantee sequential invocation are those in numeric; if your code depends on sequential invocation, you should use iota in place of generate. Adapting an existing generator:

template<typename F> struct iota_adapter {
   F f;
   operator typename std::result_of<F()>::type() { return f(); }
   void operator++() {}
};
template<typename F> iota_adapter<F> iota_adapt(F &&f) { return {f}; }

Use as:

#include <numeric>
#include <iostream>

int main() {
   int v[5], i = 0;
   std::iota(std::begin(v), std::end(v), iota_adapt([&i]() { return ++i; }));
   for (auto i: v) std::cout << i << '\n';
}
ecatmur
  • 152,476
  • 27
  • 293
  • 366
  • 1
    Interesting! Funnily enough, the thrust authors seem to assume sequentiality for `std::generate`: Looking at the code examples at http://thrust.github.com/ , they use `std::generate` and the comment above says "generate 32M random numbers serially". `iota` is an unsatisfying substitute though, it requires that the return type can have an `operator++` that does what you want. I think implementing a `seq_generate` yourself function might be the best solution, given that it's not exactly difficult to do. – us2012 Feb 12 '13 at 13:02
  • I'd read that as "serially" as opposed to in parallel (since `std::rand` might not be parallel-aware or -safe) rather than implying sequentiality. It's not too difficult to adapt `iota`; I'll provide an example. – ecatmur Feb 12 '13 at 14:55
  • Two small comments: 1) 5 algorithms form ``: `for_each`, `copy`, `copy_backward`, `move` and `move_backward` are also sequential. 2) the wording for the sequentiality guarantee for `iota` is the odd one out, since it doesn't use the phrase "in order" but rather "increments value as if by ++value". – TemplateRex Oct 31 '14 at 19:09