11

I have a function :

void get_good_items(const std::vector<T>& data,std::vector<XXX>& good_items);

This function should check all data and find items that satisfies a condition and return where they are in good_items.

what is best instead of std::vector<XXX>?

  1. std::vector<size_t> that contains all good indices.
  2. std::vector<T*> that contain a pointers to the items.
  3. std::vector<std::vector<T>::iterator> that contains iterators to the items.
  4. other ??

EDIT:

What will I do with the good_items? Many things... one of them is to delete them from the vector and save them in other place. maybe something else later

EDIT 2:

One of the most important for me is how will accessing the items in data will be fast depending on the struct of good_items?

EDIT 3:

I have just relized that my thought was wrong. Is not better to keep raw pointers(or smart) as items of the vector so I can keep the real values of the vector (which are pointers) and I do not afraid of heavy copy because they are just pointers?

Community
  • 1
  • 1
Humam Helfawi
  • 19,566
  • 15
  • 85
  • 160
  • Will you use the result only in the calling function, or do you attempt to store it so you can use it again (after the vector may have already changed)? Will any other code (possibly in a different thread) be modifying the vector between `get_good_items` and your using the result? – CompuChip Nov 12 '15 at 16:07
  • For now let us not worry about thread-safty – Humam Helfawi Nov 12 '15 at 16:08
  • If the data vector is modified (erasing elements from it, moving it from one memory range to another etc.) the references will break. In this case you can copy the good items from data to good_items. If the data vector will not be tinkered with, you can easily store pointers (hence 2 would be the way to go since omho it's easier to deal with and is more readable) to the items. – rbaleksandar Nov 12 '15 at 16:09
  • @rbaleksandar if no change happend till I want to dismiss good_items? – Humam Helfawi Nov 12 '15 at 16:10
  • If `data` remains at its location in memory using the references stored inside `good_items` do delete items from `data` would not be a problem.However if you move `data` somewhere else in the memory or for example delete an item by directly accessing `data` and not through the references stored in `good_items` you will get one or multiple broken references since the pointers stored in `good_items` will point at a location in memory that no longer contains the appropriate content(since you have deleted that item already).This applies in general when using pointers and not only to your situation – rbaleksandar Nov 12 '15 at 16:16
  • You should provide more information on where `data` is located (stack or heap) and what you intend to do with both it and `good_items`. – rbaleksandar Nov 12 '15 at 16:18
  • Do you really need random access on that collection? – decltype_auto Nov 12 '15 at 18:54
  • @decltype_auto you mean data or good_items ? – Humam Helfawi Nov 13 '15 at 06:18
  • data; but it wouldn't hurt if you told me about both. – decltype_auto Nov 13 '15 at 06:21
  • Yes data is a vector and it would be always a vector. it is something out of my scope. However, I have the choice of what good _items would be – Humam Helfawi Nov 13 '15 at 06:22

4 Answers4

4

If you remove items from the original vector, every one of the methods you listed will be a problem.

If you add items to the original vector, the second and third will be problematic. The first one won't be a problem if you use push_back to add items.

All of them will be fine if you don't modify the original vector.

Given that, I would recommend using std::vector<size_t>.

R Sahu
  • 204,454
  • 14
  • 159
  • 270
2

If you intend to remove the elements that statisfy the predicate, then erase-remove idiom is the simplest solution.

If you intend to copy such elements, then std::copy_if is the simplest solution.

If you intend to end up with two partitions of the container i.e. one container has the good ones and another the bad ones, then std::partition_copy is a good choice.

For generally allowing the iteration of such elements, an efficient solution is returning a range of such iterators that will check the predicate while iterating. I don't think there are such iterators in the standard library, so you'll need to implement them yourself. Luckily boost already has done that for you: http://www.boost.org/doc/libs/release/libs/iterator/doc/filter_iterator.html

eerorika
  • 232,697
  • 12
  • 197
  • 326
2

I would go with std::vector<size_t> or std::vector<T*> because they are easier to type. Otherwise, those three vectors are pretty much equivalent, they all identify positions of elements.

std::vector<size_t> can be made to use a smaller type for indexes if you know the limits.

If you expect that there are going to be many elements in this vector, you may like to consider using boost::dynamic_bitset instead to save memory and increase CPU cache utilization. A bit per element, bit position being the index into the original vector.

Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271
0

The problem you are solving, from my understanding, is the intersection of two sets, and I would go for the solution from standard library: std::set_intersection

Piotr Falkowski
  • 1,957
  • 2
  • 16
  • 24