Returning class object by value or pass by reference, which will be faster here

Question

Suppose , I have a class object matrix. To add two matrix of large element, I can define a operator overloading + or define a function Add like these

matrix operator + (const matrix &A, const matrix &B)
    matrix C;
    /* all required things */
    for(int i .........){
      C(i)=A(i)+B(i);
    }
    return C;
}

and I have call like,

matrix D = A+B;

Now if I define the Add function,

void Add(const matrix &A, const matrix &B, matrix &C)
    C.resize(); // according to dimensions of A, B
    // for C.resize , copy constructor will be called.
    /* all required things */
    for(int i .........){
      C(i)=A(i)+B(i);
    }
}

And I have to call this function like,

matrix D;
Add(A,B,D); //D=A+B

which of above method is faster and efficient. Which should we use ?

Always do the natural,good and nice thing before even starting to think about "faster", "efficiency" and optimizations. And never do anything without measuring and profiling to find the *real* bottle-necks. And don't forget optimization (especially things like [copy elision](https://en.cppreference.com/w/cpp/language/copy_elision)). — Some programmer dude, Jan 17 '19 at 07:23

score 0 · Answer 1 · answered Jan 17 '19 at 08:34

Without using any tools,
1. like a profiler (e.g. gprof) to see how much time is spent where,
2. nor any other tools like "valgrind + cachegrind" to see how many operations are performed in either of the two functions,
And also ignoring all compiler optimizations i.e. compiling with -O0,
And assuming whatever else there is in the two functions (what you represent as /* all required things */), is trivial,

Then all one can say, just by looking at both your functions is, that both of your functions have a complexity of O(n), since both your functions are spending most of the time in the two for loops. Depending on how big the size of the matrices is, especially if they are really large, everything else in the code is pretty much insignificant when it comes to down speed.

So, what your question boils down to, in my opinion is,

In how much time it takes,
1. to call the constructor of C
2. plus returning this C, versus,
How much time it takes,
1. to call the resize function for C,
2. plus calling the copy constructor of C.

This you can 'crudely but relatively quickly' measure using the std::clock() or chrono as shown here in multiple answers.

#include <chrono>

auto t_start = std::chrono::high_resolution_clock::now();
matrix D = A+B; // To compare replace on 2nd run with this --->   matrix D; Add(A,B,D);
auto t_end = std::chrono::high_resolution_clock::now();
double elaspedTimeMs = std::chrono::duration<double, std::milli>(t_end-t_start).count();

Although once again, in my honest opinion if your matrices are big, most of the time would go in the for loop.

p.s. Premature optimization is the root of all evil.

Returning class object by value or pass by reference, which will be faster here

1 Answers1