Why can't this be optimized?

Question

I have a function that I use to add vectors, like this:

public static Vector AddVector(Vector v1, Vector v2)
{
    return new Vector(
      v1.X + v2.X,
      v1.Y + v2.Y,
      v1.Z + v2.Z);
}

Not very interesting. However, I overload the '+' operator for vectors and in the overload I call the AddVector function to avoid code duplication. I was curious whether this would result in two method calls or if it would be optimized at compile or JIT time. I found out that it did result in two method calls because I managed to gain 10% in total performance by duplicating the code of the AddVector as well as the dot product method in the '+' and '*' operator overload methods. Of course, this is a niche case because they get called tens of thousands of times per second, but I didn't expect this. I guess I expected the method to be inlined in the other, or something. And I suppose it's not just the overhead of the method call, but also the copying of the method arguments into the other method (they're structs).

It's no big deal, I can just duplicate the code (or perhaps just remove the AddVector method since I never call it directly) but it will nag me a lot in the future when I decide to create a method for something, like splitting up a large method into several smaller ones.

99% of the time since even a 10% performance improvement is negligible in this case (inline VS 2 method calls). What was your absolute performance gain? I'm guessing a few MS at best. See http://www.codinghorror.com/blog/archives/001218.html for an example. Can't answer the question though, sorry. — James, Feb 02 '09 at 19:00
OK, OK, micro-opimizations are not good--but what's the answer to the question? Can you try out different behavior with compilation flags or method attributes, etc.? — Michael Haren, Feb 02 '09 at 19:02
@bnkdev: I agreed with you actually. In fact our comments have a few seconds time difference and I hadn't seen yours :) I was actually discouraging micro-optimzation. — Mehrdad Afshari, Feb 02 '09 at 19:08
http://www.hanselman.com/blog/ReleaseISNOTDebug64bitOptimizationsAndCMethodInliningInReleaseBuildCallStacks.aspx — Mehrdad Afshari, Feb 02 '09 at 19:09
I didn't see you guys' comments either....it took me 4 minutes to try to fit all of that in a comment... :) Didn't want to reply with an answer since I don't know the answer. — James, Feb 02 '09 at 19:19
Is this a release or debug build? Also, have you tried running it through ngen? There's a limit to what the JIT compiler is able to do (since it must be fast, as it runs at loadtime), so if you want max performance, ngen might be your best bet. — jalf, Feb 02 '09 at 19:36
Yes this is micro-optimization. However, I think knowing how something behaves in the language is important knowledge and this behaved very differently from what I expected (I expected the same performance or negligible difference). — JulianR, Feb 02 '09 at 20:36
Quite frankly, this anti-"micro optimization" fad is taking on religious forms. This is a ray tracer and as long as it's not running at 60 FPS, nothing is unnecessary optimization. I could get more performance with better algorithms (I don't even have multithreading), doesn't mean this doesn't help. — JulianR, Feb 02 '09 at 20:40
I ran the test; there is about a 20x difference between inlined unmanaged and virtual (managed) function calls. For a core math primitive in a realtime raytracer, that's huge! — Crashworks, Feb 02 '09 at 23:31

score 5 · Answer 1 · answered Feb 02 '09 at 19:07

If you compile into debug mode or begin the process with a debugger attatched (though you can add one later) then a large class of JIT optimisations, including inlining, won't happen.

Try re-running your tests by compiling it in Release mode and then running it without a debugger attatched (Ctrl+F5 in VS) and see if you see the optimisations you expected.

score 3 · Answer 2 · edited Feb 02 '09 at 19:05

3

"And I suppose it's not just the overhead of the method call, but also the copying of the method arguments into the other method (they're structs)."

Why don't you test this out? Write a version of AddVector that takes a reference to two vector structs, instead of the structs themselves.

edited Feb 02 '09 at 19:05

Joel Coehoorn

399,467
113
570
794

answered Feb 02 '09 at 19:02

Szymon Rozga

17,971
7
53
66

1

That would be slower than passing by value (depending on the struct size, but I am talking about the recommended size). – Joan Venge Feb 02 '09 at 19:27

score 1 · Accepted Answer · edited May 23 '17 at 12:04

1

Don't assume that struct is the right choice for performance. The copying cost can be significant in some scenarios. Until you measure you don't know. Furthermore, structs have spooky behaviors, especially if they're mutable, but even if they're not.

In addition, what others have said is correct:

Running under a debugger will disable JIT optimizations, making your performance measurements invalid.
Compiling in Debug mode also makes performance measurements invalid.

edited May 23 '17 at 12:04

Community

1
1

answered Feb 02 '09 at 19:21

Jay Bazuzi

45,157
15
111
168

Definitely true. Changing a custom vector type to a class in a ray tracer from codeproject.com (can't find the url) resulted in a significant performance boost. – Christian Klauser Feb 02 '09 at 20:31
Thanks for the helpful insights :) I just changed my structs to classes, and behold, significant performance gains! I really didn't expect that either, but I'll create a new question for the questions I have with that. – JulianR Feb 02 '09 at 21:25
That's strange, I got 10x better performance when I switched to structs from classes. – Joan Venge May 14 '09 at 17:58

score 1 · Answer 4 · answered Feb 02 '09 at 20:33

I had VS in Release mode and I ran without debugging so that can't be to blame. Running the .exe in the Release folder yields the same result. I have .NET 3.5 SP1 installed.

And whether or not I use structs depends on how many I create of something and how large it is when copying versus referencing.

score 0 · Answer 5 · answered Feb 02 '09 at 19:10

0

You say Vector is a struct. According to a blog post from 2004, value types are a reason for not inlining a method. I don't know whether the rules have changed about that in the meantime.

answered Feb 02 '09 at 19:10

Rob Kennedy

161,384
21
275
467

Value types can be inlined on x86 from .NET 3.5 SP1 (see: http://blogs.msdn.com/vancem/archive/2008/05/12/what-s-coming-in-net-runtime-performance-in-version-v3-5-sp1.aspx); x64 supported inlining in an earlier version than this but I'm not sure which one exactly... – Greg Beech Feb 02 '09 at 19:33

score 0 · Answer 6 · answered Feb 02 '09 at 21:03

0

Theres only one optimization I can think of, maybe you want to have a vOut parameter, so you avoid the call to new() and hence reduce garbage collection - Of course, this depends entirely on what you are doing with the returned vector and if you need to persist it or not, and if you're running into garbage collection problems.

answered Feb 02 '09 at 21:03

JSmyth

11,993
3
23
18

Since it is a struct, GC isn't an issue. The only thing that changes is some minor stack usage points - but since it is the return value even this is negligible. – Marc Gravell Feb 02 '09 at 21:07

Why can't this be optimized?

6 Answers6