This weekend I spend some time looking for a performance boosts, one of the things I tested where our vec2/3/4 classes. First off we have our own, we are not using those provided by sfml, we only use sfml to open a window and manage input (atm). But sins the classes look alike I was left wondering way Vec3 and Vec2 are not declared inline and even further down way are some huge Unicode function inline ?
Below some of the different way's I tested …
Initial Code: Overload outside of the class
template <typename T>
class Vec2
{
public:
...
};
template <typename T>
Vec2<T> operator* (const Vec2<T>& v1, const T& v2);
...
template <typename T>
inline Vec2<T> operator* (const T& v2)
{
return Vec2<T>(v[0]*v2, v[1]*v2);
}
improvement 1: Overload inside the class
template <typename T>
class Vec2
{
public:
...
Vec2<T> operator* (const Vec2<T>& v2);
...
};
...
template <typename T>
Vec2<T> Vec2<T>::operator* (const T& v2)
{
return Vec2<T>(v[0]*v2, v[1]*v2);
}
improvement 2: inline decelerations
template <typename T>
class Vec2
{
public:
...
Vec2<T> operator* (const T& v2);
...
};
...
template <typename T>
inline Vec2<T> Vec2<T>::operator* (const T& v2)
{
return Vec2<T>(v[0]*v2, v[1]*v2);
}
- The actual test involved drawing 125k cubes and animating them, this lead to +-8 million operations with vec2/3 per frame. The test was repeated 100(frames) * 2 (linux, windows) * 3 different pc's …
- I tried a lot more then just the 2 above but those stood out and they did not break anything (in our engine)
- Only tested with gcc both on linux and windows …
So here are the results I got:
Initial Code = 100%
improvement 1 = 126,23% - 26% faster
improvement 2 = 277,60% - 177% faster
The classes in sfml kinda look like our initial code, so maybe it would be a boost for sfml as well ? Is there a specific reason sfml is overloading outside of the class ? Something I might have missed ?