I already have a simple tweening class, but I'm still not 100% happy with it, plus I want more complex tweening (like VertexArrays or Colors) before releasing or suggesting anything.
It can essentially be used for creating bezier curves as well, so not that much off-topic, but it's only time dependent so far.
Usage would be something like this:
sfx::Tweener<float> x(0, 10, sf::seconds(5), sfx::TweenRepeat | sfx::TweenSine);
x can then be used just like any float variable, but it will change its values "on its own" (when retrieved; from 0 to 10 using a sine curve). Same can be done with pretty much any data type like sf::Vector2f and others.
Thor's animations (http://www.bromeon.ch/libraries/thor/v2.0/doc/group___animation.html) are not time based, only the front-end thor::Animator is. If you read a bit further (http://en.sfml-dev.org/forums/index.php?topic=4646.msg34963#msg34963), you see I proposed a std::function<float(float)> which is very generic. Maybe one could build something on top of that.
In Thor, the concept of an animation is as follows:
void animation(Animated& object, float progress);
This is quite generic, as it allows to animate an object using an arbitrary function that takes a float in [0,1]. You can use functions, functors, std::bind, or lambda expressions. Depending on what you need, you can implement an intermediate layer that transforms a progress range to another. For example, playing an animation in reverse can be done in a single line:
auto anim = ...; // animation function satisfying the concept
auto reverse = [anim] (sf::Sprite& s, float pr) { return anim(s, 1.f - pr); };
You might also want to read my design considerations (http://en.sfml-dev.org/forums/index.php?topic=7329.msg53237#msg53237). Maybe this can inspire you for an API...
I'd find that surprising. I tend to assume people who write compilers are smarter than me.
I don't know if it can be optimized. The second argument is a double, so the function has to apply a generic algorithm which is much more complex than p * p * p. So unless pow maps directly to a built-in processor instruction, I don't know how your compiler would be able to make it as optimized as the explicit version.
But that's interesting, generate the ASM listing and compare ;)
double foo( double p )
{
return std::pow( p, 10 );
}
double bar( double p )
{
return p * p * p * p * p * p * p * p * p * p;
}
foo(double):
movapd %xmm0, %xmm1
mulsd %xmm0, %xmm1
mulsd %xmm1, %xmm0
mulsd %xmm1, %xmm0
mulsd %xmm0, %xmm0
ret
bar(double):
movapd %xmm0, %xmm1
mulsd %xmm0, %xmm1
mulsd %xmm0, %xmm1
mulsd %xmm0, %xmm1
mulsd %xmm0, %xmm1
mulsd %xmm0, %xmm1
mulsd %xmm0, %xmm1
mulsd %xmm0, %xmm1
mulsd %xmm0, %xmm1
mulsd %xmm0, %xmm1
movapd %xmm1, %xmm0
ret
This was generated by GCC with -O2 -march=native. It seems the compiler is more intelligent than you might expect ;). The explicit multiplication generates as many machine operations as multiplications you perform. The std::pow() variant tells the compiler the base stays the same hence it can reuse the results of previous multiplications to reduce the number of machine operations required. Explicit multiplication is O(k) whereas std::pow() is O(log k) for p ^ k.
It groups
p * p * p * p * p * p * p * p * p * p to
( ( ( p * p ) * p ) * ( p * p ) ) * ( ( ( p * p ) * p ) * ( p * p ) ) which is
( ( ( p ^ 2 ) * p ) * p ^ 2 ) ^ 2
Maybe the word "intelligent" isn't as correct as "less conservative" as explained here (http://stackoverflow.com/questions/6430448/why-doesnt-gcc-optimize-aaaaaa-to-aaaaaa). With -ffast-math both functions produce identical asm output (the shorter variant).
In GCC it's __builtin_pow which the compiler knows it should optimize.
Uhm, no. GCC doesn't optimize __builtin_pow nearly as much as it optimizes __builtin_powi. In particular double foo(double p)
{
return __builtin_pow(p, 10);
}
double bar(double p)
{
return __builtin_powi(p, 10);
}
compiles down on my machine to the following assembly: .text
.p2align 4,,15
.globl _Z3food
.def _Z3food; .scl 2; .type 32; .endef
.seh_proc _Z3food
_Z3food:
.LFB0:
.seh_endprologue
movsd .LC0(%rip), %xmm1
jmp pow
.seh_endproc
.p2align 4,,15
.globl _Z3bard
.def _Z3bard; .scl 2; .type 32; .endef
.seh_proc _Z3bard
_Z3bard:
.LFB1:
.seh_endprologue
movapd %xmm0, %xmm1
mulsd %xmm0, %xmm1
mulsd %xmm1, %xmm0
mulsd %xmm1, %xmm0
mulsd %xmm0, %xmm0
ret
.seh_endproc
What may be interesting is that this optimization takes place only when C++11 mode is disabled. With -std=c++11, or even -std=gnu++11, for double qux(double p)
{
return std::pow(p, 10);
}
GCC 4.8.2 20130701 generates a library call: .p2align 4,,15
.globl _Z3quxd
.def _Z3quxd; .scl 2; .type 32; .endef
.seh_proc _Z3quxd
_Z3quxd:
.LFB258:
.seh_endprologue
movsd .LC0(%rip), %xmm1
jmp pow
.seh_endproc
This is very interesting but I still wonder how the compiler does. So it's probably not an effect of generic optimization rules, but rather a hard-coded one (std::pow + integer arg = this specific code).
It's been a while since I used C++, but couldn't something like that be templated?
float template <0> float _pow(float)
{
return 1;
}
float template <int n> float _pow(float x)
{
return x * _pow<n - 1>;
}
float std::pow(float x, const int n)
{
return _pow<n>(x);
}
If that works (or something similar, don't remember the syntax), maybe it's not a compiler optimization at all, but rather a std:: optimization.