Author Topic: [SOLVED] Sprite Transformation Batching (Read 16128 times)

Glocke · « **on:** January 30, 2015, 03:30:04 pm »

Hi, I'm currently using simple shapes (colored, without textures) as sprites. To speed up drawing I add all objects' vertices to a single vertex array while culling. So the actual drawing is pretty fast

Unfortunately, I need to transform the vertices referring sprite's position and orientation, so I apply translate() and rotate() to the corresponding sf::Transformation and use transformPoint on each object's vertices. Because each object has individual transformations, I need to calculate those transformations per object. So I cannot use the transformation object/matrix as part of sf::RenderStates.
I'm using a dirty flag, yet, to determine whether a transformation needs to be recalculated. So the entire process runs smoothly in most cases. But if many transformations need to be done, it takes a lot more time.

Do you have any suggestions how to speed up those transformations? .. except multi-threading, I am trying to work single-threaded as long as possible to reduce unnecessary complexity ^^

Kind regards
Glocke

/EDIT: Some more details: Each sprites holds two sets of vertices: the original, untransformed and the currently transformed vertices. On each recalculation, the sf::Transformation object is created and applied to each vertices' position. The resulting vertices are stored as "currently transformed" vertices, which are used for drawing.
/EDIT 2: About the number of object... I'm currently testing with around 5k object, which runs smootly with clang's -O2 on my netbook (in contrast: in debug mode only up to 1.3-1.4k can be handled smoothly.. But using around 10k object is just another frame show ^^ Of course a netbook isn't fast but good to face performance problems early

So I'm trying to get as much performance as I can

eXpl0it3r · « **Reply #1 on:** January 30, 2015, 03:43:41 pm »

Profile where your actual problem is and then try to optimize around that point.

Silderan · « **Reply #2 on:** January 30, 2015, 10:23:52 pm »

For sure, it's an stupid question but... Do you need to perform transforms to such amount of objectes? Can't you appy the "dirty" flag to the non-seen objects and apply transform when needed... or the transform itselfs may move the object to/from camera view and you cannot avoid from applying transform to all?

Glocke · « **Reply #3 on:** January 31, 2015, 08:23:43 am »

Quote from: Silderan on January 30, 2015, 10:23:52 pm

For sure, it's an stupid question but... Do you need to perform transforms to such amount of objectes? Can't you appy the "dirty" flag to the non-seen objects and apply transform when needed... or the transform itselfs may move the object to/from camera view and you cannot avoid from applying transform to all?

To make the object move into camera view, it's position needs to be upated. Hence the transformation needs to be applied.

Hapax · « **Reply #4 on:** January 31, 2015, 04:22:40 pm »

Quote from: Glocke on January 31, 2015, 08:23:43 am

To make the object move into camera view, it's position needs to be upated. Hence the transformation needs to be applied.

Shouldn't this be a part of the physics calculations rather than the drawing calculations and therefore separated?

Glocke · « **Reply #5 on:** January 31, 2015, 05:44:52 pm »

Quote from: Hapax on January 31, 2015, 04:22:40 pm

Shouldn't this be a part of the physics calculations rather than the drawing calculations and therefore separated?

Well, position inside the logical world and position on the screen isn't always the same. So for instance, an object's screen position at isometric maps depends on e.g. the tilesize. In my opinion, the physics system shouldn't know about the tilesize or the actual rendering perspective. So transforming a logical world position to a screen position is up to the graphics system.

Silderan · « **Reply #6 on:** January 31, 2015, 07:55:35 pm »

As the other post.. maybe I will say another stupid thing... more seeing your amazing "Rage". But, objects positions, scaling and rotation is up to logic/physics, because graphics "just" interprets this. I think that's what Hapax post is saying, and I agree. Of course, this is a point of sight and doesn't changes your needs for improving transformations to increase FPS.

This is what comes to my mind. But, for sure, you'd considered and/or you won't gain many speed with it:
1. Refactor game logic to reduce the object transformation update needs (As you said, not possible)
2. Improve maths/code (practically useless because of modern compilers optimizations).
3. Reduce at all recursivity.
4. Reduce allocations (global or local -stack-)
5. Memory array (data[ i ]) much faster than linked lists. If you have some linked list (or tree) and you MUST have it, consider to create an array with the list/tree nodes for quicker iterations.
6. Multithread. With...
6.1 Lock entire objects transformations, so game logic must wait for it and code complexity is so low. Maybe, you won't gain many speed.
6.2 Make some kind of "transformations queue" without locks. More speedy but some frames may show objects deformed.
6.3 Locks at object transformation. You'll ensure to don't see any object deformed, but some object can be transformed and other one doesn't.

Sorry if it doesn't helps you or if my english is cryptic

BTW: Is this "problem" for Rage? Are you trying to animate all game NPCs?

Silderán.

Glocke · « **Reply #7 on:** January 31, 2015, 09:24:59 pm »

Quote from: Silderan on January 31, 2015, 07:55:35 pm

But, objects positions, scaling and rotation is up to logic/physics, because graphics "just" interprets this.

Right! But position isn't just position. Considering e.g. diamon-shaped isometric maps, a position (1.3f, 7.2f) might be totally different compared with the object's screen position.

But anyway... whether moving the transformation of the points to physics or keep it at the graphics system doesn't reduce the complexity of the transformation at all $:-\$

Quote from: Silderan on January 31, 2015, 07:55:35 pm

5. Memory array (data[ i ]) much faster than linked lists. If you have some linked list (or tree) and you MUST have it, consider to create an array with the list/tree nodes for quicker iterations.

Quote from: Silderan on January 31, 2015, 07:55:35 pm

BTW: Is this "problem" for Rage? Are you trying to animate all game NPCs?

Yes and no. I'm reimplementing the entire code in a more DOD-way to achieve e.g. faster iteration of contigueos chunks of data. So yes, I'm currently not using any linked lists or trees holding the objects.

About the usage of these many transformations: I'm currently driving some kind of performance edge-case, assuming the following: Each of the 1.4k objects (debug mode without optimization, release mode can handle much more objects) is currently moving, so the transformation need to be updated. This might occur rarely in common game sessions, because having >1k enemies running around is quite insane. Non-moving objects (e.g. chests) won't cause that trouble, because of the dirty flag

I already tested this. Having 2k non-moving objects is quite smooth.

So... maybe it isn't such a problem for "Rage" - but it gained my attention, anyway ^^

Kind regards

Silderan · « **Reply #8 on:** January 31, 2015, 11:01:28 pm »

But, a orthogonal to isometric view conversion is not a "transformation" that must be applied to all objects.

Up to view size and position, you must be able to know the tiles and objectes that must be shown in screen using orto to iso conversion. As I understand from your posts, you apply orto to iso conversion on all tiles and objects. That's a wrong aproach, IMHO.
Anyway, not sure if this is the problem. The main problem is the objects iteration, not the maths involved per object transformation apply... if 1K objects move... doesn't mathers if you just make two or four math operation... the problem is the 1000 object batch... and you cannot avoid it

Well, yes, you can. I have some years of experience with a text MUD games where is quite normal having many objects. The solution is to keep living objects at minimum. I can tell you some tips if you want.

Silderán.

Glocke · « **Reply #9 on:** February 01, 2015, 08:57:07 am »

Quote from: Silderan on January 31, 2015, 11:01:28 pm

As I understand from your posts, you apply orto to iso conversion on all tiles and objects. That's a wrong aproach, IMHO.

Just to objects, but I don't know why this is wrong from your point of view. Yeah, physics should ... but physics don't know anything about the actual representation. The clue of decoupling systems is also to decouple and isolate domain-specific tasks.

Quote from: Silderan on January 31, 2015, 11:01:28 pm

The main problem is the objects iteration, not the maths involved per object transformation apply... if 1K objects move... doesn't mathers if you just make two or four math operation... the problem is the 1000 object batch... and you cannot avoid it

Well, without transformation everything runs smooth - also over those 1k objects. Each one is picked, each one's dirty flag is checked and then nothing is done (if transformations are disabled in code). And because no optimization is applied, the compiler won't optimize that in any way.
Yes, now the typical "profile your code and you'll see that you're wrong!"-posts might occur

But indeed, profiling with -pg gives me not a clue about the bottleneck. So anyway, messuring systems' elapsed time cannot be that bad at all ^^
By the way: the physics system is iterating over the same number of objects in the same way (contigueos array) but will less math... Guess how many time it is consuming xD

Quote from: Silderan on January 31, 2015, 11:01:28 pm

I have some years of experience with a text MUD games where is quite normal having many objects. The solution is to keep living objects at minimum. I can tell you some tips if you want.

Well, of course large object numbers might be common. But large numbers of moving objects is not - at least in roleplaying genre. Of course there might be lots of objects: chests, torched, enemys' corpses... but they are all not moving. They stay/lay where they are, do not rotate or everything else. Having lots of non-moving objects inside my system doesn't slow it down the way lots of moving objects do.
So I think the best way might be reducing the number of moving objects - as well as looking for optimization possibilities.

/EDIT: But, profiling gave me at least one clue: pushing back to a vector, which is already large enough to contain those additional objects, seems slow

(at least from the profiling output's point of view). Changing this helped a bit, but not quite much. ... Hey, but I've tried

Iteration with matrix math is quite tough, anyway ^^

grok · « **Reply #10 on:** February 01, 2015, 10:28:13 am »

Quote from: Glocke on January 31, 2015, 08:23:43 am

To make the object move into camera view, it's position needs to be upated. Hence the transformation needs to be applied.

yes, but it is not necessary to actually move all your objects? say, there's a big dungeon with hundreds of enemies moving, is it crucial for you to move all of them always? if you consider chunking the dungeon into the smaller parts and update the enemies in the current chunk only (where the player is located for example), will it hurt your gameplay? of course if you want all the enemies track the player (i.e. each second they're approaching towards the player location), then this approach might not work.
It is all about considerations, after all.

Glocke · « **Reply #11 on:** February 01, 2015, 11:16:10 am »

Quote from: grok on February 01, 2015, 10:28:13 am

yes, but it is not necessary to actually move all your objects? say, there's a big dungeon with hundreds of enemies moving, is it crucial for you to move all of them always? if you consider chunking the dungeon into the smaller parts and update the enemies in the current chunk only (where the player is located for example), will it hurt your gameplay?

Well, anyway I'm using a grid as spatial datastructure, where each object associated with it's tile position. So updating transformations only for those objects might work - same for animation handling.

Quote from: grok on February 01, 2015, 10:28:13 am

of course if you want all the enemies track the player (i.e. each second they're approaching towards the player location), then this approach might not work.

Well that isn't the problem. The actual AI is decoupled to an AI-System which handles all AI actions per frame. If the AI wants to move towards the player, the physics system is notified and performes the action. Especially it moves the object to another grid cell if necessary. This grid is just queryed by the graphics system. So this should work

Thanks, good idea!

Silderan · « **Reply #12 on:** February 01, 2015, 11:33:53 am »

Quote from: Glocke on February 01, 2015, 08:57:07 am

Well, without transformation everything runs smooth - also over those 1k objects. Each one is picked, each one's dirty flag is checked and then nothing is done (if transformations are disabled in code). And because no optimization is applied, the compiler won't optimize that in any way.

I see... then, too hard to understand where is the bottleneck indeed. Seems that the problem is at transformation.
Just an idea, if it makes any sence for you, of course... you talk about rotation, position and scale. Did you tried to use sprite->setPosition and change animation textures instead of rotating objects? setPosition instead transformation may (shall) not make any change, but rotation...

Glocke · « **Reply #13 on:** February 01, 2015, 11:41:48 am »

Quote from: Silderan on February 01, 2015, 11:33:53 am

Did you tried to use sprite->setPosition and change animation textures instead of rotating objects? setPosition instead transformation may (shall) not make any change, but rotation...

No, I'm pushing all object's vertices (as mentioned: only colored, not textured) to a huge vertex array while drawing. So I need to apply transformations per vertex before drawing.

Of course this won't work with animation textures ^^ When extending the code for animation, I'll use sprites again - as well as their setPosition() etc.

grok · « **Reply #14 on:** February 02, 2015, 07:55:24 am »

you store all object's vertices in a big vertex array on each iteration, wouldn't it be expensive?
i.e. I mean there's no need to draw all objects if there're only a few of them in the current view region/visible part of the game.
or am I missing something?
how many objects do you draw in average, 10, 100, 1000? if you draw them as a single vertex array, they must be the same, i.e. they hold same texture, is it so? otherwise there will be many "renderStates.texture = ..." operations which has a big impact on the performance (swapping textures is not "free").

anyway, I believe that

Quote

pushing all object's vertices (as mentioned: only colored, not textured) to a huge vertex array while drawing.

is not a good idea. it is unrelated to the draw purpose IMHO. now you have a lot of problems coming from performing operations on a vector (vector grows -> it has a dramatic impact on performance, i.e. it needs to reallocate a new chunk of memory big enough to hold the new data, then it must shift all the present items around, append the newly created ones in the end, etc).
hmm, do you recreate the vector on each draw iteration or reuse the present one?

have you tried to do "vector.reserve(...)"?
you might benchmark/measure/output that vector memory-related characteristics during the game and see what is going on with it internally. you might be interested in checking its ".size()" and ".capacity()".

Author Topic: [SOLVED] Sprite Transformation Batching (Read 16128 times)

Glocke

[SOLVED] Sprite Transformation Batching

eXpl0it3r

AW: Sprite Transformation Batching

Silderan

Re: Sprite Transformation Batching

Glocke

Re: Sprite Transformation Batching

Hapax

Re: Sprite Transformation Batching

Glocke

Re: Sprite Transformation Batching

Silderan

Re: Sprite Transformation Batching

Glocke

Re: Sprite Transformation Batching

Silderan

Re: Sprite Transformation Batching

Glocke

Re: Sprite Transformation Batching

grok

Re: Sprite Transformation Batching

Glocke

Re: Sprite Transformation Batching

Silderan

Re: Sprite Transformation Batching

Glocke

Re: Sprite Transformation Batching

grok

Re: Sprite Transformation Batching