Author Topic: [Solved] Design Question: Memory usage vs Run-time speed. (Read 6657 times)

masskiller · « **on:** November 28, 2012, 01:18:00 am »

So in my game I have a class that holds an amazingly big functionality (I would not exaggerate if I said that my whole game is trash without it). Even though it's a heavy class I use it often because I need to and I think it twice before I get to increase the size in memory of the class.

My class has an x amount of bullets, a vector of polar vectors (with size x) and a whole lot of control variables. I need to have control over the original position of my bullets. I can add a new vector that contains the starting positions for each instance of the class (but it would make it heavier than it already is) or constantly convert my vector of polar vectors into normal 2-dimensional vectors. (but I am not sure of how bad can that be for run-time).

I've done a lot of refactoring of crappy code lately and I don't want to do it again at all if possible so I ask here before I program anything. I also have to take into account that the class is something that can grow much greater than it already is (function-wise), so I'd like to refine my API as much as I can so that I don't go through a bad time when refactoring (if needed) like I just did. And the API may change according to the choice I go with.

I can post some code and explain my hierarchy further if needed, just ask.

ichineko · « **Reply #1 on:** November 28, 2012, 02:23:31 am »

Without more information it's all just stabbing in the dark. What exactly are your concerns? Are you experiencing performance problems now?

I mean, there are general things that can be said

- object allocations tend to be a lot more expensive than most developers realize,
- Many times it isn't the amount of memory, it's your access patterns to said memory.

But that's all mental masturbation without something concrete to talk about.

masskiller · « **Reply #2 on:** November 28, 2012, 03:45:09 am »

Quote

Are you experiencing performance problems now?

Not yet, it's precisely what I want to avoid.

My class BulletPlacer has this fields:

class BulletPlacer : public sf::Drawable
{
private:
        unsigned Amount, Min;

        enum Patterns {Circular = 1, Flower, Spiral, Wave }; ///More to come. Each shape has many ways to be shot.

        std::vector<Bullet> ShotVec;
        std::vector<PolarVector> PolVecV;

        TextureHolder& T; ///A reference to my texture container.
        BulletTypes B; ///An enum field
        Color C; ///Another enum field.
        Patterns P; ///By now you can surely guess.  ;)

        ///Control stuff.
        Chronometer CT;
        sf::Time Tick;

        bool first, timecheck, paint;
        static bool control;
};
 

As it can be seen the object is as big as it had bullets. My uses can be from something as simple as 5 bullets to even 500 (could be more, that's just an estimate maximum), it pretty much depends on the bullet patterns I require for my game. This is a sort of "primitive" class that is used by my class spell.

class Spell
{
protected:
        Difficulty DifLevel; ///Enum field.

        sf::Vector2f XPos;
        std::vector<BulletPlacer> BP;
};
 

Which acts as a base class for FinalSpell and CommonSpell, FinalSpell being complex patterns and CommonSpell working as a smaller type of wrapper for small and repeatable stuff.

I use Polar Vectors for position calculation of spirals, flowers and almost any nice polar function that can be used for my bullet hell shooter, I even use them for linear movement. I calculate everything in polar and then convert to positioning coordinates.

I have the choice of adding another std::vector with common 2D vectors (starting positions of bullets) to my BulletPlacer class.

Under normal circumstances I will have only one FinalSpell instance that gets reused often and a bunch of CommonSpells for easy stuff.

My other choice is to just use my PolarVector convert functions to calculate the 2D start vectors whenever I need them (I don't always need the starting positions) instead of having them there whether I need them or not.

I am fond with the latter. The problem is that it might cause time issues with bigger BulletPlacers that get calculated in run-time(of course I can and will use threads, but I still want to avoid any possible problems). So in the end to pick one is to sacrifice the other. If I go for the memory-saving solution I might get time issues, and if I go for the time-saving solutions I will sometimes have unnecessarily big copies.

Summarized it's a question of Time and Performance vs Efficiency in memory management.

I haven't tested either approach because either would make me go through some serious refactoring in some functions. I want to finish refactoring it once and for all and worry not about having to remake one of the cores of my engine.

eigenbom · « **Reply #3 on:** November 28, 2012, 05:48:02 am »

Do a quick calculation of the memory required and see if that fits your target machine. Fwiw I allocate around 200Mb RAM on startup of my game.

I think caching where possible is good, but never important until you actually notice that your performance is suffering. Over a year I've had to refactor large parts of moonman over and over because I hit a wall and have to rethink about how I'm doing things.

Personally if I was making that bullet shooter, I'd just use a super large static array to store all the possible bullets. Even say 64,000 is still not that much, if each bullet only has a position and colour, or whatever. Maybe then use a circular buffer acting on that array, then cap the max life of a bullet so you can always move the start pointer to chase the end pointer. Everything being so close in memory means that you can iterate over them really fast, so it doesn't matter if there are holes.

masskiller · « **Reply #4 on:** November 28, 2012, 06:38:40 am »

Quote

Personally if I was making that bullet shooter, I'd just use a super large static array to store all the possible bullets. Even say 64,000 is still not that much, if each bullet only has a position and colour, or whatever. Maybe then use a circular buffer acting on that array, then cap the max life of a bullet so you can always move the start pointer to chase the end pointer. Everything being so close in memory means that you can iterate over them really fast, so it doesn't matter if there are holes.

I began with a similar idea, and then hit a wall and had to change it (luckily I was just starting by then). When calculating good-looking shapes and patterns having control of indexes becomes a pain and in the end it was far more controllable to have many small ones than a big one that holds all functionality.

Also this kinds of games make you think there are more bullets than the ones really on screen. Even in hardest difficulties you can have around 4000 on screen tops, so you can reuse a lot if you wish to. My current API goes for that. The API also became very complex if I used just one container for all bullets.

Quote

Do a quick calculation of the memory required and see if that fits your target machine.

After a quick brush up I realized the memory taken by the vectors isn't much. My target machine is a low end one like mine. If it runs in this old jar then it can run well in most computers. I thought up of something (or rather remembered I had implemented it months ago but never used it).

I just need to keep the start position within each bullet so that I don't need a new container for them. That way I can perfectly reuse bullets that went out of the visible frame by setting them back to their original position, as well as calculate displacement from start point so that I can do crazy things with patterns later on.

What's more, this way I barely need to change my code. I wish I would have thought of it earlier. True I will get more memory usage this way, but after some thought it is still far more worth it than lag in run-time.

eigenbom · « **Reply #5 on:** November 28, 2012, 07:20:25 am »

can't wait to see those bullets fly!

Nexus · « **Reply #6 on:** November 28, 2012, 01:22:42 pm »

How many bullet placers will there be? Because "even 500 bullets" is not much, computation shouldn't be a problem. Don't make your life unnecessarily complicated with threads, see if optimizations are necessary at all.

Quote from: masskiller on November 28, 2012, 01:18:00 am

My class has an x amount of bullets, a vector of polar vectors (with size x) and a whole lot of control variables.

Why aren't these vectors stored in the bullets?

Quote from: eigenbom on November 28, 2012, 05:48:02 am

Personally if I was making that bullet shooter, I'd just use a super large static array to store all the possible bullets. Even say 64,000 is still not that much, if each bullet only has a position and colour, or whatever. Maybe then use a circular buffer acting on that array, then cap the max life of a bullet so you can always move the start pointer to chase the end pointer. Everything being so close in memory means that you can iterate over them really fast, so it doesn't matter if there are holes.

Why not simply a std::vector? Reallocations happen rarely, and you can even prevent them with reserve(). Elements can be removed with the swap()-and-pop_back() idiom in constant time. This makes your whole logic much easier: You don't have invalid bullets (holes) which you have to case-differentiate, you don't waste memory, you can easily count the current number of bullets, and so on. And probably, in the end it won't even be slow.

I have the impression that you both do premature optimization here. Keep things simple and generic, so you can change the system with few effort. If you come up with a very complex, specific system of which you think it is very fast, it may happen that 1. there will never be an issue at all, so your effort was for nothing, and 2. there may be other issues to which you cannot adapt your system easily because it is specifically optimized, and 3. debugging and maintenance becomes a real pain.

masskiller · « **Reply #7 on:** November 28, 2012, 03:23:57 pm »

Quote

Why aren't these vectors stored in the bullets?

This is precisely what I thought about yesterday. It's far more efficient and controllable that way.

Quote

How many bullet placers will there be? Because "even 500 bullets" is not much, computation shouldn't be a problem. Don't make your life unnecessarily complicated with threads, see if optimizations are necessary at all.

It depends, their amount raises with difficulty and pattern in question, so some bullet patterns may be hard even with a small amount of bullets while others will use a large amount of bullets for an specific visual effect. It's still not much in the end, but when using different bullet types I have to be aware that the drawing will be slower due to drawing different textures at the same time.

In one of my old test cases I had 1600 bullets on screen with four different textures, that got me a lag of 10-20 FPS. It doesn't happen when using only one texture, so I know that when trying to get things colorful there's an invisible limit to the amount of bullets I can use. That's why I planned to use threads for calculations, sound, and anything else, but that will come later, right now I need to finish my refactoring and start with enemy animations.

Quote

can't wait to see those bullets fly!

They already fly, just that you can't see an enemy shooting them yet. I guess I'll post a showcase video in the project thread once I have it done.

gyscos · « **Reply #8 on:** November 28, 2012, 05:28:18 pm »

Quote from: masskiller on November 28, 2012, 03:23:57 pm

It doesn't happen when using only one texture, so I know that when trying to get things colorful there's an invisible limit to the amount of bullets I can use.

Use a tileset

All bullets in one big texture, so you'll be able to achieve both colors and mass-bullet

Nexus · « **Reply #9 on:** November 28, 2012, 06:30:52 pm »

Quote from: masskiller on November 28, 2012, 03:23:57 pm

In one of my old test cases I had 1600 bullets on screen with four different textures, that got me a lag of 10-20 FPS. It doesn't happen when using only one texture, so I know that when trying to get things colorful there's an invisible limit to the amount of bullets I can use.

Okay, but then clearly the rendering is the bottleneck, and not the bullet data structure or the vector calculations. The conclusion is therefore to optimize the rendering part, and not everything. I doubt that threads will bring a significant change if everything else is fast. But they will for sure complicate your code.

Do not use sf::Sprite, but sf::VertexArray to draw the bullets. Make sure you draw all bullets with the same texture at once. Like this, you can minimize the amount of draw calls and texture switches (expensive). And also gyscos idea is a good one: If you can combine the bullets in a single file, that will be faster, too.

masskiller · « **Reply #10 on:** November 28, 2012, 07:02:59 pm »

Quote

Do not use sf::Sprite, but sf::VertexArray to draw the bullets. Make sure you draw all bullets with the same texture at once. Like this, you can minimize the amount of draw calls and texture switches (expensive).

I was suggested that once. I didn't even understand vertexes by then so I didn't use it. Now I do, but as far as I know VertexArrays are good for setting tilemaps and things that don't get transformed, positions are individually (same will happen with collisions once I get there) calculated so I don't know if in the end the solution will end up being more trouble than it's worth.

I was thinking however in drawing everything into a rendertexture and then drawing it to the window. I just thought of it so I haven't tested it out yet, and a possible problem with that might be applying shaders to only some of the bullets.

I think I can apply the shader to the bullets before drawing to the texture, but I am unsure of whether it will work as expected or not, since it depends on the shader that gets used. That way I get to draw only one texture and I still have easy control over the transformation of each bullet.

I thought of a hack for the vertexArray approach, but it turns into a very complicated solution, say there are 2000 bullets, I'd have to handle the position of all 4 vertexes in the quad of each bullet according to it's global position of each bullet and the size of the texture it's currently using.

That would mean at least 8000 calculations per frame in that case as opposed to nearly zero in my current case. I calculate almost everything in before it has to be used, meaning it almost doesn't calculate anything during play state, but rather before. This becomes an unbelievably useless optimization when I use a smaller amount of bullets(which is the case most of the times), since the rendering time is fairly acceptable. I had already fixed most of my FPS problems a while ago.

Quote

And also gyscos idea is a good one: If you can combine the bullets in a single file, that will be faster, too.

I have them all in the same file already, which is something that I recently improved from my old code. The 1600 test case is an old one and I haven't made a new one yet. So I can't say that I still have the same performance I had before.

Nexus · « **Reply #11 on:** November 28, 2012, 07:37:28 pm »

Quote from: masskiller on November 28, 2012, 07:02:59 pm

but as far as I know VertexArrays are good for set tilemaps and generally things that don't get transformed

That is not their only application field, just a good example. I use vertex arrays in Thor to draw particles.

Quote from: masskiller on November 28, 2012, 07:02:59 pm

so I don't know if the in the end the solution will end up being more trouble than it's worth.

And with threads, this problem doesn't exist? Of course you don't need to use vertex arrays if performances are okay. But if your graphics are on the limits and you have done everything on the high-level API so far (sprites with same textures are drawn together, maybe even use multiple texture rects in a single texture), then vertex arrays are the next step to look at.

Quote from: masskiller on November 28, 2012, 07:02:59 pm

I was thinking however in drawing everything into a rendertexture and then drawing it to the window.

So you draw everything twice instead of once. I'm sure this will improve performance

Quote from: masskiller on November 28, 2012, 07:02:59 pm

That way I get to draw only one texture

Wrong, you draw all the bullets + 1 render texture. What do you expect sf::RenderTexture::draw() to do?

Quote from: masskiller on November 28, 2012, 07:02:59 pm

That would mean at least 8000 calculations per frame in that case

Which are completely irrelevant if the bottleneck is the amount of draw calls and texture switches. And how do you think does sf::Sprite its vertices compute? With magic?

Quote from: masskiller on November 28, 2012, 07:02:59 pm

I had already fixed most of my FPS problems a while ago. [...] So I can't say that I still have the same performance I had before.

You just spoke about 10-20 FPS. Can you please explain exactly what the status quo is and which problems you actually face?

masskiller · « **Reply #12 on:** November 28, 2012, 08:08:02 pm »

Quote

So you draw everything twice instead of once. I'm sure this will improve performance

I kind of expected this one, I just posted the idea to see if I was right or not.

Quote

You just spoke about 10-20 FPS. Can you please explain exactly what the status quo is and which problems you actually face?

My bad, these were the results from the last big test, I just posted them without thinking of my new context, granted I do get a lower FPS count, just not as big as before and only in the cases that can be easily avoided with other visual tricks.

Quote

Which are completely irrelevant if the bottleneck is the amount of draw calls and texture switches. And how do you think does sf::Sprite its vertices compute? With magic?

I would probably go for it if not for the fact that I want to get faster results, I will leave on an LDS mission soon and I will leave my project untouched for two years. I want to at least finish a level before I go.

Nexus · « **Reply #13 on:** November 28, 2012, 08:26:22 pm »

Yes, as stated you don't need sf::VertexArray if sf::Sprite renders fast enough.

If you leave for so long, you shouldn't bother with things like threads or custom data structures for bullets, this will only delay development. Take the easy approaches and design interfaces between the parts of your application, so that you could easily exchange or improve a component, without refactoring the rest of your code.

eigenbom · « **Reply #14 on:** November 28, 2012, 10:24:14 pm »

Quote from: Nexus on November 28, 2012, 01:22:42 pm

Quote from: eigenbom on November 28, 2012, 05:48:02 am
Personally if I was making that bullet shooter, I'd just use a super large static array to store all the possible bullets.
...
Why not simply a std::vector? Reallocations happen rarely, and you can even prevent them with reserve(). Elements can be removed with the swap()-and-pop_back() idiom in constant time. This makes your whole logic much easier: You don't have invalid bullets (holes) which you have to case-differentiate, you don't waste memory, you can easily count the current number of bullets, and so on. And probably, in the end it won't even be slow.

I have the impression that you both do premature optimization here. Keep things simple and generic, so you can change the system with few effort.

They are both simple approaches, but just use different data structures. The circular buffer has the benefit of keeping a bullet in the same place in memory over its lifetime (good if you want to externally control it), and it naturally orders the bullets by their lifetime (roughly), which should give good performance. The cons are that it is fixed size and uses the maximum amount of memory you think is necessary. But this is common in games where you have special allocators that do a major allocation up-front.

I only do optimisations like this where I think they will be necessary, I've got all manner of standard vectors, sets, unordered_sets, and maps scattered through my code, and they only get replaced if the profiling says so. But in some specific areas I use the right tool for the right job.

Author Topic: [Solved] Design Question: Memory usage vs Run-time speed. (Read 6657 times)

masskiller

[Solved] Design Question: Memory usage vs Run-time speed.

ichineko

Re: Design Question: Memory usage vs Run-time speed.

masskiller

Re: Design Question: Memory usage vs Run-time speed.

eigenbom

Re: Design Question: Memory usage vs Run-time speed.

masskiller

Re: Design Question: Memory usage vs Run-time speed.

eigenbom

Re: [Solved] Design Question: Memory usage vs Run-time speed.

Nexus

Re: [Solved] Design Question: Memory usage vs Run-time speed.

masskiller

Re: [Solved] Design Question: Memory usage vs Run-time speed.

gyscos

Re: [Solved] Design Question: Memory usage vs Run-time speed.

Nexus

Re: [Solved] Design Question: Memory usage vs Run-time speed.

masskiller

Re: Design Question: Memory usage vs Run-time speed.

Nexus

Re: Design Question: Memory usage vs Run-time speed.

masskiller

Re: [Solved] Design Question: Memory usage vs Run-time speed.

Nexus

Re: [Solved] Design Question: Memory usage vs Run-time speed.

eigenbom

Re: [Solved] Design Question: Memory usage vs Run-time speed.