SFML community forums

Help => Graphics => Topic started by: netrick on July 23, 2013, 07:27:10 pm

Title: Vertex array - HUGE memory usage and bad performance
Post by: netrick on July 23, 2013, 07:27:10 pm
I use vertex array tile map code from the tutorial. I use exactly the same draw function as in tutorial and the loading is the same algorithm (I only load from a file but the in-app representation is the same as in tutorial).

When my map is 100 x 100 32px tiles I get 1350 FPS and memory usage of my app is 39mb. Great.
When my map is 300 x 300 32px tiles I get 200 FPS and memory usage 59mb. Well acceptable.

However when I draw a map which is 1000 x 1000 32px tiles, memory usage is 300mb and FPS is 19. That's very bad, I need maps of that size in my game.

Debug and release builds change nothing here. Is it a bug in SFML that memory usage and fps are that bad with increasing number of vertexes? It seems like vertexes are stored in some ineffective way (why they take so much of memory?)

Also I think that vertex array should be stored on GPU side rather than on RAM, correct?

Or it has to be that way and vertex array is overrated? In that case I will just use quad tree of vertex arrays but I'd like to avoid it for the sake of simplicity.

It looks like one vertex takes 65 bytes of RAM which is quite a lot. It should be a lightweight object stored on GPU side I think.
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: eXpl0it3r on July 23, 2013, 07:34:24 pm
Here's the memory usage calculation:
1000x1000x32x8 = 256000000
256000000 / 1024 / 1024 = 244.140625 MiB

Though I'm not sure what should get saved where etc.

The question now is, do you need to draw those 1000x1000x32 pixels at one time? If so you might indeed have a problem, if not you should start using an octree or similar structures, so you'll only draw what's needed.
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: netrick on July 23, 2013, 07:38:42 pm
I used a method where  I draw 30x20 visible tiles sprite-by-sprite. That way I had constant 200 fps no matter the map size (also 1 000 x 1 000 tiles only took 4mb in memory, as the array of int).

I wanted to switch to vertex array so I could use sf::View to zoom and scroll easier. My previous method didn't allow zooming, only scrolling.

So it looks like I have to use something like quad tree to have a bunch of vertex array and render 1-4 at one time.
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: Nexus on July 23, 2013, 07:39:01 pm
It seems like vertexes are stored in some ineffective way (why they take so much of memory?)
It seems like you store them in a ineffective way. Do you duplicate the tile textures?

I don't know why you mention 32 pixels, the size of the vertex array does not depend on the tile size. It is roughly sizeof(sf::Vertex) * numberOfVertices. The vertex array is stored in the RAM and loaded to the graphics card when rendered.

Or it has to be that way and vertex array is overrated?
They are in so far overrated as people think they magically solve any performance problems. Their main advantage in comparison to sf::Sprite is (besides flexibility) the reduction of draw calls.

What you still have to consider yourself, is culling geometries you don't render (namely the part of the map which is not on the screen). Don't put the whole map into a single vertex array.

Here's the memory usage calculation:
This is for the texture, not the vertex array. And only in the case where tiles are needlessly duplicated.
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: netrick on July 23, 2013, 07:44:02 pm
Quote
It seems like you store them in a ineffective way. Do you duplicate the tile textures?

I use the exact code from Lautent's tutorial. There is only one single texture with all tiles, which is stored in just one place in the app. Do you think that he stores them ineffectively there? I doubt it could be improved really.

sf::Vertex size is 48 bytes. So it should be 190mb and not 240mb anyway.
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: Nexus on July 23, 2013, 07:50:43 pm
I doubt how it could be improved really.
You missed the most important part of my post. Do you store the whole map, including all the invisible parts, in a vertex array? If so, you have your answer.

Laurent's example is meant to show a simple usage of vertex arrays, not to be a 1:1 code to apply in every possible case.

sf::Vertex size is 48 bytes. So it should be 190mb and not 240mb anyway.
Since sf::VertexArray uses a std::vector internally, it may allocate more than necessary to allow faster growing. If this is an issue, call shrink_to_fit(), but keep in mind that this results in a copy of all vertices.
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: netrick on July 23, 2013, 07:55:11 pm
You missed the most important part of my post. Do you store the whole map, including all the invisible parts, in a vertex array? If so, you have your answer.

Okay, I get it. I will just divide world in a few vertex arrays and draw only visible one(s).

Thank you for help.
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: Nexus on July 23, 2013, 07:58:12 pm
Okay, I get it. I will just divide world in a few vertex arrays and draw only visible one(s).
That's a good idea for static tile sets.

Alternatively, you can rebuild the vertex array directly from the visible tiles. Especially if tiles are animated or change over time, you will need a more dynamic approach.
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: netrick on July 23, 2013, 08:02:13 pm
]That's a good idea for static tile sets.

Alternatively, you can rebuild the vertex array directly from the visible tiles. Especially if tiles are animated or change over time, you will need a more dynamic approach.

That would be much easier and more flexible solution that some kind of a tree. I will try it, I hope that rebuilding 30x25 tiles vertex array every frame won't have a big impact of performance.
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: binary1248 on July 23, 2013, 08:06:19 pm
However when I draw a map which is 1000 x 1000 32px tiles, memory usage is 300mb and FPS is 19. That's very bad, I need maps of that size in my game.
Well, yes you might need maps of that size in your game, but nobody said you need to render it all, all the time. The only way you would be able to realistically see all the detail would be on a 32000x32000 (1 Gigapixel) screen. Don't give the industry these kinds of ideas yet please ::).

Debug and release builds change nothing here. Is it a bug in SFML that memory usage and fps are that bad with increasing number of vertexes? It seems like vertexes are stored in some ineffective way (why they take so much of memory?)

Also I think that vertex array should be stored on GPU side rather than on RAM, correct?

It looks like one vertex takes 65 bytes of RAM which is quite a lot. It should be a lightweight object stored on GPU side I think.
sf::VertexArray stores its data in system RAM for you to manipulate as much as you want. As soon as you draw it, it takes the long journey across your PCIe bus to your graphics RAM and waits there until the GPU renders it to the screen. Keep in mind, this happens EVERY frame because SFML doesn't use VBOs. Given that a single sf::Vertex contains an sf::Vector2f for position data, an sf::Vector2f for texture coordinate data and a sf::Color for color data, that sums up to 2*4+2*4+4*1=20 bytes per vertex. Because sf::Vertex is not polymorphic, the size of its members should constitute its own size. No idea where you got the 65 bytes (or 48 bytes) from... How the data is stored in graphics RAM is another question. It is an internal detail and something we don't need to know about, but I have a feeling it isn't more than the space requirement in system RAM.

If an sf::Vertex is 20 bytes large and you have 1001*1001 of them (1000*1000 tiles) you should end up with an sf::VertexArray of at least 20MB. That leaves the majority of memory consumption to the rest of your application including the baseline that SFML needs to run.

If you are hellbent on displaying that much graphical content on the screen, SFML won't really help you further, you need to code something in OpenGL yourself (you might want to look at VBOs for this). My GPU has no problem rendering a VBO of 2 million triangles every frame.

As Nexus and eXpl0it3r already said, for the panning issue, you need to reduce the amount of stuff you are drawing if you know most of it is offscreen. When zooming out, you can either compute custom mipmaps of regions much like OpenGL does, or you can think of some fancy LOD algorithm that takes care of seamless zooming while keeping the primitive count down. All very advanced topics, but who am I to stop someone from tinkering ;).
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: Nexus on July 23, 2013, 08:17:59 pm
I hope that rebuilding 30x25 tiles vertex array every frame won't have a big impact of performance.
Certainly not. For my platformer Zloxx II (http://www.bromeon.ch/games/zloxx), I created sf::Sprites every frame anew and drew them individually, using a similar amount of tiles (vertex arrays in SFML didn't exist at that time). That was no problem, even on netbooks or older desktops...
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: FRex on July 23, 2013, 08:18:49 pm
Quote
sf::Vertex size is 48 bytes.
Is it really? What system and compiler is that?

You can also use dbug's shader(my improved version) to draw tiles using shaders keeping 1 tile per pixel in map texture but then you're limited by shader version(honestly I'm not sure that #version 130 is needed at the top but dbug had it) and texture size and so on and might need to create option to use/fallback vertex arrays if shader doesn't run.
But on the plus side you get ok performance, free rotating/scaling/culling/etc., map is a single sf::Sprite and even quite ancient cards have 1024x1024 textures at least(or you could load your map in few goes then into few textures and sprites or fallback to vertex arrays) and you get few bits per each tile(each tile has 2^32 values) to optionally put your own stuff and can add animation, alpha, etc. as you desire to the shader, also maps are automatically compressed in png/whatever other format and you can copy, paste and edit them using gimp, paint or sf::Image in code easily etc.

Quote
If an sf::Vertex is 20 bytes large and you have 1001*1001 of them (1000*1000 tiles) you should end up with an sf::VertexArray of at least 20MB.
Binary, a tile is 4 vertices... ;D
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: binary1248 on July 23, 2013, 08:23:25 pm
Quote
If an sf::Vertex is 20 bytes large and you have 1001*1001 of them (1000*1000 tiles) you should end up with an sf::VertexArray of at least 20MB.
Binary, a tile is 4 vertices... ;D
Hmm yeah forgot that sf::VertexArray needlessly duplicates vertices position data... oh well don't use them myself anyways ::). Using indexed VBOs you could slash so much off that requirement...
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: netrick on July 23, 2013, 08:27:03 pm
Quote
sf::Vertex size is 48 bytes.
Is it really? What system and compiler is that?

Uh sorry, it is 20 bytes. I was thinking of something different when typing this lol.

1000 x 1000 x 4 x 20 = 80mb for 1000 tiles.
I don't know what is using so much of memory. std::vector itself and other internal things can't take 160mb of memory...

BTW I will look into shader map.
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: Nexus on July 23, 2013, 08:32:46 pm
std::vector itself and other internal things can't take 160mb of memory...
Why not? Let's assume the vector has a grow factor of 2 (typically it's not 2 in STL implementations) and the following worst case scenario occurs:

The old vector has a size of 80MB which is exhausted, and only a few bytes more are required. The newly allocated vector thus has a size of 160MB, wasting almost 50% of its memory. For exception safety, the old memory won't be released before the new one has been allocated, that is, 240MB are used at the same time. After std::vector::push_back(), 80MB can be released, but the application usually doesn't return unused memory directly to the operating system.

If you are interested, change SFML's source code to check for the eventual std::vector::capacity(), and compare it with size(). By specifying the number of vertices beforehand, rather than appending them continuously, you can avoid unnecessary reallocations.
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: FRex on July 23, 2013, 08:45:30 pm
Quote
Hmm yeah forgot that sf::VertexArray needlessly duplicates vertices position data... oh well don't use them myself anyways ::). Using indexed VBOs you could slash so much off that requirement...
But.. OpenGL :'( programming... it's hard like OpenGL :'(
I've never seen anyone color their maps so the glColorPointer call and the 4 bytes from your own specialized sf::Vertex could be scrapped for 20% instant size cut... I think. I don't know OpenGL :'( at all. Most people here(ie. me) probably don't know assembly, OpenGL :'( and c++ at all or as good as you do.

Quote
(typically it's not 2 in STL implementations).
GCC and its STL like to troll apparently, not only operator[] does not assert at all(!!), the vector new length is calculated like: size + max(size,1) so it goes from 0 to 1and then keeps doubling... and netrick is on Linux (http://en.sfml-dev.org/forums/index.php?topic=12347.msg85970#msg85970) so.. yeah, it's actually exactly 80 mbs allocated for 4 million pushed vertices. It probably could sum up and the virtual memory, minor page faults from stl structures that grow in advance, dlls and so on might make it seem worse: http://www.frogatto.com/?p=3
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: Nexus on July 23, 2013, 09:57:40 pm
To explain why the factor 2 is a particularly bad choice: It represents exactly the threshold that inhibits the usage of previously allocated memory. Growing by a factor of 2 is the worst possible allocation strategy.

Consider the scenario where a vector grows multiple times in a row (no other allocations occur in between).  Typically, there may be a large block of memory, at the beginning of which the vector is placed initially. # denotes an allocated and - an unusable byte.

When the vector grows by a factor of 2, the sum of the previous blocks is always a single byte too small for the new vector. As a consequence, the memory of all of its previous allocations cannot be used again, and the vector wanders continuously forward in memory.
Round     Allocated     Memory Usage
  1.          1         #
  2.          2         -##
  3.          4         ---####
  4.          8         -------########
  5.         16         ---------------################

As a result, a vector of capacity N uses 2N-1 bytes while growing. In the worst case, only slightly more than 50% of the elements are actually used, resulting in an effective usage of 25% of the memory, while 75% are wasted. Even though 50% can be used for future allocations, the growing process itself requires an overly big amount of memory.

The Dinkumware STL implementation of MSVC uses a factor 1.5 and therefore avoids this issue. After 4 allocations, old blocks can be used again.
Round     Allocated     Memory Usage
  1.          1         #
  2.          2         -##
  3.          3         ---###
  4.          4         ------####
  5.          6         ######----
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: FRex on July 23, 2013, 10:12:17 pm
I know ;), I just forgot gcc uses 2, I knew visuals 1.5.
That's also what folly tried to leverage in its favor: https://github.com/facebook/folly/blob/master/folly/docs/FBVector.md
And then you wonder why people reimplement STL while gcc uses 2x and doesn't check []. ;D
Who knows what other semi-witchcraft code sits in some STL implementation of some compiler.
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: netrick on July 24, 2013, 09:22:37 am
Is there any good STL implementation out there that I can use with gcc? In some aspects gcc generates really bad code compared to other compilers...
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: Silvah on July 24, 2013, 03:23:12 pm
GCC and its STL like to troll apparently, not only operator[] does not assert at all(!!)
Have you tried enabling the debug mode? If you don't enable it, the debug checks are disabled, obviously.

To explain why the factor 2 is a particularly bad choice: It represents exactly the threshold that inhibits the usage of previously allocated memory. Growing by a factor of 2 is the worst possible allocation strategy.
It's a tradeoff. Factor of 1.5 copies more often, factor of 2 wastes more memory.

Of course, if you think that a factor of 2 is egregiously bad, you should file a bug (http://gcc.gnu.org/bugzilla/) in libstdc++. Still, the developers are aware of potential drawbacks (http://gcc.gnu.org/ml/libstdc++/2013-03/msg00059.html), but they have some evidence that for real world code factor of 2 is faster.

At any rate, if you think that a smaller factor will benefit your case, you can implement it. if(vector.size() + itemsToAdd > vector.capacity()) vector.reserve(getNewCapacity(vector.capacity(), itemsToAdd)); is not exactly rocket science. And while reserve guarantees merely that capacity >= N, not capacity == N, on most if not all real implementations it'll grow to N.
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: FRex on July 24, 2013, 03:59:05 pm
The debug mode is 'enabled' as in NDEBUG is not defined so asserts happen.
It's 'disabled' as in _GLIBCXX_DEBUG is not defined and without it glibc++ doesn't have even asserts in vector.
And _GLIBCXX_DEBUG is "Undefined by default" (because it might mess up badly if a stl class is passed between units that are not both compiled with same debug switch because the sizes of stl classes change in debug) so there's a nasty surprise waiting to happen.
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: netrick on July 24, 2013, 05:15:49 pm
Going back to orginal topic, now I make a new vertex array from visible tiles every frame. I get constant 800fps and very low memory usage even when map is 1000 x 1000 tiles (40mb in that case).
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: Ancurio on July 25, 2013, 07:51:31 pm
Hmm yeah forgot that sf::VertexArray needlessly duplicates vertices position data... oh well don't use them myself anyways ::). Using indexed VBOs you could slash so much off that requirement...

Can you explain to me where these redundant vertices could be slashed off? Having recently implemented a tmx loader in openGL using indexed VBOs, I came to the conclusion that it's not really very viable, because the only cullable vertices would be those where two adjacent tiles on the screen were also adjacent in the tileset, in which case you'd safe 2 vertices per occurance, which IMO doesn't happen that often (as compared to eg. the same ground tile repeated over huge parts of the map).

Also, on the topic of the wasted std::vector memory: If the amount of vertices per tile and the amount of tiles is known before hand, why not just call vector.reserve(count) at the beginning (avoiding all reallocations) and avoid the waster memory?
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: binary1248 on July 25, 2013, 09:13:52 pm
Can you explain to me where these redundant vertices could be slashed off? Having recently implemented a tmx loader in openGL using indexed VBOs, I came to the conclusion that it's not really very viable, because the only cullable vertices would be those where two adjacent tiles on the screen were also adjacent in the tileset, in which case you'd safe 2 vertices per occurance, which IMO doesn't happen that often (as compared to eg. the same ground tile repeated over huge parts of the map).
The idea is that you use indexing, yes... but you don't advance all attributes at the same pace, i.e. glVertexAttribDivisor. As you probably noticed, there is no way of going around newer OpenGL if you want all the performance you can get ;). If you use your own VBO for tile rendering you can already drop the redundant color data (-20%). If you are really crazy, you could write your own shader to render the tiles with minimal data. Think about it, you are just iterating through a grid and changing the texture that is applied to each quad along the way. You don't have to provide much besides basic information about the grid structure and the type data at each tile. Haven't done this myself, however it seems theoretically possible (maybe something I might try when I get really bored ::)).

Also, on the topic of the wasted std::vector memory: If the amount of vertices per tile and the amount of tiles is known before hand, why not just call vector.reserve(count) at the beginning (avoiding all reallocations) and avoid the waster memory?
You can do this already by using sf::VertexArray::reserve() and just indexing the data instead of constantly appending, or even better just specifying at construction how large the sf::VertexArray should be, however many beginners haven't had the privilege of being part of this discussion yet ;).
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: Ancurio on July 26, 2013, 11:08:21 am
The idea is that you use indexing, yes... but you don't advance all attributes at the same pace, i.e. glVertexAttribDivisor. As you probably noticed, there is no way of going around newer OpenGL if you want all the performance you can get ;). If you use your own VBO for tile rendering you can already drop the redundant color data (-20%). If you are really crazy, you could write your own shader to render the tiles with minimal data. Think about it, you are just iterating through a grid and changing the texture that is applied to each quad along the way. You don't have to provide much besides basic information about the grid structure and the type data at each tile. Haven't done this myself, however it seems theoretically possible (maybe something I might try when I get really bored ::)).

Ah... Yes, yes, I see. You would provide the raw tile index data, set the attrib divisor to 4 so packs of 4 vertices describing one tile/quad get the same tile index (or is it based on indices, ie. 6 because 6 indices per tile?), and the rest of all the vertex pos and texpos data could be generated on the fly based on the provided vertex id in the shader. For a map with eg. 100x100 tiles, you would only specify a 100x100 big ushort buffer and that's all.

That would be an interesting space/computation trade off (you'd be calculating the same exact data 60 times per second), but for huge maps it might indeed help with memory concerns. Interesting.
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: FRex on July 26, 2013, 07:26:43 pm
Quote
If you are really crazy, you could write your own shader to render the tiles with minimal data. Think about it, you are just iterating through a grid and changing the texture that is applied to each quad along the way. You don't have to provide much besides basic information about the grid structure and the type data at each tile. Haven't done this myself, however it seems theoretically possible (maybe something I might try when I get really bored ::)).
Like Dbug's shader that I improved or even more minimal?
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: binary1248 on July 26, 2013, 10:12:29 pm
What... there was someone who already did it? Man... now what will I do when I get bored ::)... I had a look, it looks pretty minimal. Doubt I can beat that without a few months worth of thinking ;D.
Title: Re: Vertex array - HUGE memory usage and bad performance
Post by: FRex on July 26, 2013, 10:26:13 pm
Quote
I had a look, it looks pretty minimal. Doubt I can beat that without a few months worth of thinking ;D.
I feel flattered, I probably shouldn't because I written it based on dbug's comments and code but I do. ;D

Quote
Man... now what will I do when I get bored ::)...
SFGU.. :-X I dunno, hex or isometric tile shader? ;D