Even if there are several classes to draw tile maps, I wanted to test 3 options and see the difference of performance:
- filling a map using Vertex
- pregenerate an image and use it as chunk
- use vertexArray
As I thought, just using vertex was going to be slow, so I thought about pre-rendering bigger chunks and using them to save draw() calls.
What was my surprise when I saw the test results (debug mode)...
Filling a 800x600 window, with 32x32 tiles (and a simple chunk of 800x600)
- filling a map using Vertex (475 calls): 4706 ms
- pregenerated image and used as chunk (1 call): 4442ms
- vertexArray (1 call to draw): 57 ms
Why the big difference between using only 1 800x600 image and calling once to draw but using 1900 vertices?
My common sense says that [1 draw / 800x600px / 4 vertices] should be faster than [1 draw / 800x600px / 1900 vertices]
In release mode the results are quite different...
- filling a map using Vertex (475 calls): 544 ms
- pregenerated image and used as chunk (1 call): 7ms
- vertexArray (1 call to draw): 5399 ms
But again, I don't know why the vertexArray (1 call / 1900 vertices) is much slower than the Vertex call (475 calls / 1900 vertices)
Here is the code of the test, if you wanna try...
sf::Clock clock;
// prepare vertices[4]
sf::Texture* tilesTexture = ResourceManager::getTexture("tiles.png");
sf::Vertex vertices1[4];
vertices1[0].position = sf::Vector2f(0, 0);
vertices1[1].position = sf::Vector2f(31, 0);
vertices1[2].position = sf::Vector2f(31, 31);
vertices1[3].position = sf::Vector2f(0, 31);
vertices1[0].texCoords = sf::Vector2f(0, 0);
vertices1[1].texCoords = sf::Vector2f(31, 0);
vertices1[2].texCoords = sf::Vector2f(31, 31);
vertices1[3].texCoords = sf::Vector2f(0, 31);
sf::RenderStates VertexStates;
VertexStates.texture = tilesTexture;
// test vertices[4]
clock.restart();
for(int x=0; x<25; x++)
{
for(int y=0; y<19; y++)
{
vertices1[0].position = sf::Vector2f(32*x, 32*y);
vertices1[1].position = sf::Vector2f(32*(x+1), 32*y);
vertices1[2].position = sf::Vector2f(32*(x+1), 32*(y+1));
vertices1[3].position = sf::Vector2f(32*x, 32*(y+1));
window.draw(vertices1, 4, sf::Quads, VertexStates);
}
}
cout << "vertex[4]: " << clock.getElapsedTime().asMicroseconds() << endl;
// prepare 1 draw with a 800x600 image
sf::Texture* fondo = ResourceManager::getTexture("bg800x600.png");
vertices1[0].position = sf::Vector2f(0, 0);
vertices1[1].position = sf::Vector2f(799, 0);
vertices1[2].position = sf::Vector2f(799, 599);
vertices1[3].position = sf::Vector2f(0, 599);
vertices1[0].texCoords = sf::Vector2f(0, 0);
vertices1[1].texCoords = sf::Vector2f(799, 0);
vertices1[2].texCoords = sf::Vector2f(799, 599);
vertices1[3].texCoords = sf::Vector2f(0, 599);
// test the image chunk
clock.restart();
VertexStates.texture = fondo;
window.draw(vertices1, 4, sf::Quads, VertexStates);
cout << "chunk800x600: " << clock.getElapsedTime().asMicroseconds() << endl;
// prepare the VertexArray call
VertexStates.texture = tilesTexture;
sf::VertexArray vertices2(sf::Quads, 25*19*4);
for(int x=0; x<25; x++)
{
for(int y=0; y<19; y++)
{
vertices2[(x + y*25)*4].position = sf::Vector2f(32*x, 32*y);
vertices2[(x + y*25)*4+1].position = sf::Vector2f(32*(x+1), 32*y);
vertices2[(x + y*25)*4+2].position = sf::Vector2f(32*(x+1), 32*(y+1));
vertices2[(x + y*25)*4+3].position = sf::Vector2f(32*x, 32*(y+1));
int i = rand()%9;
int j = rand()%9;
vertices2[(x + y*25)*4].texCoords = sf::Vector2f(i*32, 32*j);
vertices2[(x + y*25)*4+1].texCoords = sf::Vector2f(32*(i+1), 32*j);
vertices2[(x + y*25)*4+2].texCoords = sf::Vector2f(32*(i+1), 32*(j+1));
vertices2[(x + y*25)*4+3].texCoords = sf::Vector2f(32*i, 32*(j+1));
}
}
// test the VertexArray call
clock.restart();
window.draw(vertices2, VertexStates);
cout << "vertexArray: " << clock.getElapsedTime().asMicroseconds() << endl;
BTW bg800x600.png is a 800x600 image, and tiles.png a 256x256 image.
Did you also disable the debugger? In Visual studio, that's independent from Release/Debug mode. By pressing F5, you start with the debugger, using Ctrl+F5 you start without. And if you want a fair comparison, show the new code and apply the points mentioned by Laurent.
Anyway, it makes sense that drawing 4 vertices (a texture) is faster than a whole vertex array... There are however other criterions to keep in mind, like animated tiles or texture size limits.
Yes, it's without the debugger. Anyways, running with F5 or with Ctrl+F5 doesn't affect much to this test.
About the new code, I just added a "wait" before each test
while(clock.getElapsedTime() < sf::seconds(5));
and wrapped each test with
for(int t=0; t<1000; t++)
About animated tiles, you have a point there, but I guess if there are only a few animated tiles they can be represented by separated AnimatedSprites over the floor tiles. Easy to do and probably still faster. Or at least that's what I think :P