Welcome, Guest. Please login or register. Did you miss your activation email?

Author Topic: Benchmark : SFML2 vs D3DXSprite  (Read 9217 times)

0 Members and 1 Guest are viewing this topic.

Cantos

  • Newbie
  • *
  • Posts: 2
    • View Profile
Benchmark : SFML2 vs D3DXSprite
« on: July 03, 2011, 10:45:34 am »
Hi,

I've been interested in making little 2D games for several years now, but never finished anything too elaborate. I was pretty exciting about SFML when I heard of it, because it's an easy to use,  cross platform, hardware accelerated 2D graphics library - just what I need to get on with making games, not fiddling around with perfecting engine code. Using it a bit though, I was disappointed with the performance. I am coming from a background in DirectX9 where you have access to the Direct3DX9Sprite interface, so I wrote two little toy programs to benchmark SFML against D3DXSprite.

The task was the build a program that opened a 1024x768 windowed application that would fill the screen with a 32x32 tiled image. It should have to make 768 Draw calls per frame.

D3DXSprite version:
Code: [Select]
#include <string>
#include <boost/lexical_cast.hpp>

#include <windows.h>
#include <d3d9.h>
#include <d3dx9.h>

#pragma comment (lib,"d3d9.lib")
#ifdef _DEBUG
#pragma comment (lib,"d3dx9d.lib")
#else
#pragma comment (lib,"d3dx9.lib")
#endif

bool registerClass();
bool createWindow();
bool dxSetup();
void dxCleanup ();
void msgLoop();
void drawFrame();
LRESULT CALLBACK WindowProcedure (HWND,UINT,WPARAM,LPARAM);

//globals
const std::string winClassname = "WinClass";
unsigned int width = 1024;
unsigned int height = 768;
unsigned int texture_size = 32;
HWND hwnd = NULL;
IDirect3D9* d3d = NULL;
IDirect3DDevice9* device = NULL;
ID3DXSprite* d3d_sprite = NULL;
D3DPRESENT_PARAMETERS d3dPresentParams;
D3DXMATRIX normal;
IDirect3DTexture9* texture = NULL;
unsigned int frames = 0;
std::string FPS = "not calculated";

int WINAPI WinMain (HINSTANCE hThisInst,HINSTANCE hPrevInst,LPSTR lpszArgs,int nWinMode)
{
try
{
// Register the Window Class
if (!registerClass())
{
// Register class failed
return 1;
}

if (createWindow())
{
// Successfully created a window, start the game engine
//engine = new kb::KingdomBattle(hwnd);
ShowWindow(hwnd,SW_SHOW);
}
else
{
// Create Window failed
return 1;
}

if (!dxSetup())
{
return 1;
}
// Enter the message loop
msgLoop();
}
catch(std::exception& e)
{
MessageBox(hwnd,e.what(),"exception",0);
}

// Message loop terminated, cleanup
dxCleanup();
ShowWindow(hwnd,SW_HIDE);


MessageBox(NULL,FPS.c_str(),"Benchmark Results",0);

return 0;
}

LRESULT CALLBACK WindowProcedure (HWND hwnd,UINT message,WPARAM wParam,LPARAM lParam)
{
if (wParam == VK_ESCAPE && message == WM_KEYDOWN)
{
PostQuitMessage(0);
return 0;
}

switch(message)
{
case WM_KEYDOWN:
break;
case WM_KEYUP:
break;

case WM_MOUSEMOVE:
break;
case WM_LBUTTONDOWN:
break;
case WM_RBUTTONDOWN:
break;
case WM_LBUTTONUP:
break;
case WM_RBUTTONUP:
break;

        case WM_CLOSE:
            DestroyWindow(hwnd);
        break;
case WM_DESTROY:
PostQuitMessage(0);
break;
default:
return DefWindowProc(hwnd, message, wParam, lParam);
}
return 0;
}
bool registerClass()
{
WNDCLASSEX wcl;

//define a window class
wcl.cbSize = sizeof(WNDCLASSEX);

wcl.hInstance = GetModuleHandle(NULL);
wcl.lpszClassName = winClassname.c_str();
wcl.lpfnWndProc = WindowProcedure;
wcl.style = 0;

wcl.hIcon = NULL;
wcl.hIconSm = NULL;
wcl.hCursor = LoadCursor(NULL, IDC_ARROW);

wcl.lpszMenuName = NULL;

wcl.cbClsExtra = 0;
wcl.cbWndExtra = 0;

wcl.hbrBackground = (HBRUSH) GetStockObject(BLACK_BRUSH);

//register a window
if (!RegisterClassEx(&wcl))    return false;
else                           return  true;
}

bool createWindow()
{
hwnd = CreateWindow(winClassname.c_str(),
               "DirectX Benchmark",
WS_OVERLAPPEDWINDOW,
CW_USEDEFAULT,
CW_USEDEFAULT,
width,
height,
NULL,
NULL,
GetModuleHandle(NULL),
NULL);

if (!hwnd) return false;
else
{
return true;
}
}

bool dxSetup()
{
HRESULT hr = D3D_OK;
d3d = Direct3DCreate9(D3D_SDK_VERSION);

D3DXMatrixScaling(&normal,1.0f,1.0f,1.0f);

ZeroMemory(&d3dPresentParams, sizeof(d3dPresentParams));
    d3dPresentParams.Windowed = TRUE;
    d3dPresentParams.SwapEffect = D3DSWAPEFFECT_DISCARD;
    d3dPresentParams.hDeviceWindow = hwnd;
    d3dPresentParams.BackBufferFormat = D3DFMT_A8R8G8B8;
    d3dPresentParams.BackBufferWidth = width;
    d3dPresentParams.BackBufferHeight = height;
d3dPresentParams.PresentationInterval = D3DPRESENT_INTERVAL_IMMEDIATE;
hr = d3d->CreateDevice(D3DADAPTER_DEFAULT,
                      D3DDEVTYPE_HAL,
                      hwnd,
                      D3DCREATE_HARDWARE_VERTEXPROCESSING,
                      &d3dPresentParams,
                      &device);
if (FAILED(hr))
{
return false;
}
device->SetFVF(D3DFVF_XYZRHW | D3DFVF_DIFFUSE | D3DFVF_TEX1);

device->SetRenderState(D3DRS_ALPHABLENDENABLE, TRUE);    // turn on the color blending
    device->SetRenderState(D3DRS_SRCBLEND, D3DBLEND_SRCALPHA);    // set source factor
    device->SetRenderState(D3DRS_DESTBLEND, D3DBLEND_INVSRCALPHA);    // set dest factor
    device->SetRenderState(D3DRS_BLENDOP, D3DBLENDOP_ADD);    // set the operation


// create the sprite object
hr = D3DXCreateSprite(device,&d3d_sprite);
if (FAILED(hr))
{
return false;
}



hr = D3DXCreateTextureFromFileEx(device,
"img.png", // filename
D3DX_DEFAULT,              // width
D3DX_DEFAULT,              // height
D3DX_DEFAULT,              // mip mapping
NULL,                      // usage
D3DFMT_A8R8G8B8,
D3DPOOL_MANAGED,
D3DX_DEFAULT,              // filtering
D3DX_DEFAULT,              // mip filtering
NULL,    // colorkey
NULL,                      // image info struct
NULL,                      // palette
&texture);
if (FAILED(hr) || texture == NULL)
{
throw std::exception("createtexture failed");
}

return true;
}

void dxCleanup ()
{
if (d3d_sprite != NULL) { d3d_sprite->Release(); }
if (texture != NULL) { texture->Release(); }
if (device != NULL) { device->Release(); }
if (d3d != NULL) { d3d->Release(); }
}

void msgLoop()
{
__int64 frequency = 0;
__int64 start_time = 0;
__int64 end_time = 0;

QueryPerformanceFrequency((LARGE_INTEGER*)&frequency);
QueryPerformanceCounter((LARGE_INTEGER*)&start_time);

MSG msg;

while(true)
    {
        if (PeekMessage(&msg, NULL, 0, 0, PM_REMOVE))
        {
            if (msg.message == WM_QUIT)
{
break; // exit the message loop
}

// else ...
            TranslateMessage(&msg);
            DispatchMessage(&msg);
        }

drawFrame();
    }
QueryPerformanceCounter((LARGE_INTEGER*)&end_time);

double seconds = static_cast<double>(end_time - start_time) / frequency;
FPS = "Runtime: "+boost::lexical_cast<std::string>(seconds)+" sec\nFPS: "+boost::lexical_cast<std::string>(frames / seconds)+" frames/sec";
}

void drawFrame()
{
device->Clear(0,NULL,D3DCLEAR_TARGET,D3DCOLOR_XRGB(255,255,255),1.0f,0);

device->BeginScene();
    d3d_sprite->Begin(D3DXSPRITE_ALPHABLEND);

D3DXVECTOR3 pos;
pos.z = 0.f;
for (unsigned int y=0; y<height/texture_size; ++y)
{
for (unsigned int x=0; x<width/texture_size; ++x)
{
pos.x = static_cast<float>(x*texture_size);
pos.y = static_cast<float>(y*texture_size);
HRESULT hr = d3d_sprite->Draw(texture,NULL,NULL,&pos,D3DCOLOR_ARGB(255,255,255,255));
if (FAILED(hr))
{
throw std::exception("d3d9 sprite object draw call failed");
}
}
}

d3d_sprite->End();
device->EndScene();
device->Present(NULL,NULL,NULL,NULL);
++frames;
}


SFML2 version
Code: [Select]
#include <string>
#include <boost/lexical_cast.hpp>

unsigned int width = 1024;
unsigned int height = 768;
unsigned int texture_size = 32;
std::string FPS = "not calculated";
unsigned int frames = 0;

#include <SFML/Graphics.hpp>
 
int WINAPI WinMain (HINSTANCE hThisInst,HINSTANCE hPrevInst,LPSTR lpszArgs,int nWinMode)
{
    sf::RenderWindow window(sf::VideoMode(width, height), "SFML window");

    sf::Image image;
    if (!image.LoadFromFile("img.png"))
{
        return 1;
}
    sf::Sprite sprite(image);
 
__int64 frequency = 0;
__int64 start_time = 0;
__int64 end_time = 0;

QueryPerformanceFrequency((LARGE_INTEGER*)&frequency);
QueryPerformanceCounter((LARGE_INTEGER*)&start_time);
    while (window.IsOpened())
    {
        sf::Event e;
        while (window.GetEvent(e))
        {
            if (e.Type == sf::Event::Closed)
{
                window.Close();
}
if ((e.Type == sf::Event::KeyPressed) && (e.Key.Code == sf::Key::Escape))
{
window.Close();
}
        }
 
        window.Clear();
for (unsigned int y=0; y<height/texture_size; ++y)
{
for (unsigned int x=0; x<width/texture_size; ++x)
{
sprite.SetPosition(static_cast<float>(x*texture_size),static_cast<float>(y*texture_size));
window.Draw(sprite);
}
}
        window.Display();
++frames;
    }

QueryPerformanceCounter((LARGE_INTEGER*)&end_time);

double seconds = static_cast<double>(end_time - start_time) / frequency;
FPS = "Runtime: "+boost::lexical_cast<std::string>(seconds)+" sec\nFPS: "+boost::lexical_cast<std::string>(frames / seconds)+" frames/sec";
MessageBox(NULL,FPS.c_str(),"Benchmark Results",0);
 
    return 0;
}


What I found with these two benchmarks running on my video card (NVidia 9800GTX) was that the Direct3DXSprite version pushes about 3500 frames per second, while the SFML2 version can only do around 500. Even more worrying, while compiled in Visual Studio 2010's Debug mode, the SFML2 version can only manage about 25 frames per second, while a debug build has no appreciable affect on the Direct3DXSprite version, and it still manages around 3500.

I'm looking for an explanation of why Direct3DXSprite is so much faster in this case. Both of these programs are calling down to the same hardware, and it's not like there are speed differences between OpenGL and DirectX to pin this on. I am assuming that the sprite batching in SFML in really naive, and it's actually making several API draw calls, where the D3DXSprite object is smart enough to batch these down to 1 call.

I know it's a bit of a jerk move to register on your forum just to have a go at your hard work. I think SFML is really good. The Direct3DXSprite version above is 300 lines of code compared to 60 for the SFML version. Look at how much work SFML saved me! And the SFML version is fundamentally portable, while the D3DXSprite version is not. I want to use SFML, but I just can't justify the performance hit.

Any comments on this issue?

WitchD0ctor

  • Full Member
  • ***
  • Posts: 100
    • View Profile
    • http://www.teleforce-blogspot.com
Benchmark : SFML2 vs D3DXSprite
« Reply #1 on: July 03, 2011, 11:28:48 am »
thats weird, something is wrong here, because I can fill a screen with scripted enemies all under a physics simulation and pathfinding around each other (over a thousand enemies) and have well over a couple frames per second

SFML should not be THAT much slower than direct X
John Carmack can Divide by zer0.

Laurent

  • Administrator
  • Hero Member
  • *****
  • Posts: 32498
    • View Profile
    • SFML's website
    • Email
Benchmark : SFML2 vs D3DXSprite
« Reply #2 on: July 03, 2011, 03:04:08 pm »
Hi

First, 500 FPS vs 3500 FPS is not that bad. FPS are not a good unit to compare performances, since they are 1/n (which is not linear). Not linear means that with higher numbers, differences tend to become huge.
500 vs 3500 means that SFML takes 1.7 ms longer to render one frame, which would slow down your app to 46 FPS if it was originally running at 50 FPS with D3D. These numbers look closer to each other than 500 and 3500, don't they? ;)

Secondly, D3DXSprite implements batching and is therefore less flexible than SFML. I guess that all sprites of a batch must share the same states, and after you call Begin() you can't change anything at all. That's not true for SFML, each sprite is independant and can have different states. And you'd get almost the same performances in this case (*), whereas you would see a huge drop with the D3D code.

(*) as long as all sprites share the same image

To get a meaningful comparison you should:
- test many different scenarios
- draw more sprites so that FPS are not that high

And, to finish, the drawing API of SFML is changing a lot, the new one will be lower level and allow batching and complex geometry that you can draw in one call.
Laurent Gomila - SFML developer

Haikarainen

  • Guest
Benchmark : SFML2 vs D3DXSprite
« Reply #3 on: July 03, 2011, 03:07:55 pm »
Quote from: "Laurent"
And, to finish, the drawing API of SFML is changing a lot, the new one will be lower level and allow batching and complex geometry that you can draw in one call.


I cant wait :D

OniLinkPlus

  • Hero Member
  • *****
  • Posts: 500
    • View Profile
Benchmark : SFML2 vs D3DXSprite
« Reply #4 on: July 03, 2011, 10:36:52 pm »
Quote from: "Laurent"
And, to finish, the drawing API of SFML is changing a lot, the new one will be lower level and allow batching and complex geometry that you can draw in one call.
Any guesstimates on how long until then?
I use the latest build of SFML2

Laurent

  • Administrator
  • Hero Member
  • *****
  • Posts: 32498
    • View Profile
    • SFML's website
    • Email
Benchmark : SFML2 vs D3DXSprite
« Reply #5 on: July 03, 2011, 10:56:54 pm »
Quote
Any guesstimates on how long until then?

Nop, sorry.
Laurent Gomila - SFML developer

Cantos

  • Newbie
  • *
  • Posts: 2
    • View Profile
Benchmark : SFML2 vs D3DXSprite
« Reply #6 on: July 04, 2011, 12:19:19 pm »
Quote from: "WitchD0ctor"
thats weird, something is wrong here, because I can fill a screen with scripted enemies all under a physics simulation and pathfinding around each other (over a thousand enemies) and have well over a couple frames per second

SFML should not be THAT much slower than direct X


Well I'd like to know why our experiences are different. I'm developing with VS 2010 Express on Win7 64 bit (but compiling 32 bit executables). And as I said before, running my builds on a 9800GTX. What does your development environment look like? Do you have a windows machine available to compile and test my code with to compare results?

Quote from: "Laurent"
Hi

First, 500 FPS vs 3500 FPS is not that bad. FPS are not a good unit to compare performances, since they are 1/n (which is not linear). Not linear means that with higher numbers, differences tend to become huge.
500 vs 3500 means that SFML takes 1.7 ms longer to render one frame, which would slow down your app to 46 FPS if it was originally running at 50 FPS with D3D. These numbers look closer to each other than 500 and 3500, don't they? ;)

Secondly, D3DXSprite implements batching and is therefore less flexible than SFML. I guess that all sprites of a batch must share the same states, and after you call Begin() you can't change anything at all. That's not true for SFML, each sprite is independant and can have different states. And you'd get almost the same performances in this case (*), whereas you would see a huge drop with the D3D code.

(*) as long as all sprites share the same image

To get a meaningful comparison you should:
- test many different scenarios
- draw more sprites so that FPS are not that high

And, to finish, the drawing API of SFML is changing a lot, the new one will be lower level and allow batching and complex geometry that you can draw in one call.


Thanks for your response. I don't really follow your explanation with the non-linear metrics. I think you must be getting 0.2857ms (0.3) per frame with Directx, and 2ms per frame for sfml? Doesn't this still indicate that SFML takes nearly 7 times longer to push a frame than directX in my benchmark?

I'm not sure what other scenarios I could test, but this seems to me a pretty common one for a 2d tile based game - rendering the background. Should I maybe test spinning, colorshifting sprites as well? I did expand on this though with a version that just draws one tile per frame, and versions that fill the screen with 16x16 tiles (3072 draw calls) and 8x8 tiles (12,288 draw calls). Here are my results:

Single Draw:
Debug DX = 6700 fps
Debug SFML = 4200 fps
Release DX = 6700 fps
Release SFML = 4100 fps (??)

16x16 Mode:
Debug DX = 1200 fps
Debug SFML = 5 fps
Release DX = 1300 fps
Release SFML = 140 fps

8x8 mode:
Debug DX = 330 fps
Debug SFML = 1.389 fps
Release DX = 350 fps
Release SFML = 34 fps

I'm not going to convert all of these to frame duration metrics, but for example, my calculation indicate SFML takes about 26.5ms longer to draw each frame in the 8x8 Release test. This is much worse than the first benchmark. Contrary to what you seem to be saying with the non linear metrics talk, SFML looks better the less there is to draw, taking only 0.07ms longer than dx on the single draw test.

Again, this is in Release mode, Debug mode is so much worse for SFML. Is this a common problem, and is there a way to get SFML to draw speedily in a Debug compile? It's not just those VS checked iterators is it?

Nexus

  • SFML Team
  • Hero Member
  • *****
  • Posts: 6287
  • Thor Developer
    • View Profile
    • Bromeon
Benchmark : SFML2 vs D3DXSprite
« Reply #7 on: July 04, 2011, 12:37:44 pm »
Quote from: "Cantos"
Again, this is in Release mode, Debug mode is so much worse for SFML. Is this a common problem, and is there a way to get SFML to draw speedily in a Debug compile? It's not just those VS checked iterators is it?
Probably, SFML comes with more debug informations and runtime checks than DirectX. You can try to set _SECURE_SCL to 0, but don't forget to do this in the user project and when recompiling the SFML libraries.
Zloxx II: action platformer
Thor Library: particle systems, animations, dot products, ...
SFML Game Development:

Groogy

  • Hero Member
  • *****
  • Posts: 1469
    • MSN Messenger - groogy@groogy.se
    • View Profile
    • http://www.groogy.se
    • Email
Benchmark : SFML2 vs D3DXSprite
« Reply #8 on: July 04, 2011, 12:41:21 pm »
Did you miss Laurent's explanation of why it was slower than DX?
Developer and Maker of rbSFML and Programmer at Paradox Development Studio

Laurent

  • Administrator
  • Hero Member
  • *****
  • Posts: 32498
    • View Profile
    • SFML's website
    • Email
Benchmark : SFML2 vs D3DXSprite
« Reply #9 on: July 04, 2011, 12:57:06 pm »
Debug is slower in SFML because what slows down rendering is the number of OpenGL calls, and those are affected by debug options since they happen on the CPU (not on the GPU).

Your test is a typical use case for the drawing API, and these results are the reason why I'm designing a new one ;)

So let's wait and run a new benchmark after the new API is there.
Laurent Gomila - SFML developer

JAssange

  • Full Member
  • ***
  • Posts: 104
    • View Profile
Benchmark : SFML2 vs D3DXSprite
« Reply #10 on: July 04, 2011, 03:24:06 pm »
An additional thing I'd like to mention is that DirectX is Windows only and therefore more optimized for it.

Laurent

  • Administrator
  • Hero Member
  • *****
  • Posts: 32498
    • View Profile
    • SFML's website
    • Email
Benchmark : SFML2 vs D3DXSprite
« Reply #11 on: July 04, 2011, 03:31:05 pm »
Quote
An additional thing I'd like to mention is that DirectX is Windows only and therefore more optimized for it.

I'm not sure about that. There's nothing to optimize regarding the interaction with the OS, the 3D API works mainly with the graphics card through its driver.
Laurent Gomila - SFML developer

JAssange

  • Full Member
  • ***
  • Posts: 104
    • View Profile
Benchmark : SFML2 vs D3DXSprite
« Reply #12 on: July 04, 2011, 04:21:59 pm »
Quote from: "Laurent"
Quote
An additional thing I'd like to mention is that DirectX is Windows only and therefore more optimized for it.

I'm not sure about that. There's nothing to optimize regarding the interaction with the OS, the 3D API works mainly with the graphics card through its driver.

DirectX takes advantage of some of Vista/7's new composting features, particularly in windowed mode. I've never seen an OpenGL driver do this.

Tronic

  • Newbie
  • *
  • Posts: 16
    • View Profile
    • http://performous.org/
Benchmark : SFML2 vs D3DXSprite
« Reply #13 on: July 06, 2011, 11:49:12 pm »
It is meaningless to benchmark anything above 200 FPS or so as display drivers and other software is not optimized for that. Consider e.g. SFML doing something per frame, taking 1 ms each time. This would limit your FPS to 1000 no matter what you render, while another system might get you 10000 FPS. But if you are developing an actual game (targeting 60 FPS), that 1 ms extra delay makes pretty much no difference what-so-ever.

So do as someone suggested, draw enough sprites to slow it down to 30 FPS, then redo your benchmark and you'll get useful results.

Byron

  • Newbie
  • *
  • Posts: 1
    • View Profile
Benchmark : SFML2 vs D3DXSprite
« Reply #14 on: July 14, 2011, 04:49:15 pm »
You also need to make sure you have hardware and drivers that are written for OpenGL.  By default windows comes with a software OpenGL driver, which is very slow.  So check your hardware and drivers for OpenGL support.

 

anything