Hey all,
I'm working on a project that involves comparing the pixels of an image rendered using SFML's primitives to an image loaded from a file. Basically, I'm trying to figure out how similar a set of triangles is to another image.
The way I'm doing this right now is drawing the triangle set in a RenderTexture, then copying this to an image, then using the getPixelsPtr() function to look at the pixel array. See here:
void Evolver::evalFitness(z::Triangles *triToEval) {
composite.clear(sf::Color::Transparent); // composite is a RenderTexture
composite.draw(triToEval->triangleVector);
composite.display();
sf::Image compImg = composite.getTexture().copyToImage();
const sf::Uint8 *pixArray = compImg.getPixelsPtr();
double distSum = 0;
for (unsigned long i = 0; i < 4*imgWidth*imgHeight; i += 4) {
distSum += (1.0/510.0)*
sqrt(pow(baseImage[i+0]-int(pixArray[i+0]), 2) +
pow(baseImage[i+1]-int(pixArray[i+1]), 2) +
pow(baseImage[i+2]-int(pixArray[i+2]), 2) +
pow(baseImage[i+3]-int(pixArray[i+3]), 2));
}
triToEval->fitness = distSum;
}
AFAIK, this is very inefficient because the RenderTexture needs to be copied from the GPU to the CPU/RAM before it can be manipulated. Right now I'm able to evaluate an image at about 400Hz, but I want to be able to do it 10x faster.
What I would like to do is access this RenderTexture using CUDA's Thrust library to perform the per-pixel distance calculation on the GPU. I think this would be way faster because the RenderTexture wouldn't need to be copied anywhere.
I found an example of OpenGL/Thrust interoperation on SO,
linked here. I think this is the relevant snippet here:
int main(int argc, char *argv[]) {
cudaGLSetGLDevice(0);
// Initialize the GLUT library and negotiate a session with the window system
glutInit(&argc, argv);
glutInitDisplayMode( GLUT_DOUBLE | GLUT_RGBA | GLUT_ALPHA );
glutInitWindowSize( DIM, DIM );
glutCreateWindow( "sort test" );
glewInit();
glGenBuffers( 1, &bufferObj );
glBindBuffer( GL_PIXEL_UNPACK_BUFFER_ARB, bufferObj );
glBufferData( GL_PIXEL_UNPACK_BUFFER_ARB, DIM * DIM * 4, NULL, GL_DYNAMIC_DRAW_ARB );
cudaGraphicsGLRegisterBuffer( &resource, bufferObj, cudaGraphicsMapFlagsNone );
glEnable(GL_TEXTURE_2D);
glGenTextures(1, &textureID);
glBindTexture(GL_TEXTURE_2D, textureID);
glTexImage2D( GL_TEXTURE_2D, 0, GL_RGBA8, DIM, DIM, 0, GL_BGRA, GL_UNSIGNED_BYTE, NULL);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
cudaGraphicsMapResources( 1, &resource, NULL );
uchar4* devPtr;
size_t size;
...
I think I need to use those
cudaGraphicsGLRegisterBuffer and
cudaGraphicsMapResources functions with SFML's RenderTexture in order to do what I want, but I'm not sure how to go about this.
Any help, advice, or suggestions for other ways to go at this problem would be greatly appreciated.