I'm not measuring frames per second, I'm measuring total frames generated over a time period
You're not measuring frames per second, but a number of frames over a number of seconds?
where most work done in the tight loop is in the SFML library (mostly draws)
So you're comparing SFML to itself? What's the point? If it's to measure the overhead of the C# -> CSFML -> SFML bridge, I can tell you for sure, without any benchmark, that it is negligible compared to the overhead of the draw call (mostly spent in the OpenGL driver).
Would the above still mean the test is not effective as a measure of each environment's raw performance? If so, what is a more effective way to test perfomance of the library in each environment?
You don't need to do that. Since you're testing a binding, and not a port, you'll always be comparing SFML to itself.
And next time you want our feedback on a benchmark, the first thing to do is to post the code. Just saying that you did something doesn't mean that it is done right.