Well I can say that it is impossible for a vertex shader to do this for you since this is not resolved in the vertex shader. You have no control of the depth-test. You can only modify a pair of states. It's all done in the GPU hardware. And no the depth buffer is not that expensive but it is still a lot more expensive than having none. Remember that the depth test is done on each pixel also for pixels that fails. And then we have the write operations. It does cost a lot compared to having none at all. It is cheaper to do a per-entity compare instead.
Couldn't the vertex shader just set the z value, which would then be used when doing the depth test to determine which fragments to draw? The reason I was thinking of using the vertex shader is exactly because it's done on the GPU, so CPU cycles wouldn't be needed in maintaining a sorted list of sprites. But I guess it would actually take some profiling to see what the speed differences were, if there are any.
The funky bug that can appear with the "I put floating things at z-depth 0.1f" is that if you have a tile-system and calculate the depth based on objects height you can easily get some tiles or objects on the edge that might cut those floating things. But if you then say "I put a limit to 0.5f" then you have more or less created an indirect layer. So why not create a proper system to manage layers and draw entities in separate passes? There is only advantages as you both get a full 32 bit floating point accuracy instead of only half of that and it will become easier to sort things after depth as you have more or less done a form of divide-and-conquer algorithm in a higher level.
Yeah, I guess it'd be like a layer system without actually programming any layer management (the isFlying could be turned into a layer variable instead), which I think could turn out as an elegant alternative (though like I said, I've only just thought about this so there could be issues I haven't thought of yet, so it may not be that elegant... but it could be, I think). But you wouldn't have to restrict z values to the [-1, 1] range, as they get homoginized after the vertex shader, so you could still perform decent depth testing. You could also disable it for certain things, like the background/tilemap and UI.
I'll have to give the vertex shader idea some more thought, but I think it has potential.
Either way, I'd still like to hear Laurent's opinion on this (I'm expecting him to say no, but I'm interested in why as well).
[edit]
I just thought about alpha blending from sprites, as that would probably be needed... I don't think the depth buffer would be much help there (as alpha blending resorts to rendering with the painter's algorithm, right?), so I think you're right in that a sprite's depth with a depth-buffer wouldn't actually help that much. Never mind, I guess.