I don't know the normal way of figuring out of buffering and culling chunks so here is what i do at the moment:
Firstly i get the position of the current chunk that the center screen pixel is in, each chunk has its own global coordinate (they also store a screen coordinate because i move the world around the player not the player around the world (not sure if that's normal bu it has worked well so far)) then i use that to make a sfIntRect based on the chunk load range i am currently using (radius of chunks to be pre-loaded around the center current chunk. then i loop over the x and y of positions within that fsIntRect and if those coordinates in my hashmap of chunks are null or not already loaded then i add the coordinates to a list of chunks to be loaded in (one chunk is created from this lists coordinates per frame, loading all required chunks as they were found to be needed caused a slight stutter when it did so all at once) then i loop over each chunk currently in the hashmap of chunks and use its coordinate/key to use the sfIntRect contains function, if it is false then i add that to a list to be unloaded same as the loading list works pretty much.
Using this so far has been pretty good and fast for me but i don't know why if your tiles are 32x32 you would need 4 million of them, i suppose zooming out enough could warrant that, in which case you would just have to change the load range according to the zoom level... interesting thought.
Also i would like more details about changing the vertex array accordingly as well, as far as i can see doing the vertex array in chunks is a solid idea. and i don't see why it couldn't be used for pseudo infinite worlds.