That was an insightful read. I think I should store whole previous sequences of graphical states in a buffer, then. If I'm on sequence 20 and I receive a client packet in which the client acknowledges the sequence 15, I will compare sequence 15 and sequence 20 in my buffer and send the delta to this client.
Of course, the buffer should have a certain limit size, probably determined dynamically. If the client acknowledges a sequence that is no longer in the buffer I can send them the whole current graphical state - instead of sending it whole every X period of time. If the client's acked sequence is, say, 7 sequences behind the latest server sequence and my buffer is storing only the last 5 states, I can probably expand the buffer's limit to 7 for some time.
But I still don't have certainty on how the server would process input. Should I have the client keep sending packets to the server until it receives an ack response from the server? Let's say that, for example, a client whose last acked sequence was 15 will keep sending packets with their current input to the server. The client ends sending three packets before receiving the server's ack, all with different input data. The first input packet to arrive to the server was the second one sent. Upon acknowledge, should the server ignore all further packets it receives from the client that were marked with sequence 15? That is, should it ignore the first and third packets sent by the client while the client waited for a response?