Yesterday I thought I sorted out the problem with assembling complete responses from packets, but that was quickly disproved by weird errors showing up only in certain situations, but in those situations always.
Waiting for the FIN flag on a packet and checking if every packet (sorted by their IP header ids, which I observed to be consecutive for packets that belong together) is present seemed to be a solid solution and one that worked all right even for the longest of responses.
But there was one kind of response that it couldn’t handle: one that was split into multiple fragments, but the last packet didn’t have FIN, only PUSH set. On the other hand, very long responses seem to have one FIN at the end, and a PUSH every 30-40 packets, which makes using PUSH as the delimiter a bad idea again.
I considered waiting for 0.3 seconds for a response to complete (thinking that much time should be enough), but neither was it enough, nor did it work as expected (it messed up the order of the responses)
After much trial-and-error, I came up with a solution – too bad it’s an unsatisfying fallback. I cache every packet all the same as before, but I limit the cache to just one response. I identify responses by the sequence number in the TCP header, which I observed to be unique to each response. If the current packet doesn’t belong to the response in the cache, I pass the cache contents to the output function and reset the cache.
The output function check the continuity of the response and whether the last packet in it has the PUSH flag set (and as far as I could see, the last packets always have that). If the response doesn’t meet both of these conditions, that means it’s incomplete, so it’s discarded. If it passes, it’s put into the queue for the higher level script.
This feels like a fallback, because responses are only processed once the next response comes in – ergo if there isn’t one for a long time (and that happens), the response just sits there in the cache. It also discards incomplete responses, which is good on the one hand, since it makes the output more stable and safer to handle on higher levels, but at the same time it discards everything if there’s any asynchronicity (which sadly seems to be the case when the game loads for the first time).
Too bad I can’t come up with anything better for now.