Missing AMF requests/responses

Every month we get calls from new Flex clients with exactly the same question: can we help to fix strange networking problems. The application they fully tested exhibits very strange problems for some users once released in the wild. Here is a brief list of the problems they have seen:
1. Missing packages coming from client going to the serer (majority of the problem cases)
2. Missing responses ( coming from server to client)
3. Out of sequence execution of the server calls
4. Duplicate requests to the server
Interesting enough, the popular belief is that TCP/IP protocol takes care of all these problems and it is not responsibility of the developers to deal with these issues.

Unfortunately, living with the problems above is a typical WAN way of life, and TCP/IP protocol provides reliability on LANs only. To make the matters worse, applications are tested on LANs and extremely reliable “local WANs”. As a result, they are not tested at all for this type of the issues. And with thousands of small AMF requests per session even fraction of percent of lost packages affects the reliability of application.

Once application is in this state, typical response from development teams is to add application error handling and do more testing. Unfortunately, this testing is also done on LAN, and reliable WANS. Plus due to the non-deterministic nature of the lost packages conventional testing does not help much.

The advice I usually give to our clients is to fix the problem on communication protocol level (in a layer between the Flex Framework and Application) rather then in the application code. Usually it involves creation “ReliableAMFChannel” ActionScript class on the client side and customized Java Endpoint on the server side. These classes are primarily responsible for the following:
1. Keeping copies of the original requests(client) and responses (server) till they are confirmed or answered by the other side.
2. Managing resending of unconfirmed requests/responses after certain timeout.
3. Filtering out duplicate/already executed requests/responses
4. Fixing the order of requests/responses to prevent execution “out of order”

Usually it is implemented as 5-10 days project to integrate reference implementation code with the client codebase, provide resynchronization logic (login/reconnect) and timeouts’ optimization. I usually recommend to move long running requests to messaging as a part of the same effort as those have a long list of their own related issues, which I’ll cover in the next post.

If you are new to Flex, consider taking this 5-day live online training course.

Anatole Tartakovsky