We have a webservice on 3 servers behind a load balancer. By constantly monitoring the LB connections, I was able to make the correlation that once the number of incoming connections to a server exceeds 40, the service goes from responding in <20 ms to just timing out.
What confuses me is that the response time is not linearly increasing with the load - it seems to just cut out right at 40 connections or more. Any ideas?