May 19, 2012, 5:48 am GMT  

Postings

Keeping the lights on

Having spent years running 24×7 internet-facing production systems, I find that the monitoring element of an application delivery environment is often the last item to be addressed and built outside of the application delivery architecture. As we continue to build our application delivery infrastructure in the cloud, having a good monitoring strategy will allow us to arm ourselves with the information we need to make intelligent decisions.

So exactly what should be monitored?

Availability

The first element in a monitoring strategy is to determine whether the application is accessible. The most simplistic form of determining availability is ping. However, as most applications are obscured behind a load balancer, a ping response doesn’t necessarily mean that the application is responding to requests. Use a monitoring system that can speak application-layer protocols to ensure that the application is indeed healthy and responding to user requests. It’s best to leverage a 3rd party solutions that can assess availability from multiple networks and provide an unbiased view on the availability of the application.

Resource Utilization / Load

Next element in a good monitoring strategy is to determine how healthy a system is. Tracking the load of various system components will enable us to uncover bottlenecks within the application delivery environment. Leverage SNMP to capture and record utilization statistics on CPU, memory, disk IO, network IO, threads, and so on. Graph these stats to establish baseline and find correlations between each monitored element. (more…)

Filed under: cloud & virtualization,web X.0 — Tags: , , , , , , — appgirl @ 9:15 am
Comments (1)

application availability monitoring

The availability of any enterprise application is only as good as the monitoring solutions utilized. The reason being that a well thought out / well deployed monitoring solution not only detects when an application has failed to respond to requests, it can also take steps to being remedy the problem without involving human intervention. The last thing any admin wants to be waken up at 3a only to hit the restart button for the downed service. (more…)

Filed under: Uncategorized — Tags: , , , — appgirl @ 8:14 pm
Comments (0)

My Tweets

Fans

AppGirl on Facebook

See What I'm Uncorking

Powered by WordPress