Infrastructure monitoring

Real-time monitoring and alerting for distributed infrastructures

Real time monitoring of IT infrastructures is a critical topic.

Whatever the context may be, detecting system failures in near-real-time is key to SLA compliance and deep knowledge of the most critical failure points is one of the best way to provide a solid and reliable infrastructure.

Because of this reason, I always set up redundant monitoring and alerting systems for the infrastructures I manage.

My tool of choice for nodes and resources monitoring is Amon.

[Read More]