Jupyter notebooks. So we really are borrowing a lot of ideas from literate programming here.
AWS divides metrics into two categories: Those that measure workload progress towards KPIs and those that measure workload health.
Basically, metrics by themselves are seldom useful — what’s useful is understanding how given metrics evolve with time, or in conjunction with other metrics.
What is “normal”, anyway?
What is “normal”, anyway? (Time series edition.)
Interesting automation tool: Amazon CloudWatch Synthetics. Basically an AWS service that lets you automatically interact with a service/application that you’ve deployed in order to generate metric data points. The idea here is to be able to catch emerging problems without having to rely on the users themselves running into them (alone). This is also probably a useful tool for continuously probing areas of a service/application that are more seldomly used.