Brendan Gregg, all-round performance guru and Netflix Performance Engineer, has a slide deck from AWS:Invent on profiling and performance tuning EC2 Instances.
He also references and links to the Utilization Saturation and Errors (USE) Method:
The USE Method provides a strategy for performing a complete check of system health, identifying common bottlenecks and errors. For each system resource, metrics for utilization, saturation and errors are identified and checked. Any issues discovered are then investigated using further strategies.
Which can be distilled to:
For every resource, check utilization, saturation, and errors.