Amazon CloudWatch — Monitoring & Observability

By Pritesh Yadav 17 min read

Amazon CloudWatch is the AWS service that watches how your resources are performing — it collects metrics (numbers like CPU usage), stores logs, raises alarms, and reacts to events. The single most-tested confusion is CloudWatch versus CloudTrail: CloudWatch answers "is my system healthy and busy?" while CloudTrail answers "who called which API and when?". You also need to know what CloudWatch measures by default, what needs the CloudWatch agent, and how alarms can trigger SNS or Auto Scaling.

Most confused here: CloudWatch = performance/monitoring metrics & logs; CloudTrail = API call audit history. Also: default EC2 metrics (CPU, network, disk I/O) are free, but memory and disk-space-used need the CloudWatch agent; CloudWatch Logs = storing/searching logs, Logs Insights = querying them; alarms can notify (SNS) or act (Auto Scaling).

Q1 A security auditor asks, "Which IAM user deleted that S3 bucket last Tuesday, and from what IP address?" Which AWS service holds the answer?

  1. AWS CloudTrail, because it records the history of API calls made in the account
  2. Amazon CloudWatch Metrics, because it tracks resource performance over time
  3. Amazon CloudWatch Logs, because it stores application output
  4. Amazon EventBridge, because it routes events between services
Answer: A
Why A is correct: "Who did what, when, and from where" is an audit question. AWS CloudTrail records every API call (such as DeleteBucket) along with the identity, time, and source IP. That is exactly what an auditor needs.
Why the other options are wrong:
  • B — CloudWatch Metrics only stores performance numbers (CPU, network), not who called an API.
  • C — CloudWatch Logs stores application/system log text, not a tamper-aware record of account API activity.
  • D — EventBridge reacts to events in near real time; it is not the historical audit store you search after the fact.
Common trap: Any "who/when/which API" question points to CloudTrail, not CloudWatch. CloudWatch is about performance and health, never identity auditing.

Q2 An operations team wants a single screen that shows EC2 CPU, RDS connections, and Lambda errors together, with auto-refreshing graphs for the on-call engineer. Which CloudWatch feature should they use?

  1. CloudWatch Alarms
  2. CloudWatch Dashboards
  3. CloudWatch Logs Insights
  4. AWS CloudTrail Insights
Answer: B
Why B is correct: CloudWatch Dashboards are customizable, shareable pages that display multiple metrics and graphs from many services on one screen — exactly the "single pane of glass" the team wants.
Why the other options are wrong:
  • A — Alarms watch one metric and fire an action when a threshold is crossed; they are not a visual multi-metric screen.
  • C — Logs Insights queries log text, not a live dashboard of metric graphs.
  • D — CloudTrail Insights flags unusual API activity; it is not a performance dashboard.
Common trap: Confusing Alarms (notify/act on a threshold) with Dashboards (visualize many metrics at once). "Show me everything on one screen" = Dashboards.

Q3 An engineer needs the percentage of memory used inside an EC2 instance to appear in CloudWatch. After launching the instance, that metric is nowhere to be found. Why, and what is the fix?

  1. Memory is always collected; the engineer just needs to enable detailed monitoring
  2. Memory metrics only appear for Auto Scaling groups, so create one first
  3. Memory is a guest-OS metric not visible to the hypervisor, so the CloudWatch agent must be installed
  4. Memory metrics require enabling AWS CloudTrail data events
Answer: C
Why C is correct: AWS can see metrics from outside the instance (CPU, network, disk I/O) but cannot see inside the operating system. Memory usage and disk-space-used live in the guest OS, so you must install the CloudWatch agent to collect and publish them as custom metrics.
Why the other options are wrong:
  • A — Detailed monitoring only changes how often the standard metrics report (1 minute instead of 5); it never adds memory.
  • B — Auto Scaling groups do not add memory metrics; the agent is still required.
  • D — CloudTrail records API calls and has nothing to do with memory performance metrics.
Common trap: Believing CPU and memory work the same way. CPU is a default free metric; memory and disk-space-used always require the CloudWatch agent.

Q4 A startup wants to be emailed automatically the moment their estimated AWS charges for the month cross $200. What is the correct way to set this up?

  1. Enable AWS CloudTrail and filter for billing API calls
  2. Create an AWS Config rule on the billing account
  3. Open a support case asking AWS to cap spending at $200
  4. Create a CloudWatch billing alarm on estimated charges that notifies an SNS topic subscribed by email
Answer: D
Why D is correct: CloudWatch can monitor the "estimated charges" billing metric (published in the us-east-1 region). You create an alarm at the $200 threshold and point it at an Amazon SNS topic; SNS then emails the subscribers when the alarm fires. This is the standard cost-alert pattern in CLF-C02.
Why the other options are wrong:
  • A — CloudTrail logs API calls; it does not track or alert on dollar amounts.
  • B — AWS Config checks resource configuration compliance, not spending thresholds.
  • C — AWS does not hard-cap spending on request; alerting (CloudWatch billing alarms or AWS Budgets) is the supported approach.
Common trap: Thinking a billing alarm itself sends the email. The alarm only changes state — it needs SNS (or Auto Scaling) as the action to actually notify or react.

Q5 During a traffic spike, a company wants their fleet to automatically add EC2 instances when average CPU stays above 70%. Which combination makes this happen?

  1. CloudTrail detects the load and launches instances
  2. A CloudWatch alarm on CPU triggers an Auto Scaling action to add instances
  3. CloudWatch Logs Insights queries the spike and scales the fleet
  4. EventBridge stores the metric and Config adds capacity
Answer: B
Why B is correct: CloudWatch watches the CPU metric; when it crosses the 70% threshold the alarm enters the ALARM state and triggers an EC2 Auto Scaling policy, which launches more instances. CloudWatch detects, Auto Scaling acts.
Why the other options are wrong:
  • A — CloudTrail audits API calls; it has no role in detecting load or scaling.
  • C — Logs Insights analyzes log text; it does not monitor CPU metrics or scale fleets.
  • D — AWS Config tracks configuration compliance and cannot add EC2 capacity.
Common trap: Crediting Auto Scaling with the monitoring. Auto Scaling cannot "see" CPU on its own — it relies on a CloudWatch alarm to tell it when to act.

Q6 By default (without any agent installed and without detailed monitoring), which EC2 metric does CloudWatch provide for free?

  1. CPU utilization, reported every 5 minutes
  2. Memory utilization, reported every 1 minute
  3. Disk-space-used percentage on each mounted volume
  4. Number of logged-in OS users
Answer: A
Why A is correct: CPU utilization is one of the standard, no-cost EC2 metrics AWS collects from outside the instance. With basic (default) monitoring it reports at 5-minute intervals; detailed monitoring drops that to 1 minute.
Why the other options are wrong:
  • B — Memory utilization is a guest-OS metric and is never collected by default; it needs the CloudWatch agent.
  • C — Disk-space-used (how full a volume is) also lives inside the OS and requires the agent. Default disk metrics are I/O activity, not space used.
  • D — Logged-in user counts are not a CloudWatch metric at all.
Common trap: Assuming "default" means everything. Default EC2 metrics are the externally visible ones (CPU, network, disk read/write activity, status checks) — not memory or disk fullness.

Q7 A team has gigabytes of application logs already centralized in CloudWatch Logs and wants to interactively run ad-hoc queries like "count errors per hour grouped by API path." Which feature is purpose-built for this?

  1. CloudWatch Dashboards
  2. CloudWatch Alarms
  3. Amazon Athena on the EC2 boot volume
  4. CloudWatch Logs Insights
Answer: D
Why D is correct: CloudWatch Logs Insights is the interactive query tool that runs over log data already in CloudWatch Logs. It lets you filter, aggregate, and group log events on the fly — exactly the "count errors per hour by path" use case.
Why the other options are wrong:
  • A — Dashboards visualize metrics; they do not run text queries against log content.
  • B — Alarms watch a threshold and fire actions; they are not a query engine.
  • C — Athena queries data in Amazon S3, not logs sitting in CloudWatch Logs, and not an EC2 boot volume.
Common trap: Mixing up CloudWatch Logs (the storage of log streams) with Logs Insights (the interactive query layer on top of that storage).

Q8 A company wants to automatically run a Lambda function whenever an EC2 instance changes state to "stopped," reacting in near real time. Which service is designed to route that event to the function?

  1. CloudWatch Dashboards
  2. AWS CloudTrail
  3. Amazon EventBridge (CloudWatch Events)
  4. CloudWatch Logs Insights
Answer: C
Why C is correct: Amazon EventBridge (formerly CloudWatch Events) matches events such as an EC2 state change against rules and routes them to targets like Lambda, SNS, or Step Functions in near real time. This event-driven automation is its core job.
Why the other options are wrong:
  • A — Dashboards only display metrics; they cannot trigger functions.
  • B — CloudTrail records the API history but does not natively route events to targets for automation.
  • D — Logs Insights queries stored logs; it does not react to live state-change events.
Common trap: Picking CloudTrail because it "knows" about the state change. CloudTrail logs it for audit, but EventBridge is the one that reacts and triggers action.

Q9 A manager says: "I just want a number that tells me how busy our database server is right now, and a graph over the last hour." Which CloudWatch concept matches that description?

  1. A log stream
  2. A metric
  3. An alarm
  4. An event rule
Answer: B
Why B is correct: A metric is a time-ordered set of numeric data points (like CPU or connection count) that CloudWatch can graph over time. "A number showing how busy it is, plotted over an hour" is the definition of a metric.
Why the other options are wrong:
  • A — A log stream holds text log entries, not the single performance number being described.
  • C — An alarm watches a metric and changes state at a threshold; it is built on a metric but is not the number itself.
  • D — An event rule routes events to targets; it is not a measured performance value.
Common trap: Blurring metric and alarm. The metric is the measurement; the alarm is the watcher that reacts when that measurement crosses a line.

Q10 Which statement best captures the core difference between Amazon CloudWatch and AWS CloudTrail?

  1. CloudWatch monitors performance and operational health; CloudTrail records API activity for auditing and governance
  2. CloudWatch records API activity; CloudTrail monitors CPU and memory
  3. Both store the same data, but CloudTrail is only for billing
  4. CloudWatch is only for S3, and CloudTrail is only for EC2
Answer: A
Why A is correct: This is the central exam distinction. CloudWatch is about operational performance — metrics, logs, alarms, dashboards (is it healthy and busy?). CloudTrail is about governance and audit — a record of who made which API calls and when.
Why the other options are wrong:
  • B — This reverses the two services completely.
  • C — They store different data, and CloudTrail is for API auditing, not billing.
  • D — Both services work across nearly all AWS services, not single ones.
Common trap: Swapping the two roles under exam pressure. Anchor it: Watch = performance you watch; Trail = the trail of who did what.

Q11 An alarm has been created on a metric with a threshold, but when the threshold is crossed nothing happens — no email, no scaling. What is the most likely cause?

  1. Alarms cannot send notifications; only dashboards can
  2. The metric must be a custom metric for alarms to work
  3. CloudTrail was not enabled, so the alarm has no permission
  4. The alarm has no action configured, such as an SNS topic or Auto Scaling policy
Answer: D
Why D is correct: An alarm changing to the ALARM state does nothing visible on its own — it must have an action attached. To notify a person you point it at an SNS topic; to scale you point it at an Auto Scaling policy. With no action, the alarm flips state silently.
Why the other options are wrong:
  • A — Alarms absolutely can notify, via SNS; dashboards only display data.
  • B — Alarms work on both default and custom metrics; custom is not required.
  • C — CloudTrail is unrelated to whether an alarm executes its action.
Common trap: Assuming an alarm self-notifies. The alarm only detects; SNS (notify) or Auto Scaling (act) does the actual response.

Q12 A team enables "detailed monitoring" on their EC2 instances. What does this actually change?

  1. It adds memory and disk-space metrics automatically
  2. It turns on CloudTrail API logging for those instances
  3. It increases the reporting frequency of standard metrics from 5 minutes to 1 minute, for an extra cost
  4. It makes CloudWatch Logs searchable with Logs Insights
Answer: C
Why C is correct: Detailed monitoring simply makes the existing standard EC2 metrics report at a finer 1-minute interval instead of the default 5 minutes, and it carries an additional charge. It improves granularity, not the list of metrics.
Why the other options are wrong:
  • A — Memory and disk-space-used still require the CloudWatch agent regardless of detailed monitoring.
  • B — Detailed monitoring has nothing to do with CloudTrail API logging.
  • D — Logs Insights availability is unrelated to EC2 monitoring frequency.
Common trap: Thinking detailed monitoring unlocks new metric types like memory. It only changes how often the same metrics are sampled.

Q13 A developer wants their application's custom log files (for example, request traces written by their code) to be centrally stored and viewable in AWS. Which service should they send those logs to?

  1. AWS CloudTrail
  2. Amazon CloudWatch Logs
  3. Amazon EventBridge
  4. CloudWatch Metrics
Answer: B
Why B is correct: CloudWatch Logs is the service for collecting, centralizing, and storing application and system log text. The developer can ship their app logs there (often via the CloudWatch agent) and view or query them centrally.
Why the other options are wrong:
  • A — CloudTrail records AWS API calls, not arbitrary application log output the developer writes.
  • C — EventBridge routes discrete events; it is not a log storage service.
  • D — Metrics store numeric data points, not free-form log text.
Common trap: Confusing CloudTrail (AWS API audit logs) with CloudWatch Logs (your application's own log text). Custom app logs go to CloudWatch Logs.

Q14 An e-commerce company wants to receive an SMS and email alert whenever order-processing latency exceeds 2 seconds. Which pairing of services achieves this?

  1. A CloudWatch alarm on the latency metric, with Amazon SNS as the notification action
  2. AWS CloudTrail with an email subscription
  3. CloudWatch Logs Insights with a scheduled query
  4. AWS Config with a remediation rule
Answer: A
Why A is correct: CloudWatch monitors the latency metric and the alarm fires when it passes 2 seconds. The alarm's action is an Amazon SNS topic, which can fan out to both email and SMS subscribers at once. This is the canonical alarm-plus-SNS notification pattern.
Why the other options are wrong:
  • B — CloudTrail audits API calls and does not watch latency or send threshold alerts.
  • C — Logs Insights runs queries but is not a real-time alerting mechanism for metric thresholds.
  • D — AWS Config enforces resource configuration compliance, not performance alerting.
Common trap: Forgetting that the alarm needs SNS to actually deliver email/SMS. The alarm detects; SNS delivers to multiple channels.

Q15 A compliance officer needs to prove which API calls were made in the account over the last 90 days, while the operations team needs to know if servers are overloaded right now. Which mapping is correct?

  1. Both needs are served by CloudWatch alone
  2. Both needs are served by CloudTrail alone
  3. CloudTrail for the API-call history; CloudWatch for current server load
  4. EventBridge for the API history; Config for current load
Answer: C
Why C is correct: The two needs map to the two services cleanly. CloudTrail provides the audit history of API calls for compliance (Event history retains the last 90 days), and CloudWatch provides real-time performance metrics so operations can see if servers are overloaded.
Why the other options are wrong:
  • A — CloudWatch does not record the API-call audit trail.
  • B — CloudTrail does not provide live CPU/load performance metrics.
  • D — EventBridge routes events (not a 90-day audit store) and Config checks configuration, not live load.
Common trap: Trying to force one service to do both jobs. Compliance/audit = CloudTrail; performance/health = CloudWatch. They are complementary, not interchangeable.

Q16 A team wants a job to run every day at 6 AM UTC to clean up temporary files using a Lambda function, with no servers to manage. Which CloudWatch-family capability provides this scheduled trigger?

  1. A CloudWatch alarm with a time-based threshold
  2. CloudWatch Logs subscription filters
  3. A CloudWatch Dashboard widget
  4. An Amazon EventBridge (CloudWatch Events) scheduled rule
Answer: D
Why D is correct: EventBridge (formerly CloudWatch Events) supports scheduled rules using cron or rate expressions. A rule set for 6 AM UTC daily can target a Lambda function, giving a serverless cron-style trigger.
Why the other options are wrong:
  • A — Alarms react to metric thresholds, not clock schedules.
  • B — Subscription filters forward matching log events; they are not a daily scheduler.
  • C — A dashboard widget only displays data and cannot trigger anything.
Common trap: Reaching for an "alarm on time." Scheduled/cron-style triggering belongs to EventBridge rules, not CloudWatch alarms.

Q17 A new engineer claims, "CloudWatch can't help me — my app writes everything to log files, not metrics." What is the most accurate response within CLF-C02 scope?

  1. Correct — CloudWatch only handles numeric metrics
  2. Incorrect — CloudWatch also includes CloudWatch Logs, which can centrally store and search application log files
  3. Correct — log files must go to CloudTrail instead
  4. Incorrect — log files can only be stored in Amazon RDS
Answer: B
Why B is correct: CloudWatch is more than metrics — CloudWatch Logs centralizes application and system log files, and Logs Insights lets you search them. So the engineer's app logs fit naturally into CloudWatch.
Why the other options are wrong:
  • A — CloudWatch handles both metrics and logs, so this is false.
  • C — CloudTrail is for AWS API audit records, not the app's own log files.
  • D — RDS is a relational database, not the standard destination for application log files.
Common trap: Thinking CloudWatch equals only metrics. It is a broader observability suite: Metrics, Logs, Alarms, Dashboards, and Events.

Q18 A company publishes its own business value — "active shopping carts" — into CloudWatch so they can alarm on it. What is this type of data point called, and how does it get there?

  1. A custom metric, published to CloudWatch by the application or the CloudWatch agent
  2. A default metric, collected automatically by AWS
  3. A CloudTrail event, generated by an API call
  4. A dashboard widget that AWS calculates for you
Answer: A
Why A is correct: A business value like "active shopping carts" is not something AWS measures, so the application (or the CloudWatch agent) must push it to CloudWatch as a custom metric. Once there, it can be graphed and alarmed just like any built-in metric.
Why the other options are wrong:
  • B — Default metrics are AWS-collected infrastructure values (CPU, network); an app-specific business count is not one.
  • C — CloudTrail events record API calls, not arbitrary business measurements.
  • D — A dashboard widget only displays data; AWS does not invent a business metric for you.
Common trap: Assuming any number in CloudWatch is "default." Application/business values are custom metrics you must publish yourself; only standard infrastructure metrics come free.

Continue reading