Amazon Macie

A fully managed data security service that uses machine learning and pattern matching to discover and classify sensitive data — almost exclusively in Amazon S3. Macie answers "where is my sensitive data, and is it exposed," not "is something attacking me" (GuardDuty) or "is there a known vulnerability" (Inspector).

Detective — data discovery Data Protection — primary role Compliance — GDPR/HIPAA support
1 GB
Free sensitive-data scan/acct/region/month
30 days
Free trial
100+
Managed data identifiers
10,000
Buckets covered by posture monitoring

How Macie Actually Works

The Core Mechanism — Two Distinct Jobs, Often Confused
⚠️ The Recurring Exam Theme

Nearly every Macie question tests one of three things: (1) do you know Macie's sensitive-data scanning scope is S3 only — it does not natively scan RDS, DynamoDB, EBS, or EFS, (2) can you distinguish automated discovery (continuous, sampled, low-cost) from a targeted job (on-demand, can be full-scan, more expensive), and (3) can you separate Macie's job (classify sensitive data) from GuardDuty S3 Protection's job (detect anomalous access to that data).

Exam Domain Mapping

DomainWhere Macie Shows Up
Data ProtectionThe centerpiece — sensitive data classification, managed/custom identifiers, encryption-aware scanning, GDPR/HIPAA use cases
Security Logging & MonitoringFindings to Security Hub/EventBridge, automated remediation of exposed buckets
Management & Security GovernanceOrganizations-wide enablement, per-account automated discovery control, exclusion lists
Infrastructure SecurityBucket-level posture findings (public access, encryption, external sharing)

Decision Tree — Mental Model

Threat

Unknown locations of sensitive data (PII, financial data, credentials) in S3; publicly exposed or unencrypted buckets containing that data; regulatory non-compliance (GDPR/HIPAA)

Security Goal

Discover where sensitive data lives across S3, continuously and cost-efficiently, and evaluate the security posture of the buckets holding it

AWS Service

Amazon Macie

Bucket posture evaluation (always-on, free) Automated sensitive data discovery (sampled, continuous) Targeted discovery jobs (on-demand, full-scan capable) Sample retrieval (validate findings)
Implementation

Enable via Organizations delegated administrator. Configure automated discovery scope per account; exclude non-sensitive buckets. Define custom identifiers and allow lists.

Monitoring

Findings → EventBridge (routing) and Security Hub (aggregation). Sensitivity scores guide which buckets need a deeper targeted job.

Remediation

EventBridge-triggered Lambda blocks public access, enables default encryption, or revokes external sharing on a flagged bucket.

Final Summary

Must Memorize
  • Macie's sensitive-data scanning scope is S3 only
  • Bucket posture evaluation is always-on and free; sensitive-data scanning is billable
  • Automated discovery (sampled, continuous) vs targeted jobs (on-demand, can be full-scan)
  • Macie = data classification, not anomalous access detection (that's GuardDuty S3 Protection)
Must Understand
  • Why resource clustering/sampling makes automated discovery cost-efficient at scale
  • Managed vs custom data identifiers, and when allow lists matter
  • The EventBridge-driven remediation pattern for exposed buckets
  • The distinction triangle: Macie (what data exists) vs GuardDuty S3 Protection (who's accessing it) vs Inspector (vulnerabilities)
Can De-prioritize
  • Exact list of all 100+ managed data identifiers
  • Console UI navigation specifics
  • Precise regional pricing figures

Exam appearance probability: HIGH

Discovery Mechanics & Capabilities

Macie's two layers — posture evaluation and content discovery — operate independently. Understanding which layer a finding came from is half the exam battle.

2.1 Automatic Bucket-Level Posture Evaluation Frequently misunderstood
TriggerAutomatic the moment Macie is enabled — no job configuration needed
ChecksPublic accessibility, encryption status, sharing/replication outside the AWS Organization
CostFree — part of the always-on inventory, distinct from billable content scanning
2.2 Automated Sensitive Data Discovery High exam relevance
WhatContinuous, intelligently sampled scanning that builds an interactive data map and sensitivity score per bucket
HowResource clustering by bucket name, file type, and prefix minimizes redundant scanning across similar objects
2.3 Targeted (Classic) Sensitive Data Discovery Jobs High exam relevance
WhatExplicitly configured, on-demand or scheduled job scanning specific buckets — can scan fully, not just sampled
2.4 Managed & Custom Data Identifiers
Managed identifiers100+ built-in patterns: PII, financial data (credit cards), credentials (AWS keys, Stripe keys, Google Cloud keys), regional government IDs
Custom identifiersRegex-based patterns you define for proprietary/organization-specific sensitive data formats
2.5 Allow Lists
PurposeSuppress known false positives — e.g. test/synthetic data that matches a sensitive-data pattern but isn't actually sensitive
2.6 Sensitive Data Sample Retrieval
WhatOne-click, temporary retrieval of up to 10 examples of the sensitive data found in an object
SecurityEncrypted with customer-managed KMS keys, viewable only temporarily within the console
2.7 Encryption-Aware Scanning
WhatSupports analyzing objects encrypted with dual-layer server-side encryption using KMS keys (DSSE-KMS)

AWS Exam Thinking

Requirement → Keywords → Expected Answer → why every distractor fails.

Find where PII/financial data lives across S3
PIIwhere is sensitive dataGDPR / HIPAA
Expected Answer

Amazon Macie

DistractorWhy it's wrong
GuardDuty S3 ProtectionDetects anomalous access patterns, doesn't classify the data's content/sensitivity
InspectorFinds software vulnerabilities, not data content
AWS ConfigEvaluates configuration compliance, not object content
Detect a publicly exposed S3 bucket without running a data scan
public bucketno scan requiredfree
Expected Answer

Macie's automatic bucket-level posture evaluation (always-on, free)

DistractorWhy it's wrong
Run a targeted sensitive data discovery jobUnnecessary cost/complexity — posture evaluation already covers public/encryption/sharing status automatically
AWS Config + S3 public access ruleValid alternative, but Macie's posture evaluation is purpose-built and free the moment Macie is enabled
Continuously, cost-efficiently discover sensitive data org-wide
continuouscost-efficientorganization-wide
Expected Answer

Macie automated sensitive data discovery

DistractorWhy it's wrong
Run a full targeted discovery job on every bucket every dayMassively more expensive — automated discovery's sampling/clustering is specifically built to avoid this cost
Satisfy an audit requirement for a documented full scan of a specific bucket
auditfull scanspecific bucket
Expected Answer

Macie targeted (classic) sensitive data discovery job

DistractorWhy it's wrong
Rely on automated discovery's sampled resultsSampling, by design, doesn't guarantee every object was inspected — insufficient for a documented full-scan audit requirement
Suppress repeated false-positive findings on known test data
false positivetest data
Expected Answer

Macie allow list

DistractorWhy it's wrong
Custom data identifierUsed to find MORE specific sensitive patterns, not to suppress known-safe ones
Discover sensitive data stored in RDS / DynamoDB / EBS
RDSDynamoDBnon-S3
Expected Answer

Not Macie — Macie's content scanning is S3-only

DistractorWhy it's wrong
MacieThis is the classic trap — Macie does NOT natively scan RDS, DynamoDB, EBS, or EFS for sensitive data. Exporting/extracting data to S3 first would be required, or a different approach entirely

Security Controls Mapping & Integrations

4 — Controls Mapping

Detective

Sensitive data discovery findings and bucket posture findings — e.g. "S3 object contains PII" or "bucket is publicly accessible"

Data Protection

Macie's core identity — classification of sensitive data and assessment of the controls (encryption, access) protecting it

Responsive (via integration, not native)

EventBridge → Lambda triggered by a finding — e.g. automatically apply S3 Block Public Access or enable default bucket encryption when an exposed-sensitive-bucket finding fires

Compliance

Supports GDPR/HIPAA-driven data mapping and audit requirements through targeted discovery jobs and continuous data maps

⚠️ Macie is NOT preventive and does NOT detect anomalous access

It classifies what data exists and evaluates static bucket posture. It does not block access, and it does not flag unusual GetObject patterns (that's GuardDuty S3 Protection's job).

5 — Integrations

EventBridge
WhatAll Macie findings are sent to EventBridge
PatternFinding → rule → Lambda (e.g. automatically block public access to a flagged bucket)
Security Hub
WhatFindings can be published into Security Hub for aggregation
WhyCross-service correlation — exposure findings can combine Macie's "sensitive data present" signal with GuardDuty/Inspector signals on the same resource
Organizations
WhatDelegated administrator model with multi-account support
CapabilityPer-account enable/disable of automated discovery, bucket exclusions, member account read access to their own stats
KMS
WhatEncrypts retrieved sensitive data samples; supports analyzing DSSE-KMS encrypted objects
Athena & QuickSight
WhatMacie discovery results can be queried/visualized via Athena and QuickSight for custom reporting

Costs, Limits & Quotas

Pricing Model

Bucket posture evaluationAlways free, regardless of other Macie usage
Sensitive data discoveryFirst 1 GB per account per region per month free; billed thereafter based on data evaluated
Trial30-day free trial including automated discovery and S3 bucket-level evaluation

Common Cost Mistakes

Cost Optimization

Limits & Quotas

ScopeRegional service — enable per region
Data sourceS3 only for content/sensitive-data scanning — no native RDS/DynamoDB/EBS/EFS support
Posture monitoring scaleUp to 10,000 general purpose S3 buckets per account
Custom identifiersRegex-based, account-defined, count-limited per account
⚠️ Exam trap

The S3-only scope is the single most tested limitation. A scenario asking to discover PII inside an RDS database or DynamoDB table is explicitly testing whether you know Macie cannot do this natively — data would need to be exported/extracted to S3 first, or a different tool used entirely.

Best Practices & Common Exam Traps

8 — Best Practices

Must Know
  • Macie's content scanning scope is S3 only
  • Bucket posture evaluation is always-on and free; content scanning is billable
  • Automated discovery (sampled, continuous, cheap) vs targeted jobs (on-demand, full-scan capable, audit-grade)
  • Macie classifies data; it does not detect anomalous access (GuardDuty's job) or block anything
Good Practice
  • Enable org-wide via delegated administrator
  • Exclude clearly non-sensitive buckets from automated discovery to control cost
  • Forward findings to EventBridge for automated remediation of exposed buckets
  • Use allow lists to cut down on repeated known-false-positive review time
Advanced Practice
  • Build custom data identifiers for organization-specific proprietary data formats
  • Use Athena/QuickSight to build custom compliance dashboards from discovery results
  • Schedule targeted jobs specifically aligned to compliance audit cycles (e.g. quarterly HIPAA scans of PHI buckets)

9 — Common Exam Traps

MisconceptionReality
"Macie scans RDS, DynamoDB, or EBS for sensitive data"Macie's content discovery is S3-only — no native support for other data stores
"All Macie functionality costs money"Bucket-level posture evaluation (public/encrypted/shared status) is always free and automatic
"Automated discovery scans every object in every bucket"It intelligently samples using resource clustering — it does not guarantee every object was inspected, unlike a full targeted job
"Macie detects who is accessing sensitive data and when"That's GuardDuty S3 Protection's job — Macie only classifies what sensitive data exists, not access patterns
"Macie blocks public access automatically"Macie only generates a finding — actually blocking access requires a separate remediation action (e.g. S3 Block Public Access via EventBridge/Lambda)

Macie vs. The Lookalikes

ServiceWhat it actually answers
vs GuardDuty S3 ProtectionMacie = what sensitive data exists in this bucket (content classification). GuardDuty S3 Protection = is access to this bucket behaving anomalously (behavioral). Complementary — Macie tells you what's at risk, GuardDuty tells you if it's being attacked
vs InspectorMacie = sensitive data content. Inspector = software vulnerabilities. Different problem spaces entirely, occasionally paired as distractors
vs AWS ConfigConfig evaluates resource CONFIGURATION against rules generally. Macie's posture evaluation is purpose-built specifically for S3 (public/encrypted/shared) plus the unique content-classification layer Config cannot do at all
vs Security HubMacie generates the underlying sensitive-data finding. Security Hub aggregates it alongside GuardDuty/Inspector/CSPM findings to build correlated exposure findings — Security Hub doesn't classify data itself

Flashcards — 16 Cards

Click card to flip. Mark right or wrong to track score.

Click to reveal answer
1 / 16
Mark:   Score: 0/0

Practice Quiz — 10 Questions

SCS-C02 scenario style, Easy → Specialty. Select an answer to reveal the explanation.

out of 10 correct