Introduction
As more and more organizations around the world have started adopting cloud and AWS for their business infrastructure provided the benefits it provides over on-premise systems, there has always arised one common question, “How is the data I store in cloud or AWS safe? What measures can my organization take to protect sensitive data and follow the proper security best practices?” If you are hosting your architecture on AWS, you’re in luck, because I have just the right service for you! Introducing AWS Macie, a fully managed AWS service that leverages the power of ML to safeguard your most critical data; especially customer information and details which are stored in your S3 bucket. So, let’s see how this works and how you can integrate this to your current architecture to be better security compliant.
How exactly does Macie work?
Let’s understand in simple how Macie actually works. For example, you are a B2C business and work closely with your customers. You hold some of their most sensitive data, could be Personal Identifiable Information (PII) records, financial records, card information, or any other private data. To manage and store these data, you create an S3 bucket and store all these information in that bucket, and manage permissions such that only authorized personnel can read or write the content within that bucket. Now you open up AWS Macie, and configure the settings by enabling the service. First you have the option to specify which bucket you want to associate it with, as shown below:

Next, you can specify some data identifiers that are to be scanned and classified as sensitive data.

These are some of the data identifiers you can set, for full list you can refer to the following link: https://docs.aws.amazon.com/macie/latest/user/mdis-reference quick.html#:~:text=In%20Amazon%20Macie%2C%20a%20managed,a%20particular%20country%20or%20region
You also have the flexibility to add your custom data identifiers. With that, your Macie is set up for the scan.
Macie simply starts scanning the specified S3 buckets using advanced ML algorithms and checks for vulnerabilities. It classifies the risks as High, Medium or Low levels of severity. Here is how the results appear after each scan:

You can configure the events to be automatically publish them in AWS Security Hub through AWS EventBridge so that you can automatically remediate the security risks. The overall process involves discovery of data, classification of data, automated alerts and monitoring, and ultimately risk management. Thats all, it is that simple to set up and integrate the service. Additionally, you can also configure Lambda to send notifications for any unusual activity detected that has the risk of compromising your data.
To summarize, lets look at this diagram by AWS and understand the flow:

As you can see, Macie continually evaluates your S3 buckets, discovers and records the sensitive data automatically, and sends the resulted data into other security tools such as AWS Security Hub for users to leverage automatic remediation. Thats all! You will get more used to the service as you start leveraging it. Remember that the above example I presented was specified as a B2C business example for the purposes to make it easier to understand. This tool is relevant to any workload or business that you are hosting in your AWS infrastructure. This tool is also specifically very helpful if you want to address the security compliance issues as well, such as GPDR, HIPAA or CCPA. Give it a try, it is really worth it!