AWS has improved its big data processing service Elastic MapReduce (EMR) by giving it the ability to support launching clusters in Amazon VPC private subnets. For EMR 4.2.0 and higher lets users to create secure and pre-configured clusters in Hadoop ecosystem applications, Spark, and Presto in your subnets.
EMR also provides increased security for user data through several means.
- Encryption at Rest—for Amazon S3 through the EMR File System, HDFRS, and Local Filesystem on each node.
- Encryption in Transit—for Hadoop MapReduce Shuffle, HDFS Rebalancing and Spark Shuffle.
- Usage of IAM user and roles.
- Audit calls using AWS CloudTrail.
- EC2 Security Groups and optional SSH access.
- Hadoop and Spark authentication and authorization.
It is worth noting that in 2015, EMR was included in the AWS Business Associates Agreement for running processes that work with PII data. EMR has also been certified under several standards, including PCI DSS Level 1, ISO 9001, ISO 27001, and ISO 27018.
For more information on EMR, you can visit the EMR documentation here. For more information on how AWS EMR can help you process big data, please contact our expert cloud consultants here at PolarSeven.
[video_player type=”youtube” width=”560″ height=”315″ align=”center” margin_top=”0″ margin_bottom=”20″ border_size=”3″ border_color=”#62e4e2″]aHR0cHM6Ly93d3cueW91dHViZS5jb20vd2F0Y2g/dj00SHNlQUxhTGxsYw==[/video_player]