AWS Elastic MapReduce (EMR) simplifies big data processing, providing a managed Hadoop framework that makes it easy, fast, and cost-effective for you to distribute and process vast amounts of your data across dynamically scalable Amazon EC2 instances.

Many of our Turbot customers use EMR for various High Performance Computing use cases for genomic sequencing to Advanced Analytics of Sales & Marketing data. EMR is easy to use from running distributed frameworks, various job runs across different Hadoop ecosystem tools, and integrations to various AWS services like S3, DynamoDB, etc.

Turbot EMR Guardrails

AWS EMR and Turbot

In addition to the great features that AWS EMR offers out of the box, Turbot provides many AWS EMR Guardrails that automate, manage, and secure our customer’s EMR workloads that help them accelerate using AWS EMR:

  • Ability to enable/disable AWS EMR in one or many AWS Accounts through Turbot’s IAM guardrails
  • Various networking controls to run AWS EMR safely:
    • Run AWS EMR securely within internal subnets
    • Isolate AWS EMR in a disconnected private VPC
    • Programmatically managed AWS EMR Security groups with least privilege
    • Ability to enable/disable connectivity to AWS Services
  • Detect and repair unsupported resources running across AWS EMR
  • Automatically apply CloudWatch alarms for:
    • AWS EMR EC2 Instances Status Failed
    • AWS EMR EC2 Instances CPU Utilization thresholds
    • AWS EMR Cluster Idle time thresholds