actionAZ:EKS

fail_az

Simulates the loss of an AZ in an AWS Region for EKS clusters with managed nodegroups

Activity as code

Below are the details and signature of the activity Python module.

Typeaction
Moduleazchaosaws.eks.actions
Namefail_az
Returnmapping

This function simulates the loss of an AZ in an AWS Region for EKS clusters with managed nodegroups. All nodegroups within the tagged clusters will be affected. For network failure type, it uses a blackhole network ACL with deny all traffic. For instance failure type, it stops normal instances with force; stops persistent spot instances; cancels spot requests and terminates one-time spot instances. Ensure your target clusters are tagged. ASG(s) that are part of the managed node groups will also be impacted.

Usage

JSON

{
  "name": "fail_az",
  "type": "action",
  "provider": {
    "type": "python",
    "module": "azchaosaws.eks.actions",
    "func": "fail_az",
    "arguments": {
      "az": "",
      "dry_run": true
    }
  }
}

YAML

name: fail_az
provider:
  arguments:
    az: ""
    dry_run: true
  func: fail_az
  module: azchaosaws.eks.actions
  type: python
type: action

Arguments

NameTypeDefaultRequiredTitleDescription
azstringYesAvailability ZoneAZ to target
tagsmapping[{"Key": "AZ_FAILURE", "Value": "True"}]NoTagsMatch only resources with these tags
failure_typestringnetworkNoFailure TypeType of failure to apply: network, instance
dry_runbooleanfalseNoDry RunOnly perform a dry run for it

Required:

Optional:

Return structure

{
  "AvailabilityZone": str,
  "DryRun": bool,
  "Clusters": [
    {
      "ClusterName": str,
      "NodeGroups": [
        {
          "NodeGroupName": str,
          "AutoScalingGroups": [
            {
              "AutoScalingGroupName": str,
              "Before": {
                "SubnetIds": List[str],
                "AZRebalance": bool,
                "MinSize": int,
                "MaxSize": int,
                "DesiredCapacity": int
              },
              "After": {
                "SubnetIds": List[str],
                "AZRebalance": bool,
                "MinSize": int,
                "MaxSize": int,
                "DesiredCapacity": int
              }
            }
          ],
          "Subnets": [
            {
              "SubnetId": str,
              "VpcId": str,
              "Before": {
                "NetworkAclId": str,
                "NetworkAclAssociationId": str
              },
              "After": {
                "NetworkAclId": str,
                "NetworkAclAssociationId": str
              }
            },
            ...
          ],
          "Instances": [
            {
              "InstanceId": str,
              "Before": {
                "State": 'pending'|'running'
              },
              "After": {
                "State": 'stopping'|'stopped'
              }
            },
            ...
          ]
        }
        ...
      ]
    }
  ]
}

Signature

def fail_az(
    az: str = None,
    dry_run: bool = None,
    failure_type: str = "network",
    tags: List[Dict[str, str]] = [{"AZ_FAILURE": "True"}],
    state_path: str = "fail_az.{}.json".format(__package__.split(".", 1)[1]),
    configuration: Configuration = None,
) -> Dict[str, Any]:
    pass