AWS AZ
recover_az
Rolls back the subnet(s), EKS instance(s), ASG(s) that were affected by the fail_az action
Simulates the loss of an AZ in an AWS Region for EKS clusters with managed nodegroups
Below are the details and signature of the activity Python module.
Type | action |
Module | azchaosaws.eks.actions |
Name | fail_az |
Return | mapping |
This function simulates the loss of an AZ in an AWS Region for EKS clusters with managed nodegroups. All nodegroups within the tagged clusters will be affected. For network failure type, it uses a blackhole network ACL with deny all traffic. For instance failure type, it stops normal instances with force; stops persistent spot instances; cancels spot requests and terminates one-time spot instances. Ensure your target clusters are tagged. ASG(s) that are part of the managed node groups will also be impacted.
Usage
JSON
{
"name": "fail_az",
"type": "action",
"provider": {
"type": "python",
"module": "azchaosaws.eks.actions",
"func": "fail_az",
"arguments": {
"az": "",
"dry_run": true
}
}
}
YAML
name: fail_az
provider:
arguments:
az: ""
dry_run: true
func: fail_az
module: azchaosaws.eks.actions
type: python
type: action
Arguments
Name | Type | Default | Required | Title | Description |
---|---|---|---|---|---|
az | string | Yes | Availability Zone | AZ to target | |
tags | mapping | [{"Key": "AZ_FAILURE", "Value": "True"}] | No | Tags | Match only resources with these tags |
failure_type | string | network | No | Failure Type | Type of failure to apply: network, instance |
dry_run | boolean | false | No | Dry Run | Only perform a dry run for it |
Required:
Optional:
[{"AZ_FAILURE": "True"}]
)Return structure
{
"AvailabilityZone": str,
"DryRun": bool,
"Clusters": [
{
"ClusterName": str,
"NodeGroups": [
{
"NodeGroupName": str,
"AutoScalingGroups": [
{
"AutoScalingGroupName": str,
"Before": {
"SubnetIds": List[str],
"AZRebalance": bool,
"MinSize": int,
"MaxSize": int,
"DesiredCapacity": int
},
"After": {
"SubnetIds": List[str],
"AZRebalance": bool,
"MinSize": int,
"MaxSize": int,
"DesiredCapacity": int
}
}
],
"Subnets": [
{
"SubnetId": str,
"VpcId": str,
"Before": {
"NetworkAclId": str,
"NetworkAclAssociationId": str
},
"After": {
"NetworkAclId": str,
"NetworkAclAssociationId": str
}
},
...
],
"Instances": [
{
"InstanceId": str,
"Before": {
"State": 'pending'|'running'
},
"After": {
"State": 'stopping'|'stopped'
}
},
...
]
}
...
]
}
]
}
Signature
def fail_az(
az: str = None,
dry_run: bool = None,
failure_type: str = "network",
tags: List[Dict[str, str]] = [{"AZ_FAILURE": "True"}],
state_path: str = "fail_az.{}.json".format(__package__.split(".", 1)[1]),
configuration: Configuration = None,
) -> Dict[str, Any]:
pass