Activity as code
Below are the details and signature of the activity Python module.
Type | action |
Module | azchaosaws.elasticache.actions |
Name | fail_az |
Return | mapping |
This function forces a failover for elasticache. If it runs in cluster mode, it forces failover for every primary node specified (max up to 5 in every 24hours). If it runs in non-cluster mode, it forces failover if the primary node is in the target AZ.
Note: You will need to provide the replicationgroupdids for clusters where Cluster mode is enabled. Otherwise, they will not be affected. If there are multiple shards in a same Redis cluster (cluster mode enabled) that will need to failover (if their primary nodes are in same AZ), the first node replacement must complete before a subsequent test_failover call can be made. Therefore, the function leverages on describe_events to wait for the first primary node to complete first.
cache_cluster_ids
provided should all be the primary nodes. Otherwise, the program will not know when the node replacement is completed as the primary cache cluster id will be a key in the event generated by ElastiCache. Incorrect cache_cluster_ids
will cause the program to time out after the failover for that particular nodegroup.
Usage
JSON
{
"name": "fail_az",
"type": "action",
"provider": {
"type": "python",
"module": "azchaosaws.elasticache.actions",
"func": "fail_az",
"arguments": {
"az": "",
"dry_run": true
}
}
}
YAML
name: fail_az
provider:
arguments:
az: ""
dry_run: true
func: fail_az
module: azchaosaws.elasticache.actions
type: python
type: action
Arguments
Name | Type | Default | Required | Title | Description |
---|---|---|---|---|---|
az | string | Yes | Availability Zone | AZ to target | |
tags | List[Dict[str, str]] | [{"Key": "AZ_FAILURE", "Value": "True"}] | No | Tags | Match only resources with these tags |
replication_groups | List[Dict[str, Any]] | null | No | Replication Groups | |
dry_run | bool | false | No | Dry Run | Only perform a dry run for it |
Required:
- az (str): An availability zone
- dry_run (bool): The boolean flag to simulate a dry run or not. Setting to True will only run read-only operations and not make changes to resources. (Accepted values: True | False)
Optional:
- replication_groups (List[Dict[str, Any]]): List of replication groups to be tested.
- replication_group_id (str): The ID of the replication group
- cache_cluster_ids (List[str]): List of cache cluster IDs in the replication group to initiate failover
- tags (List[Dict[str, str]]): A list of key-value pairs to filter the elasticache cluster(s) by. (Default:
[{"Key": "AZ_FAILURE", "Value": "True"}]
)
Return structure
{
"AvailabilityZone": str,
"DryRun": bool,
"Shards":
{
"Success": [
{
"CacheClusterId": str,
"ReplicationGroupId": str,
"NodeGroupId": str,
"ClusterEnabled": bool
},
...
],
"Failed": [
{
"CacheClusterId": str,
"ReplicationGroupId": str,
"NodeGroupId": str,
"ClusterEnabled": bool
},
...
]
}
}
Signature
def fail_az(
az: str = None,
dry_run: bool = None,
replication_groups: List[Dict[str, Any]] = None,
tags: List[Dict[str, str]] = [{"Key": "AZ_FAILURE", "Value": "True"}],
configuration: Configuration = None,
) -> Dict[str, Any]:
pass