1. Chaosblade 入门介绍

2065 阅读 0 评论 2 点赞

一、什么是混沌工程

1.1 定义

拥抱失败，反脆弱。从被动的故障驱动，到主动的故障驱动。
混沌工程是一门新的技术学科，简单理解他是一种检测架构弹性能力的手段，主要通过故障植入的方式主动暴露问题。
故障植入只是混沌工程中的一种常用手段，他们之间并不是等同的。通常，故障植入是一种可以预测的破坏方式，但没有探索更多可能发生的奇怪场景，那么不可预测的事情就可能发生。

1.2 各式各类的故障

混沌工程的故障涵盖了 Laas 层，Pass 层，Saas，后续风火轮团队也会根据这 3 个层次来学习和分享。

输入图片说明

1.3 云原生下混沌工程的必要性

越来越多的企业选择基于云原生技术构建系统架构，从普通服务到成为一个真正的云原生应用的过程需要不断的演进。
整个改进不仅仅是将服务微服务化、docker 化，还涉及到中间件的改造（dubbo2.x 跟 k8s 基础设施的能力有重叠，服务发现，负载均衡的能力最好下放到基础设施。）
同时，你的容器，k8s ，节点，同样面临了一些列跟传统场景的差异。（节点的句柄数明显比传统节点需要得高）
整个演变的过程会遇到很多坑，这些坑一般是遇到一个改一个，如果想加速坑的暴露，问题的收敛，快速提升集群服务的稳定性可以借助混动工程所提供的工具。

二、ChaosBlade

2.1 背景

ChaosBlade 是一款遵循混沌工程实验原理，建立在阿里巴巴近十年故障测试和演练实践基础上，并结合了集团各业务的最佳创意和实践，提供丰富故障场景实现，帮助分布式系统提升容错性和可恢复性的混沌工程工具。

2.1 丰富的场景

【服务层】
多语言应用服务支持，可以在具体某个方法植入延迟，抛异常等。
【kubernetes】
杀死 pod，pod io异常，容器 cpu 等
【基础实施】
CPU、内存、网络、磁盘、进程、内核、文件等

2.2 易用性

【无侵入】
故障植入的过程不需要你改动到原有的代码
【统一实验模型】
不管是验证微服务的某个方法，还是基础设施的某个指标，执行的语法命令都是类似的。
【方便扩展】
可以依据统一的模型方便快捷的扩展更多的混沌实验场景。

2.3 DEMO

官方提供的 dubbo 服务 demo，通过容器快速体验

下载 demo 容器

docker pull registry.cn-hangzhou.aliyuncs.com/chaosblade/chaosblade-demo:latest

启动镜像

docker run -it registry.cn-hangzhou.aliyuncs.com/chaosblade/chaosblade-demo:latest

Using CATALINA_BASE:   /usr/local/tomcat
Using CATALINA_HOME:   /usr/local/tomcat
Using CATALINA_TMPDIR: /usr/local/tomcat/temp
Using JRE_HOME:        /usr/lib/jvm/java-1.8-openjdk/jre
Using CLASSPATH:       /usr/local/tomcat/bin/bootstrap.jar:/usr/local/tomcat/bin/tomcat-juli.jar
Tomcat started.

use[ curl http://localhost:8080/dubbo/hello?name=dubbo ]command to request demo

You can use blade command to execute a chaos experiment.

Please read README.txt first!

进入镜像之后，可阅读 README.txt 文件实施混沌实验，Enjoy it。

# Command example

# The application is a simple dubbo demo, so you can test java application chaos experiment, and execute
#[ curl http://localhost:8080/dubbo/hello?name=dubbo ] command to call the service for checking experiment
curl http://localhost:8080/dubbo/hello?name=dubbo

# Prepare java application experiment
blade prepare jvm --process business
# or
blade p jvm --process business

# Create a experiment is delay 3s when invoke com.example.service.DemoService#sayHello service,
blade create dubbo delay --time 3000 --service com.example.service.DemoService --methodname sayHello --consumer
# or
blade c dubbo delay --time 3000 --service com.example.service.DemoService --methodname sayHello --consumer

# Execute curl http://localhost:8080/dubbo/hello?name=dubbo again to check the service status.
# Destroy the experiment, <UID> is the create command result.
blade destroy <UID>
#or
blade d <UID>

# Execute curl http://localhost:8080/dubbo/hello?name=dubbo again to check the service status.
# You can use status command to query the experiment status
blade status --type create

blade status <UID>
#or
blade s <UID>

# Create a experiment is throwing exception when request hello controller service(the request mapping method name is
# hello too)
blade create jvm throwCustomException --exception java.lang.Exception \
    --classname com.example.controller.DubboController --methodname hello

# Destroy the experiment
blade destroy UID

# Burn cpu, execute the flow command and use top command to check cpu stats. You can execute destroy command to stop the
# experiment
blade create cpu fullload

# You can also add --timeout flag to set the experiment duration, the unit of timeout flag is second
blade create cpu fullload --timeout 30

# You can use help command to discover other experiment, enjoy it.
blade help

设置某个 dubbo 服务的某个方法超时
服务方法抛出异常
调高 cpu 负载

三、混沌工程实践指导

有了工具，要做好一个混沌实验，还需要有科学的实践指导。

建立一个围绕稳定状态行为的假说
指标要客观全面跟写单元测试类似，你 case 的断言，是为了检验你的方法，不是证明你的方法是对的（对不对你写单测时候是不知道的，执行了才知道），不能明知一个方法的返回参数 name="xiaomin"，就只校验这个值，而应该全面校验所有出参/指标。
多样化真实世界的事件
任何能够破坏稳态的事件都是混沌实验中的一个潜在变量，不仅仅是用故障植入工具。（直接拔电源也是一种）
在生产环境中运行实验
跟性能测试类似，为了保证系统的真实性，排除一些变量干扰，尽量跟生产的环节一样，或者直接在生产上认证。
持续自动化运行实验
手动运行实验是劳动密集型的, 最终是不可持续的，所以我们要把实验自动化并持续运行。混沌工程要在系统中构建自动化的编排和分析。【风火轮需求潜在点】
最小化爆炸半径
在生产中进行试验可能会造成不必要的客户投诉。虽然对一些短期负面影响必须有一个补偿, 但混沌工程师的责任和义务是确保这些后续影响最小化且被考虑到。

点赞(2) 打赏

本文分类：混沌工程
本文标签：无
浏览次数：2065 次浏览
发布日期：2021-10-25 16:34:45
本文链接：http://kubeclub.cn/hundonggongcheng/157.html

下一篇 > 2. chaosblad 容器基础镜像封装

1. Chaosblade 入门介绍

一、什么是混沌工程

1.1 定义

1.2 各式各类的故障

1.3 云原生下混沌工程的必要性

二、ChaosBlade

2.1 背景

2.1 丰富的场景

2.2 易用性

2.3 DEMO

三、混沌工程实践指导

评论列表共有 0 条评论

发表评论取消回复

1. Chaosblade 入门介绍

一、什么是混沌工程

1.1 定义

1.2 各式各类的故障

1.3 云原生下混沌工程的必要性

二、ChaosBlade

2.1 背景

2.1 丰富的场景

2.2 易用性

2.3 DEMO

三、混沌工程实践指导

kubectl 源码概览

本地网络与云厂商 vpc 网络一体化方案

git 更新代码慢原因分析

Sonar 复杂度指标

评论列表 共有 0 条评论

发表评论 取消回复

评论列表共有 0 条评论

发表评论取消回复