在搭建以ELK
为核心的日志系统时,Logstash
作为日志采集的核心组件,负责将各个服务的日志数据采集、清洗、过滤。然而缺点也很明显:
在我们设计架构时,尽量会选择减少主机使用资源,轻量,高性能,又能满足日志采集的需求。有没有这么一个开源服务呢?答案便是go-stash
。
特性/服务 | logstash | go-stash |
---|---|---|
定义 | ElasticStack 的一部分,用于日志收集和处理。 | 高效的数据处理和存储工具,Go 语言实现。 |
性能 | 灵活性高,但可能在处理大数据量时性能下降。 | 高吞吐量,性能是 logstash 的5倍左右 |
资源消耗 | 占用较多的服务器资源。 | 节省了2/3的服务器资源。 |
易用性 | 配置复杂,学习曲线陡峭。 | 易用性高,配置更简洁直观。 |
过滤器系统 | 强大的过滤器系统,支持多种数据处理方式。 | 内置一套丰富的过滤器,允许自定义处理逻辑。 |
架构图
各服务功能
Filebeat
是用于转发和集中日志数据的轻量级传送工具。Filebeat
监视您指定的日志文件或位置,收集日志事件,并将它们转发到kafka
进行索引。Kafka
是一个分布式的基于发布/订阅模式的消息队列(Message Queue
),主要应用于大数据实时处理领域。go-stash
是一个高效的从Kafka
获取,根据配置的规则进行处理,然后发送到ElasticSearch
集群的工具。Lucene
的搜索引擎,提供快速的全文搜索能力,适合处理大规模数据集,并且具备高可扩展性。Elasticsearch
中的数据。流程概述
日志数据的流动遵循从产生到消费的流程。
docker
部署的服务容器,服务产生的日志以json
格式会存储在 /var/lib/docker/containers/**/*-json.log
filebeat
服务,采集容器日志并将它们转发到kafka
进行索引go-stash
是一个高效的从Kafka
获取,根据配置的规则进行处理,然后发送到ElasticSearch
集群中进行索引和存储。Kibana
作为前端界面,从Elasticsearch
检索数据,并提供数据可视化。文件目录
log-collect
|-- deploy | |-- filebeat | | -- conf | | -- filebeat.yml | -- go-stash | -- etc | -- config.yaml |-- docker-compose.yaml
配置文件
filebeat.yml
filebeat.inputs: - type: log enabled: true paths: - /var/lib/docker/containers/*/*-json.log filebeat.config: modules: path: ${path.config}/modules.d/*.yml reload.enabled: false processors: - add_cloud_metadata: ~ - add_docker_metadata: ~ output.kafka: enabled: true hosts: ["kafka:9092"] #要提前创建topic topic: "openui-log" partition.hash: reachable_only: true compression: gzip max_message_bytes: 1000000 required_acks: 1
go-stash.yml
Clusters: - Input: Kafka: Name: gostash Brokers: - "kafka:9092" Topics: - openui-log Group: pro Consumers: 16 Filters: - Action: drop Conditions: - Key: k8s_container_name Value: "-rpc" Type: contains - Key: level Value: info Type: match Op: and - Action: remove_field Fields: # - message - _source - _type - _score - _id - "@version" - topic - index - beat - docker_container - offset - prospector - source - stream - "@metadata" - Action: transfer Field: message Target: data Output: ElasticSearch: Hosts: - "http://elasticsearch:9200" Index: "openui-{{yyyy-MM-dd}}" Username: "elastic" Password: "tester"
部署
docker-compose.yaml
version: '3' services: elasticsearch: image: elasticsearch:7.13.4 container_name: elasticsearch user: root environment: - discovery.type=single-node - "ES_JAVA_OPTS=-Xms512m -Xmx512m" - TZ=Asia/Shanghai volumes: - ./data/elasticsearch/data:/usr/share/elasticsearch/data - ./data/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml restart: always ports: - 9200:9200 - 9300:9300 networks: - openui_net #查看elasticsearch数据 - Kibana to view Elasticsearch data kibana: image: kibana:7.13.4 container_name: kibana environment: - elasticsearch.hosts=http://elasticsearch:9200 - elasticsearch.username="elastic" - elasticsearch.password="tester" - TZ=Asia/Shanghai restart: always networks: - openui_net ports: - "5601:5601" depends_on: - elasticsearch #elasticsearch UI elastichd: container_name: elastichd image: containerize/elastichd restart: always networks: - openui_net ports: - "9800:9800" depends_on: - elasticsearch kafka-ui: container_name: kafka-ui image: provectuslabs/kafka-ui:latest ports: - 9090:8080 environment: DYNAMIC_CONFIG_ENABLED: 'true' KAFKA_CLUSTERS_0_NAME: kafka-work KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS: kafka:9092 depends_on: - kafka networks: - openui_net #消费kafka中filebeat收集的数据输出到es - The data output collected by FileBeat in Kafka is output to ES go-stash: image: eilinge/go-stash # golang:v1.22 container_name: go-stash environment: # 时区上海 - Time zone Shanghai (Change if needed) TZ: Asia/Shanghai user: root restart: always volumes: - ./deploy/go-stash/etc:/app/etc networks: - openui_net depends_on: - elasticsearch - kafka #收集业务数据 - Collect business data filebeat: image: elastic/filebeat:7.13.4 container_name: filebeat environment: # 时区上海 - Time zone Shanghai (Change if needed) TZ: Asia/Shanghai user: root restart: always entrypoint: "filebeat -e -strict.perms=false" #解决配置文件权限问题 - Solving the configuration file permissions volumes: - ./deploy/filebeat/conf/filebeat.yml:/usr/share/filebeat/filebeat.yml - /var/lib/docker/containers:/var/lib/docker/containers networks: - openui_net depends_on: - kafka #zookeeper是kafka的依赖 - Zookeeper is the dependencies of Kafka zookeeper: image: wurstmeister/zookeeper container_name: zookeeper environment: # 时区上海 - Time zone Shanghai (Change if needed) TZ: Asia/Shanghai restart: always ports: - 2181:2181 networks: - openui_net #消息队列 - Message queue kafka: image: wurstmeister/kafka container_name: kafka ports: - 9092:9092 environment: - KAFKA_ADVERTISED_HOST_NAME=kafka - KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181 - KAFKA_AUTO_CREATE_TOPICS_ENABLE=false - TZ=Asia/Shanghai restart: always volumes: - /var/run/docker.sock:/var/run/docker.sock networks: - openui_net depends_on: - zookeeper networks: openui_net: driver: bridge ipam: config: - subnet: 172.16.0.0/16
环境服务搭建
[root@master log-collect]# docker-compose ps NAME COMMAND SERVICE STATUS PORTS elastichd "ElasticHD" elastichd running 0.0.0.0:9800->9800/tcp, :::9800->9800/tcp elasticsearch "/bin/tini -- /usr/l…" elasticsearch running 0.0.0.0:9200->9200/tcp, 0.0.0.0:9300->9300/tcp, :::9200->9200/tcp, :::9300->9300/tcp filebeat "filebeat -e -strict…" filebeat running go-stash "./stash -f etc/conf…" go-stash running kafka "start-kafka.sh" kafka running 0.0.0.0:9092->9092/tcp, :::9092->9092/tcp kafka-ui "/bin/sh -c 'java --…" kafka-ui running 0.0.0.0:9090->8080/tcp, :::9090->8080/tcp kibana "/bin/tini -- /usr/l…" kibana running 0.0.0.0:5601->5601/tcp, :::5601->5601/tcp zookeeper "/bin/sh -c '/usr/sb…" zookeeper running 0.0.0.0:2181->2181/tcp, :::2181->2181/tcp
检查各服务是否正常运行
elastic-cluster
kafka-cluster
kafka-topic
kibana
劣势分析
通过部署文件,很明显的存在一些劣势:
在上述部署的过程,也不是一番风顺,有些错误对一个接触相关服务较少的同学而言,就容易退缩。还好出现的常见问题在网上都能找到解决方法,主要还是需要我们保持一颗热爱学习的心
。
Kibana的Stack Monitoring显示节点offline
ElasticSearch+Kibana设置用户名密码
ES开启密码安全认证后,elastichd的连接方式
go-stash 依赖json-iterator
版本较低
使用我基于golang:v1.22
最新构建的 go-stash
镜像
elasticSearch 文件权限不足,启动失败
Log-Collect