从零开始搭建个人大数据集群——环境准备篇
从零开始搭建个人大数据集群(1)——zookeeper
从零开始搭建个人大数据集群(2)——HDFS
从零开始搭建个人大数据集群(3)——YARN
从零开始搭建个人大数据集群(4)——HIVE
从零开始搭建个人大数据集群(5)——HBASE
从零开始搭建个人大数据集群(6)——SPARK

安装前的准备

1.安装好zookeeper
2.准备好kafka_2.12-2.6.2.tgz

解压安装包

cd /opt/packages
tar -zxf kafka_2.12-2.6.2.tgz ../apps
ln -s kafka_2.12-2.6.2 kafka

配置kafka

server.properties

定义端口

# The address the socket server listens on. It will get the value returned from
# java.net.InetAddress.getCanonicalHostName() if not configured.
#   FORMAT:
#     listeners = listener_name://host_name:port
#   EXAMPLE:
#     listeners = PLAINTEXT://your.host.name:9092
#listeners=PLAINTEXT://:9092
port=9092

配置log存储目录

############################# Log Basics #############################

# A comma separated list of directories under which to store log files
log.dirs=/data/kafka-logs

设置zookeeper地址和超时时间

############################# Zookeeper #############################

# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=hd1:2181,hd2:2181,hd3:2181/kafka

# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=60000

producer.properties

配置producer服务地址

############################# Producer Basics #############################

# list of brokers used for bootstrapping knowledge about the rest of the cluster
# format: host1:port1,host2:port2 ...
bootstrap.servers=hd1:9092,hd2:9092,hd3:9092

# specify the compression codec for all data generated: none, gzip, snappy, lz4, zstd
compression.type=none

consumer.properties

# list of brokers used for bootstrapping knowledge about the rest of the cluster
# format: host1:port1,host2:port2 ...
bootstrap.servers=hd1::9092,hd2:9092,hd3:9092

# consumer group id
group.id=kc-consumer-group

分发配置

kafka是依赖于zookeeper的,这里我将装了zookeeper的三个节点都装kafka,所以将配置好的整个kafka包发到三个zookeeper节点

cd /opt/apps
# 分发软件包
rszk kafka_2.12-2.6.2
# 分发软链接
rszk kafka

启动

# 首先启动zookeeper
# 这里zkman命令是我起的别名,如果你看过我的前几篇文章就会知道,不想去看的话就正常一次执行zkServer.sh start也可以
zkman start
# 启动kafka
#在三个节点上分别执行
nohup kafka-server-start.sh /opt/apps/kafka/config/server.properties > /tmp/kafka_logs 2>&1 &
#为了方便,kafka的启动命令我也配置了别名
alias zkcli='zkCli.sh -server hd1:2181,hd2:2181,hd3:2181'
alias zkman='~/zk/ssh_all.sh zkServer.sh'
alias kafkastart='~/zk/ssh_all.sh "nohup kafka-server-start.sh /opt/apps/kafka/config/server.properties > /tmp/kafka_logs 2>&1 &"'

测试生产消费功能

# 首先创建一个topic
kafka-topics.sh --create --replication-factor 1 --partitions 1 --topic kctest --zookeeper hd1:2181,hd2:2181,hd3:2181/kafka
# 启动一个生产者
kafka-console-producer.sh --broker-list hd1:9092,hd2:9092,hd3:9092 --topic kctest
# 启动一个消费者
kafka-console-consumer.sh --bootstrap-server hd1:9092,hd2:9092,hd3:9092 --topic kctest

image.png
image.png

Q.E.D.