从零开始搭建个人大数据集群——环境准备篇
从零开始搭建个人大数据集群(1)——zookeeper
从零开始搭建个人大数据集群(2)——HDFS
从零开始搭建个人大数据集群(3)——YARN
从零开始搭建个人大数据集群(4)——HIVE
从零开始搭建个人大数据集群(5)——HBASE
从零开始搭建个人大数据集群(6)——SPARK
安装前的准备
1.安装好zookeeper
2.准备好kafka_2.12-2.6.2.tgz
解压安装包
cd /opt/packages
tar -zxf kafka_2.12-2.6.2.tgz ../apps
ln -s kafka_2.12-2.6.2 kafka
配置kafka
server.properties
定义端口
# The address the socket server listens on. It will get the value returned from
# java.net.InetAddress.getCanonicalHostName() if not configured.
# FORMAT:
# listeners = listener_name://host_name:port
# EXAMPLE:
# listeners = PLAINTEXT://your.host.name:9092
#listeners=PLAINTEXT://:9092
port=9092
配置log存储目录
############################# Log Basics #############################
# A comma separated list of directories under which to store log files
log.dirs=/data/kafka-logs
设置zookeeper地址和超时时间
############################# Zookeeper #############################
# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=hd1:2181,hd2:2181,hd3:2181/kafka
# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=60000
producer.properties
配置producer服务地址
############################# Producer Basics #############################
# list of brokers used for bootstrapping knowledge about the rest of the cluster
# format: host1:port1,host2:port2 ...
bootstrap.servers=hd1:9092,hd2:9092,hd3:9092
# specify the compression codec for all data generated: none, gzip, snappy, lz4, zstd
compression.type=none
consumer.properties
# list of brokers used for bootstrapping knowledge about the rest of the cluster
# format: host1:port1,host2:port2 ...
bootstrap.servers=hd1::9092,hd2:9092,hd3:9092
# consumer group id
group.id=kc-consumer-group
分发配置
kafka是依赖于zookeeper的,这里我将装了zookeeper的三个节点都装kafka,所以将配置好的整个kafka包发到三个zookeeper节点
cd /opt/apps
# 分发软件包
rszk kafka_2.12-2.6.2
# 分发软链接
rszk kafka
启动
# 首先启动zookeeper
# 这里zkman命令是我起的别名,如果你看过我的前几篇文章就会知道,不想去看的话就正常一次执行zkServer.sh start也可以
zkman start
# 启动kafka
#在三个节点上分别执行
nohup kafka-server-start.sh /opt/apps/kafka/config/server.properties > /tmp/kafka_logs 2>&1 &
#为了方便,kafka的启动命令我也配置了别名
alias zkcli='zkCli.sh -server hd1:2181,hd2:2181,hd3:2181'
alias zkman='~/zk/ssh_all.sh zkServer.sh'
alias kafkastart='~/zk/ssh_all.sh "nohup kafka-server-start.sh /opt/apps/kafka/config/server.properties > /tmp/kafka_logs 2>&1 &"'
测试生产消费功能
# 首先创建一个topic
kafka-topics.sh --create --replication-factor 1 --partitions 1 --topic kctest --zookeeper hd1:2181,hd2:2181,hd3:2181/kafka
# 启动一个生产者
kafka-console-producer.sh --broker-list hd1:9092,hd2:9092,hd3:9092 --topic kctest
# 启动一个消费者
kafka-console-consumer.sh --bootstrap-server hd1:9092,hd2:9092,hd3:9092 --topic kctest
Q.E.D.