当前位置: 代码网 > it编程>软件设计>搜素引擎 > Docker下运行多个ClickHouse容器,在ReplicatedMergeTree引擎中借助Zookeeper同步数据时日志提示“DNS_ERROR”且数据同步失败的解决方案

Docker下运行多个ClickHouse容器,在ReplicatedMergeTree引擎中借助Zookeeper同步数据时日志提示“DNS_ERROR”且数据同步失败的解决方案

2024年08月06日 搜素引擎 我要评论
Docker下运行两个ClickHouse容器后 RelicatedMergeTree引擎借助Zookeeper将数据无法同步的解决方案,提示错误为“DNS_ERROR”

前置准备

目的

在物理机中基于docker服务运行两个clickhouse容器,配置并连接zookeeper服务器;依照官网对replicatedmergetree引擎的描述分别在两个容器中创建副本表;通过zookeeper对clickhouse的支持达到数据同步,最终让两台clickhouse互为数据副本(data replication).

准备

  1. 物理机基础环境

    ip地址:192.168.182.10

  2. docker容器和ip信息

    服务名称网段名称ip地址主机名docker执行命令
    ck节点1(第一个clickhouse容器)bridge172.17.0.37409ace09488docker run -d --name clickhouse-server-c1 --ulimit nofile=262144:262144 --volume=/software/docker/clickhouse/ck1/config:/etc/clickhouse-server/ -v /software/docker/clickhouse/ck1/data:/var/lib/clickhouse -p 8123:8123 -p 9000:9000 -p 9009:9009 yandex/clickhouse-server
    ck节点2(第二个clickhouse容器)bridge172.17.0.4b22679b9d346docker run -d --name clickhouse-server-c2 --ulimit nofile=262144:262144 --volume=/software/docker/clickhouse/ck2/config:/etc/clickhouse-server/ -v /software/docker/clickhouse/ck2/data:/var/lib/clickhouse -p 8124:8123 -p 9001:9000 -p 9010:9009 yandex/clickhouse-server
    zookeeperbridge172.17.0.5不重要docker run --name zookeeper -d -p 2181:2181 zookeeper
  3. clickhouse服务配置

    3.1. ck节点1和ck节点2的config/config.d目录中新建metrika-share.xml配置文件,并写入以下配置

    • metrika-share.xml 内容
    <?xml version="1.0"?>
    <clickhouse>
        <zookeeper>
          <node>
          	 <!--因zookeeper的2181端口已对外暴露,此处可直接使用本机地址访问,防止zookeeper ip地址变动-->
             <host>192.168.182.10</host>
             <port>2181</port>
          </node>
        </zookeeper>
    </clickhouse>
    

    3.2 ck节点1和ck节点2的config.xml配置中搜索<include_from> 标签,添加为以下内容

    <!--因docker挂载目录已映射成功,故此处可以直接连接到下列配置-->
    <include_from>/etc/clickhouse-server/config.d/metrika-share.xml</include_from>
    

    注意:修改配置完毕后需要重启clickhouse服务

问题复现

1.ck客户端中访问zookeeper正常相应,打印结果如图1所示;
判断zookeeper是否正确连通
图1:判断zookeeper是否正确连通

2.两个容器的ck客户端中分别创建replicatedmergetree引擎的建表语句正常创建成功,建表语句如图2所示;
在这里插入图片描述
图2:两个节点的建表语句.

  1. ck节点1执行2条插入语句,结果为ck节点1存在数据,而ck节点2无数据,说明数据同步失败;同样在ck节点2执行2条插入语句,结果为ck节点2存在数据,而ck节点1无数据,使用idea的zookeeper连接工具插件查看自动创建的【/clickhouse/tables/01/t_replicated_merge】目录,目前展示图如图3所示.
    在这里插入图片描述
    图3.zookeeper中对replicatedmergetree支持的节点展示图

主要目录解释
1). 【replicas】目录为副本的信息节点,包含了副本节点的连接ip、端口号、是否存活等信息.
2). 【log】目录为数据同步目录,即副本间数据同步通过该目录下的log-xxxxx进行数据交互,其中log-xxxxx可以等价为“insert into xxx values(v1,v2)”,当然还有附加属性如客户端标识、大小之类的信息.

问题定位

立即打开ck节点1的服务端错误日志,日志名为:【clickhouse-server.err.log】(linux环境下错误日志所在位置一般为【/var/log/clickhouse-server】),发现错误【dnsresolver: cannot resolve host (b22679b9d346), error 0: b22679b9d346.xxx.const: code: 198. db::exception: not found address of host: b22679b9d346. (dns_error)】;开发ck节点2的服务日志,也是类似错误;较完整错误日志打印如下:

2022.08.26 11:03:33.409494 [ 90 ] {} <error> dnsresolver: cannot resolve host (b22679b9d346), error 0: b22679b9d346.
2022.08.26 11:03:33.409738 [ 90 ] {} <error> default.t_replicated_merge (35f8baf4-d1e4-43d9-88fc-c45e1ff8adb0): auto db::storagereplicatedmergetree::processqueueentry(replicatedmergetreequeue::selectedentryptr)::(anonymous class)::operator()(db::storagereplicatedmergetree::logentryptr &) const: code: 198. db::exception: not found address of host: b22679b9d346. (dns_error), stack trace (when copying this message, always include the lines below):

0. db::exception::exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0xa82d07a in /usr/bin/clickhouse
1. ? @ 0xa8e7951 in /usr/bin/clickhouse
2. ? @ 0xa8e8122 in /usr/bin/clickhouse
3. db::dnsresolver::resolvehost(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0xa8e71fe in /usr/bin/clickhouse
4. ? @ 0xa9d9493 in /usr/bin/clickhouse
5. ? @ 0xa9dd0f9 in /usr/bin/clickhouse
6. poolbase<poco::net::httpclientsession>::get(long) @ 0xa9dc837 in /usr/bin/clickhouse
7. db::makepooledhttpsession(poco::uri const&, poco::uri const&, db::connectiontimeouts const&, unsigned long, bool) @ 0xa9dacb8 in /usr/bin/clickhouse
8. db::makepooledhttpsession(poco::uri const&, db::connectiontimeouts const&, unsigned long, bool) @ 0xa9d9819 in /usr/bin/clickhouse
9. db::updatablepooledsession::updatablepooledsession(poco::uri, db::connectiontimeouts const&, unsigned long, unsigned long) @ 0x141c76e9 in /usr/bin/clickhouse
10. db::pooledreadwritebufferfromhttp::pooledreadwritebufferfromhttp(poco::uri, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::function<void (std::__1::basic_ostream<char, std::__1::char_traits<char> >&)>, db::connectiontimeouts const&, poco::net::httpbasiccredentials const&, unsigned long, unsigned long, unsigned long) @ 0x141be51e in /usr/bin/clickhouse
11. db::datapartsexchange::fetcher::fetchpart(std::__1::shared_ptr<db::storageinmemorymetadata const> const&, std::__1::shared_ptr<db::context const>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, db::connectiontimeouts const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<db::throttler>, bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::optional<db::currentlysubmergingemergingtagger>*, bool, std::__1::shared_ptr<db::idisk>) @ 0x141bb8f2 in /usr/bin/clickhouse
12. ? @ 0x1405188d in /usr/bin/clickhouse
13. db::storagereplicatedmergetree::fetchpart(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<db::storageinmemorymetadata const> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool, unsigned long, std::__1::shared_ptr<zkutil::zookeeper>) @ 0x13fae906 in /usr/bin/clickhouse
14. db::storagereplicatedmergetree::executefetch(db::replicatedmergetreelogentry&) @ 0x13fa12b4 in /usr/bin/clickhouse
15. db::storagereplicatedmergetree::executelogentry(db::replicatedmergetreelogentry&) @ 0x13f9074d in /usr/bin/clickhouse
16. ? @ 0x1404f41f in /usr/bin/clickhouse
17. db::replicatedmergetreequeue::processentry(std::__1::function<std::__1::shared_ptr<zkutil::zookeeper> ()>, std::__1::shared_ptr<db::replicatedmergetreelogentry>&, std::__1::function<bool (std::__1::shared_ptr<db::replicatedmergetreelogentry>&)>) @ 0x14468165 in /usr/bin/clickhouse
18. db::storagereplicatedmergetree::processqueueentry(std::__1::shared_ptr<db::replicatedmergetreequeue::selectedentry>) @ 0x13fd1073 in /usr/bin/clickhouse
19. db::executablelambdaadapter::executestep() @ 0x1404fd11 in /usr/bin/clickhouse
20. db::mergetreebackgroundexecutor<db::ordinaryruntimequeue>::routine(std::__1::shared_ptr<db::taskruntimedata>) @ 0xa806ffa in /usr/bin/clickhouse
21. db::mergetreebackgroundexecutor<db::ordinaryruntimequeue>::threadfunction() @ 0xa806eb5 in /usr/bin/clickhouse
22. threadpoolimpl<threadfromglobalpool>::worker(std::__1::__list_iterator<threadfromglobalpool, void*>) @ 0xa8720aa in /usr/bin/clickhouse
23. threadfromglobalpool::threadfromglobalpool<void threadpoolimpl<threadfromglobalpool>::scheduleimpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()>(void&&, void threadpoolimpl<threadfromglobalpool>::scheduleimpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()&&...)::'lambda'()::operator()() @ 0xa873ec4 in /usr/bin/clickhouse
24. threadpoolimpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0xa86f4b7 in /usr/bin/clickhouse
25. ? @ 0xa872ebd in /usr/bin/clickhouse
26. ? @ 0x7fb368403609 in ?
27. __clone @ 0x7fb36832a293 in ?
 (version 22.1.3.7 (official build))

既然是找不到主机名,则要去寻找匹配的信息,正好图3提到的【replicas】目录有记录节点的主机和ip等信息,打开【replicas】目录发现【r_1(ck节点1)】和【r_2(ck节点2)】节点中记录的【host】信息正好就是错误日志中的host。【r_1】节点部分信息如图4所示,【r_2】节点部分信息如图5所示.
在这里插入图片描述
图4.ck节点1的节点信息截图
在这里插入图片描述
图5.ck节点2的节点信息截图

问题解决

方案1

经过以上步骤,再重新查询任意节点的数据,发现之前未正常写入的数据都已经正确同步! 至此,问题圆满解决.

方案缺点

  1. docker容器停止后,内部ip地址会发生变动,导致配置每次都需要更正;(重要)
  2. 副本应基于宏定义标签进行动态定值,需要额外在metrika-share.xml中配置宏标签 (废话)
  3. 重复配置配置文件,工作量大 (废话)

方案2

备注:该方案于2023年4月17日更新,较于【方案1】的优势便是将重要缺点补足(其他缺点解决可联系博主私发简单的执行脚本),且不用更改标签<interserver_http_host>,甚至可以应用在物理资源不足的测试/生产环境中…


思路

因docker默认网络存在先天缺陷,导致属于该网段的container-ip并未与contailer-hostname做内部绑定;但docker支持创建自定义网络,且会自动将ip同hostname进行关联,因此将多个ck容器置于同一个自定义网络中即可使其通过hostname进行通讯~

方案步骤

1.(如使用过方案1)注释ck容器的<interserver_http_host>标签
2. 执行docker命令 docker network create clickhouse-network
3. 修改原有ck节点1和ck节点2容器的docker run 命令,在其中添加== --network clickhouse-network== 选项
4. 重启服务,over~~

(0)

相关文章:

版权声明:本文内容由互联网用户贡献,该文观点仅代表作者本人。本站仅提供信息存储服务,不拥有所有权,不承担相关法律责任。 如发现本站有涉嫌抄袭侵权/违法违规的内容, 请发送邮件至 2386932994@qq.com 举报,一经查实将立刻删除。

发表评论

验证码:
Copyright © 2017-2025  代码网 保留所有权利. 粤ICP备2024248653号
站长QQ:2386932994 | 联系邮箱:2386932994@qq.com