win10环境借助dockerdesktop部署最新版大数据时序数据库apache druid32.0.0
前言
大数据分析中,有一种常见的场景,那就是时序数据,简言之,数据一旦产生绝对不会修改,随着时间流逝,每个时间点都会有个新的状态值。这种时序数据的量级往往异常夸张,例如传感器的原始监控数据:
https://lizhiyong.blog.csdn.net/article/details/114898620
一个简单的加速度传感器一年的数据量就是31e!!!制造业传感器数据如果不经底层plc
等下位机预处理,直接打到边缘计算网关,即使mqtt
也会有巨大的负载!!!
类似的,还有服务器的原始监控数据,例如常见的prometheus
和zabbix
,当集群很多时,监控项同样很多,再算上虚拟化后的容器和虚拟机内都可能部署了监控,此时的数据量级就灰常可观!!!一小时几百亿条数据都是常见的事情!!!
但是很多原始的监控数据如果全部存下来,存储成本高的可怕,同时信息密度极低,更多时候我们可能只关注近期的全部热数据来做在线的模型训练,人工查看每秒钟几千条数据也是不切合实际的,事实上,做一个简单的秒级/分钟级统计就能满足大多数的分析场景,超过1天的冷数据其实已经没什么时效性。
对于此类场景,可以高吞吐、预聚合的数据库,在压测后,从apache druid
、clickhouse
、kylin
中,选择了前者。。。专业的事情要交给专业的组件去做!!!
对于非内核和二开的业务开发人员,更多场景应该关注的是api、特性及用法,不应该在部署这种事情上花费太多精力!!!笔者之前已部署了docker desktop:
https://lizhiyong.blog.csdn.net/article/details/145580868
今天在win10环境再搭建个apache druid
最新版玩玩。
版本选择
官网:
https://druid.apache.org/
注意不是阿里数据库连接池的那个druid
!!!
截至2025-02-13
,apache druid
最新版本是32.0.0
。
资源准备
参考官网:
https://druid.apache.org/docs/latest/tutorials/docker
官方给出了使用docker-compose.yml
编排容器的教程,作为一个实时组件,大内存是必须的!!!但是启动8个容器【zookeeper
+postgresql
+6个druid
】每个最多7gb内存也不是什么大事!!!
https://raw.githubusercontent.com/apache/druid/32.0.0/distribution/docker/docker-compose.yml
获取到这个资源文件:
version: "2.2" volumes: metadata_data: {} middle_var: {} historical_var: {} broker_var: {} coordinator_var: {} router_var: {} druid_shared: {} services: postgres: container_name: postgres image: postgres:latest ports: - "5432:5432" volumes: - metadata_data:/var/lib/postgresql/data environment: - postgres_password=foolishpassword - postgres_user=druid - postgres_db=druid # need 3.5 or later for container nodes zookeeper: container_name: zookeeper image: zookeeper:3.5.10 ports: - "2181:2181" environment: - zoo_my_id=1 coordinator: image: apache/druid:32.0.0 container_name: coordinator volumes: - druid_shared:/opt/shared - coordinator_var:/opt/druid/var depends_on: - zookeeper - postgres ports: - "8081:8081" command: - coordinator env_file: - environment broker: image: apache/druid:32.0.0 container_name: broker volumes: - broker_var:/opt/druid/var depends_on: - zookeeper - postgres - coordinator ports: - "8082:8082" command: - broker env_file: - environment historical: image: apache/druid:32.0.0 container_name: historical volumes: - druid_shared:/opt/shared - historical_var:/opt/druid/var depends_on: - zookeeper - postgres - coordinator ports: - "8083:8083" command: - historical env_file: - environment middlemanager: image: apache/druid:32.0.0 container_name: middlemanager volumes: - druid_shared:/opt/shared - middle_var:/opt/druid/var depends_on: - zookeeper - postgres - coordinator ports: - "8091:8091" - "8100-8105:8100-8105" command: - middlemanager env_file: - environment router: image: apache/druid:32.0.0 container_name: router volumes: - router_var:/opt/druid/var depends_on: - zookeeper - postgres - coordinator ports: - "3012:8888" #这里笔者改为3012防止霸占有用的端口 command: - router env_file: - environment
参照官网另一篇:
https://druid.apache.org/docs/latest/configuration/
自己玩玩可以先不改这些运行时配置,容器启动的,后续要重新部署也非常容易!!!
还需要:
https://raw.githubusercontent.com/apache/druid/32.0.0/distribution/docker/environment
做另一个配置文件:
# java tuning #druid_xmx=1g #druid_xms=1g #druid_maxnewsize=250m #druid_newsize=250m #druid_maxdirectmemorysize=6172m druid_single_node_conf=micro-quickstart druid_emitter_logging_loglevel=debug druid_extensions_loadlist=["druid-histogram", "druid-datasketches", "druid-lookups-cached-global", "postgresql-metadata-storage", "druid-multi-stage-query"] druid_zk_service_host=zookeeper druid_metadata_storage_host= druid_metadata_storage_type=postgresql druid_metadata_storage_connector_connecturi=jdbc:postgresql://postgres:5432/druid druid_metadata_storage_connector_user=druid druid_metadata_storage_connector_password=foolishpassword druid_indexer_runner_javaoptsarray=["-server", "-xmx1g", "-xms1g", "-xx:maxdirectmemorysize=3g", "-duser.timezone=utc", "-dfile.encoding=utf-8", "-djava.util.logging.manager=org.apache.logging.log4j.jul.logmanager"] druid_indexer_fork_property_druid_processing_buffer_sizebytes=256mib druid_storage_type=local druid_storage_storagedirectory=/opt/shared/segments druid_indexer_logs_type=file druid_indexer_logs_directory=/opt/shared/indexing-logs druid_processing_numthreads=2 druid_processing_nummergebuffers=2 druid_log4j=<?xml version="1.0" encoding="utf-8" ?><configuration status="warn"><appenders><console name="console" target="system_out"><patternlayout pattern="%d{iso8601} %p [%t] %c - %m%n"/></console></appenders><loggers><root level="info"><appenderref ref="console"/></root><logger name="org.apache.druid.jetty.requestlog" additivity="false" level="debug"><appenderref ref="console"/></logger></loggers></configuration>
部署文件看起来麻雀虽小五脏俱全!!!
部署
ps c:\users\zhiyong> cd e:\dockerdata\volume\druid1 ps e:\dockerdata\volume\druid1> ls 目录: e:\dockerdata\volume\druid1 mode lastwritetime length name ---- ------------- ------ ---- -a---- 2025-02-13 23:26 2980 docker-compose.yml -a---- 2025-02-13 23:33 1576 environment ps e:\dockerdata\volume\druid1> docker compose up -d time="2025-02-13t23:34:39+08:00" level=warning msg="e:\\dockerdata\\volume\\druid1\\docker-compose.yml: the attribute `version` is obsolete, it will be ignored, please remove it to avoid potential confusion" [+] running 72/15 ✔ router pulled 230.7s ✔ coordinator pulled 230.7s ✔ postgres pulled 181.0s ✔ historical pulled 230.7s ✔ broker pulled 230.7s ✔ middlemanager pulled 230.7s ✔ zookeeper pulled 85.7s [+] running 15/15 ✔ network druid1_default created 0.1s ✔ volume "druid1_druid_shared" created 0.0s ✔ volume "druid1_historical_var" created 0.0s ✔ volume "druid1_middle_var" created 0.0s ✔ volume "druid1_router_var" created 0.0s ✔ volume "druid1_metadata_data" created 0.0s ✔ volume "druid1_coordinator_var" created 0.0s ✔ volume "druid1_broker_var" created 0.0s ✔ container postgres started 2.4s ✔ container zookeeper started 2.4s ✔ container coordinator started 1.6s ✔ container router started 2.5s ✔ container broker started 2.3s ✔ container historical started 2.5s ✔ container middlemanager started 2.8s ps e:\dockerdata\volume\druid1>
拉取镜像成功后很快就能拉起容器:
好家伙。。。还顺便把其它组件的端口也给暴露出来了。。。
于是还**白piao
**到一个pg和zookeeper
!!!
验证
http://localhost:3012/unified-console.html#
灰常好,现在已经拥有了一个最新apache druid32.0.0
!!!
转载请注明出处:https://lizhiyong.blog.csdn.net/article/details/145622903
到此这篇关于win10环境借助dockerdesktop部署大数据时序数据库apache druid的文章就介绍到这了,更多相关dockerdesktop部署大数据时序数据库apache druid内容请搜索代码网以前的文章或继续浏览下面的相关文章希望大家以后多多支持代码网!
发表评论