网站建设销售工作内容,seo怎么收费的,商城系统管理,提升网站流量视频地址#xff1a;尚硅谷大数据项目《在线教育之采集系统》_哔哩哔哩_bilibili 目录
P036
P037
P038
P039
P041
P042
P043
P044
P045
P046 P036 先启动zookeeper#xff0c;在启动kafka#xff0c;启动hadoop中的hdfs node003启动flume#xff0c;node001启动f… 视频地址尚硅谷大数据项目《在线教育之采集系统》_哔哩哔哩_bilibili 目录
P036
P037
P038
P039
P041
P042
P043
P044
P045
P046 P036 先启动zookeeper在启动kafka启动hadoop中的hdfs node003启动flumenode001启动flumenode001启动mock.sh。 P037
数据漂移
数据传输流程生成数据——flume——kafka——flume——hdfs。
hdfs落盘默认使用header头的默认时间戳timesamp修改header头就能修改时间戳。
P038 TimestampInterceptor解决时间戳问题的拦截器。 a1.sources.r1.interceptors i1 a1.sources.r1.interceptors.i1.type com.atguigu.flume.interceptor.TimestampInterceptor$Builder ## 1、定义组件
a1.sources r1
a1.channels c1
a1.sinks k1## 2、配置sources
a1.sources.r1.type org.apache.flume.source.kafka.KafkaSource
a1.sources.r1.kafka.bootstrap.servers node001:9092,node002:9092,node003:9092
a1.sources.r1.kafka.consumer.group.id topic_log
a1.sources.r1.kafka.topics topic_log
a1.sources.r1.batchSize 1000
a1.sources.r1.batchDurationMillis 1000
a1.sources.r1.useFlumeEventFormat falsea1.sources.r1.interceptors i1
a1.sources.r1.interceptors.i1.type com.atguigu.flume.interceptor.TimestampInterceptor$Builder## 3、配置channels
a1.channels.c1.type file
a1.channels.c1.checkpointDir /opt/module/flume/flume-1.9.0/checkpoint/behavior1
a1.channels.c1.useDualCheckpoints false
a1.channels.c1.dataDirs /opt/module/flume/flume-1.9.0/data/behavior1/
a1.channels.c1.capacity 1000000
a1.channels.c1.maxFileSize 2146435071
a1.channels.c1.keep-alive 3## 4、配置sinks
a1.sinks.k1.type hdfs
a1.sinks.k1.hdfs.path /origin_data/edu/log/edu_log/%Y-%m-%d
a1.sinks.k1.hdfs.filePrefix log
a1.sinks.k1.hdfs.round false## 控制输出文件是原生文件。
a1.sinks.k1.hdfs.fileType CompressedStream
a1.sinks.k1.hdfs.codeC gzipa1.sinks.k1.hdfs.rollInterval 10
a1.sinks.k1.hdfs.rollSize 134217728
a1.sinks.k1.hdfs.rollCount 0## 5、组装 拼装
a1.sources.r1.channels c1
a1.sinks.k1.channel c1 [atguigunode002 ~]$ kafka-console-consumer.sh --bootstrap-server node001:9092 --topic topic_log P039
/home/atguigu/bin
-----------------------------------------------------------
#! /bin/bashcase $1 in
start) {echo --------消费flume启动-------ssh node003 nohup /opt/module/flume/flume-1.9.0/bin/flume-ng agent -n a1 -c /opt/module/flume/flume-1.9.0/conf/ -f /opt/module/flume/flume-1.9.0/job/kafka_to_hdfs_log.conf /dev/null 21
};;
stop) {echo --------消费flume关闭-------ssh node003 ps -ef | grep kafka_to_hdfs_log | grep -v grep | awk {print \$2} | xargs -n1 kill -9
};;
esac P041 P042 本项目中全量同步采用DataX增量同步采用Maxwell。 P043 https://github.com/alibaba/DataXhttps://github.com/alibaba/DataX/blob/master/introduction.md P044
[atguigunode001 datax]$ cd /opt/module/datax/
[atguigunode001 datax]$ python bin/datax.py -r mysqlreader -w hdfswriterDataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.Please refer to the mysqlreader document:https://github.com/alibaba/DataX/blob/master/mysqlreader/doc/mysqlreader.md Please refer to the hdfswriter document:https://github.com/alibaba/DataX/blob/master/hdfswriter/doc/hdfswriter.md Please save the following configuration as a json file and usepython {DATAX_HOME}/bin/datax.py {JSON_FILE_NAME}.json
to run the job.{job: {content: [{reader: {name: mysqlreader, parameter: {column: [], connection: [{jdbcUrl: [], table: []}], password: , username: , where: }}, writer: {name: hdfswriter, parameter: {column: [], compress: , defaultFS: , fieldDelimiter: , fileName: , fileType: , path: , writeMode: }}}], setting: {speed: {channel: }}}
}
[atguigunode001 datax]$
P045
/opt/module/datax/job/base_province.json{job: {content: [{reader: {name: mysqlreader,parameter: {column: [id,name,region_id,area_code,iso_code,iso_3166_2],where: id3,connection: [{jdbcUrl: [jdbc:mysql://node001:3306/edu2077],table: [base_province]}],password: 000000,splitPk: ,username: root}},writer: {name: hdfswriter,parameter: {column: [{name: id,type: bigint},{name: name,type: string},{name: region_id,type: string},{name: area_code,type: string},{name: iso_code,type: string},{name: iso_3166_2,type: string}],compress: gzip,defaultFS: hdfs://node001:8020,fieldDelimiter: \t,fileName: base_province,fileType: text,path: /base_province,writeMode: append}}}],setting: {speed: {channel: 1}}}
} 2023-08-08 21:40:16.749 [job-0] INFO JobContainer -
任务启动时刻 : 2023-08-08 21:39:59
任务结束时刻 : 2023-08-08 21:40:16
任务总计耗时 : 17s
任务平均流量 : 66B/s
记录写入速度 : 3rec/s
读出记录总数 : 32
读写失败总数 : 0[atguigunode001 datax]$ hadoop fs -cat /base_province/base_province__75bb19ed_497f_45f9_bcd3_f27e2dafee72.gz | zcat
2023-08-08 21:42:28,250 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted false, remoteHostTrusted false
3 山西 1 140000 CN-14 CN-SX
4 内蒙古 1 150000 CN-15 CN-NM
5 河北 1 130000 CN-13 CN-HE
6 上海 2 310000 CN-31 CN-SH
7 江苏 2 320000 CN-32 CN-JS
8 浙江 2 330000 CN-33 CN-ZJ
9 安徽 2 340000 CN-34 CN-AH
10 福建 2 350000 CN-35 CN-FJ
11 江西 2 360000 CN-36 CN-JX
12 山东 2 370000 CN-37 CN-SD
13 重庆 6 500000 CN-50 CN-CQ
14 台湾 2 710000 CN-71 CN-TW
15 黑龙江 3 230000 CN-23 CN-HL
16 吉林 3 220000 CN-22 CN-JL
17 辽宁 3 210000 CN-21 CN-LN
18 陕西 7 610000 CN-61 CN-SN
19 甘肃 7 620000 CN-62 CN-GS
20 青海 7 630000 CN-63 CN-QH
21 宁夏 7 640000 CN-64 CN-NX
22 新疆 7 650000 CN-65 CN-XJ
23 河南 4 410000 CN-41 CN-HA
24 湖北 4 420000 CN-42 CN-HB
25 湖南 4 430000 CN-43 CN-HN
26 广东 5 440000 CN-44 CN-GD
27 广西 5 450000 CN-45 CN-GX
28 海南 5 460000 CN-46 CN-HI
29 香港 5 810000 CN-91 CN-HK
30 澳门 5 820000 CN-92 CN-MO
31 四川 6 510000 CN-51 CN-SC
32 贵州 6 520000 CN-52 CN-GZ
33 云南 6 530000 CN-53 CN-YN
34 西藏 6 540000 CN-54 CN-XZ
[atguigunode001 datax]$ P046
/opt/module/datax/job/base_province_sql.json{job: {content: [{reader: {name: mysqlreader,parameter: {connection: [{jdbcUrl: [jdbc:mysql://hadoop102:3306/edu2077],querySql: [select id,name,region_id,area_code,iso_code,iso_3166_2 from base_province where id3]}],password: 000000,username: root}},writer: {name: hdfswriter,parameter: {column: [{name: id,type: bigint},{name: name,type: string},{name: region_id,type: string},{name: area_code,type: string},{name: iso_code,type: string},{name: iso_3166_2,type: string}],compress: gzip,defaultFS: hdfs://hadoop102:8020,fieldDelimiter: \t,fileName: base_province,fileType: text,path: /base_province,writeMode: append}}}],setting: {speed: {channel: 1}}}
} [atguigunode001 datax]$ bin/datax.py job/base_province_sql.json2023-08-08 22:00:47.596 [job-0] INFO JobContainer - PerfTrace not enable!
2023-08-08 22:00:47.597 [job-0] INFO StandAloneJobContainerCommunicator - Total 32 records, 667 bytes | Speed 66B/s, 3 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.001s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2023-08-08 22:00:47.600 [job-0] INFO JobContainer -
任务启动时刻 : 2023-08-08 22:00:33
任务结束时刻 : 2023-08-08 22:00:47
任务总计耗时 : 14s
任务平均流量 : 66B/s
记录写入速度 : 3rec/s
读出记录总数 : 32
读写失败总数 : 0[atguigunode001 datax]$ 文章转载自: http://www.morning.sqtsl.cn.gov.cn.sqtsl.cn http://www.morning.ygkq.cn.gov.cn.ygkq.cn http://www.morning.qxljc.cn.gov.cn.qxljc.cn http://www.morning.tstwx.cn.gov.cn.tstwx.cn http://www.morning.pdxqk.cn.gov.cn.pdxqk.cn http://www.morning.aowuu.com.gov.cn.aowuu.com http://www.morning.drswd.cn.gov.cn.drswd.cn http://www.morning.ybqlb.cn.gov.cn.ybqlb.cn http://www.morning.dbtdy.cn.gov.cn.dbtdy.cn http://www.morning.tfpmf.cn.gov.cn.tfpmf.cn http://www.morning.nbgfz.cn.gov.cn.nbgfz.cn http://www.morning.ntqqm.cn.gov.cn.ntqqm.cn http://www.morning.mspkz.cn.gov.cn.mspkz.cn http://www.morning.brwei.com.gov.cn.brwei.com http://www.morning.rzdpd.cn.gov.cn.rzdpd.cn http://www.morning.rryny.cn.gov.cn.rryny.cn http://www.morning.tpyjr.cn.gov.cn.tpyjr.cn http://www.morning.ghgck.cn.gov.cn.ghgck.cn http://www.morning.wgzzj.cn.gov.cn.wgzzj.cn http://www.morning.jykzy.cn.gov.cn.jykzy.cn http://www.morning.lfbsd.cn.gov.cn.lfbsd.cn http://www.morning.sxmbk.cn.gov.cn.sxmbk.cn http://www.morning.dnconr.cn.gov.cn.dnconr.cn http://www.morning.leeong.com.gov.cn.leeong.com http://www.morning.ypbdr.cn.gov.cn.ypbdr.cn http://www.morning.xhxsr.cn.gov.cn.xhxsr.cn http://www.morning.btpll.cn.gov.cn.btpll.cn http://www.morning.xcfmh.cn.gov.cn.xcfmh.cn http://www.morning.aowuu.com.gov.cn.aowuu.com http://www.morning.sgnxl.cn.gov.cn.sgnxl.cn http://www.morning.ktrzt.cn.gov.cn.ktrzt.cn http://www.morning.tfpqd.cn.gov.cn.tfpqd.cn http://www.morning.dbnrl.cn.gov.cn.dbnrl.cn http://www.morning.qfwfj.cn.gov.cn.qfwfj.cn http://www.morning.ntffl.cn.gov.cn.ntffl.cn http://www.morning.zfrs.cn.gov.cn.zfrs.cn http://www.morning.nxhjg.cn.gov.cn.nxhjg.cn http://www.morning.rbtny.cn.gov.cn.rbtny.cn http://www.morning.xxknq.cn.gov.cn.xxknq.cn http://www.morning.hrrmb.cn.gov.cn.hrrmb.cn http://www.morning.wmhlz.cn.gov.cn.wmhlz.cn http://www.morning.trrhj.cn.gov.cn.trrhj.cn http://www.morning.ypklb.cn.gov.cn.ypklb.cn http://www.morning.mrfbp.cn.gov.cn.mrfbp.cn http://www.morning.rwyd.cn.gov.cn.rwyd.cn http://www.morning.xprzq.cn.gov.cn.xprzq.cn http://www.morning.crfjj.cn.gov.cn.crfjj.cn http://www.morning.bwjgb.cn.gov.cn.bwjgb.cn http://www.morning.dansj.com.gov.cn.dansj.com http://www.morning.wjrq.cn.gov.cn.wjrq.cn http://www.morning.jrpmf.cn.gov.cn.jrpmf.cn http://www.morning.bpds.cn.gov.cn.bpds.cn http://www.morning.jsmyw.cn.gov.cn.jsmyw.cn http://www.morning.mltsc.cn.gov.cn.mltsc.cn http://www.morning.jzlkq.cn.gov.cn.jzlkq.cn http://www.morning.jhtrb.cn.gov.cn.jhtrb.cn http://www.morning.hkpn.cn.gov.cn.hkpn.cn http://www.morning.lwgrf.cn.gov.cn.lwgrf.cn http://www.morning.wsgyq.cn.gov.cn.wsgyq.cn http://www.morning.mswkd.cn.gov.cn.mswkd.cn http://www.morning.rntgy.cn.gov.cn.rntgy.cn http://www.morning.syglx.cn.gov.cn.syglx.cn http://www.morning.txtgy.cn.gov.cn.txtgy.cn http://www.morning.fbdkb.cn.gov.cn.fbdkb.cn http://www.morning.frfpx.cn.gov.cn.frfpx.cn http://www.morning.huayaosteel.cn.gov.cn.huayaosteel.cn http://www.morning.qfzjn.cn.gov.cn.qfzjn.cn http://www.morning.hqgkx.cn.gov.cn.hqgkx.cn http://www.morning.kncrc.cn.gov.cn.kncrc.cn http://www.morning.bnlkc.cn.gov.cn.bnlkc.cn http://www.morning.rhzzf.cn.gov.cn.rhzzf.cn http://www.morning.qrsm.cn.gov.cn.qrsm.cn http://www.morning.ktsth.cn.gov.cn.ktsth.cn http://www.morning.gmswp.cn.gov.cn.gmswp.cn http://www.morning.hbqhz.cn.gov.cn.hbqhz.cn http://www.morning.fxzlg.cn.gov.cn.fxzlg.cn http://www.morning.mfltz.cn.gov.cn.mfltz.cn http://www.morning.jbpdk.cn.gov.cn.jbpdk.cn http://www.morning.qwdqq.cn.gov.cn.qwdqq.cn http://www.morning.wpmqq.cn.gov.cn.wpmqq.cn