elasticsearch5xx使用logstash同步mysql

安装

下载logstash

下载地址:https://www.elastic.co/downloads/logstash

当时我下载的是5.6.3版本

https://artifacts.elastic.co/downloads/logstash/logstash-5.6.3.tar.gz

解压:

tar -zxvf logstash-5.6.3.tar.gz

进入安装目录

运行:

bin/logstash -e 'input { stdin { } } output { stdout {} }'

等待几秒钟 出现

The stdin plugin is now waiting for input:

然后输入

hello world

得到类似的结果

2017-10-30T02:49:59.005Z test-env hello world

安装logstash-input-jdbc插件

1.安装 ruby 和 rubygems(注意:需要 ruby 的版本在 1.8.7 以上)

yum install -y ruby rubygems

检查 ruby 版本:

ruby -v
ruby 1.8.7 (2013-06-27 patchlevel 374) [x86_64-linux]

替换国内的镜像

gem sources --remove http://rubygems.org/
gem sources -a http://gems.ruby-china.org/

验证是否成功

gem sources -l
*** CURRENT SOURCES ***
http://rubygems.org/
http://gems.ruby-china.org/

修改Gemfile的数据源地址

vim Gemfile

修改 source 的值 为: "https://gems.ruby-china.org/"

vim Gemfile.jruby-1.9.lock

找到 remote 修改它的值为:https://gems.ruby-china.org/

开始安装:

./bin/logstash-plugin install --no-verify logstash-input-jdbc
Installing logstash-input-jdbc
Installation successful

使用

配置语法

最基本的配置文件定义,必须包含input 和 output。如果需要对数据进操作,则需要加上filter段

配置 java mysql 连接驱动 mysql-connector-java-5.1.42-bin.jar

https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.42.tar.gz

input {
stdin {
}
jdbc {
jdbc_connection_string => "jdbc:mysql://dbs1:3306/db2_utan_cs"
jdbc_user => "root"
jdbc_password => "123456"
jdbc_driver_library => "/data/arrow/logstash/mysql-connector-java-5.1.42-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_paging_enabled => "true"
jdbc_page_size => "50000"
statement => "SELECT * FROM crawler_data WHERE id > (SELECT MAX(t1.id) FROM db2_utan_cs.crawler_data AS t1)-70000 AND updatetime > :sql_last_value"
use_column_value => true
tracking_column => "updatetime"
schedule => "* * * * *"
type => "baby_crawler"
}
}
filter {
mutate {
remove_field => [ "@timestamp", "@version", "id" ]
}
}
output {
elasticsearch {
hosts => "192.168.1.21:9201"
index => "baby_crawler_b"
document_id => "%{uniquekey}"
}
stdout {
codec => json_lines
}
}

启动方式

# 通过手动指定配置文件启动
/bin/logstash -f /etc/logstash/conf.d/nginx_logstash.conf
# 以daemon方式运行,则在指令后面加一个 & 符号
/bin/logstash -f /etc/logstash/conf.d/nginx_logstash.conf &
# 如果是通过rpm包安装的logstash则可以使用自带的脚本启动
/etc/init.d/logstash start
# 通过这种方式启动,logstash会自动加载 /etc/logstash/conf.d/ 下的配置文件

参考

http://tchuairen.blog.51cto.com/3848118/1840596/