I'm trying to use elasticsearch + logstash (jdbc input).
my elasticsearch seems to be ok. The problem seems to be in logstash (elasticsearch output plugin).
my logstash.conf:
input {
jdbc {
jdbc_driver_library => "C:\DEV\elasticsearch-1.7.1\plugins\elasticsearch-jdbc-1.7.1.0\lib\sqljdbc4.jar"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://localhost:1433;databaseName=dbTest"
jdbc_user => "user"
jdbc_password => "pass"
schedule => "* * * * *"
statement => "SELECT ID_RECARGA as _id FROM RECARGA where DT_RECARGA >= '2015-09-04'"
jdbc_paging_enabled => "true"
jdbc_page_size => "50000"
}
}
filter {
}
output {
elasticsearch {
host => "localhost"
protocol => "http"
index => "test_index"
document_id => "%{objectId}"
}
stdout { codec => rubydebug }
}
when I run the log stash:
C:\DEV\logstash-1.5.4\bin>logstash -f logstash.conf
I'm getting this result:
←[33mfailed action with response of 400, dropping action: ["index", {:_id=>"%{ob
jectId}", :_index=>"parrudo", :_type=>"logs", :_routing=>nil}, #<LogStash::Event
:0x5d4c2abf #metadata_accessors=#<LogStash::Util::Accessors:0x900a6e7 #store={"r
etry_count"=>0}, #lut={}>, #cancelled=false, #data={"_id"=>908026, "#version"=>"
1", "#timestamp"=>"2015-09-04T21:19:00.322Z"}, #metadata={"retry_count"=>0}, #ac
cessors=#<LogStash::Util::Accessors:0x4929c6a4 #store={"_id"=>908026, "#version"
=>"1", "#timestamp"=>"2015-09-04T21:19:00.322Z"}, #lut={"type"=>[{"_id"=>908026,
"#version"=>"1", "#timestamp"=>"2015-09-04T21:19:00.322Z"}, "type"], "objectId"
=>[{"_id"=>908026, "#version"=>"1", "#timestamp"=>"2015-09-04T21:19:00.322Z"}, "
objectId"]}>>] {:level=>:warn}←[0m
{
"_id" => 908026,
"#version" => "1",
"#timestamp" => "2015-09-04T21:19:00.322Z"
}
{
"_id" => 908027,
"#version" => "1",
"#timestamp" => "2015-09-04T21:19:00.322Z"
}
{
"_id" => 908028,
"#version" => "1",
"#timestamp" => "2015-09-04T21:19:00.323Z"
}
{
"_id" => 908029,
"#version" => "1",
"#timestamp" => "2015-09-04T21:19:00.323Z"
}
In elasticsearch the index was created but have any docs.
I'm using windows and MSSql Server.
elasticsearch version: 1.7.1
logstash version: 1.5.4
any idea?
Thanks!
All right... after looking fot this erros in elasticsearch log I realize that the problem was the alias in my sql statement.
statement => "SELECT ID_RECARGA as _id FROM RECARGA where DT_RECARGA >= '2015-09-04'"
For some reason that I don't know It wasn't been processed, so I just remove this from my query and everything seems to be right now.
thanks!
Related
I am using ELK stack v8.4.1 and trying to integrate data between SQL Server and Elasticsearch via Logstash. My source table includes Turkish characters (collation SQL_Latin1_General_CP1_CI_AS). When Logstash writes these characters to Elasticsearch, it converts the Turkish characters to '?'. For example 'Şükrü' => '??kr?'. (I used before ELK stack v7.* and didn't have that problem)
This is my config file:
input {
jdbc
{
jdbc_connection_string => "jdbc:sqlserver://my-sql-connection-info;encrypt=false;characterEncoding=utf8"
jdbc_user => "my_sql_user"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_driver_library => "my_path\mssql-jdbc-11.2.0.jre11.jar"
statement => [ "Select id,name,surname FROM ELK_Test" ]
schedule => "*/30 * * * * *"
}
stdin {
codec => plain { charset => "UTF-8"}
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "test_index"
document_id => "%{id}"
user => "logstash_user"
password => "password"
}
stdout { codec => rubydebug }
}
I tried with and without filter to force encoding to UTF-8 but doesn't change.
filter {
ruby {
code => 'event.set("name", event.get("name").force_encoding(::Encoding::UTF_8))'
}
}
Below is my Elasticsearch result:
{
"_index": "test_index",
"_id": "2",
"_score": 1,
"_source": {
"name": "??kr?",
"#version": "1",
"id": 2,
"surname": "?e?meci",
"#timestamp": "2022-09-16T13:02:00.254013300Z"
}
}
BTW console output results are correct.
{
"name" => "Şükrü",
"#version" => "1",
"id" => 2,
"surname" => "Çeşmeci",
"#timestamp" => 2022-09-16T13:32:00.851877400Z
}
I tried to insert sample data from Kibana Dev Tool and the data was inserted without a problem. Does anybody help, please? What can be wrong? What can I check?
The solution is changing the JDK version. I changed the embedded OpenJDK with Oracle JDK-19 and the problem was solved.
I'm parsing a mongodb input into logstash, the config file is as follows:
input {
mongodb {
uri => "<mongouri>"
placeholder_db_dir => "<path>"
collection => "modules"
batch_size => 5000
}
}
filter {
mutate {
rename => { "_id" => "mongo_id" }
remove_field => ["host", "#version"]
}
json {
source => "message"
target => "log"
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
hosts => ["localhost:9200"]
action => "index"
index => "mongo_log_modules"
}
}
Outputs 2/3 documents from the collection into elasticsearch.
{
"mongo_title" => "user",
"log_entry" => "{\"_id\"=>BSON::ObjectId('60db49309fbbf53f5dd96619'), \"title\"=>\"user\", \"modules\"=>[{\"module\"=>\"user-dashboard\", \"description\"=>\"User Dashborad\"}, {\"module\"=>\"user-assessment\", \"description\"=>\"User assessment\"}, {\"module\"=>\"user-projects\", \"description\"=>\"User projects\"}]}",
"mongo_id" => "60db49309fbbf53f5dd96619",
"logdate" => "2021-06-29T16:24:16+00:00",
"application" => "mongo-modules",
"#timestamp" => 2021-10-02T05:08:38.091Z
}
{
"mongo_title" => "candidate",
"log_entry" => "{\"_id\"=>BSON::ObjectId('60db49519fbbf53f5dd96644'), \"title\"=>\"candidate\", \"modules\"=>[{\"module\"=>\"candidate-dashboard\", \"description\"=>\"User Dashborad\"}, {\"module\"=>\"candidate-assessment\", \"description\"=>\"User assessment\"}]}",
"mongo_id" => "60db49519fbbf53f5dd96644",
"logdate" => "2021-06-29T16:24:49+00:00",
"application" => "mongo-modules",
"#timestamp" => 2021-10-02T05:08:38.155Z
}
Seems like the output of stdout throws un-parsable code into
"log_entry"
After adding "rename" fields "modules" won't add a field.
I've tried the grok mutate filter, but after the _id %{DATA}, %{QUOTEDSTRING} and %{WORD} aren't working for me.
I've also tried updating a nested mapping into the index, didn't seem to work either
Is there anything else I can try to get the FULLY nested code into elasticsearch?
Solution is to filter with mutate
mutate { gsub => [ "log_entry", "=>", ": " ] }
mutate { gsub => [ "log_entry", "BSON::ObjectId\('([0-9a-z]+)'\)", '"\1"' ]}
json { source => "log_entry" remove_field => [ "log_entry" ] }
Outputs to stdout
"_id" => "60db49309fbbf53f5dd96619",
"title" => "user",
"modules" => [
[0] {
"module" => "user-dashboard",
"description" => "User Dashborad"
},
[1] {
"module" => "user-assessment",
"description" => "User assessment"
},
[2] {
"module" => "user-projects",
"description" => "User projects"
}
],
Hi All i am trying to index the documents from MSSQL server to elasticsearch using logstash. I wanted my documents to ingest as nested documents but i am getting aggregate exception error
Here i place all my code
Create table department(
ID Int identity(1,1) not null,
Name varchar(100)
)
Insert into department(Name)
Select 'IT Application development'
union all
Select 'HR & Marketing'
Create table Employee(
ID Int identity(1,1) not null,
emp_Name varchar(100),
dept_Id int
)
Insert into Employee(emp_Name,dept_Id)
Select 'Mohan',1
union all
Select 'parthi',1
union all
Select 'vignesh',1
Insert into Employee(emp_Name,dept_Id)
Select 'Suresh',2
union all
Select 'Jithesh',2
union all
Select 'Venkat',2
Final select statement
SELECT
De.id AS id,De.name AS deptname,Emp.id AS empid,Emp.emp_name AS empname
FROM department De LEFT JOIN employee Emp ON De.id = Emp.dept_Id
ORDER BY De.id
Result should be like this
My elastic search mapping
PUT /departments
{
"mappings": {
"properties": {
"id":{
"type":"integer"
},
"deptname":{
"type":"text"
},
"employee_details":{
"type": "nested",
"properties": {
"empid":{
"type":"integer"
},
"empname":{
"type":"text"
}
}
}
}
}
}
My logstash config file
input {
jdbc {
jdbc_driver_library => ""
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://EC2AMAZ-J90JR4A\SQLEXPRESS:1433;databaseName=xxxx;"
jdbc_user => "xxxx"
jdbc_password => "xxxx"
statement => "SELECT
De.id AS id,De.name AS deptname,Emp.id AS empid,Emp.emp_name AS empname
FROM department De LEFT JOIN employee Emp ON De.id = Emp.dept_Id
ORDER BY De.id"
}
}
filter{
aggregate {
task_id => "%{id}"
code => "
map['id'] = event['id']
map['deptname'] = event['deptname']
map['employee_details'] ||= []
map['employee_details'] << {'empId' => event['empid'], 'empname' => event['empname'] }
"
push_previous_map_as_event => true
timeout => 5
timeout_tags => ['aggregated']
}
}
output{
stdout{ codec => rubydebug }
elasticsearch{
hosts => "https://d9bc7cbca5ec49ea96a6ea683f70caca.eastus2.azure.elastic-cloud.com:4567"
user => "elastic"
password => "****"
index => "departments"
action => "index"
document_type => "departments"
document_id => "%{id}"
}
}
while running logstash i am getting below error
Elastic search scrrenshot for reference
my elasticsearch output should be something like this
{
"took" : 398,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "departments",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"id" : 1,
"deptname" : "IT Application development"
"employee_details" : [
{
"empid" : 1,
"empname" : "Mohan"
},
{
"empid" : 2,
"empname" : "Parthi"
},
{
"empid" : 3,
"empname" : "Vignesh"
}
]
}
}
]
}
}
Could any one please help me to resolve this issue? i want empname and empid of all the employees should get inserted as nested document for respective department. Thanks in advance
Instead of aggregate filter i used JDBC_STREAMING it is working fine might be helpful to some one looking at this post.
input {
jdbc {
jdbc_driver_library => "D:/Users/xxxx/Desktop/driver/mssql-jdbc-7.4.1.jre12-shaded.jar"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://EC2AMAZ-J90JR4A\SQLEXPRESS:1433;databaseName=xxx;"
jdbc_user => "xxx"
jdbc_password => "xxxx"
statement => "Select Policyholdername,Age,Policynumber,Dob,Client_Address,is_active from policy"
}
}
filter{
jdbc_streaming {
jdbc_driver_library => "D:/Users/xxxx/Desktop/driver/mssql-jdbc-7.4.1.jre12-shaded.jar"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://EC2AMAZ-J90JR4A\SQLEXPRESS:1433;databaseName=xxxx;"
jdbc_user => "xxxx"
jdbc_password => "xxxx"
statement => "select claimnumber,claimtype,is_active from claim where policynumber = :policynumber"
parameters => {"policynumber" => "policynumber"}
target => "claim_details"
}
}
output {
elasticsearch {
hosts => "https://e5a4a4a4de7940d9b12674d62eac9762.eastus2.azure.elastic-cloud.com:9243"
user => "elastic"
password => "xxxx"
index => "xxxx"
action => "index"
document_type => "_doc"
document_id => "%{policynumber}"
}
stdout { codec => rubydebug }
}
You can also try to make use of aggregate in logstash filter plugin. Check this
Inserting Nested Objects using Logstash
https://xyzcoder.github.io/2020/07/29/indexing-documents-using-logstash-and-python.html
I am just showing a single object but we can also have multiple arrays of items
input {
jdbc {
jdbc_driver_library => "/usr/share/logstash/javalib/mssql-jdbc-8.2.2.jre11.jar"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://host.docker.internal;database=StackOverflow2010;user=pavan;password=pavankumar#123"
jdbc_user => "pavan"
jdbc_password => "pavankumar#123"
statement => "select top 500 p.Id as PostId,p.AcceptedAnswerId,p.AnswerCount,p.Body,u.Id as userid,u.DisplayName,u.Location
from StackOverflow2010.dbo.Posts p inner join StackOverflow2010.dbo.Users u
on p.OwnerUserId=u.Id"
}
}
filter {
aggregate {
task_id => "%{postid}"
code => "
map['postid'] = event.get('postid')
map['accepted_answer_id'] = event.get('acceptedanswerid')
map['answer_count'] = event.get('answercount')
map['body'] = event.get('body')
map['user'] = {
'id' => event.get('userid'),
'displayname' => event.get('displayname'),
'location' => event.get('location')
}
event.cancel()"
push_previous_map_as_event => true
timeout => 30
}
}
output {
elasticsearch {
hosts => ["http://elasticsearch:9200", "http://elasticsearch:9200"]
index => "stackoverflow_top"
}
stdout {
codec => rubydebug
}
}
So in that example, I am having multiple ways of inserting data like aggregate, JDBC streaming and other scenarios
I'm trying to use geoip from apache access log with logstash 2.4, elasticsearch 2.4, kibna 4.6.
my logstash filter is...
input {
file {
path => "/var/log/httpd/access_log"
type => "apache"
start_position => "beginning"
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
geoip {
source => "clientip"
target => "geoip"
database =>"/home/elk/logstash-2.4.0/GeoLiteCity.dat"
#add_field => { "foo_%{somefield}" => "Hello world, from %{host}" }
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
}
mutate {
convert => [ "[geoip][coordinates]", "float" ]
}
}
output {
stdout { codec => rubydebug }
elasticsearch
{ hosts => ["192.168.56.200:9200"]
sniffing => true
manage_template => false
index => "apache-geoip-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
}
And if elasticsearch parsing some apache access log, the output is...
{
"message" => "xxx.xxx.xxx.xxx [24/Oct/2016:14:46:30 +0900] HTTP/1.1 8197 /images/egovframework/com/cmm/er_logo.jpg 200",
"#version" => "1",
"#timestamp" => "2016-10-24T05:46:34.505Z",
"path" => "/NCIALOG/JBOSS/SMBA/default-host/access_log.2016-10-24",
"host" => "smba",
"type" => "jboss_access_log",
"clientip" => "xxx.xxxx.xxx.xxx",
"geoip" => {
"ip" => "xxx.xxx.xxx.xxx",
"country_code2" => "KR",
"country_code3" => "KOR",
"country_name" => "Korea, Republic of",
"continent_code" => "AS",
"region_name" => "11",
"city_name" => "Seoul",
"latitude" => xx.5985,
"longitude" => xxx.97829999999999,
"timezone" => "Asia/Seoul",
"real_region_name" => "Seoul-t'ukpyolsi",
"location" => [
[0] xxx.97829999999999,
[1] xx.5985
],
"coordinates" => [
[0] xxx.97829999999999,
[1] xx.5985
]
}
}
I could not able to see geo_point field.
please help me.
Thanks.
I added my error in tile map .
It says "logstash-* index pattern does not contain any of the following field types: geo_point"
Mmmmm.... the geoip fields are already into you response !
Into the field "geoip" you can find all needed informations (ip, continent, country name, ...). The added field coordinates are present too.
So, what's the problem ?
Hi i am new to ES and i m trying to load data from 'MYSQL' to 'Elasticsearch'
I am getting below error when trying to loadata in array format, any help
Here is mysql data, need array data for new & hex value columns
cid color new hex create modified
1 100 euro abcd #86c67c 5/5/2016 15:48 5/13/2016 14:15
1 100 euro 1234 #fdf8ff 5/5/2016 15:48 5/13/2016 14:15
Here us the logstash config
input {
jdbc {
jdbc_driver_library => "/etc/logstash/mysql/mysql-connector-java-5.1.39-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://127.0.0.1:3306/test"
jdbc_user => "root"
jdbc_password => "*****"
schedule => "* * * * *"
statement => "select cid,color, new as 'cvalue.new',hexa_value as 'cvalue.hexa',created,modified from colors_hex_test order by cid"
jdbc_paging_enabled => "true"
jdbc_page_size => "50000"
}
}
output {
elasticsearch {
index => "colors_hexa"
document_type => "colors"
document_id => "%{cid}"
hosts => "localhost:9200"
Need array data for cvalue (new, hexa) like
{
"_index": "colors_hexa",
"_type": "colors",
"_id": "1",
"_version": 218,
"found": true,
"_source": {
"cid": 1,
"color": "100 euro",
"cvalue" : {
"new": "1234",
"hexa_value": "#fdf8ff",
}
"created": "2016-05-05T10:18:51.000Z",
"modified": "2016-05-13T08:45:30.000Z",
"#version": "1",
"#timestamp": "2016-05-14T01:30:00.059Z"
}
}
this is the error i m getting while running logstash
"status"=>400, "error"=>{"type"=>"mapper_parsing_exception",
"reason"=>"Field name [cvalue.hexa] cannot contain '.'"}}}, :level=>:warn}
You cant give a field name with .. But you can try to add:
filter {
mutate {
rename => { "new" => "[cvalue][new]" }
rename => { "hexa" => "[cvalue][hexa]" }
}
}