33mfailed action with response of 400, dropping action - logstash - elasticsearch

I'm trying to use elasticsearch + logstash (jdbc input).
my elasticsearch seems to be ok. The problem seems to be in logstash (elasticsearch output plugin).
my logstash.conf:
input {
jdbc {
jdbc_driver_library => "C:\DEV\elasticsearch-1.7.1\plugins\elasticsearch-jdbc-1.7.1.0\lib\sqljdbc4.jar"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://localhost:1433;databaseName=dbTest"
jdbc_user => "user"
jdbc_password => "pass"
schedule => "* * * * *"
statement => "SELECT ID_RECARGA as _id FROM RECARGA where DT_RECARGA >= '2015-09-04'"
jdbc_paging_enabled => "true"
jdbc_page_size => "50000"
}
}
filter {
}
output {
elasticsearch {
host => "localhost"
protocol => "http"
index => "test_index"
document_id => "%{objectId}"
}
stdout { codec => rubydebug }
}
when I run the log stash:
C:\DEV\logstash-1.5.4\bin>logstash -f logstash.conf
I'm getting this result:
←[33mfailed action with response of 400, dropping action: ["index", {:_id=>"%{ob
jectId}", :_index=>"parrudo", :_type=>"logs", :_routing=>nil}, #<LogStash::Event
:0x5d4c2abf #metadata_accessors=#<LogStash::Util::Accessors:0x900a6e7 #store={"r
etry_count"=>0}, #lut={}>, #cancelled=false, #data={"_id"=>908026, "#version"=>"
1", "#timestamp"=>"2015-09-04T21:19:00.322Z"}, #metadata={"retry_count"=>0}, #ac
cessors=#<LogStash::Util::Accessors:0x4929c6a4 #store={"_id"=>908026, "#version"
=>"1", "#timestamp"=>"2015-09-04T21:19:00.322Z"}, #lut={"type"=>[{"_id"=>908026,
"#version"=>"1", "#timestamp"=>"2015-09-04T21:19:00.322Z"}, "type"], "objectId"
=>[{"_id"=>908026, "#version"=>"1", "#timestamp"=>"2015-09-04T21:19:00.322Z"}, "
objectId"]}>>] {:level=>:warn}←[0m
{
"_id" => 908026,
"#version" => "1",
"#timestamp" => "2015-09-04T21:19:00.322Z"
}
{
"_id" => 908027,
"#version" => "1",
"#timestamp" => "2015-09-04T21:19:00.322Z"
}
{
"_id" => 908028,
"#version" => "1",
"#timestamp" => "2015-09-04T21:19:00.323Z"
}
{
"_id" => 908029,
"#version" => "1",
"#timestamp" => "2015-09-04T21:19:00.323Z"
}
In elasticsearch the index was created but have any docs.
I'm using windows and MSSql Server.
elasticsearch version: 1.7.1
logstash version: 1.5.4
any idea?
Thanks!

All right... after looking fot this erros in elasticsearch log I realize that the problem was the alias in my sql statement.
statement => "SELECT ID_RECARGA as _id FROM RECARGA where DT_RECARGA >= '2015-09-04'"
For some reason that I don't know It wasn't been processed, so I just remove this from my query and everything seems to be right now.
thanks!

Related

Logstash from SQL Server to Elasticsearch character encoding problem

I am using ELK stack v8.4.1 and trying to integrate data between SQL Server and Elasticsearch via Logstash. My source table includes Turkish characters (collation SQL_Latin1_General_CP1_CI_AS). When Logstash writes these characters to Elasticsearch, it converts the Turkish characters to '?'. For example 'Şükrü' => '??kr?'. (I used before ELK stack v7.* and didn't have that problem)
This is my config file:
input {
jdbc
{
jdbc_connection_string => "jdbc:sqlserver://my-sql-connection-info;encrypt=false;characterEncoding=utf8"
jdbc_user => "my_sql_user"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_driver_library => "my_path\mssql-jdbc-11.2.0.jre11.jar"
statement => [ "Select id,name,surname FROM ELK_Test" ]
schedule => "*/30 * * * * *"
}
stdin {
codec => plain { charset => "UTF-8"}
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "test_index"
document_id => "%{id}"
user => "logstash_user"
password => "password"
}
stdout { codec => rubydebug }
}
I tried with and without filter to force encoding to UTF-8 but doesn't change.
filter {
ruby {
code => 'event.set("name", event.get("name").force_encoding(::Encoding::UTF_8))'
}
}
Below is my Elasticsearch result:
{
"_index": "test_index",
"_id": "2",
"_score": 1,
"_source": {
"name": "??kr?",
"#version": "1",
"id": 2,
"surname": "?e?meci",
"#timestamp": "2022-09-16T13:02:00.254013300Z"
}
}
BTW console output results are correct.
{
"name" => "Şükrü",
"#version" => "1",
"id" => 2,
"surname" => "Çeşmeci",
"#timestamp" => 2022-09-16T13:32:00.851877400Z
}
I tried to insert sample data from Kibana Dev Tool and the data was inserted without a problem. Does anybody help, please? What can be wrong? What can I check?
The solution is changing the JDK version. I changed the embedded OpenJDK with Oracle JDK-19 and the problem was solved.

How can I fully parse json into ElasticSearch?

I'm parsing a mongodb input into logstash, the config file is as follows:
input {
mongodb {
uri => "<mongouri>"
placeholder_db_dir => "<path>"
collection => "modules"
batch_size => 5000
}
}
filter {
mutate {
rename => { "_id" => "mongo_id" }
remove_field => ["host", "#version"]
}
json {
source => "message"
target => "log"
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
hosts => ["localhost:9200"]
action => "index"
index => "mongo_log_modules"
}
}
Outputs 2/3 documents from the collection into elasticsearch.
{
"mongo_title" => "user",
"log_entry" => "{\"_id\"=>BSON::ObjectId('60db49309fbbf53f5dd96619'), \"title\"=>\"user\", \"modules\"=>[{\"module\"=>\"user-dashboard\", \"description\"=>\"User Dashborad\"}, {\"module\"=>\"user-assessment\", \"description\"=>\"User assessment\"}, {\"module\"=>\"user-projects\", \"description\"=>\"User projects\"}]}",
"mongo_id" => "60db49309fbbf53f5dd96619",
"logdate" => "2021-06-29T16:24:16+00:00",
"application" => "mongo-modules",
"#timestamp" => 2021-10-02T05:08:38.091Z
}
{
"mongo_title" => "candidate",
"log_entry" => "{\"_id\"=>BSON::ObjectId('60db49519fbbf53f5dd96644'), \"title\"=>\"candidate\", \"modules\"=>[{\"module\"=>\"candidate-dashboard\", \"description\"=>\"User Dashborad\"}, {\"module\"=>\"candidate-assessment\", \"description\"=>\"User assessment\"}]}",
"mongo_id" => "60db49519fbbf53f5dd96644",
"logdate" => "2021-06-29T16:24:49+00:00",
"application" => "mongo-modules",
"#timestamp" => 2021-10-02T05:08:38.155Z
}
Seems like the output of stdout throws un-parsable code into
"log_entry"
After adding "rename" fields "modules" won't add a field.
I've tried the grok mutate filter, but after the _id %{DATA}, %{QUOTEDSTRING} and %{WORD} aren't working for me.
I've also tried updating a nested mapping into the index, didn't seem to work either
Is there anything else I can try to get the FULLY nested code into elasticsearch?
Solution is to filter with mutate
mutate { gsub => [ "log_entry", "=>", ": " ] }
mutate { gsub => [ "log_entry", "BSON::ObjectId\('([0-9a-z]+)'\)", '"\1"' ]}
json { source => "log_entry" remove_field => [ "log_entry" ] }
Outputs to stdout
"_id" => "60db49309fbbf53f5dd96619",
"title" => "user",
"modules" => [
[0] {
"module" => "user-dashboard",
"description" => "User Dashborad"
},
[1] {
"module" => "user-assessment",
"description" => "User assessment"
},
[2] {
"module" => "user-projects",
"description" => "User projects"
}
],

Nested document to elasticsearch using logstash

Hi All i am trying to index the documents from MSSQL server to elasticsearch using logstash. I wanted my documents to ingest as nested documents but i am getting aggregate exception error
Here i place all my code
Create table department(
ID Int identity(1,1) not null,
Name varchar(100)
)
Insert into department(Name)
Select 'IT Application development'
union all
Select 'HR & Marketing'
Create table Employee(
ID Int identity(1,1) not null,
emp_Name varchar(100),
dept_Id int
)
Insert into Employee(emp_Name,dept_Id)
Select 'Mohan',1
union all
Select 'parthi',1
union all
Select 'vignesh',1
Insert into Employee(emp_Name,dept_Id)
Select 'Suresh',2
union all
Select 'Jithesh',2
union all
Select 'Venkat',2
Final select statement
SELECT
De.id AS id,De.name AS deptname,Emp.id AS empid,Emp.emp_name AS empname
FROM department De LEFT JOIN employee Emp ON De.id = Emp.dept_Id
ORDER BY De.id
Result should be like this
My elastic search mapping
PUT /departments
{
"mappings": {
"properties": {
"id":{
"type":"integer"
},
"deptname":{
"type":"text"
},
"employee_details":{
"type": "nested",
"properties": {
"empid":{
"type":"integer"
},
"empname":{
"type":"text"
}
}
}
}
}
}
My logstash config file
input {
jdbc {
jdbc_driver_library => ""
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://EC2AMAZ-J90JR4A\SQLEXPRESS:1433;databaseName=xxxx;"
jdbc_user => "xxxx"
jdbc_password => "xxxx"
statement => "SELECT
De.id AS id,De.name AS deptname,Emp.id AS empid,Emp.emp_name AS empname
FROM department De LEFT JOIN employee Emp ON De.id = Emp.dept_Id
ORDER BY De.id"
}
}
filter{
aggregate {
task_id => "%{id}"
code => "
map['id'] = event['id']
map['deptname'] = event['deptname']
map['employee_details'] ||= []
map['employee_details'] << {'empId' => event['empid'], 'empname' => event['empname'] }
"
push_previous_map_as_event => true
timeout => 5
timeout_tags => ['aggregated']
}
}
output{
stdout{ codec => rubydebug }
elasticsearch{
hosts => "https://d9bc7cbca5ec49ea96a6ea683f70caca.eastus2.azure.elastic-cloud.com:4567"
user => "elastic"
password => "****"
index => "departments"
action => "index"
document_type => "departments"
document_id => "%{id}"
}
}
while running logstash i am getting below error
Elastic search scrrenshot for reference
my elasticsearch output should be something like this
{
"took" : 398,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "departments",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"id" : 1,
"deptname" : "IT Application development"
"employee_details" : [
{
"empid" : 1,
"empname" : "Mohan"
},
{
"empid" : 2,
"empname" : "Parthi"
},
{
"empid" : 3,
"empname" : "Vignesh"
}
]
}
}
]
}
}
Could any one please help me to resolve this issue? i want empname and empid of all the employees should get inserted as nested document for respective department. Thanks in advance
Instead of aggregate filter i used JDBC_STREAMING it is working fine might be helpful to some one looking at this post.
input {
jdbc {
jdbc_driver_library => "D:/Users/xxxx/Desktop/driver/mssql-jdbc-7.4.1.jre12-shaded.jar"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://EC2AMAZ-J90JR4A\SQLEXPRESS:1433;databaseName=xxx;"
jdbc_user => "xxx"
jdbc_password => "xxxx"
statement => "Select Policyholdername,Age,Policynumber,Dob,Client_Address,is_active from policy"
}
}
filter{
jdbc_streaming {
jdbc_driver_library => "D:/Users/xxxx/Desktop/driver/mssql-jdbc-7.4.1.jre12-shaded.jar"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://EC2AMAZ-J90JR4A\SQLEXPRESS:1433;databaseName=xxxx;"
jdbc_user => "xxxx"
jdbc_password => "xxxx"
statement => "select claimnumber,claimtype,is_active from claim where policynumber = :policynumber"
parameters => {"policynumber" => "policynumber"}
target => "claim_details"
}
}
output {
elasticsearch {
hosts => "https://e5a4a4a4de7940d9b12674d62eac9762.eastus2.azure.elastic-cloud.com:9243"
user => "elastic"
password => "xxxx"
index => "xxxx"
action => "index"
document_type => "_doc"
document_id => "%{policynumber}"
}
stdout { codec => rubydebug }
}
You can also try to make use of aggregate in logstash filter plugin. Check this
Inserting Nested Objects using Logstash
https://xyzcoder.github.io/2020/07/29/indexing-documents-using-logstash-and-python.html
I am just showing a single object but we can also have multiple arrays of items
input {
jdbc {
jdbc_driver_library => "/usr/share/logstash/javalib/mssql-jdbc-8.2.2.jre11.jar"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://host.docker.internal;database=StackOverflow2010;user=pavan;password=pavankumar#123"
jdbc_user => "pavan"
jdbc_password => "pavankumar#123"
statement => "select top 500 p.Id as PostId,p.AcceptedAnswerId,p.AnswerCount,p.Body,u.Id as userid,u.DisplayName,u.Location
from StackOverflow2010.dbo.Posts p inner join StackOverflow2010.dbo.Users u
on p.OwnerUserId=u.Id"
}
}
filter {
aggregate {
task_id => "%{postid}"
code => "
map['postid'] = event.get('postid')
map['accepted_answer_id'] = event.get('acceptedanswerid')
map['answer_count'] = event.get('answercount')
map['body'] = event.get('body')
map['user'] = {
'id' => event.get('userid'),
'displayname' => event.get('displayname'),
'location' => event.get('location')
}
event.cancel()"
push_previous_map_as_event => true
timeout => 30
}
}
output {
elasticsearch {
hosts => ["http://elasticsearch:9200", "http://elasticsearch:9200"]
index => "stackoverflow_top"
}
stdout {
codec => rubydebug
}
}
So in that example, I am having multiple ways of inserting data like aggregate, JDBC streaming and other scenarios

Could not able to use geo_ip in logstash 2.4

I'm trying to use geoip from apache access log with logstash 2.4, elasticsearch 2.4, kibna 4.6.
my logstash filter is...
input {
file {
path => "/var/log/httpd/access_log"
type => "apache"
start_position => "beginning"
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
geoip {
source => "clientip"
target => "geoip"
database =>"/home/elk/logstash-2.4.0/GeoLiteCity.dat"
#add_field => { "foo_%{somefield}" => "Hello world, from %{host}" }
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
}
mutate {
convert => [ "[geoip][coordinates]", "float" ]
}
}
output {
stdout { codec => rubydebug }
elasticsearch
{ hosts => ["192.168.56.200:9200"]
sniffing => true
manage_template => false
index => "apache-geoip-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
}
And if elasticsearch parsing some apache access log, the output is...
{
"message" => "xxx.xxx.xxx.xxx [24/Oct/2016:14:46:30 +0900] HTTP/1.1 8197 /images/egovframework/com/cmm/er_logo.jpg 200",
"#version" => "1",
"#timestamp" => "2016-10-24T05:46:34.505Z",
"path" => "/NCIALOG/JBOSS/SMBA/default-host/access_log.2016-10-24",
"host" => "smba",
"type" => "jboss_access_log",
"clientip" => "xxx.xxxx.xxx.xxx",
"geoip" => {
"ip" => "xxx.xxx.xxx.xxx",
"country_code2" => "KR",
"country_code3" => "KOR",
"country_name" => "Korea, Republic of",
"continent_code" => "AS",
"region_name" => "11",
"city_name" => "Seoul",
"latitude" => xx.5985,
"longitude" => xxx.97829999999999,
"timezone" => "Asia/Seoul",
"real_region_name" => "Seoul-t'ukpyolsi",
"location" => [
[0] xxx.97829999999999,
[1] xx.5985
],
"coordinates" => [
[0] xxx.97829999999999,
[1] xx.5985
]
}
}
I could not able to see geo_point field.
please help me.
Thanks.
I added my error in tile map .
It says "logstash-* index pattern does not contain any of the following field types: geo_point"
Mmmmm.... the geoip fields are already into you response !
Into the field "geoip" you can find all needed informations (ip, continent, country name, ...). The added field coordinates are present too.
So, what's the problem ?

load array data mysql to ElasticSearch using logstash jdbc

Hi i am new to ES and i m trying to load data from 'MYSQL' to 'Elasticsearch'
I am getting below error when trying to loadata in array format, any help
Here is mysql data, need array data for new & hex value columns
cid color new hex create modified
1 100 euro abcd #86c67c 5/5/2016 15:48 5/13/2016 14:15
1 100 euro 1234 #fdf8ff 5/5/2016 15:48 5/13/2016 14:15
Here us the logstash config
input {
jdbc {
jdbc_driver_library => "/etc/logstash/mysql/mysql-connector-java-5.1.39-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://127.0.0.1:3306/test"
jdbc_user => "root"
jdbc_password => "*****"
schedule => "* * * * *"
statement => "select cid,color, new as 'cvalue.new',hexa_value as 'cvalue.hexa',created,modified from colors_hex_test order by cid"
jdbc_paging_enabled => "true"
jdbc_page_size => "50000"
}
}
output {
elasticsearch {
index => "colors_hexa"
document_type => "colors"
document_id => "%{cid}"
hosts => "localhost:9200"
Need array data for cvalue (new, hexa) like
{
"_index": "colors_hexa",
"_type": "colors",
"_id": "1",
"_version": 218,
"found": true,
"_source": {
"cid": 1,
"color": "100 euro",
"cvalue" : {
"new": "1234",
"hexa_value": "#fdf8ff",
}
"created": "2016-05-05T10:18:51.000Z",
"modified": "2016-05-13T08:45:30.000Z",
"#version": "1",
"#timestamp": "2016-05-14T01:30:00.059Z"
}
}
this is the error i m getting while running logstash
"status"=>400, "error"=>{"type"=>"mapper_parsing_exception",
"reason"=>"Field name [cvalue.hexa] cannot contain '.'"}}}, :level=>:warn}
You cant give a field name with .. But you can try to add:
filter {
mutate {
rename => { "new" => "[cvalue][new]" }
rename => { "hexa" => "[cvalue][hexa]" }
}
}

Resources