ElasticSearch not mapping JODA time format - elasticsearch

I am indexing tweets, and would like to map the created_at field to a date. An example date looks like this:
'created_at': 'Wed Sep 21 05:19:16 +0000 2011'
which using the JODA time format, I figured out to be:
"format" : "EEE MMM dd HH:mm:ss +SSSS yyyy",
However, when trying to index a new tweet I get the following error:
{u'status': 400, u'error': u'RemoteTransportException[[Rattler][inet[/192.155.85.243:9301]][index]]; nested: MapperParsingException[Failed to parse [created_at]]; nested: MapperParsingException[failed to parse date field [2013-04-30 20:34:43], tried both date format [yyyyMMdd HH:mm:ss], and timestamp number]; nested: IllegalArgumentException[Invalid format: "2013-04-30 20:34:43" is malformed at "-04-30 20:34:43"]; '}
I've tried changing the date format to use
yyyy-MM-dd HH:mm:ss
EEE, dd MMM yyyy HH:mm:ss Z
EEE dd MMM yyyy HH:mm:ss Z
EEE MMM dd HH:mm:ss +0000 yyyy
, and several other variations to just see, and no luck. I'm using the following call to create an initial tweet document:
curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
"tweet" : {
"properties" : {
"created_at" : {"type" : "date", "format" : "EEE dd MMM yyyy HH:mm:ss Z"}
}
}
}'
Any help is greatly appreciated!

The Joda time format you specified is not completely correct.
S is for fraction of second, not timezone as you wanted. Also the "+" sign is included in the timezone parser.
I managed to parse the twitter date format in elasticsearch with this format specifier:
"format": "EE MMM d HH:mm:ss Z yyyy"

Related

Unable to convert string into date in tMap component Talend

I have gone through some similar questions but those solutions didn't worked for me I am having a date field which is String of timestamp "1631898440" I tried converting this string into date using tMap but got this error - java.lang.RuntimeException: java.text.ParseException: Unparseable date: "1631898440".
The function I am using -
row5.mydatecolumn!=null && !"".equalsIgnoreCase(row5.mydatecolumn)? TalendDate.parseDateLocale("EEE MMM dd HH:mm:ss zzz yyyy", row5.mydatecolumn, "EN") :null
Also tried -
TalendDate.parseDate("ddMMyyyy",row5.mydatecolumn)
In this I am getting this err- timestamp out of range: "898442-07-16 00:00:00+05:30"ERROR
How to resolve this issue is there anything wrong with the format of date?
In your user routine just create a fonction like this :
public static Date Convert_String_To_Date(String String_Timestamp) {
SimpleDateFormat sf = new SimpleDateFormat("ddMMyyyy");
Date date = new Date(Long.parseLong(String_Timestamp));
System.out.println("*** Date Converted to this patter ddMMyyyy : "+sf.format(date));
return TalendDate.parseDate("ddMMyyyy",sf.format(date)) ;
}
don't forget the import
import java.text.SimpleDateFormat;
import java.util.Date;
then for me i just put a tjava component where i called my fonction like below
String str = "1631898440";
System.out.println(Format_String_Date.Convert_String_To_Date(str)) ;
So , in your case you would call this fonction in your tMap like this i guess :
row5.mydatecolumn!=null && !"".equalsIgnoreCase(row5.mydatecolumn)?
Format_String_Date.Convert_String_To_Date(row5.mydatecolumn) :null
Here is the output
[statistics] connected
*** Date Converted to this patter ddMMyyyy : 19011970
Mon Jan 19 00:00:00 CET 1970
[statistics] disconnected

Date pattern doesn't work as expected in logstash

I am trying to use following date filter to convert string to date but it doesn't seem to be working.
Sample input data(string) - Mon Jan 20 09:20:35 GMT 2020
I am first using a mutate gsub to remove GMT which renders following string output-
Mon Jan 20 09:20:35 2020
My gsub mutate filter looks like this -
mutate { gsub => [ "TimeStamp", "GMT", "" ] }
Now, I am using a date filter to convert gsub output to date format but it doesn't seem to be working-
date {
match => [ "TimeStamp", "EEE MMM dd HH:mm:ss yyyy" ]
target => "TimeStamp"
locale => "en"
}
I have also tried following with no success-
date {
match => [ "TimeStamp", "EEE\sMMM\sdd\sHH:mm:ss\s+yyyy" ]
target => "TimeStamp"
timezone => "Etc/GMT"
locale => "en"
}
The date pattern should be
MMM dd HH:mm:ss yyyy
Maybe you have to add some extra spaces before the year (looks like you have them in your logs).
Instead of EEE (name of weekday abreviated) you need to use MMM (name of month abreviated).

Overriding #timestamp via Logstash's date filter with a grok-extracted value

I am trying to mutate an string value to date time in logstash. Although the format is correct but in kibana/elastic search the field is showing string and not date.
As part of the analysis I tried to mutate the date in multiple ways but none of them are working. I tried some filters for milliseconds and half day as the date format for my log is with AM/PM.
Grok
match => { message => [
"\"%{WORD:status}\"\,\"(?<monitortime>%{MONTH:month}%{SPACE}%{MONTHDAY:day}\,%{SPACE}%{YEAR:year}%{SPACE}%{TIME:t1}%{SPACE}%{WORD:t2})\"\,\"%{WORD:monitor}\"\,%{INT:loadtime}\,%{INT:totalbytes}\,\"%{WORD:location}\"\,(?m)%{GREEDYDATA:error}"
Date Conversion
date {
locale => "en"
match => [ "monitortime", "MMM dd, yyyy kk:mm:ss.SSS aa ZZZ", "YYYY-MM-dd kk:mm:ss.SSS aa ZZZ" ]
timezone => "Etc/UCT"
}
output in kibana
message "Error","Jun 14, 2019 02:47:33 pm","xxxxxxxxxx",0,0,"stage_1","HomePage: Sign in link is not visible!"
monitortime Jun 14, 2019 02:47:33 pm
monitortime string
Timestamp recorded by elasticsearch
#timestamp Sep 10, 2019 # 20:06:48.525
The expected result will be to get monitortime as datatype date.

Parse date string in elasticsearch using custom joda format string

Trying to figure out why this joda custom format is causing an error. I'm trying to match this date string:
Wed May 23 2018 13:45:04 GMT-0700 (Pacific Daylight Time)
with this joda custom format string:
E MMM dd yyyy HH:mm:ss z (zzzz)||epoch_millis
I'm doing this in the dev console to test a mapping that uses the format. Elasticsearch doesn't like it:
PUT /twitter
{}
PUT /twitter/_mapping/_doc
{
"properties": {
"TxnDate": {
"type": "date",
"format": "E MMM dd yyyy HH:mm:ss z (zzzz)||epoch_millis"
}
}
}
Elasticsearch is returning:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Incomplete parser array"
}
],
"type": "illegal_argument_exception",
"reason": "Incomplete parser array"
},
"status": 400
}
In order to get the mapping to save, the correct format to be used is this one, i.e. you need to escape GMT- and the parenthesis.
E MMM dd yyyy HH:mm:ss 'GMT'Z '('ZZZZ')'||epoch_millis
However, this is not the end of the story, unfortunately... You'll then get a parsing error at indexing time when saving your document with a date such as Wed May 23 2018 13:45:04 GMT-0700 (Pacific Daylight Time). The problem here is that Joda time doesn't parse timezones as "explained" in their documentation:
Zone names: Time zone names ('z') cannot be parsed.
So your only option is to remove the timezone in parenthesis before indexing your document and the pattern E MMM dd yyyy HH:mm:ss 'GMT'Z||epoch_millis will work fine. The timezone in parenthesis is useless anyway.
On another note, you should take the habit of storing all your dates in the GMT timezone, but that's another story.

Joda Time - String to DateTime conversion

I require some help converting the following Thu, 13 Feb 2014 16:43:58 +0000 string to type DateTime. I have a stream of tweets being stored in an ElasticSearch cluster, currently the timestamp of each tweet is mapped as a string. I wish to parse these to type DateTime.
I tried EEE, dd MMM yyyy HH:mm:ss ZZZZZ but it failed. Any help would be great.
Thanks.
You only want a single Z to represent "offset without a colon".
Also note that you should ensure that your DateTimeFormatter is using English month/day names.
For example:
import java.util.*;
import org.joda.time.format.DateTimeFormat;
import org.joda.time.format.DateTimeFormatter;
class Test {
public static void main(String[] args) throws Exception {
DateTimeFormatter format =
DateTimeFormat.forPattern("EEE, dd MMM yyyy HH:mm:ss Z")
.withLocale(Locale.US);
String text = "Thu, 13 Feb 2014 16:43:58 +0000";
System.out.println(format.parseDateTime(text));
}
}

Resources