How to handle all Zone Offset in one DateTimeFormater Java 8 - java-8

I need to create a DateTimeFormatter for the following valid dates.
String date1 = "2017-06-20T17:25:28";
String date2 = "2017-06-20T17:25:28.477777";
String date3 = "2017-06-20T17:25:28.477777Z";
String date4 = "2017-06-20T17:25:28.477777UTC";
String date5 = "2017-06-20T17:25:28.477777-05";
String date6 = "2017-06-20T17:25:28.477777+05";
String date7 = "2017-06-20T17:25:28.477777+05:30";
String date8 = "2017-06-20T17:25:28.477777-05:30";
String date9 = "2017-06-20T17:25:28.477777+0530";
String date10 = "2017-06-20T17:25:28.477777-0530";
I have tried the following date time formatter, but this fails for last two dates (date9, date10).
private static final DateTimeFormatter DATE_TIME_FORMATTER = new DateTimeFormatterBuilder()
.appendPattern("yyyy-MM-dd'T'HH:mm:ss")
.appendFraction(ChronoField.MICRO_OF_SECOND, 0, 6, true)
.optionalStart().appendZoneId().optionalEnd()
.optionalStart().appendOffset("+HH", "+00").optionalEnd()
.optionalStart().appendOffset("+HH:mm", "+00:00").optionalEnd()
.optionalStart().appendOffset("+HHmm", "+0000").optionalEnd().toFormatter();
All dates from date1 to date8 work fine but I get a DateTimeParseException when trying to parse last two dates:
Exception in thread "main" java.time.format.DateTimeParseException: Text '2017-06-20T17:25:28.477777+0530' could not be parsed, unparsed text found at index 29
For parsing the date I am using following.
LocalDateTime.parse(date1, DATE_TIME_FORMATTER);
Valid Pattern for Offset From OffsetIdPrinterParser:
static final class OffsetIdPrinterParser implements DateTimePrinterParser {
static final String[] PATTERNS = new String[] {
"+HH", "+HHmm", "+HH:mm", "+HHMM", "+HH:MM", "+HHMMss", "+HH:MM:ss", "+HHMMSS", "+HH:MM:SS",
}; // order used in pattern builder
I am not able to understand while I am using valid ZoneOffset patterns, why my last two dates fail.

Simply reverse the order of your optional sections:
private static final DateTimeFormatter DATE_TIME_FORMATTER = new DateTimeFormatterBuilder()
.appendPattern("yyyy-MM-dd'T'HH:mm:ss")
.appendFraction(ChronoField.MICRO_OF_SECOND, 0, 6, true)
.optionalStart().appendZoneId().optionalEnd()
.optionalStart().appendOffset("+HHmm", "+0000").optionalEnd()
.optionalStart().appendOffset("+HH:mm", "+00:00").optionalEnd()
.optionalStart().appendOffset("+HH", "+00").optionalEnd()
.toFormatter();
This parses all your 10 sample date-time strings.
I am not quite sure why it works. I suppose that it is now trying +HHmm before +HH, which makes sure it gets alle four digits when there are four, instead of leaving the last two unparsed.

Another alternative is to use optional sections, delimited by [], and the respective offset patterns (VV and x):
DATE_TIME_FORMATTER = DateTimeFormatter
// pattern with optional sections: fraction of seconds and offsets
.ofPattern("yyyy-MM-dd'T'HH:mm:ss[.SSSSSS][VV][x][xx][xxx]");
Each pair of [] is equivalent to one optionalStart and optionalEnd section. Note that I also had to include the uppercase S (fraction of second) as optional, to parse the case where this field is not present.
The other patterns (VV and x) correspond to the various offsets you need. From the javadoc:
Pattern Count Equivalent builder methods
------- ----- --------------------------
VV 2 appendZoneId()
x 1 appendOffset("+HHmm","+00")
xx 2 appendOffset("+HHMM","+0000")
xxx 3 appendOffset("+HH:MM","+00:00")
This works for all your input dates.
The only difference is that [.SSSSSS] accepts exactly 6 digits in the fraction-of-seconds field (or zero digits, as it's an optional section), while appendFraction accepts any quantity from 0 to 6 digits. To get exactly this same behaviour, you must use the DateTimeFormatterBuilder:
DATE_TIME_FORMATTER = new DateTimeFormatterBuilder()
// date and time
.appendPattern("yyyy-MM-dd'T'HH:mm:ss")
// fraction of seconds, from 0 to 6 digits
.appendFraction(ChronoField.MICRO_OF_SECOND, 0, 6, true)
// optional offset patterns
.appendPattern("[VV][x][xx][xxx]")
.toFormatter();

Related

How can I filter() stream method using regexp and predicate to get negated list

I am trying to filter anything not in the regexp.
So what I am trying to express is write anything to a list that has characters other than a-z,0-9 and -, so I can deal with these city names with invalid characters afterwards.
But whatever I try I either end up with a list of valid cities or an IllegalArgumentException where the list contains valid character cities.
String str;
List<String> invalidCharactersList = cityName.stream()
.filter(Pattern.compile("[^a-z0-9-]*$").asPredicate())
.collect(toList());
// Check for invalid names
if (!invalidCharactersList.isEmpty()) {
str = (inOut) ? "c" : "q";
throw new IllegalArgumentException("City name characters "
+ str + ": for city name " + invalidCharactersList.get(0)
+ ": fails constraint city names [a-z, 0-9, -]");
}
I am try to filter anything not in the regexp
Following is some test data which fails on the first list, I want it to fail on last
List<String> c = new ArrayList<>(Arrays.asList("fastcity", "bigbanana", "xyz"));
List<Integer> x = new ArrayList<>(Arrays.asList(23, 23, 23));
List<Integer> y = new ArrayList<>(Arrays.asList(1, 10, 20));
List<String> q = new ArrayList<>(Arrays.asList("fastcity*", "bigbanana", "xyz&"));
Following is output:
#Holger
filter(Pattern.compile("[^a-z0-9-]").asPredicate())
Thanks this works fine.

how to strip out the time portion from a datetime string

how to strip out the time portion from a datetime string in Informatica. I can't use GEt_DATE_PART function since the input value is a string value.
Ex. 2019-08-01 14:30:00
I want to strip out the hour, minute and second portions of the string and store them in separate variables.
Try creating two ports, like:
SUBSTRING(in_date, 1, 10)
SUBSTRING(in_date, 12, 19)
You should create 3 variables
1. hour = substring(date_in,12,14)
2. minute=substring(date_in,15,17)
3. seconds = substring(date_in,19,21)
adjust the length to your preferences depending on the space in the value

Java 8 DateTimeFormatter two digit year 18 parsed to 0018 instead of 2018?

With Java 8, the code below parses "18" into year "0018" instead of "2018".
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("M/d/y");
return LocalDate.parse(date, formatter);
input date is "01/05/18".
1) why the result is "0018"? Does DateTimeFormatter not follow the 80-20 rule?
2) How to control SimpleDateFormat parse to 19xx or 20xx? talked about SimpleDateFormat.set2DigitYearStart(Date) can be used to fix the year. Is there something similar to that for DateTimeFormatter?
I was hoping "M/d/y" will parse both 2 and 4 digit years.
"M/d/yy" throws Exception for 4 digit years and parses "01/05/97" to "2097-01-05". Ideally this should be parsed to "1997-01-05".
"M/d/yyyy" throws Exception for 2 digit years.
There is not a single string of y or u that will allow you to parse both two and four digit years. However, you may use optional parts in the format pattern string to specify that a two or four digit year may be present:
public static LocalDate parseDateString(CharSequence date) {
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("M/d/[uuuu][uu]");
return LocalDate.parse(date, formatter);
}
Try it:
System.out.println(parseDateString("01/05/18"));
System.out.println(parseDateString("01/06/2018"));
This printed:
2018-01-05
2018-01-06
In the format pattern string you need to put the four digit year first. With the opposite order, when trying to parse a four digit year, the formatter will parse two digits, decide it was successful this far, and then complain about unparsed text after the two digits.
If you want more precise control over how two digit years are interpreted:
DateTimeFormatter formatter = new DateTimeFormatterBuilder().appendPattern("M/d/")
.optionalStart()
.appendPattern("uuuu")
.optionalEnd()
.optionalStart()
.appendValueReduced(ChronoField.YEAR, 2, 2, 1920)
.optionalEnd()
.toFormatter();
Using this formatter in the above method let’s try:
System.out.println(parseDateString("01/05/22"));
This prints:
1922-01-05
Giving 1920 as base (as in my example code) will cause two digit years to end up in the interval from 1920 through 2019. Adjust the value to your requirements.
Change your Formatter string to
"M/d/yy"

DateTimeFormatter weekday seems off by one

I'm porting an existing application from Joda-Time to Java 8 java.time.
I ran into a problem where parsing a date/time string that contains a 'day of week' value triggered an exception in my unit tests.
When parsing:
2016-12-21 20:50:25 Wednesday December +0000 3
using format:
yyyy'-'MM'-'dd' 'HH':'mm':'ss' 'EEEE' 'MMMM' 'ZZ' 'e
I get:
java.time.format.DateTimeParseException:
Text '2016-12-21 20:50:25 Wednesday December +0000 3'
could not be parsed: Conflict found:
Field DayOfWeek 3 differs from DayOfWeek 2 derived from 2016-12-21
When letting the DateTimeFormatter indicate what it expects:
String logline = "2016-12-21 20:50:25 Wednesday December +0000";
String format = "yyyy'-'MM'-'dd' 'HH':'mm':'ss' 'EEEE' 'MMMM' 'ZZ";
DateTimeFormatter formatter = DateTimeFormatter.ofPattern(format).withLocale(Locale.ENGLISH);;
ZonedDateTime dateTime = formatter.parse(logline, ZonedDateTime::from);
format = "yyyy'-'MM'-'dd' 'HH':'mm':'ss' 'EEEE' 'MMMM' 'ZZ' 'e";
formatter = DateTimeFormatter.ofPattern(format).withLocale(Locale.ENGLISH);
System.out.println(formatter.format(dateTime));
I now get this output:
2016-12-21 20:50:25 Wednesday December +0000 4
So effectively the root cause of the problem is that the e flag in Joda-Time considers Monday to be 1 yet the Java 8 java.time considers Monday to be 0.
Now for the patterns that java.time.DateTimeFormatter supports I find in both the Oracle documentation and in JSR-310 this:
e/c localized day-of-week number/text 2; 02; Tue; Tuesday; T
This explicit example of 2 and 'Tuesday' leads me to believe that Wednesday should also in java.time be 3 instead of 4.
What is wrong here?
Do I misunderstand?
Is this a bug in Java 8?
There's a difference on how Joda-Time and java.time interprets the pattern e.
In Joda-Time, the e pattern designates the numeric value of day-of-week:
Symbol Meaning Presentation Examples
------ ----------- ------------ -------
e day of week number 2
So, using e is equivalent to getting the day of the week from a date object:
// using org.joda.time.DateTime and org.joda.time.format.DateTimeFormat
DateTime d = new DateTime(2016, 12, 21, 20, 50, 25, 0, DateTimeZone.UTC);
DateTimeFormatter fmt = DateTimeFormat.forPattern("e").withLocale(Locale.ENGLISH);
System.out.println(d.toString(fmt)); // 3
System.out.println(d.getDayOfWeek()); // 3
System.out.println(d.dayOfWeek().getAsText(Locale.ENGLISH)); // Wednesday
Note that both the formatter and getDayOfWeek() return 3. The getDayOfWeek() method returns a value defined in DateTimeConstants class, and Wednesday's value is 3 (the third day of the week according to ISO's definition).
In java.time API, the pattern e has a different meaning:
Pattern Count Equivalent builder methods
------- ----- --------------------------
e 1 append special localized WeekFields element for numeric day-of-week
It uses the localized WeekFields element, and this can vary according to the locale. The behaviour might be different when compared to the getDayOfWeek() method:
ZonedDateTime z = ZonedDateTime.of(2016, 12, 21, 20, 50, 25, 0, ZoneOffset.UTC);
DateTimeFormatter fmt = DateTimeFormatter.ofPattern("e", Locale.ENGLISH);
System.out.println(z.format(fmt)); // 4
System.out.println(z.getDayOfWeek()); // WEDNESDAY
System.out.println(z.getDayOfWeek().getValue()); // 3
Note that the formatter uses the localized day of week for English locale, and the value is 4, while calling getDayOfWeek().getValue() returns 3.
That's because e with English locale is equivalent to using a java.time.temporal.WeekFields:
// using localized fields
WeekFields wf = WeekFields.of(Locale.ENGLISH);
System.out.println(z.get(wf.dayOfWeek())); // 4
While getDayOfWeek() is equivalent to using ISO's definition:
// same as getDayOfWeek()
System.out.println(z.get(WeekFields.ISO.dayOfWeek())); // 3
That's because ISO's definition uses Monday as the first day of the week, while WeekFields with English locale uses Sunday:
// comparing the first day of week
System.out.println(WeekFields.ISO.getFirstDayOfWeek()); // MONDAY
System.out.println(wf.getFirstDayOfWeek()); // SUNDAY
So the e pattern might behave differently or not to getDayOfWeek(), according to the locale set in the formatter (or the JVM default locale, if none is set). In French locale, for example, it behaves just like ISO, while in some arabic locales, the first day of the week is Saturday:
WeekFields.of(Locale.FRENCH).getFirstDayOfWeek(); // MONDAY
WeekFields.of(new Locale("ar", "AE")).getFirstDayOfWeek(); // SATURDAY
According to javadoc, the only patterns that return a numeric value for the day of week seem to be the localized ones. So, to parse the input 2016-12-21 20:50:25 Wednesday December +0000 3, you can use a java.time.format.DateTimeFormatterBuilder and join the date/time pattern with a java.time.temporal.ChronoField to indicate the numeric value of the day of week (the ISO non-locale sensitive field):
String input = "2016-12-21 20:50:25 Wednesday December +0000 3";
DateTimeFormatter parser = new DateTimeFormatterBuilder()
// date/time pattern
.appendPattern("yyyy-MM-dd HH:mm:ss EEEE MMMM ZZ ")
// numeric day of week
.appendValue(ChronoField.DAY_OF_WEEK)
// create formatter with English locale
.toFormatter(Locale.ENGLISH);
ZonedDateTime date = ZonedDateTime.parse(input, parser);
Also note that you don't need to quote the -, : and space characters, so the pattern becomes more clear and readable (IMO).
I also set the English locale, because if you don't set, it'll use the JVM default locale, and it's not guaranteed to always be English. And it can also be changed without notice, even at runtime, so it's better to specify one, specially if you already know in what language the input is.
Update: probably the ccccc pattern should work, as it's equivalent to appendText(ChronoField.DAY_OF_WEEK, TextStyle.NARROW_STANDALONE) and in my tests (JDK 1.8.0_144) it returns (and also parses) 3:
DateTimeFormatter parser = DateTimeFormatter
.ofPattern("yyyy-MM-dd HH:mm:ss EEEE MMMM ZZ ccccc", Locale.ENGLISH);
ZonedDateTime date = ZonedDateTime.parse(input, parser);
In Locale.ENGLISH Wednesday is the 4th day of week, as week starts on Sunday.
You can check first day of week with
WeekFields.of(Locale.ENGLISH).getFirstDayOfWeek(); //it's SUNDAY

talend take out zeros before the comma

I have a file with two columns the first one with a name and the second one with a number.
The size of the number column is 20 chars, the numbers use to be less than 2 chars size the rest of the chars are complite with 0.
I need to take out all the zeros before the comma. I should use a tMap, How?
The solution:
Using a tMap, put a Var in the midle of both files (Input and output).
In the var use:
"0"+row1.numberField.split(",")[0].replace("0", "") + "." + row1.numberField.split(",")[1]
Example:
000000001,58
Result:
01.58
Solution 2:
Define your own routine:
public static String calcImp(String theNumber) {
Float theFNumber = new Float(theNumber.replace(",", "."));
return Float.toString(theFNumber).replace(".", ",");
}
Example:
000000001,587
Result:
1,587

Resources