Rails split string with multiple parts - ruby

I have a string that's been imported from a csv such as:
14th Aug 2009:1, 15th Aug 2009:1, 16th Sep 2015:1|Style1, 17th Sep
2015:1|Style 1
I wish to add this data to my database in a specific way. First I split it on , to get each date group (in this case 4 dates).
Secondly i'd like a way to split each of those date groups into multiple segments. The first with the date, second with the number after the colon and then a varied amount more for each of the items separated by the | character.
Is there an decent efficient way to accomplish this in Ruby?
Looking for outcome to be a hash like so:
{ '14th Aug 2009' => 1, '15th Aug 2009' => 1, '16th Aug 2009' => 1, '16th Sep 2015' => { 1 => 'Style 1' }, '17th Sep 2015' => { 1 => 'Style 1' }
Basically if the string was like so:
15th Aug 2009:1, 16th Sep 2015:3|Style1|Style 1, 17th Sep
2015:1|Style 1
I would get
{ '15th Aug 2009' => 1, '16th Sep 2015' => { '', 'Style 1', 'Style 1' }, '17th Sep 2015' => { 1 => 'Style 1' }
Basically, the text separated by |'s should be assigned to the number after the colon. If the number is 3 and there are two sets of text after it then one is an empty string and the other two will say the text (eg: "Style 1".
Sorry for sounding very confusing.

I'm assuming that you meant for the '|' separted items to build an Array, as infused asked about. How about this?
s ="15th Aug 2009:1, 16th Sep 2015:3|Style1|Style 1, 17th Sep 2015:1|Style 1"
result = {}
s.split(',').each do |v|
date,rest = v.split(':')
items = rest.split('|')
if items[0] == "1"
result[date] = 1
else
result[date] = ['', items[1..-1]]
end
end

Related

Compare one date value with other dates and perform conditional action in cypress

I'm trying to compare one date value (ie. base value) with all other date values on a page and based on the difference between these days, I want to execute other commands.
Well, in the above UI, the base value is 11 Jul 2021 (Departure date in the first list) and other date values are 12 Jul 2021, 20 Jul 2021, 27 Jul 2021, 3 Aug 2021 and so on (Arrival dates from 2nd list onwards).
Now, I had to delete the all list(s) where the date difference between the base value and particular list is less than 15 days.
In this case, 12 Jul 2021, 20 Jul 2021 had to be deleted and all lists from 27 Jul 2021, 3 Aug 2021 and so on should be untouched as in the below picture.
So far, I have captured the value of the base value and came up with logic to compare it with another date value but I am not sure how I can save the 2nd and further date value(s) to a variable in order to compare with the base value.
{
cy.get("[data-test='departureTime']")
.eq(0)
.then((date) => {
const depDate_FirstPort = new Date(date.text());
cy.log(depDate_FirstPort.toISOString()); //2021-07-11T19:00:00.000Z
// const arrDate_SecondPort = new Date(cy.get('[data-test="arrivalTime"]').eq(1).invoke('text'));
// Since the above approach does not work, hard coding now.
const arrDate_SecondPort = new Date("22 Jul 2021 12:01")
cy.log(arrDate_SecondPort.toISOString()); //2021-07-22T10:01:00.000Z
cy.getDifferenceBetweenDates(depDate_FirstPort,arrDate_SecondPort).then((dif)=>{
if(dif < 16) {
cy.log("delete the port entry");
//do something
}
});
});
}
Cypress Command:
Cypress.Commands.add("getDifferenceBetweenDates", (Date1, Date2) => {
var diff_times = Math.abs(Date1.getTime() - Date2.getTime());
var diff_days = Math.ceil(diff_times / (1000 * 3600 * 24));
cy.log(diff_days) //11
})
Also, curious to know a possible approach to iterate all list falls under the 'to be deleted list' (12 Jul 2021, 20 Jul 2021) based on the condition mentioned above.
The iterative approach you have is ok, but you need to repeat the code you have for the first date to get the subsequent dates.
So, this bit but changing the index
cy.get("[data-test='departureTime']")
.eq(0) // 1,2,3 etc
.then((date) => {
A different approach is to filter the whole set,
const dayjs = require('dayjs') // replaces Cypress.moment
// first install with
// yarn add -D dayjs
it('finds the skipped ports', () => {
// helper func with format specific to this website
const toDate = (el) => dayjs(el.innerText, 'D MMM YYYY HH:mm')
cy.get("[data-test='departureTime']")
.then($departures => {
const departures = [...$departures] // convert jQuery object to an array
const first = toDate(departures[0]);
const cutoff = first.add(15, 'day')
const nextPorts = departures.slice(1) // all but the first
const skipPorts = nextPorts.filter(port => toDate(port).isBefore(cutoff))
expect(skipPorts.length).to.eq(2)
expect(skipPorts[0].innerText).to.eq('12 Jul 2021 14:02')
expect(skipPorts[1].innerText).to.eq('21 Jul 2021 04:00')
})
})
I'm not clear about your goal, but if you are going to actually delete the skipPorts from the page instead of just testing them, you should be wary of the DOM list changing as you do so.
Deleting from the list you have recently queried with cy.get("[data-test='departureTime']") would cause the internal subject to become invalid, and you might get "detached from DOM" errors or delete the wrong item.

I don't know how to filter my log file with grok and logstash

I have an small java app that loads logs similar to these ones bellow:
Fri May 29 12:10:34 BST 2015 Trade ID: 2 status is :received
Fri May 29 14:12:36 BST 2015 Trade ID: 4 status is :received
Fri May 29 17:15:39 BST 2015 Trade ID: 3 status is :received
Fri May 29 21:19:43 BST 2015 Trade ID: 3 status is :Parsed
Sat May 30 02:24:48 BST 2015 Trade ID: 8 status is :received
Sat May 30 08:30:54 BST 2015 Trade ID: 3 status is :Data not found
Sat May 30 15:38:01 BST 2015 Trade ID: 3 status is :Book not found
Sat May 30 23:46:09 BST 2015 Trade ID: 6 status is :received
I want to use ELK stack to analyse my logs and filter them.
I would like at least 3 filters : Date and time, trade Id and status.
In the filter part of my logstash configuration file here is what I did:
filter {
grok {
match => { "message" => "%{DAY} %{MONTH} %{DAY} %{TIME} BST %{YEAR} Trade ID: %{NUMBER:tradeId} status is : %{WORD:status}" }
}
And for the moment I can't filter my logs as I want.
You have some extra spaces between the pattern, and for the status, you would like to parse the entire message, so using the GREEEDYDATA instead of the WORD is your choice.
filter {
grok {
match => { "message" => "%{DAY:day} %{MONTH:month} %{MONTHDAY:monthday} %{TIME:time} BST %{YEAR:year} Trade ID: %{NUMBER:tradeId} status is :%{GREEDYDATA:status}" }
}
}
For this log line:
Sat May 30 15:38:01 BST 2015 Trade ID: 3 status is :Book not found
You will end up with a json like:
{
"message" => "Sat May 30 15:38:01 BST 2015 Trade ID: 3 status is :Book not found",
"#version" => "1",
"#timestamp" => "2015-08-18T18:28:47.195Z",
"host" => "Gabriels-MacBook-Pro.local",
"day" => "Sat",
"month" => "May",
"monthday" => "30",
"time" => "15:38:01",
"year" => "2015",
"tradeId" => "3",
"status" => "Book not found"
}

Comparing elements inside array of ranges

So given that I have this array of ranges:
[
[0] Mon, 29 Dec 2014 07:30:00 PST -08:00..Mon, 29 Dec 2014 10:59:59 PST -08:00,
[1] Mon, 29 Dec 2014 12:30:01 PST -08:00..Mon, 29 Dec 2014 15:00:00 PST -08:00,
[2] Mon, 29 Dec 2014 07:30:00 PST -08:00..Mon, 29 Dec 2014 08:59:59 PST -08:00,
[3] Mon, 29 Dec 2014 10:30:01 PST -08:00..Mon, 29 Dec 2014 15:00:00 PST -08:00
]
How do I compare ranges that have the same minimum value and remove that element if the maximum value is greater than the other?
Two ways, where a is the array of ranges:
#1
a.each_with_object({}) { |r,h| h.update({ r.first=>r }) { |_,ov,nv|
[ov,nv].min_by(&:last) } }.values
#2
a.group_by(&:first).values.map { |r| r.min_by(&:last) }
Admittedly, this will be slow:
your_array.group_by do |range|
range.min
end.each do |min_value, ranges|
least_max = ranges.map(&:max).min
ranges.delete_if{ |range| range.max != least_max }
end.values
The following might be faster and also will delete things from your original array:
min_maxes = {}
your_array.each do |range|
min = range.min
max = range.max
if min_maxes[min].nil? || (min_maxes[min] > max)
min_maxes[min] = max
end
end
your_array.delete_if do |range|
min_maxes[range.min] < range.max
end
If there is multimap data structure we can handle this scenario easily. It is a hashing implementation using binary tree and elements are ordered in keys. And it allows duplicate keys. It is there in C++, not sure is there anything similar in Ruby. Since question tagged 'data structure' hopefully my answer spread some lights.
For your case, you can consider lower range as key and upper range as value. If there is a collision in lower range you can easily identify that and compare the values of collided records and delete one if necessary.

Check if array of Dates are adjacent ordered months

I have an array of Dates. I need to check if it follows a month sequence, e.g.:
[Mar 2010, Apr 2010, May 2010, Jun 2010, ..., Jan 2012]
Since a Date object should have day, month and year, I want to ignore the day, and just worry about month and year.
I want to get true if there are no months "missing" on the sequence. In other words, after April or the vector ends, or I have a May; after a May either the vector ends or there is a June.
I want to get false if the months are not ordered correctly (from older to newer) or if there are months missing.
I can easily check if the dates are ordered by using the "<" operator. But I'm not sure how to check if there are missing months. How can I do that?
Here's one way
require 'date'
>> dates
=> ["Nov 2010", "Dec 2010", "Jan 2011"]
>> date_objs = dates.map{|d| Date.parse d }
=> [#<Date: 2010-03-01 ((2455257j,0s,0n),+0s,2299161j)...]
>> date_objs.each_cons(2).all?{|d1, d2| d1.next_month == d2 }
=> true
This handles missing months as well:
>> dates = ["Nov 2010", "Dec 2010", "Feb 2011"]
>> date_objs = dates.map{|d| Date.parse(d) }
>> date_objs.each_cons(2).all?{|d1, d2| d1.next_month == d2 }
=> false
require 'date'
ar =["Mar 2010","Apr 2010", "May 2010", "Jun 2010"]
p ar.map{|d| Date.parse(d)}.each_cons(2).all?{|(d1,d2)| (d1 >> 1) == d2} #=> true

Get the word after a particular word in a Ruby string?

How do I get the word after a particular word in a Ruby string?
For example:
From:Ysxrb<abc#gmail.com>\nTo: <xyzn#gmail.com>Subject: xyzabc\nDate: Tue, 19 Jun 2012 03:26:56 -0700\nMessage-ID: <9D.A1.02635.ABB40EF4#ecout1>
I just want to get:
Ysxrb<abc#gmail.com
xyzabc
I think your question/requirement may need a bit of refinement.
You state: "How to get the word after a particular word in a ruby string?" and your example text is this : "From:Ysxrb\nTo: Subject: xyzabc\nDate: Tue, 19 Jun 2012 03:26:56 -0700\nMessage-ID: <9D.A1.02635.ABB40EF4#ecout1>"
and then you finally say that what you really want out of these string are the following words:
"'Ysxrb' and 'xyzabc'".
Will you always be parsing email text, which is what this appears to be? If so, then there are some more specific approaches you could take. For instance, in this example you could do something like this:
eml = "From:Ysxrb\nTo: Subject: xyzabc\nDate: Tue, 19 Jun 2012 03:26:56 -0700\nMessage-ID: <9D.A1.02635.ABB40EF4#ecout1>"
tokens = eml.split(/[\s\:]/)
which would yield this:
["From", "Ysxrb", "To", "", "Subject", "", "xyzabc", "Date", "", "Tue,", "19", "Jun", "2012", "03", "26", "56", "-0700", "Message-ID", "", "<9D.A1.02635.ABB40EF4#ecout1>"]
At this point, if the word following "To" and "Subject" are what you're after, you could simply get the first non-blank array element after each one, like this:
tokens[tokens.find_index("From") + 1] => "Ysxrb"
tokens[tokens.find_index("Subject") + 2] => "xyzabc" # + 2 is needed because of the newline.
You can use a regular expresion, try this on a irb console:
string = "From:Ysxrb<abc#gmail.com>\nTo: <xyzn#gmail.com>Subject:"
/From:(.+)\n/.match string
$1
$1 hold the backreference we capture with the parenthesis in the regular expression
You could try a regexp, here's an example:
>> s = "From:Ysxrb\nTo: Subject: xyzabc\nDate: Tue, 19 Jun 2012 03:26:56 -0700\nMessage-ID: <9D.A1.02635.ABB40EF4#ecout1>"
=> "From:Ysxrb\nTo: Subject: xyzabc\nDate: Tue, 19 Jun 2012 03:26:56 -0700\nMessage-ID: <9D.A1.02635.ABB40EF4#ecout1>"
>> m, w1, w2 = s.match(/^From:(\w*)\W+.*Subject: (\w*)/).to_a
=> ["From:Ysxrb\nTo: Subject: xyzabc", "Ysxrb", "xyzabc"]
>> w1
=> "Ysxrb"
>> w2
=> "xyzabc"
to find out a good regexp for your requirements, you may use rubular, a Ruby regular expression editor

Resources