How to sort given array list? - ruby

list = ["HM00", "HM01", "HM010", "HM011", "HM012", "HM013", "HM014", "HM015", "HM016", "HM017", "HM018", "HM019", "HM02", "HM020", "HM021", "HM022", "HM023", "HM024", "HM025", "HM026", "HM027", "HM028", "HM029", "HM03", "HM030", "HM031", "HM032", "HM033", "HM034", "HM035", "HM036", "HM037", "HM038", "HM039", "HM04", "HM040", "HM041", "HM042", "HM043", "HM044", "HM045", "HM046", "HM047", "HM05", "HM06", "HM07", "HM08", "HM09"]
I want the display the results as ["HM00","HM01","HM002"...] but using sort method it is giving the below results
["HM00", "HM01", "HM010", "HM011", "HM012", "HM013", "HM014", "HM015", "HM016", "HM017", "HM018", "HM019", "HM02"]

If every element has a number at the end
list.sort_by { |item| item.scan(/\d*$/).first.to_i }
match that number at the end, take the first one (because scan gives you an array of results), convert it to an integer
simpler
list.sort_by { |item| item[/\d*$/].to_i }
[] already takes the first match

There is a more general solution that will work with most strings that contain groups of numbers
number = /([+-]{0,1}\d+)/;
strings = [ '2', '-2', '10', '0010', '010', 'a', '10a', '010a', '0010a', 'b10', 'b2', 'a1b10c20', 'a1b2.2c2' ]
p strings.sort_by { |item| [item.split(number).each_slice(2).map {
|x| x.size == 1 ? [ x[0], '0' ] : [ x[0], x[1] ] }].map {|y| ret = y.inject({r:[],x:[]}) { |s, z| s[:r].push [ z[0], z[1].to_r]; s[:x].push z[1].size.to_s; s }; ret[:r] + ret[:x] }.flatten
}
You can adjust number to match the types of numbers you want to use: integers, floating point, etc.
There is some extra code to sort equal numbers by length so that '10' comes before '010'.

Related

How to do smart sort in groovy? [duplicate]

I have a list of version numbers like,
Versions = [0.0.10, 0.0.11, 0.0.13, 0.0.14, 0.0.15, 0.0.16, 0.0.17, 0.0.18, 0.0.19, 0.0.20, 0.0.21, 0.0.22, 0.0.23, 0.0.24, 0.0.25, 0.0.26, 0.0.27, 0.0.28, 0.0.29, 0.0.3, 0.0.30, 0.0.33, 0.0.34, 0.0.35, 0.0.36, 0.0.37, 0.0.38, 0.0.39, 0.0.4, 0.0.41, 0.0.42, 0.0.43, 0.0.44, 0.0.45, 0.0.46, 0.0.47, 0.0.48, 0.0.49, 0.0.5, 0.0.5-delivery.5, 0.0.50, 0.0.51, 0.0.52, 0.0.53, 0.0.54, 0.0.55, 0.0.56, 0.0.57, 0.0.58, 0.0.59, 0.0.6, 0.0.60, 0.0.61, 0.0.62, 0.0.63, 0.0.64, 0.0.7, 0.0.8, 0.0.9]'
And i need to get the last version (0.0.64), Versions.sort() && Collections.max(Versions) doesn't work for me.
So I developed this function blow
def mostRecentVersion(def versions) {
def lastversion = "0.0.0"
for (def items : versions) {
def version = items.tokenize('-')[0]
def ver = version.tokenize('.')
def lastver = lastversion.tokenize('.')
if (lastver[0].toInteger() < ver[0].toInteger() ){
lastversion = version
}else if(lastver[0].toInteger() == ver[0].toInteger()) {
if (lastver[1].toInteger() < ver[1].toInteger() ){
lastversion = version
}else if(lastver[1].toInteger() == ver[1].toInteger()){
if (lastver[2].toInteger() < ver[2].toInteger() ){
lastversion = version
}
}
}
}
return lastversion }
i'm asking if there is something better,
Thank you for help :)
the idea:
build map with sortable key and original version value, then sort map by keys, then get only values
to create sortable key for each value
split version to digits & not-digit strings array
prepend to each part 0 to have minimum length 3 (assume each number not longer then 3 digits)
join array to string
so, for 0.11.222-dev ->
1. [ '0', '.', '11', '222', '-dev' ]
2. [ '000', '00.', '011', '222', '-dev' ]
3. '00000.011222-dev'
the code
def mostRecentVersion(versions){
return versions.collectEntries{
[(it=~/\d+|\D+/).findAll().collect{it.padLeft(3,'0')}.join(),it]
}.sort().values()[-1]
}
//test cases:
def fullVersions = ['0.0.10', '0.0.11', '0.0.13', '0.0.14', '0.0.15', '0.0.16',
'0.0.17', '0.0.18', '0.0.19', '0.0.20', '0.0.21', '0.0.22', '0.0.23', '0.0.24',
'0.0.25', '0.0.26', '0.0.27', '0.0.28', '0.0.29', '0.0.3', '0.0.30', '0.0.33',
'0.0.34', '0.0.35', '0.0.36', '0.0.37', '0.0.38', '0.0.39', '0.0.4', '0.0.41',
'0.0.42', '0.0.43', '0.0.44', '0.0.45', '0.0.46', '0.0.47', '0.0.48', '0.0.49',
'0.0.5', '0.0.5-delivery.5', '0.0.50', '0.0.51', '0.0.52', '0.0.53', '0.0.54',
'0.0.55', '0.0.56', '0.0.57', '0.0.58', '0.0.59', '0.0.6', '0.0.60', '0.0.61',
'0.0.62', '0.0.63', '0.0.64', '0.0.7', '0.0.8', '0.0.9']
assert mostRecentVersion(fullVersions) == '0.0.64'
assert mostRecentVersion(['0.0.5-delivery.5', '0.0.3', '0.0.5']) == '0.0.5-delivery.5'
assert mostRecentVersion(['0.0.5.5', '0.0.5-delivery.5', '0.0.5']) == '0.0.5.5'
I believe this will work... it also keeps the original version strings around, incase 0.5.5-devel.5 is the latest... It relies on the fact that Groovy will use a LinkedHashMap for the sorted map, so the order will be preserved :-)
def mostRecentVersion(def versions) {
versions.collectEntries {
[it, it.split(/\./).collect { (it =~ /([0-9]+).*/)[0][1] }*.toInteger()]
}.sort { a, b ->
[a.value, b.value].transpose().findResult { x, y -> x <=> y ?: null } ?:
a.value.size() <=> b.value.size() ?:
a.key <=> b.key
}.keySet()[-1]
}
def fullVersions = ['0.0.10', '0.0.11', '0.0.13', '0.0.14', '0.0.15', '0.0.16', '0.0.17', '0.0.18', '0.0.19', '0.0.20', '0.0.21', '0.0.22', '0.0.23', '0.0.24', '0.0.25', '0.0.26', '0.0.27', '0.0.28', '0.0.29', '0.0.3', '0.0.30', '0.0.33', '0.0.34', '0.0.35', '0.0.36', '0.0.37', '0.0.38', '0.0.39', '0.0.4', '0.0.41', '0.0.42', '0.0.43', '0.0.44', '0.0.45', '0.0.46', '0.0.47', '0.0.48', '0.0.49', '0.0.5', '0.0.5-delivery.5', '0.0.50', '0.0.51', '0.0.52', '0.0.53', '0.0.54', '0.0.55', '0.0.56', '0.0.57', '0.0.58', '0.0.59', '0.0.6', '0.0.60', '0.0.61', '0.0.62', '0.0.63', '0.0.64', '0.0.7', '0.0.8', '0.0.9']
assert mostRecentVersion(fullVersions) == '0.0.64'
assert mostRecentVersion(['0.0.5-delivery.5', '0.0.3', '0.0.5']) == '0.0.5-delivery.5'
assert mostRecentVersion(['0.0.5.5', '0.0.5-delivery.5', '0.0.5']) == '0.0.5.5'
Edit:
Made a change so that 0.5.5.5 > 0.5.5-devel.5

Convert and translate a Ruby hash into a human-readable string

I have the following Ruby hash:
{"limit"=>250, "days_ago"=>14, "days_ago_filter"=>"lt", "key"=>3}
I'd like to convert it to a human-readable string and translate some of the values as necessary:
Limit: 250 - Days Ago: 14 - Days Ago Filter: Less than - Key: D♯, E♭,
So lt, in this case, actually translates to Less than. and 3 for key translates to D♯, E♭.
I'm almost there with this:
variables.map {|k,v| "#{k.split('_').map(&:capitalize).join(' ')}: #{v}"}.join(' - ')
But translating those values is where I'm hitting a snag.
I'd suggest using hashes for mapping out the possible values, e.g.:
days_ago_filter_map = {
"lt" => "Less than",
# ...other cases here...
}
musical_key_map = {
3 => "D♯, E♭",
# ...other cases here...
}
Then you can switch on the key:
variables.map do |key, value|
label = "#{key.split('_').map(&:capitalize).join(' ')}"
formatted_value = case key
when "days_ago_filter" then days_ago_filter_map.fetch(value)
when "key" then musical_key_map.fetch(value)
else value
end
"#{label}: #{formatted_value}"
end.join(' - ')
Note that if you're missing anything in your maps, the above code will raise KeyNotFound errors. You can set a default in your fetch, e.g.
days_ago_filter_map.fetch(value, "Unknown filter")
musical_key_map.fetch(value, "No notes found for that key")
You can create YAML files too for such kind of mappings :
values_for_replacement = {
"lt" => "Less than",
3 => "D♯, E♭"
}
you can try following :
variables.map {|k,v|
value_to_be_replaced = values_for_replacement[v]
"#{k.humanize}: #{(value_to_be_replaced.present? ? value_to_be_replaced : v)}"}.join(' - ')

Best way to capture multiple matches

Having in same text message fixed part once (id of item) and multiple lines (several references and dimensions of each part):
..some random text here..
ID/11000082734
REF/D14-109-0
REF/D14-209-0
REF/D14-219-0
CMT/59-40-25
CMT/38-25-28
CMT/59-40-25
CMT/37-37-20
CMT/40-40-20
CMT/37-37-20
CMT/49-41-31
CMT/44-34-53
I want to parse and store IdCode, References, Array with dimensions.
When applying REGEX.match(my_text) method getting only first occurencies of REF and CMT:
REGEX = %r{
ID\/(?<IdCode> \d{10})\s
(REF\/(?<ReferenceCode> \w{3}\-\d{3}\-\d)\s)+
(CMT\/(?<Length> \d+)\-(?<Width> \d+)\-(?<Height> \d+)\s)+
}x
The result looks like this:
IdCode: "1100008273"
ReferenceCode: "D14-219-0"
Length: "37"
Width: "37"
Height: "20"
Is there a way to capture multiple occurrences without iterating ?
Suppose your string were:
str = %w| dog
ID/11000082734
REF/D14-109-0
REF/D14-209-0
CMT/49-41-31
CMT/44-34-53
cat
ID/11000082735
REF/D14-109-1
REF/D14-209-1
CMT/49-41-32
CMT/44-34-54
pig |.join("\n")
#=> "dog\nID/11000082734\nREF/D14-109-0\nREF/D14-209-0\nCMT/49-41-31\nCMT/44-34-53\ncat\nID/11000082735\nREF/D14-109-1\nREF/D14-209-1\nCMT/49-41-32\nCMT/44-34-54\npig"
Then you could write:
r = /(ID\/\d{11}) # match string in capture group 1
\n # match newline
((?:REF\/[A-Z]\d{2}-\d{3}-\d\n)+) # match consecutive REF lines in capture group 2
((?:CMT\/\d{2}-\d{2}-\d{2}\n)+) # match consecutive CMT lines in capture group 3
/x # free-spacing regex definition mode
arr = str.scan(r)
#=> [["ID/11000082734", "REF/D14-109-0\nREF/D14-209-0\n",
# "CMT/49-41-31\nCMT/44-34-53\n"],
# ["ID/11000082735", "REF/D14-109-1\nREF/D14-209-1\n",
# "CMT/49-41-32\nCMT/44-34-54\n"]]
This extracts the desired information without iterating.
At this point it may be desirable to convert arr to a more convenient data structure. For example:
arr.map do |a,b,c|
{ :id => a[/\d+/],
:ref => b.split("\n").map { |s| s[4..-1] },
:cmt => c.scan(/(\d{2})-(\d{2})-(\d{2})/).map { |e|
[:length, :width, :height].zip(e.map(&:to_i)).to_h }
}
end
#=> [{ :id=>"11000082734",
# :ref=>["D14-109-0", "D14-209-0"],
# :cmt=>[{ :length=>49, :width=>41, :height=>31 },
# { :length=>44, :width=>34, :height=>53 }
# ]
# },
# { :id=>"11000082735",
# :ref=>["D14-109-1", "D14-209-1"],
# :cmt=>[{ :length=>49, :width=>41, :height=>32 },
# { :length=>44, :width=>34, :height=>54 }
# ]
# }
# ]
Try this
(?<IdCode>\d{10,})|REF\/(?<ReferenceCode>\w{3}\-\d{3}\-\d)|CMT\/(?<Length>\d+)\-(?<Width>\d+)\-(?<Height>\d+)
Regex demo
Explanation:
( … ): Capturing group sample
?: Once or none sample
\: Escapes a special character sample
|: Alternation / OR operand sample
+: One or more sample
Input
..some random text here..
ID/11000082734
REF/D14-109-0
REF/D14-209-0
REF/D14-219-0
CMT/59-40-25
CMT/38-25-28
CMT/59-40-25
CMT/37-37-20
CMT/40-40-20
CMT/37-37-20
CMT/49-41-31
CMT/44-34-53
Output:
MATCH 1
IdCode [29-40] `11000082734`
MATCH 2
ReferenceCode [45-54] `D14-109-0`
MATCH 3
ReferenceCode [59-68] `D14-209-0`
MATCH 4
ReferenceCode [73-82] `D14-219-0`
MATCH 5
Length [87-89] `59`
Width [90-92] `40`
Height [93-95] `25`
MATCH 6
Length [100-102] `38`
Width [103-105] `25`
Height [106-108] `28`
MATCH 7
Length [113-115] `59`
Width [116-118] `40`
Height [119-121] `25`
MATCH 8
Length [126-128] `37`
Width [129-131] `37`
Height [132-134] `20`
MATCH 9
Length [139-141] `40`
Width [142-144] `40`
Height [145-147] `20`
MATCH 10
Length [152-154] `37`
Width [155-157] `37`
Height [158-160] `20`
MATCH 11
Length [165-167] `49`
Width [168-170] `41`
Height [171-173] `31`
MATCH 12
Length [178-180] `44`
Width [181-183] `34`
Height [184-186] `53`

Parsing a string field

I have these Syslog messages:
N 4000000 PROD 15307 23:58:12.13 JOB78035 00000000 $HASP395 GGIVJS27 ENDED\r
NI0000000 PROD 15307 23:58:13.41 STC81508 00000200 $A J78036 /* CA-JOBTRAC JOB RELEASE */\r
I would like to parse these messages into various fields in a Hash, e.g.:
event['recordtype'] #=> "N"
event['routingcode'] #=> "4000000"
event['systemname'] #=> "PROD"
event['datetime'] #=> "15307 23:58:12.13"
event['jobid'] #=> "JOB78035"
event['flag'] #=> "00000000"
event['messageid'] #=> "$HASP395"
event['logmessage'] #=> "$HASP395 GGIVJS27 ENDED\r"
This is the code I have currently:
message = event["message"];
if message.to_s != "" then
if message[2] == " " then
array = message.split(%Q[ ]);
event[%q[recordtype]] = array[0];
event[%q[routingcode]] = array[1];
event[%q[systemname]] = array[2];
event[%q[datetime]] = array[3] + " " +array[4];
event[%q[jobid]] = message[38,8];
event[%q[flags]] = message[47,8];
event[%q[messageid]] = message[57,8];
event[%q[logmessage]] = message[56..-1];
else
array = message.split(%Q[ ]);
event[%q[recordtype]] = array[0][0,2];
event[%q[routingcode]] = array[0][2..-1];
event[%q[systemname]] = array[1];
event[%q[datetime]] = array[2] + " "+array[3];
event[%q[jobid]] = message[38,8];
event[%q[flags]] = message[47,8];
event[%q[messageid]] = message[57,8];
event[%q[logmessage]] = message[56..-1];
end
end
I'm looking to improve the above code. I think I could use a regular expression, but I don't know how to approach it.
You can't use split(' ') or a default split to process your fields because you are dealing with columnar data that has fields that have no whitespace between them, resulting in your array being off. Instead, you have to pick apart each record by columns.
There are many ways to do that but the simplest and probably fastest, is indexing into a string and grabbing n characters:
'foo'[0, 1] # => "f"
'foo'[1, 2] # => "oo"
The first means "starting at index 0 in the string, grab one character." The second means "starting at index 1 in the string, grab two characters."
Alternately, you could tell Ruby to extract by ranges:
'foo'[0 .. 0] # => "f"
'foo'[1 .. 2] # => "oo"
These are documented in the String class.
This makes writing code that's easily understood:
record_type = message[ 0 .. 1 ].rstrip
routing_code = message[ 2 .. 8 ]
system_name = message[ 10 .. 17 ]
Once you have your fields captured add them to a hash:
{
'recordtype' => record_type,
'routingcode' => routing_code,
'systemname' => system_name,
'datetime' => date_time,
'jobid' => job_id,
'flags' => flags,
'messageid' => message_id,
'logmessage' => log_message,
}
While you could use a regular expression there's not much gained using one, it's just another way of doing it. If you were picking data out of free-form text it'd be more useful, but in columnar data it tends to result in visual noise that makes maintenance more difficult. I'd recommend simply determining your columns then cutting the data you need based on those from each line.

Sort and take top five values of hash

I have a hash whose keys are unique but values similar. I am trying to retrieve pairs of the top five highest values. For example, from this:
{9=>1, 11=>1, 12=>2, 13=>1, 14=>1, 18=>1, 19=>1, 20=>1, 23=>1, 24=>2, 27=>1, 28=>1, 29=>1, 30=>1, 33=>1, 34=>1, 35=>1, 36=>1, 37=>1, 38=>1, 39=>1, 40=>1, 41=>1, 42=>1, 43=>1, 44=>1, 45=>1, 46=>1, 47=>1, 48=>1, 49=>1, 52=>1, 53=>1, 54=>1, 55=>1, 56=>1, 57=>1, 58=>1, 59=>1, 60=>1, 61=>1, 62=>1, 63=>1, 64=>1, 66=>1, 67=>1, 68=>1, 69=>1, 70=>2, 72=>1}
I would like these values first
=> 12=>2, 24=>2, 70=>2, ???
I am not sure how to do this because there are three instances whose value is 2. How would ruby decide what the next values are if they are all the same?
I have this solution
#common_locations.max { |a,b| a.last() <=> b.last() }
but this only gives me one instance. How would I collect five?
This would work:
hash = {9=>1, 11=>1, 12=>2, 13=>1, 14=>1, 18=>1, 19=>1, 20=>1, 23=>1, 24=>2, 27=>1, 28=>1, 29=>1, 30=>1, 33=>1, 34=>1, 35=>1, 36=>1, 37=>1, 38=>1, 39=>1, 40=>1, 41=>1, 42=>1, 43=>1, 44=>1, 45=>1, 46=>1, 47=>1, 48=>1, 49=>1, 52=>1, 53=>1, 54=>1, 55=>1, 56=>1, 57=>1, 58=>1, 59=>1, 60=>1, 61=>1, 62=>1, 63=>1, 64=>1, 66=>1, 67=>1, 68=>1, 69=>1, 70=>2, 72=>1}
hash.sort_by { |k, v| v }.reverse.first(5).to_h
#=> {12=>2, 70=>2, 24=>2, 44=>1, 67=>1}
You can replace sort_by { |k, v| v }.reverse with sort_by { |k, v| -v } if your values are numbers.
Note that Array#to_h was introduced in Ruby 2.1; for older versions, you will have to use Hash[hash.sort_by...first(5)] instead.

Resources