I need to parse a field from a CSV column which is a string:
TXDAT - SnpRespData_SC or SnpRespData_SC_PD (7)
I need to extract:
type_0 = SnpRespData
resp_0 = SC
pd_0 = 0
type_1 = SnpRespData
resp_1 = SC
pd_1 = 1
from this string. I want to pass the whole string to a function and be able to return these six values.
The string could be any of the following:
a) TXDAT - SnpRespData_SC_PD
b) TXRSP - SnpResp_SC
c) TXDAT - SnpRespData_SC or SnpRespData_SC_PD (7)
d) TXRSP - SnpResp_UC or TXDAT - SnpRespData_UC_PD (7)
So I created a function which receives this string and returns the following:
def map_rxsnp_transaction(rxsnp_transaction)
tx_dat = []
tx_rsp = []
case rxsnp_transaction
when (/SnpRespData.*SnpRespData/)
tx_dat = rxsnp_transaction.split(/or/)
(tx_dat_0, dat_0_resp, dat_0_pd) = tx_dat[0].split(/_/)
(tx_dat_1, dat_1_resp, dat_1_pd) = tx_dat[1].split(/_/)
return [tx_dat_0, dat_0_resp, dat_0_pd, tx_dat_1, dat_1_resp, dat_1_pd]
when (/SnpResp.*SnpRespData/)
(tx_rsp, tx_dat) = rxsnp_transaction.split(/or/)
(tx_rsp, rsp_resp, rsp_pd) = tx_rsp.split(/_/)
(tx_dat, dat_resp, dat_pd) = tx_dat.split(/_/)
return [tx_rsp, rsp_resp, rsp_pd, tx_dat, dat_resp, dat_pd]
when (/SnpRespData_{1}/)
return rxsnp_transaction.split(//)
when (/SnpResp{1}/)
return rxsnp_transaction.split(/_/)
end
end
Function call:
(tx_rsp[0],tx_rsp[1],tx_rsp[2],tx_rsp[3],tx_rsp[4],tx_rsp[5]) = map_rxsnp_transaction table_col[5]
Just wondering if I can optimize this code better...don't like the way it
I assume that you are looking for a specific string, namely, "SnpRespData". If so, you could do this:
str = "TXDAT - SnpRespData_SC or SnpRespData_SC_PD (7)"
f, l = str.scan(/SnpRespData\w+/).sort
#=> ["SnpRespData_SC", "SnpRespData_SC_PD"]
type_0, resp_0, pd_0 = f.split('_') << 0
#=> ["SnpRespData", "SC", 0]
type_1, resp_1, pd_1 = f.split('_').first(2) << 1
#=> ["SnpRespData", "SC", 1]
type_0 #=> "SnpRespData"
resp_0 #=> "SC"
pd_0 #=> 0
type_1 #=> "SnpRespData"
resp_1 #=> "SC"
pd_1 #=> 1
Related
From a file i import lines. In this line an (escaped) string is part of the line:
DP,0,"021",257
DP,1,"022",257
DP,2,"023",513
DP,3,"024",513
DP,4,"025",1025
DP,5,"026",1025
DP,6,"081",257
DP,7,"082",257
DP,8,"083",513
DP,9,"084",513
DP,10,"085",1025
DP,11,"086",1025
DP,12,"087",1025
DP,13,"091",257
DP,14,"092",513
DP,15,"093",1025
IS,0,"FIX",0
IS,1,"KARIN02",0
IS,2,"KARUIT02",0
IS,3,"KARIN02HOV",0
IS,4,"KARUIT02HOV",0
IS,5,"KARIN08",0
IS,6,"KARUIT08",0
IS,7,"KARIN08HOV",0
IS,8,"KARUIT08HOV",0
IS,9,"KARIN09",0
IS,10,"KARUIT09",0
IS,11,"KARIN09HOV",0
IS,12,"KARUIT09HOV",0
IS,13,"KARIN10",0
IS,14,"KARUIT10",0
IS,15,"KARIN10HOV",0
I get the following Objects (if DP) :
index - parts1 (int)
name - parts2 (string)
ref - parts3 (int)
I tried using REGEX to replace the excape-sequence from the lines but to no effect
#name_to_ID = {}
kruising = 2007
File.open(cfgFile).each{|line|
parts = line.split(",")
if parts[0]=="DP"
index = parts[1].to_i
hex = index.to_s(16).upcase.rjust(2, '0')
cname = parts[2].to_s
tname = cname.gsub('\\"','')
p "cname= #{cname} (#{cname.length})"
p "tname= #{tname} (#{tname.length})"
p cname == tname
#name_to_ID[tname] = kruising.to_s + "-" + hex.to_s
end
}
teststring = "021"
p #name_to_ID[teststring]
> "021" (5)
> "021" (5)
> true
> nil
The problem came to light when calling from another string reference (length3)
hash[key] isnt equal as string "021" ( length 5) is not string 021 ( length 3)
any method that actually replaces the chars i need?
EDIT: I used
cname.each_char{|c|
p c
}
> "\""
> "0"
> "2"
> "1"
> "\""
EDIT: requested outcome update:
# Current output:
#name_to_ID["021"] = 2007-00 "021".length = 5
#name_to_ID["022"] = 2007-01 "022".length = 5
#name_to_ID["081"] = 2007-06 "081".length = 5
#name_to_ID["082"] = 2007-07 "082".length = 5
#name_to_ID["091"] = 2007-0D "091".length = 5
#name_to_ID["101"] = 2007-10 "101".length = 5
# -------------
# Expected output:
#name_to_ID["021"] = 2007-00 "021".length = 3
#name_to_ID["022"] = 2007-01 "022".length = 3
#name_to_ID["081"] = 2007-06 "081".length = 3
#name_to_ID["082"] = 2007-07 "082".length = 3
#name_to_ID["091"] = 2007-0D "091".length = 3
#name_to_ID["101"] = 2007-10 "101".length = 3
Your problem is you don't know the correct character in your string. It might not be the same character when printing it.
Try parts[2].to_s.bytes to check exactly what is the character code of that unexpected character. For example:
> "asd".bytes
=> [205, 184, 97, 115, 100]
Alternatively, you can delete the first and the last characters, if you are sure that every part of the string has the same format:
cname = parts[2].to_s[1..-2]
Or you can remove all special characters in the string if you know that the string will not contain any special character
cname = parts[2].to_s.gsub(/[^0-9A-Za-z]/, '')
I have this structure:
$ArrayX = [8349310431,8349314513,......]
$ArrayY = [667984788,667987788,......]
$ArrayZ = [148507632380,153294624079,.....]
$range_map = $ArrayX.zip([$ArrayY.map(&:to_i),
$ArrayZ.map(&:to_i)].transpose).sort
puts $range_map ={[8349310431=>[667984788, 148507632380],
8349314513=>[667987788, 153294624079]}
I need the key to be compared with the rest of the keys and if the subtraction between keys is lower than 100, that key to print
I corrected your code also as per your need, and solved further,
$ArrayX = [8349310431,8349314513]
$ArrayY = [667984788,667987788]
$ArrayZ = [148507632380,153294624079]
$range_map = $ArrayX.zip([$ArrayY.map(&:to_i), $ArrayZ.map(&:to_i)].transpose).sort
$ArrayX = [8349310431,8349314513]
=> [8349310431, 8349314513]
$ArrayY = [667984788,667987788]
=> [667984788, 667987788]
$ArrayZ = [148507632380,153294624079]
=> [148507632380, 153294624079]
$range_map = Hash[$ArrayX.zip([$ArrayY.map(&:to_i), $ArrayZ.map(&:to_i)].transpose).sort]
=> {8349310431=>[667984788, 148507632380], 8349314513=>[667987788, 153294624079]}
keys = $range_map.keys
valid_keys = keys.select { |k| keys.detect { |x| (x-k).abs > 100 } }
$range_map.slice(*valid_keys)
If particular key is having difference more than 100 with one of rest of keys then it will be valid for filtering.
I'm using the google API ruby client and I want to implement some more complex analytics queries such as suggested in this document
https://developers.google.com/analytics/devguides/reporting/core/v3/common-queries
This document suggests that metrics can be supplied as a comma delimited string of multiple metrics but the API client only accepts an expression.
How can I query on multiple metrics in a single query? The ruby client appears only to accept an expression which generally consists of a single metric such as sessions or pageviews like this:
metric = Google::Apis::AnalyticsreportingV4::Metric.new(expression: 'ga:sessions')
If I remove "expression" and enter a list of metrics I just get an error.
Invalid value 'ga:sessions;ga:pageviews' for metric parameter.
Here is my solution, together with a generic method for reporting Google Analytics data:
This answer should be read in conjunction with https://developers.google.com/drive/v3/web/quickstart/ruby
analytics = Google::Apis::AnalyticsreportingV4::AnalyticsReportingService.new
analytics.client_options.application_name = APPLICATION_NAME
analytics.authorization = authorize
def get_analytics_data( analytics,
view_id,
start_date: (Date.today + 1 - Date.today.wday) - 6,
end_date: (Date.today + 1 - Date.today.wday),
metrics: ['ga:sessions'],
dimensions: [],
order_bys: [],
segments: nil, # array of hashes
filter: nil,
page_size: nil )
get_reports_request_object = Google::Apis::AnalyticsreportingV4::GetReportsRequest.new
report_request_object = Google::Apis::AnalyticsreportingV4::ReportRequest.new
report_request_object.view_id = view_id
analytics_date_range_object = Google::Apis::AnalyticsreportingV4::DateRange.new
analytics_date_range_object.start_date = start_date
analytics_date_range_object.end_date = end_date
report_request_object.date_ranges = [analytics_date_range_object]
# report_request_metrics = []
report_request_object.metrics = []
metrics.each { |metric|
analytics_metric_object = Google::Apis::AnalyticsreportingV4::Metric.new
analytics_metric_object.expression = metric
report_request_object.metrics.push(analytics_metric_object) }
# report_request_object.metrics = report_request_metrics
unless dimensions.empty?
report_request_object.dimensions = []
dimensions.each { |dimension|
analytics_dimension_object = Google::Apis::AnalyticsreportingV4::Dimension.new
analytics_dimension_object.name = dimension
report_request_object.dimensions.push(analytics_dimension_object) }
end
unless segments.nil?
report_request_object.segments = []
analytics_segment_object = Google::Apis::AnalyticsreportingV4::Segment.new
analytics_dynamic_segment_object = Google::Apis::AnalyticsreportingV4::DynamicSegment.new
analytics_segment_definition_object = Google::Apis::AnalyticsreportingV4::SegmentDefinition.new
analytics_segment_filter_object = Google::Apis::AnalyticsreportingV4::SegmentFilter.new
analytics_simple_segment_object = Google::Apis::AnalyticsreportingV4::SimpleSegment.new
analytics_or_filters_for_segment_object = Google::Apis::AnalyticsreportingV4::OrFiltersForSegment.new
analytics_segment_filter_clause_object = Google::Apis::AnalyticsreportingV4::SegmentFilterClause.new
analytics_segment_metric_filter_object = Google::Apis::AnalyticsreportingV4::SegmentMetricFilter.new
analytics_dimension_object = Google::Apis::AnalyticsreportingV4::Dimension.new
analytics_dimension_object.name = 'ga:segment'
report_request_object.dimensions.push(analytics_dimension_object)
analytics_or_filters_for_segment_object.segment_filter_clauses = []
analytics_simple_segment_object.or_filters_for_segment = []
analytics_segment_definition_object.segment_filters = []
segments.each { |segment|
analytics_segment_metric_filter_object.metric_name = segment[:metric_name]
analytics_segment_metric_filter_object.comparison_value = segment[:comparison_value]
analytics_segment_metric_filter_object.operator = segment[:operator]
analytics_segment_filter_clause_object.metric_filter = analytics_segment_metric_filter_object
analytics_or_filters_for_segment_object.segment_filter_clauses.push(analytics_segment_filter_clause_object)
analytics_simple_segment_object.or_filters_for_segment.push(analytics_or_filters_for_segment_object)
analytics_segment_filter_object.simple_segment = analytics_simple_segment_object
analytics_segment_definition_object.segment_filters.push(analytics_segment_filter_object)
analytics_dynamic_segment_object.name = segment[:name]
analytics_dynamic_segment_object.session_segment = analytics_segment_definition_object
analytics_segment_object.dynamic_segment = analytics_dynamic_segment_object
report_request_object.segments.push(analytics_segment_object) }
end
unless order_bys.empty?
report_request_object.order_bys = []
order_bys.each { |orderby|
analytics_orderby_object = Google::Apis::AnalyticsreportingV4::OrderBy.new
analytics_orderby_object.field_name = orderby
analytics_orderby_object.sort_order = 'DESCENDING'
report_request_object.order_bys.push(analytics_orderby_object)}
end
unless filter.nil?
report_request_object.filters_expression = filter
end
unless page_size.nil?
report_request_object.page_size = page_size
end
get_reports_request_object.report_requests = [report_request_object]
response = analytics.batch_get_reports(get_reports_request_object)
end
If using dimensions, you can report data like this:
response = get_analytics_data(analytics, VIEW_ID, metrics: ['ga:pageviews'], dimensions: ['ga:pagePath'], order_bys: ['ga:pageviews'], page_size: 25)
response.reports.first.data.rows.each do |row|
puts row.dimensions
puts row.metrics.first.values.first.to_i
puts
end
I have these Syslog messages:
N 4000000 PROD 15307 23:58:12.13 JOB78035 00000000 $HASP395 GGIVJS27 ENDED\r
NI0000000 PROD 15307 23:58:13.41 STC81508 00000200 $A J78036 /* CA-JOBTRAC JOB RELEASE */\r
I would like to parse these messages into various fields in a Hash, e.g.:
event['recordtype'] #=> "N"
event['routingcode'] #=> "4000000"
event['systemname'] #=> "PROD"
event['datetime'] #=> "15307 23:58:12.13"
event['jobid'] #=> "JOB78035"
event['flag'] #=> "00000000"
event['messageid'] #=> "$HASP395"
event['logmessage'] #=> "$HASP395 GGIVJS27 ENDED\r"
This is the code I have currently:
message = event["message"];
if message.to_s != "" then
if message[2] == " " then
array = message.split(%Q[ ]);
event[%q[recordtype]] = array[0];
event[%q[routingcode]] = array[1];
event[%q[systemname]] = array[2];
event[%q[datetime]] = array[3] + " " +array[4];
event[%q[jobid]] = message[38,8];
event[%q[flags]] = message[47,8];
event[%q[messageid]] = message[57,8];
event[%q[logmessage]] = message[56..-1];
else
array = message.split(%Q[ ]);
event[%q[recordtype]] = array[0][0,2];
event[%q[routingcode]] = array[0][2..-1];
event[%q[systemname]] = array[1];
event[%q[datetime]] = array[2] + " "+array[3];
event[%q[jobid]] = message[38,8];
event[%q[flags]] = message[47,8];
event[%q[messageid]] = message[57,8];
event[%q[logmessage]] = message[56..-1];
end
end
I'm looking to improve the above code. I think I could use a regular expression, but I don't know how to approach it.
You can't use split(' ') or a default split to process your fields because you are dealing with columnar data that has fields that have no whitespace between them, resulting in your array being off. Instead, you have to pick apart each record by columns.
There are many ways to do that but the simplest and probably fastest, is indexing into a string and grabbing n characters:
'foo'[0, 1] # => "f"
'foo'[1, 2] # => "oo"
The first means "starting at index 0 in the string, grab one character." The second means "starting at index 1 in the string, grab two characters."
Alternately, you could tell Ruby to extract by ranges:
'foo'[0 .. 0] # => "f"
'foo'[1 .. 2] # => "oo"
These are documented in the String class.
This makes writing code that's easily understood:
record_type = message[ 0 .. 1 ].rstrip
routing_code = message[ 2 .. 8 ]
system_name = message[ 10 .. 17 ]
Once you have your fields captured add them to a hash:
{
'recordtype' => record_type,
'routingcode' => routing_code,
'systemname' => system_name,
'datetime' => date_time,
'jobid' => job_id,
'flags' => flags,
'messageid' => message_id,
'logmessage' => log_message,
}
While you could use a regular expression there's not much gained using one, it's just another way of doing it. If you were picking data out of free-form text it'd be more useful, but in columnar data it tends to result in visual noise that makes maintenance more difficult. I'd recommend simply determining your columns then cutting the data you need based on those from each line.
positions = {
--table 1
[1] = {pos = {fromPosition = {x=1809, y=317, z=8},toPosition = {x=1818, y=331, z=8}}, m = {"100 monster"}},
--table 2
[2] = {pos = {fromPosition = {x=1809, y=317, z=8},toPosition = {x=1818, y=331, z=8}}, m = {"100 monster"}},
-- table3
[3] = {pos = {fromPosition = {x=1809, y=317, z=8},toPosition = {x=1818, y=331, z=8}}, m = {"100 monster"}}
}
tb = positions[?]--what need place here?
for _,x in pairs(tb.m) do --function
for s = 1, tonumber(x:match("%d+")) do
pos = {x = math.random(tb.pos.fromPosition.x, tb.pos.toPosition.x), y = math.random(tb.pos.fromPosition.y, tb1.pos.toPosition.y), z = tb.pos.fromPosition.z}
doCreateMonster(x:match("%s(.+)"), pos)
end
end
Here the problem, i use tb = positions[1], and it only for one table in "positions" table. But how apply this function for all tables in this table?
I don't know Lua very well but you could loop over the table:
for i = 0, table.getn(positions), 1 do
tb = positions[i]
...
end
Sources :
http://lua.gts-stolberg.de/en/schleifen.php and http://www.lua.org/pil/19.1.html
You need to iterate over positions with a numerical for.
Note that, unlike Antoine Lassauzay's answer, the loop starts at 1 and not 0, and uses the # operator instead of table.getn (deprecated function in Lua 5.1, removed in Lua 5.2).
for i=1,#positions do
tb = positions[i]
...
end
use the pairs() built-in. there isn't any reason to do a numeric for loop here.
for index, position in pairs(positions) do
tb = positions[index]
-- tb is now exactly the same value as variable 'position'
end