Using a CSV file to insert values using Ruby - ruby

I have some sample code I can execute for our Nexpose server and I need to do some mass asset tagging. Here is an example of the code.
nsc = Nexpose::Connection.new('your_nexpose_instance', 'username', 'password', 3780)
nsc.login
criterion = Nexpose::Tag::Criterion.new('IP_RANGE', 'IN', ['ip1', 'ip2'])
criteria = Nexpose::Tag::Criteria.new(criterion)
tag = Nexpose::Tag.new("tagname", Nexpose::Tag::Type::Generic::CUSTOM)
tag.search_criteria = criteria
tag.save(nsc)
I have a file called with the following data.
ip1,ip2,tagname
192.168.1.1,192.168.1.255,Workstations
How would I go about running a for loop and using the CSV to quickly process the above code? I have no experiance with Ruby and tried to follow some example but I'm confused at this point.

There's a CSV library in Ruby's standard lib collection that you can use.
Basic example based on your code example and data, not tested:
require 'csv'
nsc = Nexpose::Connection.new('your_nexpose_instance', 'username', 'password', 3780)
nsc.login
CSV.foreach("path/to/file.csv", headers: true) do |row|
criterion = Nexpose::Tag::Criterion.new('IP_RANGE', 'IN', [row['ip1'], row['ip2'])
criteria = Nexpose::Tag::Criteria.new(criterion)
tag = Nexpose::Tag.new(row['tagname'], Nexpose::Tag::Type::Generic::CUSTOM)
tag.search_criteria = criteria
tag.save(nsc)
end

I made a directory with input.csv and main.rb
input.csv
ip1,ip2,tagname
192.168.1.1,192.168.1.255,Workstations
main.rb
require "csv"
CSV.foreach("input.csv", headers: true) do |row|
puts "ip1: #{row['ip1']}"
puts "ip2: #{row['ip2']}"
puts "tagname: #{row['tagname']}"
end
the output is
ip1: 192.168.1.1
ip2: 192.168.1.255
tagname: Workstations
I hope this can help. If you have questions I'm here :)

If you just need to loop through each line of the file and fire that chunk of code for each line, you could do something like this:
file = Net::HTTP.get(URI(<whatever_your_file_name_is>))
index = 0
file.each_line do |line|
next if index == 0
index += 1
split_line = line.split(',')
ip1 = split_line[0]
ip2 = split_line[1]
tagname = split_line[2]
nsc = Nexpose::Connection.new('your_nexpose_instance', 'username', 'password', 3780)
nsc.login
criterion = Nexpose::Tag::Criterion.new('IP_RANGE', 'IN', [ip1, ip2])
criteria = Nexpose::Tag::Criteria.new(criterion)
tag = Nexpose::Tag.new(tagname, Nexpose::Tag::Type::Generic::CUSTOM)
tag.search_criteria = criteria
tag.save(nsc)
end
NOTE: This code example is assuming that the CSV file is stored remotely, not locally.
ALSO: In case you're wondering, the next if index == 0 is there to skip your header record.
UPDATE
To use this approach for a local file, you can use File.open() instead of Net::HTTP.get(), like so:
file = File.open(<whatever_your_file_name_is>).read
Two things to note:
Make sure you use the fully-qualified name of the file - i.e. ~/folder/folder/filename.csv instead of just filename.csv.
If the files you're going to be loading are enormous, this might not be an ideal approach because it's actually reading the whole file into memory. But considering your file only has 3 columns, you'd have to have an extreme number of rows in the file for this to be an issue.

Related

Rails Not Reading a Specific Column from CSV

I'm trying to read my CSV file in Rails 5.2.3 (Ruby 2.6.3) and it's not reading a specific column.
CSV file:
barcode,hardware,title,price
2900007868390,PS4,title1,300
3499550362923,PS4,title2,1800
3499550370973,Nintendo Switch,title3,5000
and so on...
Code:
csv = CSV.read('path/to/my_csv.csv', headers: true)
csv.each do |row|
puts "#{row['barcode']}, #{row['hardware']}, #{row['title']}, #{row['price']}"
end
Result:
, PS4, title1, 300
, PS4, title2, 1800
, Nintendo Switch, title3, 5000
and so on...
As you can see above, it's not reading the barcode column for some reason.
I've managed to get the barcode value if I write row[0] instead of row['barcode'] .
Any ideas why this is happening?
This was because my CSV file had BOMs...
https://en.wikipedia.org/wiki/Byte_order_mark
The code below solved the problem:
csv = CSV.read("resources/games/original_#{date}.csv", 'r:BOM|UTF-8', headers: true)
Thank you #dimitry_n for your help!

how to amend a code to get query from external and save result to external

I have a code which looks like this
require 'net/http'
base = 'www.uniprot.org'
tool = 'mapping'
params = {
'from' => 'ACC+ID', 'to' => 'P_ENTREZGENEID', 'format' => 'tab',
'query' => 'A0A0K3AVS5 A0A0K3AVV4 A0A0K3AW32 A0A0K3AWP0'
}
http = Net::HTTP.new base
$stderr.puts "Submitting...\n";
response = http.request_post '/' + tool + '/',
params.keys.map {|key| key + '=' + params[key]}.join('&')
loc = nil
while response.code == '302'
loc = response['Location']
response = http.request_get loc
end
while loc
wait = response['Retry-After'] or break
$stderr.puts "Waiting (#{wait})...\n";
sleep wait.to_i
response = http.request_get loc
end
response.value # raises http error if not 2xx
puts response.body
which gives me what I need. however, I have two questions
1- How to load a query list instead I parse it into the code ? lets say I save a txt file with all the query i want to the desktop of a mac
2- How to export the output ?
If I have
B2D6P1
G5EC52
B2FDA8-2
B2MZB1
B3CJ34
B3CKG1
B3GWA1
what #tadman showed gives me the answer
however, I have the following
B2D6P1
G5EC52;B2D6P4
B2FDA8-2;B2FDA8
B2MZB1;P18834
B3CJ34
B3CKG1
B3GWA1;Q8I7K5
and the answer is like below
B2D6P1 rmd-2
G5EC52 tlf-1
B2D6P4 tlf-1
B2FDA8 smc-3
B2MZB1 col-14
P18834 col-14
B3CJ34 gcn-1
B3CKG1 urm-1
B3GWA1 nono-1
Q8I7K5 nono-1
what I want is that if I have two entries in each row (separated with ;) leading to the similar output , it gives me only one , otherwise give me as many as they have for example in above example , my desire output is
B2D6P1 rmd-2
G5EC52;B2D6P4 tlf-1
B2FDA8-2;B2FDA8 smc-3
B2MZB1;P18834 col-14
B3CJ34 gcn-1
B3CKG1 urm-1
B3GWA1;Q8I7K5 nono-1
is this possible ?
Reading query data:
query = File.readlines('ids.txt').map(&:chomp).join(' ')
That way you can have them on separate lines, easier to edit, and they're space separated when submitted.
That makes your params look like:
params = {
'query' => query,
...
}
Writing data:
File.open('output.txt', 'w') do |f|
f.write(response.body)
end
That's all there is to it. If it's a string, or can be converted to a string, you can write it to a file.

How to pull data from tags based on other tags

I have the following example document:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<n1:Form109495CTransmittalUpstream xmlns="urn:us:gov:treasury:irs:ext:aca:air:7.0" xmlns:irs="urn:us:gov:treasury:irs:common" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:us:gov:treasury:irs:msg:form1094-1095Ctransmitterupstreammessage IRS-Form1094-1095CTransmitterUpstreamMessage.xsd" xmlns:n1="urn:us:gov:treasury:irs:msg:form1094-1095Ctransmitterupstreammessage">
<Form1095CUpstreamDetail RecordType="String" lineNum="1">
<RecordId>1</RecordId>
<CorrectedInd>0</CorrectedInd>
<irs:TaxYr>2015</irs:TaxYr>
<EmployeeInfoGrp>
<OtherCompletePersonName>
<PersonFirstNm>JOHN</PersonFirstNm>
<PersonMiddleNm>B</PersonMiddleNm>
<PersonLastNm>Doe</PersonLastNm>
</OtherCompletePersonName>
<PersonNameControlTxt/>
<irs:TINRequestTypeCd>INDIVIDUAL_TIN</irs:TINRequestTypeCd>
<irs:SSN>123456790</irs:SSN>
</Form1095CUpstreamDetail>
<Form1095CUpstreamDetail RecordType="String" lineNum="1">
<RecordId>2</RecordId>
<CorrectedInd>0</CorrectedInd>
<irs:TaxYr>2015</irs:TaxYr>
<EmployeeInfoGrp>
<OtherCompletePersonName>
<PersonFirstNm>JANE</PersonFirstNm>
<PersonMiddleNm>B</PersonMiddleNm>
<PersonLastNm>DOE</PersonLastNm>
</OtherCompletePersonName>
<PersonNameControlTxt/>
<irs:TINRequestTypeCd>INDIVIDUAL_TIN</irs:TINRequestTypeCd>
<irs:SSN>222222222</irs:SSN>
</EmployeeInfoGrp>
</Form1095CUpstreamDetail>
</n1:Form109495CTransmittalUpstream>
Using Nokogiri I want to extract the value between the <PersonFirstNm>, <PersonLastNm> and <irs:SSN> for each <Form1095CUpstreamDetail> based on the <RecordId>.
I tried removing namespaces as well. I posted a small snippet, but I have tried many iterations of working through the XML with no success. This is my first time using XML, so I realize I am likely missing something easy.
When I set my XPath:
require 'nokogiri'
submission_doc = Nokogiri::XML(open('1094C_Request.xml'))
submissions = submission_doc.remove_namespaces
nodes = submission.xpath('//Form1095CUpstreamDetail')
I do not seem to have any association between the RecordId and the tags mentioned above, and I am stuck on where to go next.
The fields are not listed as children for the RecordId, so I can't think of how to approach obtaining their values. I am including the full document as an example to make sure I am not excluding anything.
I have an array of values, and I would like to pull the three tags mentioned above if the RecordId is contained within the array of numbers.
Nokogiri makes it pretty easy to do what you want (assuming the XML is syntactically correct). I'd do something like:
require 'nokogiri'
require 'pp'
doc = Nokogiri::XML(<<EOT)
<n1:Form109495CTransmittalUpstream xmlns="urn:us:gov:treasury:irs:ext:aca:air:7.0" xmlns:irs="urn:us:gov:treasury:irs:common" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:us:gov:treasury:irs:msg:form1094-1095Ctransmitterupstreammessage IRS-Form1094-1095CTransmitterUpstreamMessage.xsd" xmlns:n1="urn:us:gov:treasury:irs:msg:form1094-1095Ctransmitterupstreammessage">
<Form1095CUpstreamDetail RecordType="String" lineNum="1">
<RecordId>1</RecordId>
<PersonFirstNm>JOHN</PersonFirstNm>
<PersonLastNm>Doe</PersonLastNm>
<irs:SSN>123456790</irs:SSN>
</Form1095CUpstreamDetail>
<Form1095CUpstreamDetail RecordType="String" lineNum="1">
<RecordId>2</RecordId>
<PersonFirstNm>JANE</PersonFirstNm>
<PersonLastNm>DOE</PersonLastNm>
<irs:SSN>222222222</irs:SSN>
</Form1095CUpstreamDetail>
</Form109495CTransmittalUpstream>
EOT
info = doc.search('Form1095CUpstreamDetail').map{ |form|
{
record_id: form.at('RecordId').text,
person_first_nm: form.at('PersonFirstNm').text,
person_last_nm: form.at('PersonLastNm').text,
ssn: form.at('irs|SSN').text
}
}
pp info
# >> [{:record_id=>"1",
# >> :person_first_nm=>"JOHN",
# >> :person_last_nm=>"Doe",
# >> :ssn=>"123456790"},
# >> {:record_id=>"2",
# >> :person_first_nm=>"JANE",
# >> :person_last_nm=>"DOE",
# >> :ssn=>"222222222"}]
While it's possible to do this with XPath, Nokogiri's implementation of CSS selectors tends to result in more easily read selectors, which translates to easier to maintain, which is a very good thing.
You'll see the use of | in 'irs|SSN' which is Nokogiri's way of defining a namespace for CSS. This is documented in "Namespaces".
First of all the xml validator reports error
The default (no prefix) Namespace URI for XPath queries is always '' and it cannot be redefined to 'urn:us:gov:treasury:irs:ext:aca:air:7.0'.
so you must set this default xmlns to "".
You can use this code.
require 'nokogiri'
doc = Nokogiri::XML(open('1094C_Request.xml'))
doc.namespaces['xmlns'] = ''
details = doc.xpath("//:Form1095CUpstreamDetail")
elem_a = ["PersonFirstNm", "PersonLastNm", "irs:SSN"]
output = details.each_with_object({}) do |element, exp|
exp[element.xpath("./:RecordId").text] = elem_a.each_with_object({}) do |elem_n, exp_h|
exp_h[elem_n] = element.xpath(".//#{elem_n.include?(':') ? elem_n : ":#{elem_n}"}").text
end
end
output
p output
# {
# "1" => {"PersonFirstNm" => "JOHN", "PersonLastNm" => "Doe", "irs:SSN" => "123456790"},
# "2" => {"PersonFirstNm" => "JANE", "PersonLastNm" => "DOE", "irs:SSN" => "222222222"}
# }
I hope this helps

How to export pdf table data into csv?

I am using Rails 4.2, Ruby 2.2, Gem: 'pdf-reader'.
My application will read pdf file which has table-data and it exports into CSV which i have already done. When i match result with table header and table content, they are in wrong position, yes because pdf table is not a actual table, we need to write some extra logic behind this which I am asking for.
marks.pdf has content similar as shown below
School Name: ABC
Program: MicroBiology Year: Second
| Roll No | Math |
|----------- |-------- |
1000001 | 65
|----------- |-------- |
Any help would be appreciated.
Working code which reads PDF and export to CSV is given below
class ExportToCsv
# method useful to export pdf to csv
def convert_to_csv
pdf_reader = PDF::Reader.new("public/marks.pdf")
csv = CSV.open("output100.tsv","wb", {:col_sep => "\t"})
data_header = ""
pdf_reader.pages.each do |page|
page.text.each_line do |line|
# line with characters
if /^[a-z|\s]*$/i=~line
data_header = line.strip
else
# line with number
data_row = line.split(/[0-9]/).first
csv_line = line.sub(data_row,'').strip.split(/[\(|\)]/)
csv_line.unshift(data_row).unshift(data_header)
csv << csv_line
end
end
end
end
end
I am not able to attach original pdf here because of security, sorry for that. You can generate the pdf as per below screenshot.
The screen of pdf is given below:
The screen of generated Csv is given below:
Desired pdf should be like below image

Output gene positions from GenBank file

Is it possible to output the gene location for a CDS feature or do I need to parse the 'location' or 'complement' field myself?
For example,
seq = Sequence.read(genbank_fp, format='genbank')
for feature in seq.metadata['FEATURES']:
if feature['type_'] == 'CDS':
if 'location' in feature:
print 'location = ', feature['location']
elif 'complement' in feature:
print 'location = ', feature['complement']
else:
raise ValueError('positions for gene %s not found' % feature['protein_id'])
would output:
location = <1..206
location = 687..3158
for this sample GenBank file.
This functionality is possible in BioPython (see this thread) where I can output the positions already parsed (ex. start = 687, end = 3158).
Thanks!
For the example, you can get the Sequence object for the feature only, using the following code:
# column index in positional metadata
col = feature['index_']
loc = seq.positional_metadata[col]
feature_seq = seq[loc]
# if the feature is on reverse strand
if feature['rc_']:
feature_seq = feature_seq.reverse_complement()
Note: the GenBank parser is newly added in the development branches.

Resources