The code below comes from the documentation for the Ruby Gem rroc. I desperately need to calculate the AUC for my AI project. However I have virtually no knowledge of Ruby file I/O, not having had occasion to learn. The documentation says rroc expects an n by 2 array but the first line of code below suggest that the data is in a csv file and it will be formatted into my_data for roc to calculate the auc.
I have tried every conceivable combination of csv data and arrays as both files for the first line to read or direct input into the line calculating auc. At best the code works, without error but gives a useless output of 0. My hope is that if I had a fuller understand of what that line does, I could either fix the problem or give up on the gem since a previous version of this gem was shown to be obsolete and this one's 8 years old. I took the data from the article referenced by the gem author and am pretty sure it's not the problem, but then,...
So, to refine the question: from that statement, can we tell what kind of data should be in 'some_data.cvs'? And what will be done to it to make my_data?
require 'rroc'
my_data = open('some_data.csv').readlines.collect { |l| l.strip.split(",").map(&:to_f) }
auc = ROC.auc(my_data)
puts auc
Below I've copied the output for two runs, the first with array data read in, the second with csv values (each in separate files). I added a line to read out the input file just to be sure.
RoyiMac:ruby $ ruby PDaucT.rb
[[90, 1], [80, 1], [70,-1], [60,1], [55,1], [54,1], [53,-1], [52,-1], [51,1], [50,-1], [40,1], [39,-1], [38,1], [37,-1], [36,-1], [35,-1], [34,1], [33,-1], [30,1], [10,-1]]
0.0
RoyiMac:ruby $ ruby PDaucT.rb
90,1,80,1,70,-1,60,1,55,1,54,1,53,-1,52,-1,51,1,50,-1,40,1,39,-1,38,1,37,-1,36,-1,35,-1,34,1,33,-1,30,1,10,-1
0.0
The explanation of the code:
open('some_data.csv') # open the some_data.csv file
.readlines # returns an array with each element being a line
.collect { |l| # for each line do the following tranformation
l.strip # remove proceeding and trailing whitespace characters
.split(',') # split the line based on the "," character (returning an array)
.map(&:to_f) # call .to_f on each element in the array, converting them to a float value
}
map/collect are aliases of each other.
However, like tadman already said in the comments you're better of using the csv standard library. The same can be achieved with:
require 'csv'
my_data = CSV.read('some_data.csv', converters: :float)
# should output
#=> [[90, 1], [80, 1], [70,-1], [60,1], [55,1], [54,1], [53,-1], [52,-1], [51,1], [50,-1], [40,1], [39,-1], [38,1], [37,-1], [36,-1], [35,-1], [34,1], [33,-1], [30,1], [10,-1]]
I want to insert data in specific positions in a text file, like in line 1 starting from position 10, how can I do it using ruby?
I also want to pass fake data into this file using fakker gem or in any other way possible. Like sending phone number, name, SSN etc.
Here's a sample script that takes two arguments and writes a modified copy of the first file's contents to the second file:
require 'faker'
input = File.open(ARGV[0], 'r')
lines = input.readlines
lines[0].gsub!(/^(.{0,10})/, '\1' + Faker::Base.numerify('###').to_s)
output = File.open(ARGV[1], 'w')
lines.each do |line|
output.write(line)
end
If you have an input file that looks like:
12345678901234567890
^^^ fake data
the output might look like:
12345678909451234567890
^^^ fake data
Since I opened the output file after reading the input file, you can pass the same file name as both the first and the second argument. That isn't exactly inserting the string into the file, but it's as close as you'll get.
The key line is:
lines[0].gsub!(/^(.{0,10})/, '\1' + Faker::Base.numerify('###').to_s)
It takes the fist line and substitutes in place a random 3-digit integer. If there are fewer than 10 characters in the first line, it'll append the random data to the end of the line. If you'd prefer to not substitute, you might want to remove the beginning of the range in the regex:
/^(.{10})/
Or maybe do something else if lines[0].length < 10.
How do I add a line-break/new-line in IRB/Ruby? The book I'm learning from shows this code:
print "2+3 is equal to "
print 2 + 3
without telling how to go to the second line without hitting Enter, which obviously just runs the program.
You could use semicolon at the end of statement like this puts "hello";puts"world"
That book might be taking very tiny steps to introducing this idea:
print "Continues..."
puts "(Up to here)"
The print function just outputs to the terminal exactly what it's given. The puts function does the same but also adds a newline, which is what you want.
The more Ruby way of doing this is either:
puts "2+3 equals #{2+3}" # Using string interpolation
puts "2+3 equals %d" % (2 + 3) # Using sprintf-style interpolation
Now if you're using irb, that's a Read-Evaluate-Print-Loop (REPL) which means it executes everything you type in as soon as you press enter, by design. If you want to use your original code, you need to force it on one line:
print "2+3 equals "; print 2+3
Then that will work as expected. The ; line separator is rarely used in Ruby, most style guides encourage you to split things up onto multiple lines, but if you do need to do a one-liner, this is how.
When writing code in, say a .rb file the return key is just used for formatting and doesn't execute any code.
You can put a semicolon after the first line, like this:
print "2+3 is equal to ";
print 2 + 3
The following code give two errors which I am not able to resolve. Any help would be appreciated:
random.rb:10: can't find string "TEMPLATE" anywhere before EOF
random.rb:3: syntax error, unexpected end-of-input
Code:
id = 2
File.open("#{id}.json","w") do |file|
file.write <<TEMPLATE
{
"submitter":"#{hash["submitter"]}",
"quote":"#{hash["quote"]}",
"attribution":"#{hash["attribution"]}"
}
TEMPLATE
end
From the documentation (emphasis mine):
The heredoc starts on the line following <<HEREDOC and ends with the next line that starts with HEREDOC
Your code doesn't contain a line starting with TEMPLATE. If your text editor (or IDE) supports regular expressions in searches, try ^TEMPLATE.
You can either remove the spaces or if you want to keep them, change <<TEMPLATE into <<-TEMPLATE. The addition of - instructs the Ruby parser to search for an (possibly) intended TEMPLATE like you have in your code.
From a unix script, I want to search a text file for a string and then return all the lines following the pattern up until a line that contains the word "Failed".
For example,
Test Case Name "Blah"
Error 1
Error 2
Error 3
Failed
Test Case Name "Foo"
Pass
Test Case Name "Red"
Pass
In the above, I want to search for "Blah", and then return:
Error 1
Error 2
Error 3
Up until the line "Failed". There can be any number of "Error" lines between "Blah" and "Failed".
Any solutions using sed, awk, etc. are acceptable.
Thanks!
Here is the awk version:
$ awk '/Failed/{p=0}p;/Blah/{p=1}' file
Error 1
Error 2
Error 3
And if you don't mind printing the boundary lines, you can do
awk '/Blah/,/Failed/' file
Some explanations how this works: an awk script is essentially a series of blocks with the structure filter{actions}, where the filter defines for which input records, the actions will be applied.
So the first block /Failed/{p=0} says that if we find a record that contains the regular expression Failed, we set the variable p to zero.
The second block p; uses the default action, which is to print the current record. So for each record that is read, the script checks the value of the p variable, and prints the record if p has a non-zero value (which is equivalent to the true condition).
The third block /Blah/{p=1} says that if we find a record that contains the regular expression Blah, to set the variable p to one.
So if we put them all together, the script starts reading all input lines without printing them (since the initial value of p is zero). After a record containing Blah is found, the following records are printed until a record containing Failed is found. Since the blocks are examined for each record in the order that they appear, the order of the three blocks will determine what happens to the boundary records. For example, if the boundary lines were to be printed we could write the script as awk '/Blah/{p=1}p;/Failed/{p=0}' file.
The second command awk '/Blah/,/Failed/' file uses a range construct (the comma). The operation of the range construct is documented nicely here: https://www.gnu.org/software/gawk/manual/html_node/Ranges.html
This might work for you:
sed -n '/Blah/,/Failed/{//!p}' file