The code below comes from the documentation for the Ruby Gem rroc. I desperately need to calculate the AUC for my AI project. However I have virtually no knowledge of Ruby file I/O, not having had occasion to learn. The documentation says rroc expects an n by 2 array but the first line of code below suggest that the data is in a csv file and it will be formatted into my_data for roc to calculate the auc.
I have tried every conceivable combination of csv data and arrays as both files for the first line to read or direct input into the line calculating auc. At best the code works, without error but gives a useless output of 0. My hope is that if I had a fuller understand of what that line does, I could either fix the problem or give up on the gem since a previous version of this gem was shown to be obsolete and this one's 8 years old. I took the data from the article referenced by the gem author and am pretty sure it's not the problem, but then,...
So, to refine the question: from that statement, can we tell what kind of data should be in 'some_data.cvs'? And what will be done to it to make my_data?
require 'rroc'
my_data = open('some_data.csv').readlines.collect { |l| l.strip.split(",").map(&:to_f) }
auc = ROC.auc(my_data)
puts auc
Below I've copied the output for two runs, the first with array data read in, the second with csv values (each in separate files). I added a line to read out the input file just to be sure.
RoyiMac:ruby $ ruby PDaucT.rb
[[90, 1], [80, 1], [70,-1], [60,1], [55,1], [54,1], [53,-1], [52,-1], [51,1], [50,-1], [40,1], [39,-1], [38,1], [37,-1], [36,-1], [35,-1], [34,1], [33,-1], [30,1], [10,-1]]
0.0
RoyiMac:ruby $ ruby PDaucT.rb
90,1,80,1,70,-1,60,1,55,1,54,1,53,-1,52,-1,51,1,50,-1,40,1,39,-1,38,1,37,-1,36,-1,35,-1,34,1,33,-1,30,1,10,-1
0.0
The explanation of the code:
open('some_data.csv') # open the some_data.csv file
.readlines # returns an array with each element being a line
.collect { |l| # for each line do the following tranformation
l.strip # remove proceeding and trailing whitespace characters
.split(',') # split the line based on the "," character (returning an array)
.map(&:to_f) # call .to_f on each element in the array, converting them to a float value
}
map/collect are aliases of each other.
However, like tadman already said in the comments you're better of using the csv standard library. The same can be achieved with:
require 'csv'
my_data = CSV.read('some_data.csv', converters: :float)
# should output
#=> [[90, 1], [80, 1], [70,-1], [60,1], [55,1], [54,1], [53,-1], [52,-1], [51,1], [50,-1], [40,1], [39,-1], [38,1], [37,-1], [36,-1], [35,-1], [34,1], [33,-1], [30,1], [10,-1]]
When processing a file, I used to use the special variable $. to get the last line number being read. For instance, the following program
require 'csv'
IFS=';'
CSV_OPTIONS = { col_sep: IFS, external_encoding: Encoding::ISO_8859_1, internal_encoding: Encoding::UTF_8 }
CSV.new($stdin, CSV_OPTIONS).each do |row|
puts "::::line #{$.} row=#{row}"
end
is supposed to dump a CSV file (where the fields are delimited by semicolon instead of comma, as is the case in our project) and prepend each output line by the line number.
After updating Ruby to
_ruby 2.6.3p62 (2019-04-16 revision 67580) [x86_64-cygwin]_
the lines are still dumped, but the line number is always displayed as zero.
What strikes me, is that this Ruby Wiki on special Ruby variables, while still having $. in its list, doesn't have a description for this variable anymore. So I wonder: Is this variable gone, or was it never supposed to work with the csv class and just worked for me by accident in the earlier versions?
I'm not sure why $. isn't working for you, but it's also not the best solution here. When it works, $. gives you the number of lines read from input, but since quoted fields in a CSV file can span multiple lines the number you get from $. won't always be the number of rows that have been read.
As mentioned above, each_with_index is a good alternative:
CSV.new($stdin, CSV_OPTIONS).each_with_index do |row, i|
puts "::::row #{i} row=#{row}"
end
Another alternative is CSV#lineno:
lineno()
The line number of the last row read from this file. Fields with nested line-end characters will not affect this count.
You would use it like this:
csv = CSV.new($stdin, CSV_OPTIONS)
csv.each do |row|
puts "::::row #{csv.lineno} row=#{row}"
end
Note that each_with_index will start counting at 0, whereas lineno starts at 1.
You can see both approaches in action on repl.it: https://repl.it/#jrunning/LoudBlushingCharactercode
I want to write a command in the terminal like config.section.key, parse the command, and get the strings "section" and "key". I want to use these two keys in my function to search a hash.
Is there any way to parse a command from the terminal to do this?
To execute terminal commands you can use either backticks or a system call here's some examples keep in mind that this is all pseudo code and I have no idea if this will run correctly:
def create_file
`touch test.txt`
end
def cmd
system('ls')
end
def check_file
results = cmd
if results.include?('test.txt')
puts 'File exists.'
else
puts 'Creating file..'
create_file
end
end
Now to the parsing part, depending on what you want to do, you can either save the information into a variable, or you could use a regex to extract the information. So if you wanted to extract digits with a regex: /\d+/ if you wanted to save the information: results = cmd..
I hope this answers your question.
To split the information, you could use the split method for example:
def cmd
`prt_jobs`
end
def check_jobs
res = cmd
res.split(".")
end
This will split the results of a print jobs command by periods and make them into an array. I'd show you more except I'm on my phone so it will have to wait
As Tadman commented, you can use the String#split method to split the argv on period characters, if that is your desire:
config, section, key, *rest = ARGF.argv.split('.')
Another good option when dealing with parsing command lines is the Ruby standard library OptionParser class. Rather than rebuild all of the CLI parsing by hand, the OptionParser class has that built in and much more. The resulting scripts can feel much more linux like and be familiar to anyone who's used bash before.
The following code is something I am beginning to test for use within a "Texas Hold Em" style game I am working on.
My question is why, when running the following code, does the puts involving a "♥" return a "\u" in it's place. I feel certain it is this multibyte character that is causing the issue becuse on the second puts , I replaced the ♦ with a d in the array of strings and it returned what i was expecting. See Below:
My Code:
#! /usr/bin/env ruby
# encoding: utf-8
table_cards = ["|2♥|", "|8♥|", "|6d|", "|6♣|", "|Q♠|"]
# Array of cards
player_1_face_1 = "8"
player_1_suit_1 = "♦"
# Player 1's face and suit of first card he has
player_1_face_2 = "6"
player_1_suit_2 = "♥"
# Player 1's face and suit of second card he has
test_str_1 = /(\D8\D{2})/.match(table_cards.to_s)
# EX: Searching for match between face values on (player 1's |8♦|) and the |8♥| on the table
test_str_2 = /(\D6\D{2})/.match(table_cards.to_s)
# EX: Searching for match between face values on (player 1's |6♥|) and the |6d| on the table
puts "#{test_str_1}"
puts "#{test_str_2}"
Puts to Screen:
|8\u
|6d|
-- My goal would be to get the first puts to return: |8♥|
I am not so much looking for a solution to this (there may not even be one) but more so a "as simple as possible" explanation of what is causing this issue and why. Thanks ahead of time for any information on what is happening here and how I can tackle the goal.
The "\u" you're seeing is the Unicode string indicator.
For example, Unicode character 'HEAVY BLACK HEART' (U+2764) can be printed as "\u2764".
A friendly Unicode character listing site is http://unicode-table.com/en/sets/
Are you able to launch interactive Ruby in your shell and print a heart like this?
irb
irb> puts "\u2764"
❤
When I run your code in my Ruby, I get the answer you expect:
test_str_1 = /(\D8\D{2})/.match(table_cards.to_s)
=> #<MatchData "|8♥|" 1:"|8♥|">
What happens if you try a regex that is more specific to your cards?
test_str_1 = /(\|8[♥♦♣♠]\|)/.match(table_cards.to_s)
In your example output, you're not seeing the Unicode heart symbol as you want. Instead, your output is printing the "\u" which is the Unicode starter, but then not printing the rest of the expected string which is "2764".
See the comment by the Tin Man that describes encoding for your console. If he's correct, then I expect the more-specific regex will succeed, but still print the wrong output.
See the comment by David Knipe that says it looks like it gets truncated because the regex only matches 4 characters. If he's correct, then I expect the more-specific regex will succeed and also print the right output.
(The rest of this answer is typical for Unix; if you're on Windows, ignore the rest here...)
To show your system language settings, try this in your shell:
echo $LC_ALL
echo $LC_CTYPE
If they are not "UTF-8" or something like that, try this in your shell:
export LC_ALL=en_US.UTF-8
export LC_CTYPE=en_US.UTF-8
Then re-run your code -- be sure to use the same shell.
If this works, and you want to make this permanent, one way is to add these here:
# /etc/environment
LC_ALL=en_US.UTF-8
LC_CTYPE=en_US.UTF-8
Then source that file from your .bashrc or .zshrc or whatever shell startup file you use.
I was following the advice from this question when trying to read in multi-line input from the command line:
# change line separator
$/ = 'END'
answer = gets
pp answer
However, I get weird behavior from STDIN#gets when I try to change $/ back:
# put it back to normal
$/ = "\n"
answer = gets
pp answer
pp 'magic'
This produces output like this when executed with Ruby:
$ ruby multiline_input_test.rb
this is
a multiline
awesome input string
FTW!!
END
"this is\n\ta multiline\n awesome input string\n \t\tFTW!!\t\nEND"
"\n"
"magic"
(I input up to the END and the rest is output by the program, then the program exits.)
It does not pause to get input from the user after I change $/ back to "\n". So my question is simple: why?
As part of a larger (but still small) application, I'm trying to devise a way of recording notes; as it is, this weird behavior is potentially devastating, as the rest of my program won't be able to function properly if I can't reset the line separator. I've tried all manner of using double- and single-quotes, but that doesn't seem to be the issue. Any ideas?
The problem you're having is that your input ends with END\n. Ruby sees the END, and there's still a \n left in the buffer. You do successfully set the input record separator back to \n, so that character is immediately consumed by the second gets.
You therefore have two easy options:
Set the input record separator to END\n (use double quotes in order to have the newline character work):
$/ = "END\n"
Clear the buffer with an extra call to gets:
$/ = 'END'
answer = gets
gets # Consume extra `\n`
I consider option 1 clearer.
This shows it working on my system using option 1:
$ ruby multiline_input_test.rb
this is
a multiline
awesome input string
FTW!!
END
"this is\n a multiline\n awesome input string\n FTW!!\nEND\n"
test
"test\n"
"magic"