Saving and calling multiple variables from ruby method - ruby

Maybe I've been staring at my Ruby book to the point where words don't mean anything anymore, but I thought I might as well ask.
What I'm looking to do is, rather than passing a bulk of variables through my function headers, or instead of using global variables, I'm looking to save my variables within a method, and call upon them at multiple times, throughout my functions. What I'm realistically having an issue with, is scope.
def DateGrab()
print "\nEnter the date you're looking for (Month/Day): "
longdate = gets.strip.split(/\/| /)
if longdate[0].length > 3
month = longdate[0].slice(0..2)
else
month = longdate[0]
end
day = longdate[1]
year = `date | awk '{print $6}'`.strip
grepdate = "#{day}/#{month}/#{year}"
date = Date.parse("#{day}-#{month}-#{year}").strftime('%m%d%Y').strip
end
I'm looking to pass "grepdate" and "date" through multiple functions, and I feel that using a method would be easier, but every time I try to call the variable, I get a "undefined local variable or method" error.

You want to look at instance variables. You can set them via #grepdate = "something" These instance variables are accessible throughout your class and in all methods of that class.

Related

Global variables always detrimental to Julia's performance?

Every corner of Julia's documentation is filled with reminders to "avoid global scope variables". But I fail to see how this could be beneficial even in some of the most common data analysis scenarios, probably due to a misunderstanding regarding how Julia's compiler works.
For example, one function I use checks whether each token of a document belongs to a huge lexicon of acceptable tokens. Currently, I use something like this:
using CSV, DataFrames
accepted_tokens = #chain begin
CSV.read("accepted_tokens.csv", DataFrame)
Set{String}(_.tokens)
end
function redact_document(doc::String)
tokens = split(doc, " ")
redacted_tokens = [token in accepted_tokens ? token : "REDACTED" for token in tokens]
return join(" ", redacted_tokens)
end
Now, since redact_document is the only function the uses accepted_tokens I of course could just assign the variable inside the function, like this:
function redact_document(doc::String)
accepted_tokens = #chain begin
CSV.read("accepted_tokens.csv", DataFrame)
Set{String}(_.tokens)
end
tokens = split(doc, " ")
redacted_tokens = [token in accepted_tokens ? token : "REDACTED" for token in tokens]
return join(" ", redacted_tokens)
end
The reason I don't do this is that it seems to me that in this case accedted_tokens would need to be assigned each time redact_document is called, which seems like a total waste of time, given that I'd have to read a huge file from disk every time, instead of creating/assigning the variable just once (albeit in the global scope). I also don't want to declare accepted_tokens as a constant, since I might want to tweak the lexicon as I develop my script.
Am I right on my reading of the code? Or, as I suspect, the compiler is smarter than what I take it to be, and I should still be wrapping my variables within the functions that use them?
While all has been said in comments, just for cleanness your code should look like this (you should pass accepted_tokens as argument rather than to use a global variable):
function redact_document(doc::AbstractString, accepted_tokens::AbstractSet{<:AbstractString})
tokens = split(doc, " ")
redacted_tokens = [token in accepted_tokens ? token : "REDACTED" for token in tokens]
return join(" ", redacted_tokens)
end
The type declarations for function arguments are optional (do not affect performance), but if you use the usually it is better to use their abstract counterparts.

Overwrite a variable without repeating the name

Is there a method to overwrite variable without copying its name? For example, when I want to change my_var = '3' to an integer, I must do something like this:
my_var = my_var.to_i
Is there way to do this without copying variable's name? I want to do something like this:
my_var = something_const.to_i
For numbers there exists +=, -= etc, but is there universal way to do this for all methods ?
There is no way to covert a string to an integer like that, without repeating the variable name. Methods such as String#upcase! and Array#flatten! work by mutating the object; however, it is not possible to define such a method like String#to_i! because we are converting the object to an instance of a different class.
For example, here is a (failed) attempt to define such a method:
# What I want to be able to do:
# my_var = "123"
# my_var.to_i! # => my_var == 123
class String
def to_i!
replace(Integer(self))
end
end
my_var = "123"
my_var.to_i! # TypeError: no implicit conversion of Fixnum into String
...And even if this code were valid, it would still offer no performance gain since a new object is still being created.
As for your examples of += and -=, these are in fact simply shorthand for:
x += 1
# Is equivalent to:
x = x + 1
So again, there is no performance gain here either; just slightly nicer syntax. A good question to ask is, why doesn't ruby support a ++ operator? If such an operator existed then it would offer performance gain... But I'll let you research for yourself why this is missing from the language.
So to summarise,
is there universal way to do this for all methods?
No. The special operators like +=, -=, |= and &= are all predefined; there is no "generalised" version such as method_name=.
You can also define methods that mutate the object, but only when appropriate. Such methods are usually named with a !, are called "bang-methods", and have a "non-bang" counterpart. On String objects, for example, there is String#capitalize! (and String#capitalize), String#delete! (and String#delete), String#encode! (and String#encode), .... but no String#to_i! for the reasons discussed above.

Ruby: evaluate string with dynamic binding of variables

I have a database of "formulas" stored as strings. Let's assume for simplicity, that each formula contains 2 variables denoted by a and b, and that the formulas are all wellformed and it is ensured that it consists only of characters from the set ()ab+-*.
At runtime, formulas are fetched from this database, and from another source, numeric values for a and b are fetched, and the formulas are evaluated. The evaluation can be programmed like this:
# This is how it works right now
formula = fetch_formula(....)
a = fetch_left_arg(....)
b = fetch_right_arg(....)
result = eval(formula)
This design works, but I'm not entirely happy with it. It requires that my program names the free variables exactly the same as they are named in the formula, which is ugly.
If my "formula" would not be a string, but a Proc object or Lambda which accepts two parameters, I could do something like
# No explicitly named variables
result = fetch_proc(...).call(fetch_left_arg(....),fetch_right_arg(....))
but unfortunately, the formulas have to be strings.
I tried to experiment in the following way: What if the method, which fetches the formula from the database, would wrap the string into something, which behaves like a block, and where I could pass parameters to it?
# This does not work of course, but maybe you get the idea:
block_string = "|a,b| #{fetch_formula(....)}"
Of course I can't eval such a block_string, but is there something similar which I could use? I know that instance_eval can pass parameters, but what object should I apply it to? So this is perhaps not an option either....
This is very nasty approach, but for simple formulas you’ve mentioned it should work:
▶ formula = 'a + b'
▶ vars = formula.scan(/[a-z]+/).uniq.join(',') # getting vars names
#⇒ "a,b"
▶ pr = eval("proc { |#{vars}| #{formula} }") # preparing proc
▶ pr.call 3, 5
#⇒ 8
Here we rely on the fact, that parameters are passed to the proc in the same order, as they appear in the formula.
If I get your question correctly, it is something that I have done recently, and is fairly easy. Given a string:
s = "{|x, y| x + y}"
You can create a proc by doing:
eval("Proc.new#{s}")
One way to avoid creating the variables in the local scope could be to use a Binding:
bind = binding
formula = fetch_formula(....)
bind.local_variable_set :a, fetch_left_arg(....)
bind.local_variable_set :b, fetch_right_arg(....)
result = bind.eval(formula)
The variables a and b now only exist in the binding, and do not pollute the rest of your code.
You can create a lambda from string, as shown below:
formula = "a + b"
lambda_template = "->(a,b) { %s }"
formula_lambda = eval(lambda_template % formula)
p formula_lambda.call(1,2)
#=> 3

Does awk support dynamic user-defined variables?

awk supports this:
awk '{print $(NF-1);}'
but not for user-defined variables:
awk '{a=123; b="a"; print $($b);}'
by the way, shell supports this:
a=123;
b="a";
eval echo \${$b};
How can I achieve my purpose in awk?
OK, since some of us like to eat spaghetti through their nose, here is some actual code that I wrote in the past :-)
First of all, getting a self modifying code in a language that does not support it will be extremely non-trivial.
The idea to allow dynamic variables, function names, in a language that does not support one is very simple. At some state in the program, you want a dynamic anything to self modify your code, and resume execution
from where you left off. a eval(), that is.
This is all very trivial, if the language supports eval() and such equlavant. However, awk does not have such function. Therefore, you, the programmer has to provide a interface to such thing.
To allow all this to happen, you have three main problems
How to get our self so we can modify it
How to load the modified code, and resume from where we left off
Finding a way for the interpreter to accept our modified code
How to get our self so we can modify it
Here is a example code, suitable for direct execution.
This one is the infastrucure that I inject for enviroments running gawk, as it requires PROCINFO
echo ""| awk '
function push(d){stack[stack[0]+=1]=d;}
function pop(){if(stack[0])return stack[stack[0]--];return "";}
function dbg_printarray(ary , x , s,e, this , i ){
x=(x=="")?"A":x;for(i=((s)?s:1);i<=((e)?e:ary[0]);i++){print x"["i"]=["ary[i]"]"}}
function dbg_argv(A ,this,p){
A[0]=0;p="/proc/"PROCINFO["pid"]"/cmdline";push(RS);RS=sprintf("%c",0);
while((getline v <p)>0)A[A[0]+=1]=v;RS=pop();close(p);}
{
print "foo";
dbg_argv(A);
dbg_printarray(A);
print "bar";
}'
Result:
foo
A[1]=[awk]
A[2]=[
function push(d){stack[stack[0]+=1]=d;}
function pop(){if(stack[0])return stack[stack[0]--];return "";}
function dbg_printarray(ary , x , s,e, this , i ){
x=(x=="")?"A":x;for(i=((s)?s:1);i<=((e)?e:ary[0]);i++){print x"["i"]=["ary[i]"]"}}
function dbg_argv(A ,this,p){
A[0]=0;p="/proc/"PROCINFO["pid"]"/cmdline";push(RS);RS=sprintf("%c",0);
while((getline v <p)>0)A[A[0]+=1]=v;RS=pop();close(p);}
{
print "foo";
dbg_argv(A);
dbg_printarray(A);
print "bar";
}]
bar
As you can see, as long as the OS does not play with our args, and /proc/ is available, it is possible
to read our self. This may appear useless at first, but we need it for push/pop of our stack,
so that our execution state can be enbedded within the code, so we can save/resume and survive OS shutdown/reboots
I have left out the OS detection function and the bootloader (written in awk), because, if I publish that,
kids can build platform independent polynormal code, and it is easy to cause havoc with it.
how to load the modified code, and resume from where we left off
Now, normaly you have push() and pop() for registers, so you can save your state and play with
your self, and resume from where you left off. a Call and reading your stack is a typical way to get the
memory address.
Unfortunetly, in awk, under normal situations we can not use pointers (with out a lot of dirty work),
or registers (unless you can inject other stuff along the way).
However you need a way to suspend and resume from your code.
The idea is simple. Instead of letting awk in control of your loops and while, if else conditions,
recrusion depth, and functions you are in, the code should.
Keep a stack, list of variable names, list of function names, and manage it your self.
Just make sure that your code always calls self_modify( bool ) constantly, so that even upon sudden failure,
As soon as the script is re-run, we can enter self_modify( bool ) and resume our state.
When you want to self modify your code, you must provide a custom made
write_stack() and read_stack() code, that writes out the state of stack as string, and reads string from
the values out from the code embedded string itself, and resume the execution state.
Here is a small piece of code that demonstrates the whole flow
echo ""| awk '
function push(d){stack[stack[0]+=1]=d;}
function pop(){if(stack[0])return stack[stack[0]--];return "";}
function dbg_printarray(ary , x , s,e, this , i ){
x=(x=="")?"A":x;for(i=((s)?s:1);i<=((e)?e:ary[0]);i++){print x"["i"]=["ary[i]"]"}}
function _(s){return s}
function dbg_argv(A ,this,p){
A[0]=0;p="/proc/"PROCINFO["pid"]"/cmdline";push(RS);RS=sprintf("%c",0);
while((getline v <p)>0)A[A[0]+=1]=v;RS=pop();close(p);}
{
_(BEGIN_MODIFY"|");print "#foo";_("|"END_MODIFY)
dbg_argv(A);
sub( \
"BEGIN_MODIFY\x22\x5c\x7c[^\x5c\x7c]*\x5c\x7c\x22""END_MODIFY", \
"BEGIN_MODIFY\x22\x7c\x22);print \"#"PROCINFO["pid"]"\";_(\x22\x7c\x22""END_MODIFY" \
,A[2])
print "echo \x22\x22\x7c awk \x27"A[2]"";
print "function bar_"PROCINFO["pid"]"_(s){print \x22""doe\x22}";
print "\x27"
}'
Result:
Exactly same as our original code, except
_(BEGIN_MODIFY"|");print "65964";_("|"ND_MODIFY)
and
function bar_56228_(s){print "doe"}
at the end of code
Now, this may seem useless, as we are only replaceing code print "foo"; with our pid.
But it becomes usefull, when there are multiple _() with separate MAGIC strings to identify BLOCKS,
and a custome made multi line string replacement routine instead of sub()
You msut provide BLOCKS for stack, function list, execution point, as a bare minimum.
And notice that the last line contains bar
This it self is just a sting, but when this code repeatedly gets executed, notice that
function bar_56228_(s){print "doe"}
function bar_88128_(s){print "doe"}
...
and it keeps growing. While the example is intentionally made so that it does nothing useful,
if we provide a routine to call bar_pid_(s) instead of that print "foo" code,
Sudenly it means we have eval() on our hands :-)
Now, isn't eval() usefull :-)
Don't forget to provide a custome made remove_block() function so that the code maintains
a reasonable size, instead of growing every time you execute.
Finding a way for the interpreter to accept our modified code
Normally calling a binary is trivial. However, when doing so from with in awk, it becomes difficult.
You may say system() is the way.
There are two problems to that.
system() may not work on some envoroments
it blocks while you are executing code, trus you can not perform recrusive calls and keep the user happy at the same time.
If you must use system(), ensure that it does not block.
A normal call to system("sleep 20 && echo from-sh & ") will not work.
The solution is simple,
echo ""|awk '{print "foo";E="echo ep ; sleep 20 && echo foo & disown ; "; E | getline v;close(E);print "bar";}'
Now you have a async system() call that does not block :-)
Not at the moment. However, if you provide a wrapper, it is (somewhat hacky and dirty) possible.
The idea is to use # operator, introduced in the recent versions of gawk.
This # operator is normally used to call a function by name.
So if you had
function foo(s){print "Called foo "s}
function bar(s){print "Called bar "s}
{
var = "";
if(today_i_feel_like_calling_foo){
var = "foo";
}else{
var = "bar";
}
#var( "arg" ); # This calls function foo(), or function bar() with "arg"
}
Now, this is usefull on it's own.
Assuming we know var names beforehand, we can write a wrapper to indirectly modify and obtain vars
function get(varname, this, call){call="get_"varname;return #call();}
function set(varname, arg, this, call){call="set_"varname; #call(arg);}
So now, for each var name you want to prrvide access by name, you declare these two functions
function get_my_var(){return my_var;}
function set_my_var(arg){my_var = arg;}
And prahaps, somewhere in your BEGIN{} block,
BEGIN{ my_var = ""; }
To declare it for global access.
Then you can use
get("my_var");
set("my_var", "whatever");
This may appear useless at first, however there are perfectly good use cases, such as
keeping a linked list of vars, by holding the var's name in another var's array, and such.
It works for arrays too, and to be honest, I use this for nesting and linking Arrays within
Arrays, so I can walk through multiple Arrays like using pointers.
You can also write configure scripts that refer to var names inside awk this way,
in effect having a interpreter-inside-a-interpreter type of things, too...
Not the best way to do things, however, it gets the job done, and I do not have to worry about
null pointer exceptions, or GC and such :-)
The $ notation is not a mark for variables, as in shell, PHP, Perl etc. It is rather an operator, which receives an integer value n and returns the n-th column from the input. So, what you did in the first example is not the setting/getting of a variable dynamically but rather a call to an operator/function.
As stated by commenters, you can archive the behavior you are looking for with arrays:
awk '{a=123; b="a"; v[b] = a; print v[b];}'
I had a similar problem to solve, to load the settings from a '.ini' file and I've used arrays to set the variables dynamically.
It works with Awk or Gawk, Linux or Windows (GnuWin32)
gawk -v Settings_File="my_settings_file.ini" -f awk_script.awk <processing_file>
[my_settings_file.ini]
#comment
first_var=foo
second_var=bar
[awk_script.awk]
BEGIN{
FS="=";
while((getline < Settings_File)>0) {
if($0 !~ /^[#;]|^(\s*)$/) {
var_array[$1] = $2;
}
}
print var_array["first_var"];
print var_array["second_var"];
if (var_array["second_var"] == "bar") {
print "works!";
}
}
{
#more processing
}
END {
#finish processing
}

Modifying module level variables in an anonymous array in Ruby

I am in the midst of learning Ruby and thought I was clever with the following piece of code:
[#start,#end].map!{ |time| time += operation == :add ? amount : -(amount) }
where #start, #end are two module level variables, operation can be one of :add or :sub, and amount is an float amount to adjust both #start and #end by.
Granted it only saves me a line of code, but why doesn't this approach work, and how can I get something similar that does?
(My expected output is for #start/#end to be modified accordingly, however unit tests show that they stay at their original values.)
It's important in Ruby to remember the distinction between variables and the objects they hold. Simply setting a variable will never change the object referenced by that variable. When you do a += b, it's just shorthand for a = a + b. So you're assigning a new value to the variable a, not changing the object that used to be there or changing any other references to that object. So changing the variable time doesn't change #start.
In order to assign to an instance variable, you need to actually assign to that instance variable. Here's a way to do what you were looking for:
operation = :+
amount = 12
#start, #end = [#start, #end].map {|time| time.send(operation, amount)}
You'll notice that we're not faffing around with that :add and :sub business either — we can just pass the actual name of the message we want to send (I used + in this case, but it could be anything).
If you had a big, dynamically generated list of ivars you wanted to set, it's only a little bit more complicated. The only difference there is that need to get and set the ivars by name.
ivars = [:#start, :#end, :#something_else]
operation = :+
amount = 12
ivars.each {|ivar| instance_variable_set(ivar, instance_variable_get(ivar).send(operation, amount))}
The += operator changes the value of time but it returns the old value of time, therefore the right code is:
#start,#end = [#start,#end].map!{ |time| time + (operation == :add ? amount : -amount) }
EDIT
Updated the code to actually change #start and #end.
The addition operation in the block doesn't modify 'time', it returns a new value. So the elements in the array aren't modified, they're replaced.

Resources