Bash Functions Order and Timing - bash

This should be easy to answer, but I couldn't find exactly what I was asking on google/stackoverflow.
I have a bash script with 18 functions (785 lines)- ridiculous, I know I need to learn another language for the lengthy stuff. I have to run these functions in a particular order because the functions later in the sequence use info from the database and/or text files that were modified by the functions preceding. I am pretty much done with the core functionality of all the functions individually and I would like a function to run them all (One ring to rule them all!).
So my questions are, if I have a function like so:
function precious()
{
rings_of #Functions in Sequence
elves #This function Modifies DB
men #This function uses DB to modify text
dwarves #This function uses that modified text
}
Would variables be carried from one function to the next if declared like so? (inside of a function):
function men()
{
...
frodo_sw_name=`some DB query returning the name of Frodo's sword`
...
}
Also, if the functions are called in a specific order, as seen above, will Bash wait for one function to finish before starting the next? - I am pretty sure the answer is yes, but I have a lot of typing to do either way, and since I couldn't find this answer quickly on the internet, I figured it might benefit others to have this answer posted as well.
Thanks!

Variables persist unless you run the function in a subshell. This would happen if you run it as part of a pipeline, or group it with (...) (you should use { ... } instead for grouping if you don't want to create a subshell.
The exception is if you explicitly declare the variables in the function with declare, typeset, or local, which makes them local to that function rather than global to the script. But you can also use the -g option to declare and typeset to declare global variables (this would obviously be inappropriate for the local declaration).
See this tutorial on variable scope in bash.
Commands are all run sequentially, unless you deliberately background them with & at the end. There's no difference between functions and other commands in this regard.

Related

Refactor eval(some_variable).is_a?(Proc) to not use eval

I have some old code that looks like:
some_variable = "-> (params) { Company.search_by_params(params) }"
if eval(some_variable).is_a?(Proc)
...
Rubocop is complaining about the use of eval. Any ideas on how to remove the usage of eval?
I don't really understand Procs so any guidance on that would be appreciated.
Simple. Don't define your variable object as a string but as a lambda Proc
my_lamda = -> (params) { Company.search_by_params(params) }
if my_lambda.is_a?(Proc)
#do stuff
end
But why would you instantiate a string object which contains what appears to be a normal lambda which is a Proc, when you can define a Proc instead?
I am going to answer the question "If I want to run code at a later time, What is the difference between using a proc and a eval'd string?" (which I think is part of your question and confusion):
What eval does is take a string and parses it to code, and then runs it. This string can come from anywhere, including user input. But eval is very unsafe and problematic, especially when used with raw user input.
The problems with eval are usually:
There is almost always a better way to do it
Very dangerous and insecure
Makes debugging difficult
Slow
Using eval allows full control of the ruby process, and if you have high permissions given to the ruby process, potentially even root acmes to the machine. So the general recommendation is use 'eval' only if you absolutely have no other options, and especially not with user input.
Procs/lambdas/blocks also let you save code for later, (and solve most of the problems with eval, they are the "better way") but instead of storing arbitrary code as a string to read later, they are code already, already parsed and ready to go. In someways, they are methods you can pass around later. Making a proc/lambda gives you an object with a #call method. Then when you later want to run the proc/block/lambda, you call call([arguments...]). What you can't do with procs though is let users write arbitrary code (and generally that's good). You have to write the code for the proc in a file ruby loads (most of the time). Eval does get around that, but you really should rethink if you really want that to be possible.
Your code sample oddly combines both these methods: it evaluates a string to a lambda. So what's happening here is eval is running the code in the string right away and returning the (last) result, which in this case happens to be a lambda/proc. (Note that this would happen every time you ran eval, which would result in multiple copies of the proc, with different identities, but the same behavior). Since the code inside the string happens to make a lambda, the value returned is a Proc which can later be #call'd. So when eval is run, the code is parsed, and a new lambda is created, with the code in the lambda stored to be run at a later time. If the code inside the string did not create a lambda, the all that code would be run immediately when eval was called with the string.
This behavior might be desired, but there is probably a better way to do this, and this is definitely a foot-gun: there are at least a half dozen subtle ways this code could do unintended things if you weren't really careful with it.

How to find variable names in Bash source code

I'm writing an experimental Bash module system that would allow local function namespaces, and my first idea was to write a Bash function parser that would read the function code line by line and prepend each function/variable name with <module-name>. (i.e. function func in module module would become module.func - which could again be imported in another module like module_2.module.func and so on; variables inside functions would be name-mangled - variable var within function func in module module would become __module_func_var).
However, in order to do that, I need a way to detect which names are variables and replace all their occurences in the function with the transported import-name. Trivial cases like variable=[...] are easily parsable, but there are countless of other cases where it's not that trivial - what about while read variable; do [...] done and variable2="asdf${variable//_/+}"?
It seems to me that in order to do this I need to dive into the parsing mechanisms of Bash or read a book on programming languages - but where do I start in order to achieve what I have explained above?
I need a way to detect which names are variables
I'm sorry to say this, but in general it's impossible.
Supporting only the static cases where variables can occur is possible but very tricky. Consider only variable assignments: Besides x= there are declare x=, printf -v x, read x, mapfile x, readarray x and probably many more. Even mature tools like shellcheck still have problems parsing all these cases correctly (for instance, see this issue).
However, even if you mastered parsing all the static cases correctly there still could by dynamic variables, for instance:
x=$(someCommand)
declare "$x=something"
In this example you cannot know the name of the new variable without executing someCommand. Other things which are equally (or even) worse are bash's indirection operator ${!x}, implicit indirection in arithmetic contexts (e.g. x=y; echo $((x))), and eval.
tl;dr: The only way to get all the variables in a script is to interpret/execute the script.
But here comes another problem: Executing the script is also not an option if there is non-determinism (declare "$(tr -cd a-z /dev/urandom | head -c1)=..."). Note that user-input is also non-deterministic (read x; declare "var$x=..."). You would have to write a static analyzer. But this is also not an option because of the halting problem. From the halting problem we can deduce that it is (in general) impossible to tell whether a given bash script has a finite amount of variables.
To implement your module system you could use another approach. For instance, if someone wants to implement a module for your framework then they have to specify the functions/variables in this module in an easy parsable format.

Setting up default overwritable constructors and destructors or other functions for a set of commands

alias cmd_name="source mainshell cmd_name"
My plan is to alias a single main script to a set of script names. Now on invocation of any script that main script would be called and it can define certain functions or constructors and destructor. Then it can check that script file if it has a constructor definition. If it has one, call that constructor else call the default one. Then source that script and then call the destructor. This would also give that script access to the default functions setup by main script. This shall work fine but these aliases can’t be exported to subshells.
To add to that, I just want these defaults functions available to that particular aliased set of commands and want those functions to destroy once command execution is complete. That’s why I can’t just write them on .bash_profile making it absolutely global.
command_name() {
# initial code
source path/to/command_name
# destructing code
}
Another option which I found was to create function for each name and call my script inside. This one is exportable too. In this way i could just encapsulate every command in a function with same name and can easily have initial codes and destroying code. Here the problem is that i can’t define any more functions inside that function and it would get really clumsy too doing everything inside a function.
Another thought I had was symbolic links, but they seem to have a limit to how many I can create to a particular script.
What should be the best way to achieve this or if its somehow an inappropriate design, can someone please explain?
IIUC you're trying to achieve the following:
A set of commands that necessarily take place in the context of the current shell rather than a new shell process.
These commands have a lot of common functionality that needs to be factored out of these commands.
The common functionality must not be accessible to the current shell.
In that case, the common code would be e.g. functions & variables that you have to explicitly unset after the command has been executed. Therefore, your best bet is to have a function per-command, have that function source the common code, and have the common code also have another function (called before you return) to unset everything.
Notes:
You can actually declare functions inside other functions, but the nested functions will actually be global - so name them uniquely and don't forget to unset them.
If they don't need to affect the current shell then you can just put them in their own file, source the common code, and not unset anything.
There is generally no limit to how many symlinks you can create to a single file. The limit on symlink chains (symlink to symlink to symlink etc.) is low.

Best way to use variable across functions in python

I need to refer to the same variable in several functions in a python script. (The write context of a CGPDFDocument).
I can either write global writeContext in every function;
or add the variable as an argument to every function;
....both of which seem to be excessive duplication.
Other answers to questions about global variables in python suggest that they are "bad".
So what's the better way of handling it?

Looking for difference between xdmp:value() vs xdmp:eval()

Can someone provide a better explanation of the xdmp:eval() and xdmp:value() functions?
I had tried to follow the Developer API. However, I am not really satisfied with the instances and it's a bit vague for me. I would really appreciate if someone could help me understand those functions and their differences with examples.
Both functions are for executing strings of code dynamically, but xdmp:value is evaluated against the current context, such that if you have variables defined in the current scope or modules declared, you can reference them without redeclaring them.
xdmp:eval necessitates the creation of an entirely new context that has no knowledge of the context calling xdmp:eval. One must define a new XQuery prolog, and variables from the main context are passed to the xdmp:eval call as parameters and declared as external variables in the eval script.
Generally, if you can use xdmp:value, it's probably the best choice; however, xdmp:eval has some capabilities that xdmp:value doesn't, namely everything defined in the <options> argument. Through these options, it's possible to control the user executing the query, the database it's executed against, transaction mode, etc.
There is another function for executing dynamic strings: xdmp:unpath, and it's similar to xdmp:value, but more limited in that it can only execute XPath.

Resources