I've got a config .cfg file that has the hostname hard coded in it. I'm trying to find a way for the hostname to be gotten locally (dynamically) by running a command similar to hostname -f to have it configure the variable in the .cfg, without running a script, like python, to write the config file ahead time. Is it possible to run a 'yum' command that gets the hostname to use in the YAML/yml file?
Thanks to Wikipedia, I think I found out why no one is helping me with this:
Wiki YAML -> Security
Security
YAML is purely a data representation language and thus has no executable commands. While validation and safe parsing is inherently possible in any data language, implementation is such a notorious pitfall that YAML's lack of an associated command language may be a relative security benefit.
However, YAML allows language-specific tags so that arbitrary local objects can be created by a parser that supports those tags. Any YAML parser that allows sophisticated object instantiation to be executed opens the potential for an injection attack. Perl parsers that allow loading of objects of arbitrary class create so-called "blessed" values. Using these values may trigger unexpected behavior, e.g. if the class uses overloaded operators. This may lead to execution of arbitrary Perl code.
The situation is similar for Python or Ruby parsers. According to the PyYAML documentation
Related
Is there a way to use dictionary rather than using a yaml config for parameters.yml? I want to keep it as a Python Object because my IDE can then track the dependency easily. For my parameters, I am injecting functions in it.
If i need to use yml, I will have to use
def steps1(x, func1):
func1 = eval(func1)
And this will break the refactoring features easily.
You could overwrite the params property in your src.package_name.run.ProjectContext so that it uses a Python dictionary instead of the config loader. You’re also welcome to write up your custom ConfigLoader and use that instead (by overriding _create_config_loader), but that’s probably more effort.
Please bear in mind that parameters in Kedro are meant to be “as dumb as possible” though, as it’s considered static configuration and it’s better separated out of code. What you describe, with expressions, sounds more suited for nodes.
I'm into unit testing of some legacy shell scripts.
In the real world scripts are often used to call utility programs
like find, tar, cpio, grep, sed, rsync, date and so on with some rather complex command lines containing a lot of options. Sometimes regular expressions or wildcard patterns are constructed and used.
An example: A shell script which is usually invoked by cron in regular intervals has the task to mirror some huge directory trees from one computer to another using the utility rsync.
Several types of files and directories should be excluded from the
mirroring process:
#!/usr/bin/env bash
...
function mirror() {
...
COMMAND="rsync -aH$VERBOSE$DRY $PROGRESS $DELETE $OTHER_OPTIONS \
$EXCLUDE_OPTIONS $SOURCE_HOST:$DIRECTORY $TARGET"
...
if eval $COMMAND
then ...
else ...
fi
...
}
...
As Michael Feathers wrote in his famous book Working Effectively with Legacy Code, a good unit test runs very fast and does not touch the network, the file-system or opens any database.
Following Michael Feathers advice the technique to use here is: dependency injection. The object to replace here is utility program rsync.
My first idea: In my shell script testing framework (I use bats) I manipulate $PATH in a way that a mockup rsync is found instead of
the real rsync utility. This mockup object could check the supplied command line parameters and options. Similar with other utilities used in this part of the script under test.
My past experience with real problems in this area of scripting were often bugs caused by special characters in file or directory names, problems with quoting or encodings, missing ssh keys, wrong permissions and so on. These kind of bugs would have escaped this technique of unit testing. (I know: for some of these problems unit testing is simply not the cure).
Another disadvantage is that writing a mockup for a complex utility like rsync or find is error prone and a tedious engineering task of its own.
I believe the situation described above is general enough that other people might have encountered similar problems. Who has got some clever ideas and would care to share them here with me?
You can mockup any command using a function, like this:
function rsync() {
# mock things here if necessary
}
Then export the function and run the unittest:
export -f rsync
unittest
Cargill's quandary:
" Any design problem can be solved by adding an additional level of indirection, except for too many levels of indirection."
Why mock system commands ? After all if you are programming Bash, the system is your target goal and you should evaluate your script using the system.
Unit test, as the name suggests, will give you a confidence in a unitary part of the system you are designing. So you will have to define what is your unit in the case of a bash script. A function ? A script file ? A command ?
Given you want to define the unit as a function I would then suggest writing a list of well known errors as you listed above:
Special characters in file or directory names
Problems with quoting or encodings
Missing ssh keys
Wrong permissions and so on.
And write a test case for it. And try to not deviate from the system commands, since they are integral part of the system you are delivering.
I am looking for a way to emit YAML files avoiding the use of aliases (mostly for simplified human readability). I think extending Psych::Visitors::Emitter or
Psych::Visitors::Visitor is the way to go, but I cannot actually find where Ruby decides whether to dump an anchor in full, or reference it with an alias.
I wouldn't even mind if the anchors were used repeatedly (with their &...... references), I just need to expand aliases to the full structures.
I am aware of similar questions being asked in the past, but:
Ruby YAML write without aliases remained unanswered
Is it possible to emit valid YAML with anchors / references disabled using Ruby or Python? gave answer for Python but not for Ruby
One simple (hacky) approach I used was convert the yaml to json. and then convert it back to YAML. new YAML does not contain aliases/anchors.
require 'json'
jsonObj = oldYaml.to_json
newYaml = YAML.load(jsonObj)
print newYaml.to_yaml
The only way I've found to do this is to perform a deep clone of the object being dumped to YAML. This is because YAML will identify the anchors and aliases based on their identity, and if you clone or dup them, the new object will be equal, but have a different identity.
There are many ways to perform a deep clone, including library support, or writing your own helper function -- I'll leave that as an exercise for the reader.
I have one large project with components in multiple languages that each depend on some of the same enum values. What solutions have you come up with to unify enums across multiple arbitrary languages? I can think of a few, but I'm looking for the best solution.
(In my implementation, I'm using Php, Java, Javascript, and SQL.)
You can put all of the enums in a text file, then use a code generator to write out the appropriate syntax for each language from that common file so that each component has the enums. Make that text file the authoritative source of information.
You can express the text file in XML but I'd think a tab-delimited flat file would work just fine.
Make them in a format that every language can understand or has a library for. I am using JSON for this at the moment.
Then you can include it with two ways:
For development: Load it from a file/URL at runtime
good for small changes you want too see immediately
slow
For productive usage: Include it in the files
using a build script
fast
no instant feedback
I would apply the dry principle and using code generator as such you could add anew language easely even if it has not enum natively existing.
I'm working on a project which will do some complicated analyzing on some user-supplied input. There will be 3 parts of the code:
1) Input supplied by user, such as keywords
2) Rules, such as if keyword 1 is repeated 3 times in keyword 5, do this, etc.
3) And the analyzing itself which executes the rules and processes the user input, and generates the output necessary based on the processing.
Naturally this will lead to a lot of spaghetti code and many, many if statements in the processing code. I want to avoid that, and keep the rules (i.e. the if statements) separately from the code which loops through the user input and generates the output.
How can I do that, i.e. what is the best way?
If you have enough rules that you want to externalize, you could try using a business rules engines, like Drools in Java.
A business rules engine is a software system that executes one or more business rules in a runtime production environment. The rules might come from legal regulation ("An employee can be fired for any reason or no reason but not for an illegal reason"), company policy ("All customers that spend more than $100 at one time will receive a 10% discount"), or other sources. (Wikipedia)
It could be a little bit overhead depending of what you're trying to do. In my company we're using such kind of tools for our quality analysis tool.
Store it in XML. Easy to parse and update.
I had designed a code generator, which can be controllable from a xml file.
For each command I had a entry in the xml. I was processing the node to generate the opcode for that command. Node itself contains the actions I need to do for getting the opcode. For some commands I had to look into database, all those things I had put in this xml file.
Well, i doubt that it is necessary to have hughe if statements if polymorphism is applied correctly.
Actually, you need a proper domain model for your rules. This goes somehow into the direction of the command pattern, depending on the complexitiy of your code maybe in combination with the state machine pattern.
Once you have your model, defining rules is instantiate them correctly.
This could be done by having an xml definition, which is parsed and transformed into your model. But the new modern and even more fancy way would be using DSLs. If you program in Java and have a certain freedom about your libraries, this would be a proper use case for Embedded DSLs with Groovy. Basically you would need a Builder which constructs your model, that's all.
You always can implement factory that will create certain strategies according to passed parameters. And then you will use those strategies in your code without any if.
If it's just detecting keywords, a finite state machine or similar. If it's doing more, then other pattern matching systems, such as rules engines.
Adding an embedded scripting language to your application might help. The rules would then be expressed in scripts, executed by the applications on processing.
The idea is that scripts are easy to change and contain high level logic that will be executed by your application in details.
There are a lot of scripting languages available to do this : lua, Python, Falcon, squirrel, angelscript, etc.
Have a look at rule engines!
The approach from Lars may also be arguable.