scala process with spaces not working correctly - shell

I have a scala process command like below to use linux bash egrep command. But the search results are not the same in terminal and in my scala generated file. Scala results contain everything that has "new" and "Exception" while I want the output to contains only lines having "new Exception". Am I missing something here? Please help
if (("egrep -r -I -n -E \"*new Exception*\" /mysource/" #|
"grep -v .svn").! == 0) {
out.println(("egrep -r -I -n -E \"*new Exception*\" /mysource/" #|
"grep -v .svn").!!)
}

The docs say (under "What to Run and How"): Implicitly, each process is created either out of a String, with arguments separated by spaces -- no escaping of spaces is possible -- or out of a scala.collection.Seq, where the first element represents the command name, and the remaining elements are arguments to it. In this latter case, arguments may contain spaces
So, apparently if you need to pass the command line a single argument with spaces, like new Exception, you need to create the process builder from a Seq instead of a single String.

Related

How to remove duplicate with bash script command xargs when the string has some quotes ""?

I am a newbie in bash script.
Here is my environment:
Mac OS X Catalina
/bin/bash
I found here a mix of several commands to remove the duplicate string in a string.
I needed for my program which updates the .zhrc profile file.
Here is my code:
#!/bin/bash
a='export PATH="/Library/Frameworks/Python.framework/Versions/3.8/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/local/bin:"'
myvariable=$(echo "$a" | tr ':' '\n' | sort | uniq | xargs)
echo "myvariable : $myvariable"
Here is the output:
xargs: unterminated quote
myvariable :
After some test, I know that the source of the issue is due to some quotes "" inside my variable '$a'.
Why am I so sure?
Because when I execute this code for example:
#!/bin/bash
a="/Library/Java/JavaVirtualMachines/jdk1.8.0_271.jdk/Contents/Home:/Library/Java/JavaVirtualMachines/jdk1.8.0_271.jdk/Contents/Home"
myvariable=$(echo "$a" | tr ':' '\n' | sort | uniq | xargs)
echo "myvariable : $myvariable"
where $a doesn't contain any quotes, I get the correct output:
myvariable : /Library/Java/JavaVirtualMachines/jdk1.8.0_271.jdk/Contents/Home
I tried to search for a solution for "xargs: unterminated quote" but each answer found on the web is for a particular case which doesn't correspond to my problem.
As I am a newbie and this line command is using several complex commands, I was wondering if anyone know the magic trick to make it work.
Basically, you want to remove duplicates from a colon-separated list.
I don't know if this is considered cheating, but I would do this in another language and invoke it from bash. First I would write a script for this purpose in zsh: It accepts as parameter a string with colon separtors and outputs a colon-separated list with duplicates removed:
#!/bin/zsh
original=${1?Parameter missing} # Original string
# Auxiliary array, which is set up to act like a Set, i.e. without
# duplicates
typeset -aU nodups_array
# Split the original strings on the colons and store the pieces
# into the array, thereby removing duplicates. The core idea for
# this is stolen from:
# https://stackoverflow.com/questions/2930238/split-string-with-zsh-as-in-python
nodups_array=("${(#s/:/)original}")
# Join the array back with colons and write the resulting string
# to stdout.
echo ${(j':')nodups_array}
If we call this script nodups_string, you can invoke it in your bash-setting as:
#!/bin/bash
a_path="/Library/Frameworks/Python.framework/Versions/3.8/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/local/bin:"
nodups_a_path=$(nodups_string "$a_path")
my_variable="export PATH=$nodups_a_path"
echo "myvariable : $myvariable"
The overall effect would be literally what you asked for. However, there is still an open problem I should point out: If one of the PATH components happens to contain a space, the resulting export statement can not validly be executed. This problem is also inherent into your original problem; you just didn't mention it. You could do something like
my_variable=export\ PATH='"'$nodups_a_path"'"'
to avoid this. Of course, I wonder why you take such an effort to generat a syntactically valid export command, instead of simply building the PATH by directly where it is needed.
Side note: If you would use zsh as your shell instead of bash, and only want to keep your PATH free of duplicates, a simple
typeset -iU path
would suffice, and zsh takes care of the rest.
With awk:
awk -v RS=[:\"] 'NR > 1 { pth[$0]="" } END { for (i in pth) { if (i !~ /[[:space:]]+/ && i != "" ) { printf "%s:",i } } }' <<< "$a"
Set the record separator to : and double quotes. Then when the number record is greater than one, set up an array called pth with the path as the index. At the end, loop through the array, re printing the paths separated with :

Linux script text substitutions

I want to make a few configuration files (for homeassistant) that are very similar to each other. I am aiming to use a template file as the base and put in a few substitution strings at the top of the file and use a bash script to read the substitutions and run sed with the applicable strings.
i.e.
# substitutions
# room = living_room
# switch = hallway_motion
# delay = 3
automations:
foo......
.........
entity_id: $switch
When I run the script it will look for any line beginning with a # that has a word (key) and then an = and another word (maybe string) (value) and replace anywhere that key with a $ in front is in the rest of the file.
Like what is done by esphome. https://esphome.io/guides/configuration-types.html#substitutions
I am getting stuck at finding the "keys" in the file. How can I script this so it can find all the "keys" recursively?
Or is there something that does this, or something similar, out there already?
You can do this with sed in two stages. The first stage will generate a second stage sed script to fill in your template. I'd make a small adjustment to your syntax and recommend that you require curly braces around your variable name. In other words, write your variable expansions like this:
# foo = bar
myentry: ${foo}
This makes it easier to avoid pitfalls when you have one variable name that's a prefix of another (e.g., foo and foobar).
#!/bin/bash
in="$1"
stage2=$(mktemp)
trap 'rm -f "$stage2"' EXIT
sed -n -e 's,^# \([[:alnum:]_]\+\) = \([^#]\+\),s#\${\1}#\2#g,p' "$in" > "$stage2"
sed -f "$stage2" "$in"
Provide a filename as the first argument, and it will print the filled out template on stdout.
This example code is pretty strict about white space on variable definition lines, but that can obviously be adjusted to your liking.

Is there a way to prevent injection attacks when building a command-line from untrusted input in bash?

I have a situation where a Bash script runs and parses a user-supplied JSON file using jq. Since it's supplied by the user, it's possible for them to include values in the JSON to perform an injection attack.
I'd like to know if there's a way to overcome this. Please note, the setup of: 'my script parsing a user-supplied JSON file' cannot be changed, as it's out of my control. Only thing I can control is the Bash script.
I've tried using jq with and without the -r flag, but in each case, I was successfully able to inject.
Here's what the Bash script looks like at the moment:
#!/bin/bash
set -e
eval "INCLUDES=($(cat user-supplied.json | jq '.Include[]'))"
CMD="echo Includes are: "
for e in "${INCLUDES[#]}"; do
CMD="$CMD\\\"$e\\\" "
done
eval "$CMD"
And here is an example of a sample user-supplied.json file that demonstrates an injection attack:
{
"Include": [
"\\\";ls -al;echo\\\""
]
}
The above JSON file results in the output:
Includes are: ""
, followed by a directory listing (an actual attack would probably be something far more malicious).
What I'd like instead is something like the following to be outputted:
Includes are: "\\\";ls -al;echo\\\""
Edit 1
I used echo as an example command in the script, which probably wasn’t the best example, as then the solution is simply not using eval.
However the actual command that will be needed is dotnet test, and each array item from Includes needs to be passed as an option using /p:<Includes item>. What I was hoping for was a way to globally neutralise injection regardless of the command, but perhaps that’s not possible, ie, the technique you go for relies heavily on the actual command.
You don't need to use eval for dotnet test either. Many bash extensions not present in POSIX sh exist specifically to make eval usage unnecessary; if you think you need eval for something, you should provide enough details to let us explain why it isn't actually required. :)
#!/usr/bin/env bash
# ^^^^- Syntax below is bash-only; the shell *must* be bash, not /bin/sh
include_args=( )
IFS=$'\n' read -r -d '' -a includes < <(jq -r '.Include[]' user-supplied.json && printf '\0')
for include in "${includes[#]}"; do
include_args+=( "/p:$include" )
done
dotnet test "${include_args[#]}"
To speak a bit to what's going on:
IFS=$'\n' read -r -d '' -a arrayname reads up to the next NUL character in stdin (-d specifies a single character to stop at; since C strings are NUL-terminated, the first character in an empty string is a NUL byte), splits on newlines, and puts the result into arrayname.
The shorter way to write this in bash 4.0 or later is readarray -t arrayname, but that doesn't have the advantage of letting you detect whether the program generating the input failed: Because we have the && printf '\0' attached to the jq code, the NUL terminator this read expects is only present if jq succeeds, thus causing the read's exit status to reflect success only if jq reported success as well.
< <(...) is redirecting stdin from a process substitution, which is replaced with a filename which, when read from, returns the output of running the code ....
The reason we can set include_args+=( "/p:$include" ) and have it be exactly the same as include_args+=( /p:"$include" ) is that the quotes are read by the shell itself and used to determine where to perform string-splitting and globbing; they're not persisted in the generated content (and thus later passed to dotnet test).
Some other useful references:
BashFAQ #50: I'm trying to put a command in a variable, but the complex cases always fail! -- explains in depth why you can't store commands in strings without using eval, and describes better practices to use instead (storing commands in functions; storing commands in arrays; etc).
BashFAQ #48: Eval command and security issues -- Goes into more detail on why eval is widely frowned on.
You don't need eval at all.
INCLUDES=( $(jq '.Include[]' user-supplied.json) )
echo "Includes are: "
for e in "${INCLUDES[#]}"; do
echo "$e"
done
The worst that can happen is that the unquoted command substitution may perform word-splitting or pathname expansion where you don't want it (which is a problem in your original as well), but there's no possibility for arbitrary command execution.

Unix: Remove date from filename using sed without modifying existing one

I have a legacy code which transmits the file only if it has date within the command. But client transmission doesnt want date to be appended to filename. legacy code cannot be modified since many other transmision depends on it. So my requirement is i want to have date parameter in the command but again the same has to be removed using a single command.
Condition in legacy code:
grep '\`date' $COMMAND
Note: COMMAND will contain the complete command defined below and not the filename (not CMD output).
So ideally my command should have `date added. I added a command like this below.
CMD=`echo prefix_filename.txt | sed 's/^prefix_//'`_`date +%m%d%Y`
The above command is used to remove prefix_ and send filename. Here i get output as filename.txt_09232016. Since legacy code logic only checks if command has `date in it, i added it. Is there a way to remove the date again in the same command so that my output will be filename.txt
Current output:
filename.txt_09232016
Expected output:
filename.txt
Get the file name before date part:
echo 'filename.txt_09232016' | grep -o '^.*\.txt'
Or remove date from the end of the file:
echo 'filename.txt_09232016' | sed 's/_[0-9]\+$//'
There are a number of things you can do to improve/simplify your code. The main thing is that bash have very nice built-in string manipulation. Another is that you should probably use $(...) instead of `...` notation:
CMD=`echo prefix_filename.txt | sed 's/^prefix_//'`_`date +%m%d%Y`
Can be replaced with
ORIG=prefix_filename.txt
CMD=${ORIG#prefix_}_$(date +%m%d%Y)
Continuing,
echo $CMD
NODATE=${CMD%_*}
echo $NODATE
This prints
filename.txt_09232016
filename.txt
The construct ${var#pattern} removes the shortest occurrence of pattern from the start of your variable: in this case, prefix_. Similarly, the construct ${var%pattern} removes the shortest occurrence of pattern from the end of your string: in this case _*.
In the first case, you could have used ${var##pattern} since prefix_ is a fixed string. However, in the second case you could not use ${var%%pattern}, since you want to make sure you only truncate starting at the last underscore, not the first one and the date is specified as a dynamic pattern.
Just as an FYI, the links point to www.tldp.org, which has the best Bash manual I have come across by far. It gets dense sometimes, but the explanations are generally worth it in the end.
Just do that:
echo filename.txt_09232016 | sed s/_[^_]*$//
Here, you are replacing (by nothing) ' _ ' and all subsequent characters, until the end of the string ($), since they are all different (^) of ' _ '.

Create variable by combining text + another variable

Long story short, I'm trying to grep a value contained in the first column of a text file by using a variable.
Here's a sample of the script, with the grep command that doesn't work:
for ii in `cat list.txt`
do
grep '^$ii' >outfile.txt
done
Contents of list.txt :
123,"first product",description,20.456789
456,"second product",description,30.123456
789,"third product",description,40.123456
If I perform grep '^123' list.txt, it produces the correct output... Just the first line of list.txt.
If I try to use the variable (ie grep '^ii' list.txt) I get a "^ii command not found" error. I tried to combine text with the variable to get it to work:
VAR1= "'^"$ii"'"
but the VAR1 variable contained a carriage return after the $ii variable:
'^123
'
I've tried a laundry list of things to remove the cr/lr (ie sed & awk), but to no avail. There has to be an easier way to perform the grep command using the variable. I would prefer to stay with the grep command because it works perfectly when performing it manually.
You have things mixed in the command grep '^ii' list.txt. The character ^ is for the beginning of the line and a $ is for the value of a variable.
When you want to grep for 123 in the variable ii at the beginning of the line, use
ii="123"
grep "^$ii" list.txt
(You should use double quotes here)
Good moment for learning good habits: Continue in variable names in lowercase (well done) and use curly braces (don't harm and are needed in other cases) :
ii="123"
grep "^${ii}" list.txt
Now we both are forgetting something: Our grep will also match
1234,"4-digit product",description,11.1111. Include a , in the grep:
ii="123"
grep "^${ii}," list.txt
And how did you get the "^ii command not found" error ? I think you used backquotes (old way for nesting a command, better is echo "example: $(date)") and you wrote
grep `^ii` list.txt # wrong !
#!/bin/sh
# Read every character before the first comma into the variable ii.
while IFS=, read ii rest; do
# Echo the value of ii. If these values are what you want, you're done; no
# need for grep.
echo "ii = $ii"
# If you want to find something associated with these values in another
# file, however, you can grep the file for the values. Use double quotes so
# that the value of $ii is substituted in the argument to grep.
grep "^$ii" some_other_file.txt >outfile.txt
done <list.txt

Resources