Sed variable too long - bash

I need to substitute a unique string in a json file: {FILES} by a bash variable that contains thousands of paths: ${FILES}
sed -i "s|{FILES}|$FILES|" ./myFile.json
What would be the most elegant way to achieve that ? The content of ${FILES} is a result of an "aws s3" command. The content would look like :
FILES="/file1.ipk, /file2.ipk, /subfolder1/file3.ipk, /subfolder2/file4.ipk, ..."
I can't think of a solution where xargs would help me.

The safest way is probably to let Bash itself expand the variable. You can create a Bash script containing a here document with the full contents of myFile.json, with the placeholder {FILES} replaced by a reference to the variable $FILES (not the contents itself). Execution of this script would generate the output you seek.
For example, if myFile.json would contain:
{foo: 1, bar: "{FILES}"}
then the script should be:
#!/bin/bash
cat << EOF
{foo: 1, bar: "$FILES"}
EOF
You can generate the script with a single sed command:
sed -e '1i#!/bin/bash\ncat << EOF' -e 's/\$/\\$/g;s/{FILES}/$FILES/' -e '$aEOF' myFile.json
Notice sed is doing two replacements; the first one (s/\$/\\$/g) to escape any dollar signs that might occur within the JSON data (replace every $ by \$). The second replaces {FILES} by $FILES; the literal text $FILES, not the contents of the variable.
Now we can combine everything into a single Bash one-liner that generates the script and immediately executes it by piping it to Bash:
sed -e '1i#!/bin/bash\ncat << EOF' -e 's/\$/\\$/g;s/{FILES}/$FILES/' -e '$aEOF' myFile.json | /bin/bash
Or even better, execute the script without spawning a subshell (useful if $FILES is set without export):
sed -e '1i#!/bin/bash\ncat << EOF' -e 's/\$/\\$/g;s/{FILES}/$FILES/' -e '$aEOF' myFile.json | source /dev/stdin
Output:
{foo: 1, bar: "/file1.ipk, /file2.ipk, /subfolder1/file3.ipk, /subfolder2/file4.ipk, ..."}

Maybe perl would have fewer limitations?
perl -pi -e "s#{FILES}#${FILES}#" ./myFile.json

It's a little gross, but you can do it all within shell...
while read l
do
if ! echo "$l" | grep -q '{DATA}'
then
echo "$l"
else
echo "$l" | sed 's/{DATA}.*$//'
echo "$FILES"
echo "$l" | sed 's/^.*{DATA}//'
fi
done <./myfile.json >newfile.json
#mv newfile.json myfile.json
Obviously I'd leave the final line commented until you were confident it worked...

Maybe just don't do it? Can you just :
echo "var f = " > myFile2.json
echo $FILES >> myFile2.json
And reference myFile2.json from within your other json file? (You should put the global f variable into a namespace if this works for you.)

Instead of putting all those variables in an environment variable, put them in a file. Then read that file in perl:
foo.pl:
open X, "$ARGV[0]" or die "couldn't open";
shift;
$foo = <X>;
while (<>) {
s/world/$foo/;
print;
}
Command to run:
aws s3 ... >/tmp/myfile.$$
perl foo.pl /tmp/myfile.$$ <myFile.json >newFile.json
Hopefully that will bypass the limitations of the environment variable space and the argument length by pulling all the processing within perl itself.

Related

Modify bash variable with sed

Why doesn't the follow bash script work? I would like it to output
two lines like this:
XXXXXXX
YYYYYYY
It works if I change the sed line to use a filename instead of the variable, but I want to use the variable.
#!/bin/bash
input=$(echo -e '=======\n-------\n')
for sym in = -; do
if [ "$sym" == '-' ]; then
replace=Y
else
replace=X
fi
printf "%s\n" "s/./$replace/g"
done | sed -f- <<<"$input"
The main problem is that you're giving sed two sources to read standard input from: the for loop that is fed through the pipe, and the variable coming through the here-string. As it turns out, the here-string gets precedence and sed complains that there are extra characters after a command (= is a command).
Instead of a here-string, you could use process substitution:
for sym in = -; do
if [ "$sym" == '-' ]; then
replace=Y
else
replace=X
fi
printf "%s\n" "s/./$replace/g"
done | sed -f- <(printf '%s\n' '=======' '-------')
You'll notice that the output isn't what you want, though, namely
YYYYYYY
YYYYYYY
This is because the sed script you end up with looks like this:
s/./X/g
s/./Y/g
No matter what you do first, the last command replaces everything with Y.

Turning a list of abs pathed files to a comma delimited string of files in bash

I have been working in bash, and need to create a string argument. bash is a newish for me, to the point that I dont know how to build a string in bash from a list.
// foo.txt is a list of abs file names.
/foo/bar/a.txt
/foo/bar/b.txt
/delta/test/b.txt
should turn into: a.txt,b.txt,b.txt
OR: /foo/bar/a.txt,/foo/bar/b.txt,/delta/test/b.txt
code
s = ""
for file in $(cat foo.txt);
do
#what goes here? s += $file ?
done
myShellScript --script $s
I figure there was an easy way to do this.
with for loop:
for file in $(cat foo.txt);do echo -n "$file",;done|sed 's/,$/\n/g'
with tr:
cat foo.txt|tr '\n' ','|sed 's/,$/\n/g'
only sed:
sed ':a;N;$!ba;s/\n/,/g' foo.txt
This seems to work:
#!/bin/bash
input="foo.txt"
while IFS= read -r var
do
basename $var >> tmp
done < "$input"
paste -d, -s tmp > result.txt
output: a.txt,b.txt,b.txt
basename gets you the file names you need and paste will put them in the order you seem to need.
The input field separator can be used with set to create split/join functionality:
# split the lines of foo.txt into positional parameters
IFS=$'\n'
set $(< foo.txt)
# join with commas
IFS=,
echo "$*"
For just the file names, add some sed:
IFS=$'\n'; set $(sed 's|.*/||' foo.txt); IFS=,; echo "$*"

Read a file and replace ${1}, ${2}... value with string

I have a file template.txt and its content is below:
param1=${1}
param2=${2}
param3=${3}
I want to replace ${1},{2},${3}...${n} string values by elements of scriptParams variable.
The below code, only replaces first line.
scrpitParams="test1,test2,test3"
cat template.txt | for param in ${scriptParams} ; do i=$((++i)) ; sed -e "s/\${$i}/$param/" ; done
RESULT:
param1=test1
param2=${2}
param3=${3}
EXPECTED:
param1=test1
param2=test2
param3=test3
Note: I don't want to save replaced file, want to use its replaced value.
If you intend to use an array, use a real array. sed is not needed either:
$ cat template
param1=${1}
param2=${2}
param3=${3}
$ scriptParams=("test one" "test two" "test three")
$ while read -r l; do for((i=1;i<=${#scriptParams[#]};i++)); do l=${l//\$\{$i\}/${scriptParams[i-1]}}; done; echo "$l"; done < template
param1=test one
param2=test two
param3=test three
Learn to debug:
cat template.txt | for param in ${scriptParams} ; do i=$((++i)) ; echo $i - $param; done
1 - test1,test2,test3
Oops..
scriptParams="test1 test2 test3"
cat template.txt | for param in ${scriptParams} ; do i=$((++i)) ; echo $i - $param; done
1 - test1
2 - test2
3 - test3
Ok, looks better...
cat template.txt | for param in ${scriptParams} ; do i=$((++i)) ; sed -e "s/\${$i}/$param/" ; done
param1=test1
param2=${2}
param3=${3}
Ooops... so what's the problem? Well, the first sed command "eats" all the input. You haven't built a pipeline, where one sed command feeding the next... You have three seds trying to read the same input. Obviously the first one processed the whole input.
Ok, let's take a different approach, let's create the arguments for a single sed command (note: the "" is there to force echo not to interpret -e as an command line switch).
sedargs=$(for param in ${scriptParams} ; do i=$((++i)); echo "" -e "s/\${$i}/$param/"; done)
cat template.txt | sed $sedargs
param1=test1
param2=test2
param3=test3
That's it. Note that this isn't perfect, you can have all sort of problems if the replace texts are complex (e.g.: contain space).
Let me think how to do this in a better way... (well, the obvious solution which comes to mind is not to use a shell script for this task...)
Update:
If you want to build a proper pipeline, here are some solutions: How to make a pipe loop in bash
You can do that with just bash alone:
#!/bin/bash
scriptParams=("test1" "test2" "test3") ## Better store it as arrays.
while read -r line; do
for i in in "${!scriptParams[#]}"; do ## Indices of array scriptParams would be populated to i starting at 0.
line=${line/"\${$((i + 1))}"/"${scriptParams[i]}"} ## ${var/p/r} replaces patterns (p) with r in the contents of var. Here we also add 1 to the index to fit with the targets.
done
echo "<br>$line</br>"
done < template.txt
Save it in a script and run bash script.sh to get an output like this:
<br>param1=test1</br>
<br>param2=test2</br>
<br>param3=test3</br>

Speed up bash filter function to run commands consecutively instead of per line

I have written the following filter as a function in my ~/.bash_profile:
hilite() {
export REGEX_SED=$(echo $1 | sed "s/[|()]/\\\&/g")
while read line
do
echo $line | egrep "$1" | sed "s/$REGEX_SED/\x1b[7m&\x1b[0m/g"
done
exit 0
}
to find lines of anything piped into it matching a regular expression, and highlight matches using ANSI escape codes on a VT100-compatible terminal.
For example, the following finds and highlights the strings bin, U or 1 which are whole words in the last 10 lines of /etc/passwd:
tail /etc/passwd | hilite "\b(bin|[U1])\b"
However, the script runs very slowly as each line forks an echo, egrep and sed.
In this case, it would be more efficient to do egrep on the entire input, and then run sed on its output.
How can I modify my function to do this? I would prefer to not create any temporary files if possible.
P.S. Is there another way to find and highlight lines in a similar way?
sed can do a bit of grepping itself: if you give it the -n flag (or #n instruction in a script) it won't echo any output unless asked. So
while read line
do
echo $line | egrep "$1" | sed "s/$REGEX_SED/\x1b[7m&\x1b[0m/g"
done
could be simplified to
sed -n "s/$REGEX_SED/\x1b[7m&\x1b[0m/gp"
EDIT:
Here's the whole function:
hilite() {
REGEX_SED=$(echo $1 | sed "s/[|()]/\\\&/g");
sed -n "s/$REGEX_SED/\x1b[7m&\x1b[0m/gp"
}
That's all there is to it - no while loop, reading, grepping, etc.
If your egrep supports --color, just put this in .bash_profile:
hilite() { command egrep --color=auto "$#"; }
(Personally, I would name the function egrep; hence the usage of command).
I think you can replace the whole while loop with simply
sed -n "s/$REGEX_SED/\x1b[7m&\x1b[0m/gp"
because sed can read from stdin line-by-line so you don't need read
I'm not sure if running egrep and piping to sed is faster than using sed alone, but you can always compare using time.
Edit: added -n and p to sed to print only highlighted lines.
Well, you could simply do this:
egrep "$1" $line | sed "s/$REGEX_SED/\x1b[7m&\x1b[0m/g"
But I'm not sure that it'll be that much faster ; )
Just for the record, this is a method using a temporary file:
hilite() {
export REGEX_SED=$(echo $1 | sed "s/[|()]/\\\&/g")
export FILE=$2
if [ -z "$FILE" ]
then
export FILE=~/tmp
echo -n > $FILE
while read line
do
echo $line >> $FILE
done
fi
egrep "$1" $FILE | sed "s/$REGEX_SED/\x1b[7m&\x1b[0m/g"
return $?
}
which also takes a file/pathname as the second argument, for case like
cat /etc/passwd | hilite "\b(bin|[U1])\b"

Substitution with sed + bash function

my question seems to be general, but i can't find any answers.
In sed command, how can you replace the substitution pattern by a value returned by a simple bash function.
For instance, I created the following function :
function parseDates(){
#Some process here with $1 (the pattern found)
return "dateParsed;
}
and the folowing sed command :
myCatFile=`sed -e "s/[0-3][0-9]\/[0-1][0-9]\/[0-9][0-9]/& parseDates &\}/p" myfile`
I found that the caracter '&' represents the current pattern found, i'd like it to be passed to my bash function and the whole pattern to be substituted by the pattern found +dateParsed.
Does anybody have an idea ?
Thanks
you can use the "e" option in sed command like this:
cat t.sh
myecho() {
echo ">>hello,$1<<"
}
export -f myecho
sed -e "s/.*/myecho &/e" <<END
ni
END
you can see the result without "e":
cat t.sh
myecho() {
echo ">>hello,$1<<"
}
export -f myecho
sed -e "s/.*/myecho &/" <<END
ni
END
Agree with Glenn Jackman.
If you want to use bash function in sed, something like this :
sed -rn 's/^([[:digit:].]+)/`date -d #&`/p' file |
while read -r line; do
eval echo "$line"
done
My file here begins with a unix timestamp (e.g. 1362407133.936).
Bash function inside sed (maybe for other purposes):
multi_stdin(){ #Makes function accepet variable or stdin (via pipe)
[[ -n "$1" ]] && echo "$*" || cat -
}
sans_accent(){
multi_stdin "$#" | sed '
y/àáâãäåèéêëìíîïòóôõöùúûü/aaaaaaeeeeiiiiooooouuuu/
y/ÀÁÂÃÄÅÈÉÊËÌÍÎÏÒÓÔÕÖÙÚÛÜ/AAAAAAEEEEIIIIOOOOOUUUU/
y/çÇñÑߢÐð£Øø§µÝý¥¹²³ªº/cCnNBcDdLOoSuYyY123ao/
'
}
eval $(echo "Rogério Madureira" | sed -n 's#.*#echo & | sans_accent#p')
or
eval $(echo "Rogério Madureira" | sed -n 's#.*#sans_accent &#p')
Rogerio
And if you need to keep the output into a variable:
VAR=$( eval $(echo "Rogério Madureira" | sed -n 's#.*#echo & | desacentua#p') )
echo "$VAR"
do it step by step. (also you could use an alternate delimiter , such as "|" instead of "/"
function parseDates(){
#Some process here with $1 (the pattern found)
return "dateParsed;
}
value=$(parseDates)
sed -n "s|[0-3][0-9]/[0-1][0-9]/[0-9][0-9]|& $value &|p" myfile
Note the use of double quotes instead of single quotes, so that $value can be interpolated
I'd like to know if there's a way to do this too. However, for this particular problem you don't need it. If you surround the different components of the date with ()s, you can back reference them with \1 \2 etc and reformat however you want.
For instance, let's reverse 03/04/1973:
echo 03/04/1973 | sed -e 's/\([0-9][0-9]\)\/\([0-9][0-9]\)\/\([0-9][0-9][0-9][0-9]\)/\3\/\2\/\1/g'
sed -e 's#[0-3][0-9]/[0-1][0-9]/[0-9][0-9]#& $(parseDates &)#' myfile |
while read -r line; do
eval echo "$line"
done
You can glue together a sed-command by ending a single-quoted section, and reopening it again.
sed -n 's|[0-3][0-9]/[0-1][0-9]/[0-9][0-9]|& '$(parseDates)' &|p' datefile
However, in contrast to other examples, a function in bash can't return strings, only put them out:
function parseDates(){
# Some process here with $1 (the pattern found)
echo dateParsed
}

Resources