Bash, sed: replacing pattern with multiline file content - bash

if I have file named some_file, with content as follows:
first line
second line
third line
and inside script:
VAR1="first line\nsecond line\nthird line"
VAR2="`cat some_file`"
I expect VAR1 and VAR2 to be the same, but it is obviously not the case according to the sed:
sed "s/^a/${VAR1}/" some_another_file # this is OK
sed "s/^a/${VAR2}/" some_another_file # this fail with syntactic error
I suppose that newline representation is somehow different, but i can't find any way how to make VAR2 equal to VAR1.
thanks in advance

This will read in the file and replace the line with its contents:
sed -e '/^a/{r some_file' -e 'd}' some_another_file

Change VAR1 to:
VAR1=$(echo -e "first line\nsecond line\nthird line")
Then test them:
$ [ "$VAR1" == "$VAR2" ] && echo equal
equal
Update:
To get sed to work, change VAR2 so that it has "\n"s instead of newline characters.
VAR2=$(sed ':a;N;$!ba;s/\n/\\n/g' some_file)
sed "s/^a/${VAR2}/" file

Related

How to parse multiple line output as separate variables

I'm relatively new to bash scripting and I would like someone to explain this properly, thank you. Here is my code:
#! /bin/bash
echo "first arg: $1"
echo "first arg: $2"
var="$( grep -rnw $1 -e $2 | cut -d ":" -f1 )"
var2=$( grep -rnw $1 -e $2 | cut -d ":" -f1 | awk '{print substr($0,length,1)}')
echo "$var"
echo "$var2"
The problem I have is with the output, the script I'm trying to write is a c++ function searcher, so upon launching my script I have 2 arguments, one for the directory and the second one as the function name. This is how my output looks like:
first arg: Projekt
first arg: iseven
Projekt/AX/include/ax.h
Projekt/AX/src/ax.cpp
h
p
Now my question is: how do can I save the line by line output as a variable, so that later on I can use var as a path, or to use var2 as a character to compare. My plan was to use IF() statements to determine the type, idea: IF(last_char == p){echo:"something"}What I've tried was this question: Capturing multiple line output into a Bash variable and then giving it an array. So my code looked like: "${var[0]}". Please explain how can I use my line output later on, as variables.
I'd use readarray to populate an array variable just in case there's spaces in your command's output that shouldn't be used as field separators that would end up messing up foo=( ... ). And you can use shell parameter expansion substring syntax to get the last character of a variable; no need for that awk bit in your var2:
#!/usr/bin/env bash
readarray -t lines < <(printf "%s\n" "Projekt/AX/include/ax.h" "Projekt/AX/src/ax.cpp")
for line in "${lines[#]}"; do
printf "%s\n%s\n" "$line" "${line: -1}" # Note the space before the -1
done
will display
Projekt/AX/include/ax.h
h
Projekt/AX/src/ax.cpp
p

How to instert in bash a char at specific line and position in line

I have a file, where I want to add a * char on specific line, and at a specific location in that line.
Is that possible?
Thank you
You can use a kind of external tool available to manipulate data such as sed or awk. You can use this tool directly from your command line or include it in your bash script.
Example:
$ a="This is a test program that will print
Hello World!
Test programm Finished"
$ sed -E '2s/(.{4})/&\*/' <<<"$a" #Or <file
#Output:
This is a test program that will print
Hell*o World!
Test programm Finished
In above test, we enter an asterisk after 4th char of line2.
If you want to operate on a file and make changes directly on the file then use sed -E -i '....'
Same result can also be achieved with gnu awk:
awk 'BEGIN{OFS=FS=""}NR==2{sub(/./,"&*",$4)}1' <<<"$a"
In pure bash you can achieve above output with something like this:
while read -r line;do
let ++c
[[ $c == 2 ]] && printf '%s*%s\n' "${line:0:4}" "${line:4}" || printf '%s\n' "${line}"
# alternative:
# [[ $c == 2 ]] && echo "${line:0:4}*${line:4}" || echo "${line}"
done <<<"$a"
#Alternative for file read:
# done <file >newfile
If your variable is just a single line, you don't need the loop. You can do it directly like:
printf '%s*%s\n' "${a:0:4}" "${a:4}"
# Or even
printf '%s\n' "${a:0:4}*${a:4}" #or echo "${a:0:4}*${a:4}"
I suggest to use sed. If you want to insert an asterisk at the 2nd line at the 5th column:
sed -r "2s/^(.{5})(.*)$/\1*\2/" myfile.txt
2s says you are going to perform a substitution on the 2nd line. ^(.{5})(.*)$ says you are taking 5 characters from the beginning of the line and all characters after it. \1*\2 says you are building the string from the first match (i.e. 5 beginning characters) then a * then the second match (i.e. characters until the end of the line).
If your line and column are in variables you can do something like that:
_line=5
_column=2
sed -r "${_line}s/^(.{${_column}})(.*)$/\1*\2/" myfile.txt

search lines in bash for specific character and display line

I am trying to write search a string in bash and echo the line of that string that contains the + character with some text is a special case. The code does run but I get both lines in the input file displayed. Thank you :)
bash
#!/bin/bash
printf "Please enter the variant the following are examples"
echo " c.274G>T or c.274-10G>A"
printf "variant(s), use a comma between multiple: "; IFS="," read -a variant
for ((i=0; i<${#variant[#]}; i++))
do printf "NM_000163.4:%s\n" ${variant[$i]} >> c:/Users/cmccabe/Desktop/Python27/input.txt
done
awk '{for(i=1;i<=NF;++i)if($i~/+/)print $i}' input.txt
echo "$i" "is a special case"
input.txt
NM_000163.4:c.138C>A
NM_000163.4:c.266+83G>T
desired output ( this line contains a + in it)
NM_000163.4:c.266+83G>T is a special case
edit:
looks like I need to escape the + and that is part of my problem
you can change your awk script as below and get rid of echo.
$ awk '/+/{print $0,"is a special case"}' file
NM_000163.4:c.266+83G>T is a special case
As far as I understand your problem, you can do it with a single sed command:
sed -n '/+/ {s/$/is a special case/ ; p}' input.txt
On lines containing +, it replaces the end ($) with your text, thus appending it. After that the line is printed.

Bash script get item from array

I'm trying to read file line by line in bash.
Every line has format as follows text|number.
I want to produce file with format as follows text,text,text etc. so new file would have just text from previous file separated by comma.
Here is what I've tried and couldn't get it to work :
FILENAME=$1
OLD_IFS=$IFSddd
IFS=$'\n'
i=0
for line in $(cat "$FILENAME"); do
array=(`echo $line | sed -e 's/|/,/g'`)
echo ${array[0]}
i=i+1;
done
IFS=$OLD_IFS
But this prints both text and number but in different format text number
here is sample input :
dsadadq-2321dsad-dasdas|4212
dsadadq-2321dsad-d22as|4322
here is sample output:
dsadadq-2321dsad-dasdas,dsadadq-2321dsad-d22as
What did I do wrong?
Not pure bash, but you could do this in awk:
awk -F'|' 'NR>1{printf(",")} {printf("%s",$1)}'
Alternately, in pure bash and without having to strip the final comma:
#/bin/bash
# You can get your input from somewhere else if you like. Even stdin to the script.
input=$'dsadadq-2321dsad-dasdas|4212\ndsadadq-2321dsad-d22as|4322\n'
# Output should be reset to empty, for safety.
output=""
# Step through our input. (I don't know your column names.)
while IFS='|' read left right; do
# Only add a field if it exists. Salt to taste.
if [[ -n "$left" ]]; then
# Append data to output string
output="${output:+$output,}$left"
fi
done <<< "$input"
echo "$output"
No need for arrays and sed:
while IFS='' read line ; do
echo -n "${line%|*}",
done < "$FILENAME"
You just have to remove the last comma :-)
Using sed:
$ sed ':a;N;$!ba;s/|[0-9]*\n*/,/g;s/,$//' file
dsadadq-2321dsad-dasdas,dsadadq-2321dsad-d22as
Alternatively, here is a bit more readable sed with tr:
$ sed 's/|.*$/,/g' file | tr -d '\n' | sed 's/,$//'
dsadadq-2321dsad-dasdas,dsadadq-2321dsad-d22as
Choroba has the best answer (imho) except that it does not handle blank lines and it adds a trailing comma. Also, mucking with IFS is unnecessary.
This is a modification of his answer that solves those problems:
while read line ; do
if [ -n "$line" ]; then
if [ -n "$afterfirst" ]; then echo -n ,; fi
afterfirst=1
echo -n "${line%|*}"
fi
done < "$FILENAME"
The first if is just to filter out blank lines. The second if and the $afterfirst stuff is just to prevent the extra comma. It echos a comma before every entry except the first one. ${line%|\*} is a bash parameter notation that deletes the end of a paramerter if it matches some expression. line is the paramter, % is the symbol that indicates a trailing pattern should be deleted, and |* is the pattern to delete.

How can I capture the text between specific delimiters into a shell variable?

I have little problem with specifying my variable. I have a file with normal text and somewhere in it there are brackets [ ] (only 1 pair of brackets in whole file), and some text between them. I need to capture the text within these brackets in a shell (bash) variable. How can I do that, please?
Bash/sed:
VARIABLE=$(tr -d '\n' filename | sed -n -e '/\[[^]]/s/^[^[]*\[\([^]]*\)].*$/\1/p')
If that is unreadable, here's a bit of an explanation:
VARIABLE=`subexpression` Assigns the variable VARIABLE to the output of the subexpression.
tr -d '\n' filename Reads filename, deletes newline characters, and prints the result to sed's input
sed -n -e 'command' Executes the sed command without printing any lines
/\[[^]]/ Execute the command only on lines which contain [some text]
s/ Substitute
^[^[]* Match any non-[ text
\[ Match [
\([^]]*\) Match any non-] text into group 1
] Match ]
.*$ Match any text
/\1/ Replaces the line with group 1
p Prints the line
May I point out that while most of the suggested solutions might work, there is absolutely no reason why you should fork another shell, and spawn several processes to do such a simple task.
The shell provides you with all the tools you need:
$ var='foo[bar] pinch'
$ var=${var#*[}; var=${var%%]*}
$ echo "$var"
bar
See: http://mywiki.wooledge.org/BashFAQ/073
Sed is not necessary:
var=`egrep -o '\[.*\]' FILENAME | tr -d ][`
But it's only works with single line matches.
Using Bash builtin regex matching seems like yet another way of doing it:
var='foo[bar] pinch'
[[ "$var" =~ [^\]\[]*\[([^\[]*)\].* ]] # Bash 3.0
var="${BASH_REMATCH[1]}"
echo "$var"
Assuming you are asking about bash variable:
$ export YOUR_VAR=$(perl -ne'print $1 if /\[(.*?)\]/' your_file.txt)
The above works if brackets are on the same line.
What about:
shell_variable=$(sed -ne '/\[/,/\]/{s/^.*\[//;s/\].*//;p;}' $file)
Worked for me on Solaris 10 under Korn shell; should work with Bash too. Replace '$(...)' with back-ticks in Bourne shell.
Edit: worked when given [ on one line and ] on another. For the single line case as well, use:
shell_variable=$(sed -n -e '/\[[^]]*$/,/\]/{s/^.*\[//;s/\].*//;p;}' \
-e '/\[.*\]/s/^.*\[\([^]]*\)\].*$/\1/p' $file)
The first '-e' deals with the multi-line spread; the second '-e' deals with the single-line case. The first '-e' says:
From the line containing an open bracket [ not followed by a close bracket ] on the same line
Until the line containing close bracket ],
substitute anything up to and including the open bracket with an empty string,
substitute anything from the close bracket onwards with an empty string, and
print the result
The second '-e' says:
For any line containing both open bracket and close bracket
Substitute the pattern consisting of 'characters up to and including open bracket', 'characters up to but excluding close bracket' (and remember this), 'stuff from close bracket onwards' with the remembered characters in the middle, and
print the result
For the multi-line case:
$ file=xxx
$ cat xxx
sdsajdlajsdl
asdajsdkjsaldjsal
sdasdsad [aaaa
bbbbbbb
cccc] asdjsalkdjsaldjlsaj
asdjsalkdjlksjdlaj
asdasjdlkjsaldja
$ shell_variable=$(sed -n -e '/\[[^]]*$/,/\]/{s/^.*\[//;s/\].*//;p;}' \
-e '/\[.*\]/s/^.*\[\([^]]*\)\].*$/\1/p' $file)
$ echo $shell_variable
aaaa bbbbbbb cccc
$
And for the single-line case:
$ cat xxx
sdsajdlajsdl
asdajsdkjsaldjsal
sdasdsad [aaaa bbbbbbb cccc] asdjsalkdjsaldjlsaj
asdjsalkdjlksjdlaj
asdasjdlkjsaldja
$
$ shell_variable=$(sed -n -e '/\[[^]]*$/,/\]/{s/^.*\[//;s/\].*//;p;}' \
-e '/\[.*\]/s/^.*\[\([^]]*\)\].*$/\1/p' $file)
$ echo $shell_variable
aaaa bbbbbbb cccc
$
Somewhere about here, it becomes simpler to do the whole job in Perl, slurping the file and editing the result string in two multi-line substitute operations.
var=`grep -e '\[.*\]' test.txt | sed -e 's/.*\[\(.*\)\].*/\1/' infile.txt`
Thanks to everyone, i used Strager's version and works perfectly, thanks alot once again...
var=`grep -e '\[.*\]' test.txt | sed -e 's/.*\[\(.*\)\].*/\1/' infile.txt`
Backslashes (BSL) got munched up ... :
var='foo[bar] pinch'
[[ "$var" =~ [^\]\[]*\[([^\[]*)\].* ]] # Bash 3.0
# Just in case ...:
[[ "$var" =~ [^BSL]BSL[]*BSL[([^BSL[]*)BSL].* ]] # Bash 3.0
var="${BASH_REMATCH[1]}"
echo "$var"
2 simple steps to extract the text.
split var at [ and get the right part
split var at ] and get the left part
cb0$ var='foo[bar] pinch'
cb0$ var=${var#*[}
cb0$ var=${var%]*} && echo $var
bar

Resources