How to grep all characters in file - bash

I have a CSV file with this lines:
----------+79975532211,----------+79975532212
4995876655,4995876658
I try to grep this lines in Bash script
#!/bin/bash
config='/test/config.conf'
sourcecsv=/test/sourse.csv
cat $sourcecsv | while read line
do
Oldnumber=$(echo $line | cut -d',' -f1)
cat $config | grep "\\$Oldnumber" -B 8
done
But when script grep value 4995876655 I get error:
grep: Invalid back reference
How I can grep all values in my file?

Instead of:
cat $config | grep "\\$Oldnumber" -B 8
You should do:
grep -B 8 -F -- "$Oldnumber" "$config"

If you really mean to grep for all strings between commas, you can do it all in one go.
tr ',' '\n' </test/sourse.csv |
grep -F -f - -B 8 /test/config.conf
If you need to obtain the matches in sequence (all matches for the first string followed by all matches for the second, etc) then maybe loop over them with a proper while loop:
tr ',' '\n' </test/sourse.csv |
while read -r Oldnumber; do
grep -F -B 8 -e "$Oldnumber" /test/config.conf
done
Keeping the file names in variables does not seem to offer any advantage here.
If you mean to search for the strings preceded by a literal backslash, you can add it back; the -F option I added turns all strings into literals. If you need metacharacters, and take out the -F option, you need to double the backslashes (inside double quotes, a single backslash needs to be represented as double; and to get a literal backslash in a regular expression, you need two of them).

Related

Users who are logged on, in alphabetical order, printed on one line

Users who are logged on, in alphabetical order, printed on one line.
What are the minimum amount of changes to get this to work because in a bash script?
this is the given script:
for name in $#
do
who | grep -w "^name" | sed 's/ .*//' | uniq
done | sort | tr '\n' ' '
echo
Single line commands:
who | awk '{print $1}' | sort | uniq | tr '\n' ' '
who: list of logged in users.
awk '{print $1}': keep only the first word or each line, which is the usernames.
sort: put the usernames in alphabetical order.
uniq: remove duplicates.
tr '\n' ' ': remove carriage returns, and replace them with spaces.
Ex
$ who
steve tty7 Mar 5 16:25 (:0)
bernard tty7 Mar 5 16:25 (:0)
sarah tty7 Mar 5 16:25 (:0)
$ who | awk '{print $1}' | sort | uniq | tr '\n' ' '
bernard sara steve
Your code did grep -w "^name", which tells grep to output the lines that start with "name". Not the lines that begin with the value of variable "name". For that you would need to do grep -w "^$name".
Try this Shellcheck-clean code:
for name in "$#"
do
who | sed 's/[[:space:]].*//' | grep -xF -- "$name"
done | sort -u | paste -sd ' '
$# should always have double quotes on it ("$#"). See Accessing bash command line args $# vs $*. Shellcheck correctly complains if double quotes are not used.
sed 's/[[:space:]].*//' removes the first whitespace character, and everything after it, on every input line. Using [[:space:]] instead of a literal space character means that the code will still work if the who output uses tabs as separators. It may be easier to read too. The sed command is run first to ensure that usernames occupy whole lines so it's easier to avoid spurious matches at the next pipeline stage.
grep -xF -- "$name" searches for whole lines in the input that are the "$name" string. The -x option forces matching of whole lines. That prevents, for instance, the username mary matching the username mary.jane (a valid username on at least some Linux systems). The -F option means that regular expression patterns in "$name" are treated as literal strings. That prevents, for instance, the name t.m matching the name tim. The -- prevents a leading hyphen in "$name" being treated as a grep option. No system that I know of allows usernames to have leading hyphens, but there's nothing to stop such an invalid name being provided as a command line argument to the code. The -w option to grep wouldn't be useful here because valid names may contain non-word characters (e.g. t.m).
sort -u takes the output of the for loop (an unsorted list of usernames, one per line, possibly with repetitions) and sorts it. The -u option causes it to remove duplicates (like piping to uniq, but saves a process creation).
paste -sd ' ' puts all the lines in to input on a single line, separated by spaces (specified by the -d option and ' ' (space) option argument), and terminated with a newline character. tr '\n' ' ' would have a similar effect but it produces an unterminated line with a trailing space character.
All you need is:
who | sort -k1,1 -u | awk '{u=u s $1; s=OFS} END{print u}'
That will output a blank-separated list of all logged in users, all on 1 line, with a terminating newline to make it a valid POSIX text file, and without an undesirable trailing blank char.

bash cat exclude multiple files based on grep results

I have the following cat command that I use in a bash script. I look for $SAMPLE.txt file in subfolders 20* and combine them into 1 output.txt
cat /$FOLDER/20*/$SAMPLE.txt > /$OUTPUTFOLDER/output.txt
I now want to exclude certain files conditionally.
I found the following here https://unix.stackexchange.com/questions/246048/cat-files-except-one
$ shopt -s extglob
$ cat -- !(DISCARD).txt > catKEPT
I want to do something like this.
Look for $SAMPLE and a pattern '$PAT1' in a $SAMPLEFILE. This $SAMPLEFILE is comma seperated. If there is a match, I want to store the first field of this line & use it to exclude files from cat
I would use this command to look for $SAMPLE and $PAT1 & then cut to keep my first field. I would assign that to a variable 'EXLUDE_FOLDER'
EXCLUDE_FOLDER=grep '$SAMPLE' $SAMPLEFILE | grep '$PAT1' | cut -d "," -f 1
And then use it like this
cat /$FOLDER/20*/$SAMPLE.txt -- !($FOLDER/$EXLUDE_FOLDER/$SAMPLE.txt) > /$OUTPUTFOLDER/output.txt
I'm stuck at putting this into an if/statement and dealing with situations where grep results in multiple matches, so multiple files should be excluded
If SAMPLE and PAT are variables, you presumably want them expanded to their contents, which means you must put them in double quotes, not single quotes. Example:
SAMPLE=3
# Compare single quotes versus double
echo '$SAMPLE' # outputs $SAMPLE
echo "$SAMPLE" # outputs 3
If SAMPLEFILE is the name of a file, you must double-quote it, else it will fail if your filename has spaces in it, so you must use:
grep "$SAMPLE" "$SAMPLEFILE"
So, now you can test if your grep works like this:
grep "$SAMPLE" "$SAMPLEFILE" | grep "$PAT1" | cut -d "," -f 1
So, if that works, the next thing is that you want to capture the output of the command, so you need to use $(...). That means:
EXCLUDE_FOLDER=$(grep "$SAMPLE" "$SAMPLEFILE" | grep "$PAT1" | cut -d "," -f 1)
So, see test if that works now:
echo "$EXCLUDE_FOLDER"

Delete words in a line using grep or sed

I want to delete three words with a special character on a line such as
Input:
\cf4 \cb6 1749,1789 \cb3 \
Output:
1749,1789
I have tried a couple sed and grep statements but so far none have worked, mainly due to the character \.
My unsuccessful attempt:
sed -i 's/ [.\c ] //g' inputfile.ext >output file.ext
Awk accepts a regex Field Separator (in this case, comma or space):
$ awk -F'[ ,]' '$0 = $3 "." $4' <<< '\cf4 \cb6 1749,1789 \cb3 \'
1749.1789
-F'[ ,]' - Use a single character from the set space/comma as Field Separator
$0 = $3 "." $4 - If we can set the entire line $0 to Field 3 $4 followed by a literal period "." followed by Field 4 $4, do the default behavior (print entire line)
Replace <<< 'input' with file if every line of that file has the same delimeters (spaces/comma) and number of fields. If your input file is more complex than the sample you shared, please edit your question to show actual input.
The backslash is a special meta-character that confuses bash.
We treat it like any other meta-character, by escaping it, with--you guessed it--a backslash!
But first, we need to grep this pattern out of our file
grep '\\... \\... [0-9]+,[0-9]+ \\... \\' our_file # Close enough!
Now, just sed out those pesky backslashes
| sed -e 's/\\//g' # Don't forget the g, otherwise it'll only strip out 1 backlash
Now, finally, sed out the clusters of 2 alpha followed by a number and a space!
| sed -e 's/[a-z][a-z][0-9] //g'
And, finally....
grep '\\... \\... [0-9]+,[0-9]+ \\... \\' our_file | sed -e 's/\\//g' | sed -e 's/[a-z][a-z][0-9] //g'
Output:
1749,1789
My guess is you are having trouble because you have backslashes in input and can't figure out how to get backslashes into your regex. Since backslashes are escape characters to shell and regex you end up having to type four backslashes to get one into your regex.
Ben Van Camp already posted an answer that uses single quotes to make the escaping a little easier; however I shall now post an answer that simply avoids the problem altogether.
grep -o '[0-9]*,[0-9]*' | tr , .
Locks on to the comma and selects the digits on either side and outputs the number. Alternately if comma is not guaranteed we can do it this way:
egrep -o ' [0-9,]*|^[0-9,]*' | tr , . | tr -d ' '
Both of these assume there's only one usable number per line.
$ awk '{sub(/,/,".",$3); print $3}' file
1749.1789
$ sed 's/\([^ ]* \)\{2\}\([^ ]*\).*/\2/; s/,/./' file
1749.1789

Bash variables not acting as expected

I have a bash script which parses a file line by line, extracts the date using a cut command and then makes a folder using that date. However, it seems like my variables are not being populated properly. Do I have a syntax issue? Any help or direction to external resources is very appreciated.
#!/bin/bash
ls | grep .mp3 | cut -d '.' -f 1 > filestobemoved
cat filestobemoved | while read line
do
varYear= $line | cut -d '_' -f 3
varMonth= $line | cut -d '_' -f 4
varDay= $line | cut -d '_' -f 5
echo $varMonth
mkdir $varMonth'_'$varDay'_'$varYear
cp ./$line'.mp3' ./$varMonth'_'$varDay'_'$varYear/$line'.mp3'
done
You have many errors and non-recommended practices in your code. Try the following:
for f in *.mp3; do
f=${f%%.*}
IFS=_ read _ _ varYear varMonth varDay <<< "$f"
echo $varMonth
mkdir -p "${varMonth}_${varDay}_${varYear}"
cp "$f.mp3" "${varMonth}_${varDay}_${varYear}/$f.mp3"
done
The actual error is that you need to use command substitution. For example, instead of
varYear= $line | cut -d '_' -f 3
you need to use
varYear=$(cut -d '_' -f 3 <<< "$line")
A secondary error there is that $foo | some_command on its own line does not mean that the contents of $foo gets piped to the next command as input, but is rather executed as a command, and the output of the command is passed to the next one.
Some best practices and tips to take into account:
Use a portable shebang line - #!/usr/bin/env bash (disclaimer: That's my answer).
Don't parse ls output.
Avoid useless uses of cat.
Use More Quotes™
Don't use files for temporary storage if you can use pipes. It is literally orders of magnitude faster, and generally makes for simpler code if you want to do it properly.
If you have to use files for temporary storage, put them in the directory created by mktemp -d. Preferably add a trap to remove the temporary directory cleanly.
There's no need for a var prefix in variables.
grep searches for basic regular expressions by default, so .mp3 matches any single character followed by the literal string mp3. If you want to search for a dot, you need to either use grep -F to search for literal strings or escape the regular expression as \.mp3.
You generally want to use read -r (defined by POSIX) to treat backslashes in the input literally.

How to remove the last character from a bash grep output

COMPANY_NAME=`cat file.txt | grep "company_name" | cut -d '=' -f 2`
outputs something like this
"Abc Inc";
What I want to do is I want to remove the trailing ";" as well. How can i do that? I am a beginner to bash. Any thoughts or suggestions would be helpful.
This will remove the last character contained in your COMPANY_NAME var regardless if it is or not a semicolon:
echo "$COMPANY_NAME" | rev | cut -c 2- | rev
I'd use sed 's/;$//'. eg:
COMPANY_NAME=`cat file.txt | grep "company_name" | cut -d '=' -f 2 | sed 's/;$//'`
foo="hello world"
echo ${foo%?}
hello worl
I'd use head --bytes -1, or head -c-1 for short.
COMPANY_NAME=`cat file.txt | grep "company_name" | cut -d '=' -f 2 | head --bytes -1`
head outputs only the beginning of a stream or file. Typically it counts lines, but it can be made to count characters/bytes instead. head --bytes 10 will output the first ten characters, but head --bytes -10 will output everything except the last ten.
NB: you may have issues if the final character is multi-byte, but a semi-colon isn't
I'd recommend this solution over sed or cut because
It's exactly what head was designed to do, thus less command-line options and an easier-to-read command
It saves you having to think about regular expressions, which are cool/powerful but often overkill
It saves your machine having to think about regular expressions, so will be imperceptibly faster
I believe the cleanest way to strip a single character from a string with bash is:
echo ${COMPANY_NAME:: -1}
but I haven't been able to embed the grep piece within the curly braces, so your particular task becomes a two-liner:
COMPANY_NAME=$(grep "company_name" file.txt); COMPANY_NAME=${COMPANY_NAME:: -1}
This will strip any character, semicolon or not, but can get rid of the semicolon specifically, too.
To remove ALL semicolons, wherever they may fall:
echo ${COMPANY_NAME/;/}
To remove only a semicolon at the end:
echo ${COMPANY_NAME%;}
Or, to remove multiple semicolons from the end:
echo ${COMPANY_NAME%%;}
For great detail and more on this approach, The Linux Documentation Project covers a lot of ground at http://tldp.org/LDP/abs/html/string-manipulation.html
Using sed, if you don't know what the last character actually is:
$ grep company_name file.txt | cut -d '=' -f2 | sed 's/.$//'
"Abc Inc"
Don't abuse cats. Did you know that grep can read files, too?
The canonical approach would be this:
grep "company_name" file.txt | cut -d '=' -f 2 | sed -e 's/;$//'
the smarter approach would use a single perl or awk statement, which can do filter and different transformations at once. For example something like this:
COMPANY_NAME=$( perl -ne '/company_name=(.*);/ && print $1' file.txt )
don't have to chain so many tools. Just one awk command does the job
COMPANY_NAME=$(awk -F"=" '/company_name/{gsub(/;$/,"",$2) ;print $2}' file.txt)
In Bash using only one external utility:
IFS='= ' read -r discard COMPANY_NAME <<< $(grep "company_name" file.txt)
COMPANY_NAME=${COMPANY_NAME/%?}
Assuming the quotation marks are actually part of the output, couldn't you just use the -o switch to return everything between the quote marks?
COMPANY_NAME="\"ABC Inc\";" | echo $COMPANY_NAME | grep -o "\"*.*\""
you can strip the beginnings and ends of a string by N characters using this bash construct, as someone said already
$ fred=abcdefg.rpm
$ echo ${fred:1:-4}
bcdefg
HOWEVER, this is not supported in older versions of bash.. as I discovered just now writing a script for a Red hat EL6 install process. This is the sole reason for posting here.
A hacky way to achieve this is to use sed with extended regex like this:
$ fred=abcdefg.rpm
$ echo $fred | sed -re 's/^.(.*)....$/\1/g'
bcdefg
Some refinements to answer above. To remove more than one char you add multiple question marks. For example, to remove last two chars from variable $SRC_IP_MSG, you can use:
SRC_IP_MSG=${SRC_IP_MSG%??}
cat file.txt | grep "company_name" | cut -d '=' -f 2 | cut -d ';' -f 1
I am not finding that sed 's/;$//' works. It doesn't trim anything, though I'm wondering whether it's because the character I'm trying to trim off happens to be a "$". What does work for me is sed 's/.\{1\}$//'.

Resources