How to get unique results with grep? - bash

The below mentioned scenario is a part of the logic that i want to implement as part of a jenkins job. I am trying to write a shell script.
I am using grep command to recursively search for a particular string. Sample result that grep returns is like this:
./src/test/java/com/ABC/st/test/pricing/Test1.java: #Tags({ "B-05256" })
./src/test/java/com/ABC/st/test/pricing/Test1.java: #MapToVO(storyID = "B-05256: prices in ST")
./src/test/java/com/ABC/st/test/pricing/Test1.java: #Tags({ "B-05256" })
./src/test/java/com/ABC/st/test/pricing/Test2.java: #Tags({ "B-05256" })
./src/test/java/com/ABC/st/test/pricing/Test2.java: #MapToVO(storyID = "B-05256:Lowest Price of the Season")
./src/test/java/com/ABC/st/test/pricing/Test2.java: #Tags({ "B-05256" })
I want to extract unique file paths such as:
/src/test/java/com/ABC/st/test/pricing/Test1.java
/src/test/java/com/ABC/st/test/pricing/Test2.java
and then use each unique path in a maven command. So:
How can i extract unique file paths from the result set given by grep command?
How do i run a loop kind of a thing, where in every iteration i execute mvn command with unique file path?

If you need only the name of the matching files, grep has a command line switch for this:
-l, --files-with-matches
Suppress normal output; instead print the name of each input file from which output
would normally have been printed. The scanning will stop on the first match. (-l is
specified by POSIX.)

Pipe your text into
sed 's/:.*//' | sort -u | while read path
do
echo now execute your command using "$path"
done

This is what the -l flag to grep is for.
-l, --files-with-matches
Suppress normal output; instead print the name of each input file from which output would normally have been printed. The scanning will stop on the first match. (-l is specified by POSIX.)

Related

Is there a better way to run a repeat command in terminal?

I need to run a repeat command with the different filename to get the header.
However, I need to run each file.
dfits *.fit | grep MSBTITLE
Is there any command I can run several files and show the filename and the header I need?
grep does not know the filename, so you see only the matching lines, but not which file they come from originally. I would in your case write an explicit loop:
for file in *.fit
do
if titleline=$(dfits $file|grep MSBTITLE)
then
echo $file : $titleline
fi
done
Since dfits already obscures the file name in its output, we store the output from grep into a variable, and if there is a match, output this line together with the file name.

Sort files in directory then execute command on each one of them

I have a directory containing files numbered like this
1>chr1:2111-1111_mask.txt
1>chr1:2111-1111_mask2.txt
1>chr1:2111-1111_mask3.txt
2>chr2:345-678_mask.txt
2>chr2:345-678_mask2.txt
2>chr2:345-678_mask3.txt
100>chr19:444-555_mask.txt
100>chr19:444-555_mask2.txt
100>chr19:444-555_mask3.txt
each file contains a name like >chr1:2111-1111 in the first line and a series of characters in the second line.
I need to sort files in this directory numerically using the number before the > as guide, the execute the command for each one of the files with _mask3 and using.
I have this code
ls ./"$INPUT"_temp/*_mask3.txt | sort -n | for f in ./"$INPUT"_temp/*_mask3.txt
do
read FILE
Do something with each file and list the results in output file including the name of the string
done
It works, but when I check the list of the strings inside the output file they are like this
>chr19:444-555
>chr1:2111-1111
>chr2:345-678
why?
So... I'm not sure what "Works" here like your question stated.
It seems like you have two problems.
Your files are not in sorted order
The file names have the leading digits removed
Addressing 1, your command ls ./"$INPUT"_temp/*_mask3.txt | sort -n | for f in ./"$INPUT"_temp/*_mask3.txt here doesn't make a whole lot of sense. You are getting a list of files from ls, and then piping that to sort. That probably gives you the output you are looking for, but then you pipe that to for, which doesn't make any sense.
In fact you can rewrite your entire script to
for f in ./"$INPUT"_temp/*_mask3.txt
do
read FILE
Do something with each file and list the results in output file including the name of the string
done
And you'll have the exact same output. To get this sorted you could do something like:
for f in `ls ./"$INPUT"_temp/*_mask3.txt | sort -n`
do
read FILE
Do something with each file and list the results in output file including the name of the string
done
As for the unexpected truncation, that > character in your file name is important in your bash shell since it directs the stdout of the preceding command to a specified file. You'll need to insure that when you use variable $f from your loop that you stick quotes around that thing to keep bash from misinterpreting the file name a command > file type of thing.

How to fetch the file names present in the text file and delete that files using shell

I want to delete some files mentioned in a text file .The text would be in a single line like below along with some other data
Cannot Handle File:C:\patches\BUG2\abc.javaCannot Handle File:C:\patches\BUG2\xyz.javaErrors .
So now I want to fetch the file names like abc.java and xyz.java in the text file and delete them so How can we proceed with it using shell. Please help to resolve this
Perl to the rescue:
perl -lne 'unlink $1 while /File:(.*?)(?:Cannot|Errors)/g' input.txt
-l adds a newline to prints
-n processes the input line by line
(.*?) matches "frugally", i.e. finds the shortest possible match
/g matches globally, i.e. as many times as it can.
unlink removes a file.
So, the file name must be preceded by File: and followed by Cannot or Errors.
Using grep -o and xargs:
grep -Eo '[[:alnum:]_$-]+\.java' file | xargs rm
Will get this output from grep:
grep -Eo '[[:alnum:]_$-]+\.java' file
abc.java
xyz.java

Display just one file per match using grep (in a shellscript)

How do I limit grep's output to just one line per file?
(Since this is part of a shellscript function, I can use everything, but I'm too nooby to figure out how to pipe the specific parts the right way.)
The function I'm trying to write is basically "Given a string, display every file (in this directory and all subdirectories), which contains it and display a list of those files as clickable links"
(btw. could you hint me to scripts/commands, which do something like this?)
If you are interested: The functions in .bashrc are these:
(And should be used like: "where foobar")
function where(){
grep -rHoiIm1 "$#" | cut -d":" -f1-1 | asURL
}
function asURL() {
PREFIX="file://$(pwd)/";
sed "s*^*$PREFIX*" |
sed 's/ /%20/g';
}
If you're only interested in the paths of matching files, use the -l / --files-with-matches option:
function where(){
grep -riIl "$#" | asURL
}
Note that I've omitted several options that don't apply anymore once you use -l.
As an aside: while your asUrl() function will work in simple cases, it's not fully robust and can result in invalid URLs. Aside from that, there's no reason for two invocations of sed; simply string the two s calls together in a single script, separated with ;.
Add the -l option to grep to tell it to output file names only.
From the grep man page:
-l
--files-with-matches
Suppress normal output; instead print the name of each input file from
which output would normally have been printed. The scanning of each file
stops on the first match. (-l is specified by POSIX.)

Trying to extract a number from a plist with grep

I'll try and make this question short. Basically, I am working on a shell script, and I have a .plist file containing an integer value that I am trying to "extract" and put into a variable in my shell script.
I'm able to refine the contents of the .plist file to a few lines, but I am still getting a bunch of characters I don't need.
I am delcaring / running the following command in my shell script, and it is giving me the following results.
file_refine=`grep -C 2 CFBundleVersion $file | grep '[0-9]\{3\}'`
Output
<string>645</string>
I just need the numeral digits not the string tags, but I can't seem to figure that out.
Try this
file_refine=$(grep -C 2 CFBundleVersion $file | grep -o '[0-9]\{3\}')
the -o option from grep man page:
-o, --only-matching
Print only the matched (non-empty) parts of a matching line, with
each such part on a separate output line.

Resources