grep in forloop not functioning as expected in git-bash - shell

I have a collection of stored procedures (SPs) being called in some C# code. I simply want to find which lines in which C# files are using these SPs.
I have installed git-bash, and am working in a Win10 environment.
No matter what I try, grep either spits out nothing, or spits out the entire contents of every file that has a matching record. I simply want the filename
and the line number where SP regex matches.
In a terminal, here is what I have done:
procs=( $(cat procs.txt) ) #load the procs into an array
echo ${#procs[#]} #echo the size to make sure each proc got read in separately
output: 235
files=( $(find . -type f -iregex '.*\.cs') ) #load the file paths into an array,
#this similarly returns a filled out array
output: #over 1000
I have also tried this variant which removes the initial './' in the path, thinking that the relative pathing was causing an issue
files=( $(find . -type f -iregex '.*\.cs' | sed 's/..//') )
The rest is a simple nested for loop:
for i in ${procs[#]}
do
for j in ${files[#]}
do
grep -nie "$i" "$j"
done
done
I have tried many other variants of this basic idea, like redirecting the grep output to a text file, adding and subtracting flags,
quoting and unquoting the variables, and the like.
I also tried this approach, but was similarly unsuccessful
for i in ${procs[#]}
do
grep -r --include='*.cs' -F $i
#and i also tried
grep -F $i *
done
at this point I am thinking there is something I don't understand about how git-bash works in a windows environment, because it seems like it should have worked by now.
Thanks for your help.
EDIT:
So after hours of heart-ache I finally got it to work with this:
for i in "${!procs[#]}"
do
for j in "${!files[#]}"
do
egrep -nH $(echo "${procs[$i]}") $(echo "${files[$j]}")
done
done
I looked it up, and my git-bash version is gnu-bash 4.4.12(1) x86_64-pc-msys
I'm still not sure why git-bash needs such weird quoting and echoing just to get everything to run properly. On debian linux it worked with just a simple
for i in ${procs[#]}
do
for j in ${files[#]}
do
grep $i $j
done
done
Running this version of bash: GNU bash, version 4.4.12(1)-release (x86_64-pc-linux-gnu)
If anyone can tell me why git-bash behaves so oddly, I would still love to know the answer.

Related

How to add leading zero's to sequential file names

I have images files that when they are created have these kind of file names:
Name of file-1.jpg
Name of file-2.jpg
Name of file-3.jpg
Name of file-4.jpg
..etc
This causes problems for sorting between Windows and Cygwin Bash. When I process these files in Cygwin Bash, they get processed out of order because of the differences in sorting between Windows file system and Cygwin Bash sees them. However, if the files get manually renamed and numbered with leading zeroes, this issue isn't a problem. How can I use Bash to rename these files automatically so I don't have to manually process them. I'd like to add a few lines of code to my Bash script to rename them and add the leading zeroes before they are processed by the rest of the script.
Since I use this Bash script interchangeably between Windows Cygwin and Mac, I would like something that works in both environments, if possible. Also all files will have names with spaces.
You could use something like this:
files="*.jpg"
regex="(.*-)(.*)(\.jpg)"
for f in $files
do
if [[ "$f" =~ $regex ]]
then
number=`printf %03d ${BASH_REMATCH[2]}`
name="${BASH_REMATCH[1]}${number}${BASH_REMATCH[3]}"
mv "$f" "${name}"
fi
done
Put that in a script, like rename.sh and run that in the folder where you want to covert the files. Modify as necessary...
Shamelessly ripped from here:
Capturing Groups From a Grep RegEx
and here:
How to Add Leading Zeros to Sequential File Names
#!/bin/bash
#cygcheck (cygwin) 2.3.1
#GNU bash, version 4.3.42(4)-release (i686-pc-cygwin)
namemodify()
{
bname="${1##*/}"
dname="${1%/*}"
mv "$1" "${dname}/00${bname}" # Add any number of leading zeroes.
}
export -f namemodify
find . -type f -iname "*jpg" -exec bash -c 'namemodify "$1"' _ {} \;
I hope this won't break on Mac too :) good luck

Bypass ls argument limit

So, I need to list a bunch of files in reverse order from a certain directory. Only problem is that there are a lot of files in the directory (since I'm decompiling video frames in a directory to reverse the entire video) and when I run ls I get an error that says /bin/ls argument list too long. I was wondering how to get around this error?
Operating System: Ubuntu 14.04
If ls doesn't do, find -type f is usually your friend (and can also use stuff like -print0 to avoid problems exotic filenames).
I assume that you are using something like
ls -1 -r *.jpg
to produce the reverse-sorted list of images. Since Bash sorts filename expansions (aka globs) itself, you can get the same effect by just reversing the expansion of *.jpg. This is one way to do it:
printf '%s\n' *.jpg | tac
If you haven't got tac, you can do it all in pure Bash:
images=( *.jpg )
for (( i=${#images[*]}-1 ; i>=0 ; i-- )) ; do
printf '%s\n' "${images[i]}"
done

Difference between using ls and find to loop over files in a bash script

I'm not sure I understand exactly why:
for f in `find . -name "strain_flame_00*.dat"`; do
echo $f
mybase=`basename $f .dat`
echo $mybase
done
works and:
for f in `ls strain_flame_00*.dat`; do
echo $f
mybase=`basename $f .dat`
echo $mybase
done
does not, i.e. the filename does not get stripped of the suffix. I think it's because what comes out of ls is formatted differently but I'm not sure. I even tried to put eval in front of ls...
The correct way to iterate over filenames here would be
for f in strain_flame_00*.dat; do
echo "$f"
mybase=$(basename "$f" .dat)
echo "$mybase"
done
Using for with a glob pattern, and then quoting all references to the filename is the safest way to use filenames that may have whitespace.
First of all, never parse the output of the ls command.
If you MUST use ls and you DON'T know what ls alias is out there, then do this:
(
COLUMNS=
LANG=
NLSPATH=
GLOBIGNORE=
LS_COLORS=
TZ=
unset ls
for f in `ls -1 strain_flame_00*.dat`; do
echo $f
mybase=`basename $f .dat`
echo $mybase
done
)
It is surrounded by parenthesis to protect existing environment, aliases and shell variables.
Various environment names were NUKED (as ls does look those up).
One unalias command (self-explanatory).
One unset command (again, protection against scrupulous over-lording 'ls' function).
Now, you can see why NOT to use the 'ls'.
Another difference that hasn't been mentioned yet is that find is recursive search by default, whereas ls is not. (even though both can be told to do recursive / non-recursive through options; and find can be told to recurse up to a specified depth)
And, as others have mentioned, if it can be achieved by globbing, you should avoid using either.

Properly handle lists of files with whitespace in filename

I want to iterate over a list of files in Bash and perform some action. The problem: the file names may contain whitespace, which creates an obvious problem with wildcards or ls:
touch a\ b
FILES=* # or $(ls)
for FILE in $FILES; do echo $FILE; done
yields
a
b
Now, the conventional way to handle this is to use find … -print0 instead. However, this only works (well) in conjunction with xargs -0, not with Bash variables / loops.
My idea was to set $IFS to the null character to make this work. However, the comp.unix.shell seems to think that this is impossible in bash.
Bummer. Well, it’s theoretically possible to use another character, such as : (after all, $PATH uses this format, too):
IFS=$':'
FILES=$(find . -print0 | xargs -0 printf "%s:")
for FILE in $FILES; do echo $FILE; done
(The output is slightly different but fair enough.)
However, I can’t help but feel that this is clumsy and that there should be a more direct way of doing this. I’m looking for a more direct way of accomplishing this, preferably using wildcards or ls.
The best way to handle this is to store the file list as an array, rather than a string (and be sure to double-quote all variable substitutions):
files=(*)
for file in "${files[#]}"; do
echo "$file"
done
If you want to generate an array from find's output (e.g. if you need to search recursively), see this previous answer.
Exactly what you have in the first example works fine for me in Msys Bash, Cygwin and on my Fedora box:
FILES=*
for FILE in $FILES
do
echo $FILE
done
Its very important to preceed
IFS=""
otherwise files with two directly following spaces will not be found

Handle special characters in bash for...in loop

Suppose I've got a list of files
file1
"file 1"
file2
a for...in loop breaks it up between whitespace, not newlines:
for x in $( ls ); do
echo $x
done
results:
file
1
file1
file2
I want to execute a command on each file. "file" and "1" above are not actual files. How can I do that if the filenames contains things like spaces or commas?
It's a little trickier than I think find -print0 | xargs -0 could handle, because I actually want the command to be something like "convert input/file1.jpg .... output/file1.jpg" so I need to permutate the filename in the process.
Actually, Mark's suggestion works fine without even doing anything to the internal field separator. The problem is running ls in a subshell, whether by backticks or $( ) causes the for loop to be unable to distinguish between spaces in names. Simply using
for f in *
instead of the ls solves the problem.
#!/bin/bash
for f in *
do
echo "$f"
done
UPDATE BY OP: this answer sucks and shouldn't be on top ... #Jordan's post below should be the accepted answer.
one possible way:
ls -1 | while read x; do
echo $x
done
I know this one is LONG past "answered", and with all due respect to eduffy, I came up with a better way and I thought I'd share it.
What's "wrong" with eduffy's answer isn't that it's wrong, but that it imposes what for me is a painful limitation: there's an implied creation of a subshell when the output of the ls is piped and this means that variables set inside the loop are lost after the loop exits. Thus, if you want to write some more sophisticated code, you have a pain in the buttocks to deal with.
My solution was to take the "readline" function and write a program out of it in which you can specify any specific line number that you may want that results from any given function call. ... As a simple example, starting with eduffy's:
ls_output=$(ls -1)
# The cut at the end of the following line removes any trailing new line character
declare -i line_count=$(echo "$ls_output" | wc -l | cut -d ' ' -f 1)
declare -i cur_line=1
while [ $cur_line -le $line_count ] ;
do
# NONE of the values in the variables inside this do loop are trapped here.
filename=$(echo "$ls_output" | readline -n $cur_line)
# Now line contains a filename from the preceeding ls command
cur_line=cur_line+1
done
Now you have wrapped up all the subshell activity into neat little contained packages and can go about your shell coding without having to worry about the scope of your variable values getting trapped in subshells.
I wrote my version of readline in gnuc if anyone wants a copy, it's a little big to post here, but maybe we can find a way...
Hope this helps,
RT

Resources