list all imports recursively from a file - bash

I am reorganising a react codebase and I would like to write a bash script which I can give a path to a component which lists all the files imported from that file recursively.
The output should be filtered by unique occurrences. I am only interested in the source files (aliased to #browse) (not node modules etc).
so far I have
#!/bin/bash
list_browse_imports() {
rg "from '#browse" $1 | sed "s,.*#browse,," | sed "s/'//"
}
list_browse_imports $1
which yields:
/search/template/template
/product/button/button
I then want to feed these lines onward to rg (feel free to substitute to grep) again to match all the files in the tree that match that like
#!/bin/bash
list_browse_imports() {
rg "from '#browse" $1 | sed "s,.*#browse,," | sed "s/'//" | xargs rg -l
}
list_browse_imports $1
but this yields nothing.
I have tried to extract the second part to it's own function but get stuck there as well.
EDIT:
made some progress with this very ugly (and slow) code:
#!/bin/bash
list_imports_rec() {
declare IMPORTS=$(rg "from '#browse" $1 | sed "s,.*#browse,," | sed "s/'//")
declare FILES=$(rg -l '')
#echo "RESULT IS $IMPORTS RESULT IS"
for import in $(rg "from '#browse" $1 | sed "s,.*#browse,," | sed "s/'//")
do
for file in $(rg -l '')
do
declare import_stripped=$(echo $import | sed "s,/,,g")
declare file_stripped=$(echo $file | sed "s,/,,g")
#echo "$import_stripped and $file_stripped"
if [[ "$file_stripped" =~ .*"$import_stripped".* ]]; then
echo "$file"
list_imports_rec $file
fi
done
done
}
list_imports_rec $1

Related

Sed replace substring only if expression exist

In a bash script, I am trying to remove the directory name in filenames :
documents/file.txt
direc/file5.txt
file2.txt
file3.txt
So I try to first see if there is a "/" and if yes delete everything before :
for i in **/*.scss *.scss; do
echo "$i" | sed -n '^/.*\// s/^.*\///p'
done
But it doesn't work for files in the current directory, it gives me a blank string.
I get :
file.txt
file5.txt
When you only want the filename, use basename instead of sed.
# basename /path/to/file
returns file
here is the man page
Your sed attempt is basically fine, but you should print regardless of whether you performed a substitution; take out the -n and the p at the end. (Also there was an unrelated syntax error.)
Also, don't needlessly loop over all files.
printf '%s\n' **/*.scss *.scss |
sed -n 's%^.*/%%p'
This also can be done with awk bash util.
Example:
echo "1/2/i.py" | awk 'BEGIN {FS="/"} {print $NF}'
output: i.py
Eventually, I did :
for i in **/*.scss *.scss; do
# for i in *.scss; do
# for i in _hm-globals.scss; do
name=${i##*/} # remove dir name
name=${name%.scss} # remove extension
name=`echo "$name" | sed -n "s/^_hm-//p"` # remove _hm-
if [[ $name = *"."* ]]; then
name=`echo "$name" | sed -n 's/\./-/p'` #replace . to --
fi
echo "$name" >&2
done

Bash: Filter directory when piping from `ls` to `tee`

(background info)
Writing my first bash psuedo-program. The program downloads a bunch of files from the network, stores them in a sub-directory called ./network-files/, then removes all the files it downloaded. It also logs the result to several log files in ./logs/.
I want to log the filenames of each file deleted.
Currently, I'm doing this:
echo -e "$(date -u) >>> Removing files: $(ls -1 "$base_directory"/network-files/* | tr '\n' ' ')" | tee -a $network_files_log $verbose_log $network_log
($base_directory is a variable defining the base directory for the app, $network_files_log etc are variables defining the location of various log files)
This produces some pretty grody and unreadable output:
Tue Jun 21 04:55:46 UTC 2016 >>> Removing files: /home/vagrant/load-simulator/network-files/207822218.png /home/vagrant/load-simulator/network-files/217311040.png /home/vagrant/load-simulator/network-files/442119100.png /home/vagrant/load-simulator/network-files/464324101.png /home/vagrant/load-simulator/network-files/525787337.png /home/vagrant/load-simulator/network-files/581100197.png /home/vagrant/load-simulator/network-files/640387393.png /home/vagrant/load-simulator/network-files/650797708.png /home/vagrant/load-simulator/network-files/827538696.png /home/vagrant/load-simulator/network-files/833069509.png /home/vagrant/load-simulator/network-files/8580204.png /home/vagrant/load-simulator/network-files/858174053.png /home/vagrant/load-simulator/network-files/998266826.png
Any good way to strip out the /home/vagrant/load-simulator/network-files/ part from each of those file paths? I suspect there's something I should be doing with sed or grep, but haven't had any luck so far.
You might also consider using find. Its perfect for walking directories, removing files and using customized printf for output:
find $PWD/x -type f -printf "%f\n" -delete >>$YourLogFile.log
Don't use ls at all; use a glob to populate an array with the desired files. You can then use parameter expansion to shorten each array element.
d=$base_directory/network-files
files=( "$d"/* )
printf '%s Removing files: %s' "$(date -u)" "${files[*]#$d/}" | tee ...
You could do it a couple of ways. To directly answer the question, you could use sed to do it with the substitution command like:
echo -e "$(date -u) >>> Removing files: $(ls -1 "$base_directory"/network-files/* | tr '\n' ' ')" | sed -e "s,$base_directory/network-files/,," | tee -a $network_files_log $verbose_log $network_log
which adds sed -e "s,$base_directory/network-files/,," to the pipeline. It will substitute the string found in base_directory with the empty string, so long as there are not any commas in base_directory. If there are you could try a different separator for the parts of the sed command, like underscore: sed -e "s_$base_directory/network-files__"
Instead though, you could just have the subshell cd to that directory and then the string wouldn't be there in the first place:
echo -e "$(date -u) >>> Removing files: $(cd "$base_directory/network-files/"; ls -1 | tr '\n' ' ')" | tee -a "$network_files_log" "$verbose_log" "$network_log"
Or you could avoid some potential pitfalls with echo and use printf like
{ printf '%s >>>Removing files: '; printf '%s ' "$(cd "$base_directory/network-files"; ls -1)"; printf '\n'; } | tee -a ...
testdata="/home/vagrant/load-simulator/network-files/207822218.png /home/vagrant/load-simulator/network-files/217311040.png"
echo -e $testdata | sed -e 's/\/[^ ]*\///g'
Pipe your output to sed the replace that captured group with nothing.
The regex: \/[^ ]*\/
Start with a /, captured everything that is not a space until it gets to the last /.

bash: sed: unexpected behavior: displays everything

I wrote what I thought was a quick script I could run on a bunch of machines. Instead it print what looks like might be directory contents in a recursive search:
version=$(mysql Varnish -B --skip-column-names -e "SELECT value FROM sys_param WHERE param='PatchLevel'" | sed -n 's/^.*\([0-9]\.[0-9]*\).*$/\1/p')
if [[ $(echo "if($version == 6.10) { print 1; } else { print 0; }" | bc) -eq 1 ]]; then
status=$(dpkg-query -l | awk '{print $2}' | grep 'sg-status-polling');
cons=$(dpkg-query -l | awk '{print $2}' | grep 'sg-consolidated-poller');
if [[ "$status" != "" && "$cons" != "" ]]; then
echo "about to change /var/www/Varnish/lib/Extra/SG/ObjectPoller2.pm"; echo;
cp /var/www/Varnish/lib/Extra/SG/ObjectPoller2.pm /var/www/Varnish/lib/Extra/SG/ObjectPoller2.pm.bkup;
sed -ir '184s!\x91\x93!\x91\x27--timeout=35\x27\x93!' /var/www/Varnish/lib/Extra/SG/ObjectPoller2.pm;
sed -n 183,185p /var/www/Varnish/lib/Extra/SG/ObjectPoller2.pm; echo;
else
echo "packages not found. Assumed to be not applicable";
fi
else
echo "This is 4.$version, skipping";
fi
The script is supposed to make sure Varnish is version 4.6.10 and has 2 custom .deb packages installed (not through apt-get). then makes a backup and edits a single line in a perl module from [] to ['--timeout=35']
it looks like its tripping up on the sed replace one liner.
There are two major problems (minor ones addressed in comments). The first is that you use the decimal code for [] instead of the hexa, so you should use \x5b\x5d instead of \x91\x93. The second problem is that if you do use the proper codes, sed will still interpret those syntactically as []. So you can't escape escaping. Here's what you should call:
sed -ri'.bkup' '184s!\[\]![\x27--timeout=35\x27]!' /var/www/Varnish/lib/Extra/SG/ObjectPoller2.pm
And this will create the backup for you (but you should double check).

Speed up bash filter function to run commands consecutively instead of per line

I have written the following filter as a function in my ~/.bash_profile:
hilite() {
export REGEX_SED=$(echo $1 | sed "s/[|()]/\\\&/g")
while read line
do
echo $line | egrep "$1" | sed "s/$REGEX_SED/\x1b[7m&\x1b[0m/g"
done
exit 0
}
to find lines of anything piped into it matching a regular expression, and highlight matches using ANSI escape codes on a VT100-compatible terminal.
For example, the following finds and highlights the strings bin, U or 1 which are whole words in the last 10 lines of /etc/passwd:
tail /etc/passwd | hilite "\b(bin|[U1])\b"
However, the script runs very slowly as each line forks an echo, egrep and sed.
In this case, it would be more efficient to do egrep on the entire input, and then run sed on its output.
How can I modify my function to do this? I would prefer to not create any temporary files if possible.
P.S. Is there another way to find and highlight lines in a similar way?
sed can do a bit of grepping itself: if you give it the -n flag (or #n instruction in a script) it won't echo any output unless asked. So
while read line
do
echo $line | egrep "$1" | sed "s/$REGEX_SED/\x1b[7m&\x1b[0m/g"
done
could be simplified to
sed -n "s/$REGEX_SED/\x1b[7m&\x1b[0m/gp"
EDIT:
Here's the whole function:
hilite() {
REGEX_SED=$(echo $1 | sed "s/[|()]/\\\&/g");
sed -n "s/$REGEX_SED/\x1b[7m&\x1b[0m/gp"
}
That's all there is to it - no while loop, reading, grepping, etc.
If your egrep supports --color, just put this in .bash_profile:
hilite() { command egrep --color=auto "$#"; }
(Personally, I would name the function egrep; hence the usage of command).
I think you can replace the whole while loop with simply
sed -n "s/$REGEX_SED/\x1b[7m&\x1b[0m/gp"
because sed can read from stdin line-by-line so you don't need read
I'm not sure if running egrep and piping to sed is faster than using sed alone, but you can always compare using time.
Edit: added -n and p to sed to print only highlighted lines.
Well, you could simply do this:
egrep "$1" $line | sed "s/$REGEX_SED/\x1b[7m&\x1b[0m/g"
But I'm not sure that it'll be that much faster ; )
Just for the record, this is a method using a temporary file:
hilite() {
export REGEX_SED=$(echo $1 | sed "s/[|()]/\\\&/g")
export FILE=$2
if [ -z "$FILE" ]
then
export FILE=~/tmp
echo -n > $FILE
while read line
do
echo $line >> $FILE
done
fi
egrep "$1" $FILE | sed "s/$REGEX_SED/\x1b[7m&\x1b[0m/g"
return $?
}
which also takes a file/pathname as the second argument, for case like
cat /etc/passwd | hilite "\b(bin|[U1])\b"

extract characters from filename of newest file

I am writing a bash script where i will need to check a directory for existing files and look at the last 4 digits of the first segment of the file name to set the counter when adding new files to the directory.
Naming Scructure:
yymmddHNAZXLCOM0001.835
I need to put the portion in the example 0001 into a CTR variable so the next file it puts into the directory will be
yymmddHNAZXLCOM0002.835
and so on.
what would be the easiest and shortest way to do this?
You can do this with sed:
filename="yymmddHNAZXLCOM0001.835"
first_part=$(echo $filename | sed -e 's/\(.*\)\([0-9]\{4,4\}\)\.\(.*\)/\1/')
counter=$(echo $filename | sed -e 's/\(.*\)\([0-9]\{4,4\}\)\.\(.*\)/\2/')
suffix=$(echo $filename | sed -e 's/\(.*\)\([0-9]\{4,4\}\)\.\(.*\)/\3/')
echo "$first_part$(printf "%04u" $(($counter + 1))).$suffix"
=> "yymmddHNAZXLCOM0002.835"
All three sed calls use the same regular expression. The only thing that changes is the group selected to return. There's probably a way to do all of that in one call, but my sed-fu is rusty.
Alternate version, using a Bash array:
filename="yymmddHNAZXLCOM0001.835"
ary=($(echo $filename | sed -e 's/\(.*\)\([0-9]\{4,4\}\)\.\(.*\)/\1 \2 \3/'))
echo "${ary[0]}$(printf "%04u" $((${ary[1]} + 1))).${ary[2]}"
=> "yymmddHNAZXLCOM0002.835"
Note: This version assumes that the filename does not have spaces in it.
Try this...
current=`echo yymmddHNAZXLCOM0001.835 | cut -d . -f 1 | rev | cut -c 1-4 | rev`
next=`echo $current | awk '{printf("%04i",$0+1)}'`
f() {
if [[ $1 =~ (.*)([[:digit:]]{4})(\.[^.]*)$ ]]; then
local -a ctr=("${BASH_REMATCH[#]:1}")
touch "${ctr}$((++ctr[1]))${ctr[2]}"
# ...
else
echo 'no matches'
fi
}
shopt -s nullglob
f *

Resources