This question already has answers here:
Is there a way to get the git root directory in one command?
(22 answers)
Closed 2 years ago.
I'm attempting to find the "root" of a folder. I'm doing this in a Bash script with the following (at least in my head):
# Get current directory (e.g. /foo/bar/my/subdir)
CURR_DIR = `cwd`
# Break down into array of folder names
DIR_ARRAY=(${CURR_DIR//\// })
# Iterate over items in DIR_ARRAY starting with "subdir"
<HELP WITH FOR LOOP SYNTAX>
# Each loop:
# build path to current item in DIR_ITER; e.g.
# iter N: DIR_ITER=/foo/bar/my/subdir
# iter N-1: DIR_ITER=/foo/bar/my
# iter N-2: DIR_ITER=/foo/bar
# iter 0: DIR_ITER=/foo
# In each loop:
# get the contents of directory using "ls -a"
# look for .git
# set ROOT=DIR_ITER
export ROOT
I've Googled for looping in Bash but it all uses the "for i in ARRAY" form, which doesn't guarantee reverse iteration order. What's the recommended way to achieve what I want to do?
One idea on reverse index referencing.
First our data:
$ CURR_DIR=/a/b/c/d/e/f
$ DIR_ARRAY=( ${CURR_DIR//\// } )
$ typeset -p DIR_ARRAY
declare -a DIR_ARRAY=([0]="a" [1]="b" [2]="c" [3]="d" [4]="e" [5]="f")
Our list of indices:
$ echo "${!DIR_ARRAY[#]}"
0 1 2 3 4 5
Our list of indices in reverse:
$ echo "${!DIR_ARRAY[#]}" | rev
5 4 3 2 1 0
Looping through our reverse list of indices:
$ for i in $(echo "${!DIR_ARRAY[#]}" | rev)
do
echo $i
done
5
4
3
2
1
0
As for working your way up the directory structure using this 'reverse' index strategy:
$ LOOP_DIR="${CURR_DIR}"
$ for i in $(echo "${!DIR_ARRAY[#]}" | rev)
do
echo "${DIR_ARRAY[${i}]}:${LOOP_DIR}"
LOOP_DIR="${LOOP_DIR%/*}"
done
f:/a/b/c/d/e/f
e:/a/b/c/d/e
d:/a/b/c/d
c:/a/b/c
b:/a/b
a:/a
Though we could accomplish the same thing a) without the array and b) using some basic parameter expansions, eg:
$ LOOP_DIR="${CURR_DIR}"
$ while [ "${LOOP_DIR}" != '' ]
do
subdir="${LOOP_DIR##*/}"
echo "${subdir}:${LOOP_DIR}"
LOOP_DIR="${LOOP_DIR%/*}"
done
f:/a/b/c/d/e/f
e:/a/b/c/d/e
d:/a/b/c/d
c:/a/b/c
b:/a/b
a:/a
You can use dirname in a loop, to find the parent folder, then move up until you e.g., find the .git folder.
Quick example:
#!/usr/bin/env bash
set -eu
for arg in "$#"
do
current=$arg
while true
do
if [ -d "$current/.git" ]
then
echo "$arg: .git in $current"
break
fi
parent="$(dirname "$current")"
if [ "$parent" == "$current" ]
then
echo "No .git in $arg"
break
fi
current=$parent
done
done
For each parameter you pass to this script, it will print where it found the .git folder up the directory tree, or print an error if it didn't find it.
Related
I have a directory with a lot of files, which can be grouped based on their names. For example here I have 4 groups with 5 files in each:
ls - ./
# group 1
NpXynWT_apo_300K_0.pdb
NpXynWT_apo_300K_1.pdb
NpXynWT_apo_300K_2.pdb
NpXynWT_apo_300K_3.pdb
NpXynWT_apo_300K_4.pdb
# group 2
NpXynWT_apo_340K_0.pdb
NpXynWT_apo_340K_1.pdb
NpXynWT_apo_340K_2.pdb
NpXynWT_apo_340K_3.pdb
NpXynWT_apo_340K_4.pdb
# group 3
NpXynWT_com_300K_0.pdb
NpXynWT_com_300K_1.pdb
NpXynWT_com_300K_2.pdb
NpXynWT_com_300K_3.pdb
NpXynWT_com_300K_4.pdb
# group 4
NpXynWT_com_340K_0.pdb
NpXynWT_com_340K_1.pdb
NpXynWT_com_340K_2.pdb
NpXynWT_com_340K_3.pdb
NpXynWT_com_340K_4.pdb
So here each of the 5 files of the same group is different by the end suffix from 0 to 4:
NpXynWT_apo_300K_0 ... NpXynWT_apo_300K_4
NpXynWT_apo_340K_0 ... NpXynWT_apo_340K_4
etc
I need to loop over all of these 40 files and
pre-process each of the fille: adding inside of it "MODEL + A number of the file" (thus a number in range between 0 and 4) before the first string, and "ENDMDL" on the last string.
cat together the pre-processed files of the same group
In summary, as the result my script should create 4 new "combined" files, which will consist of 5 subfiles from the initial list.
For the realisation I created an array of the groups and looped it providing index from 0 to 4 as well as two loops: 1)pre-processing of each file; 2) CAT the pre-processed files together:
# list of 4 groups
systems=(NpXynWT_apo_300K NpXynWT_apo_340K NpXynWT_com_300K NpXynWT_com_340K)
# pre-process files
for model in "${systems[#]}"; do
i="0"
while [ $i -lt 5 ]; do
# EDIT EXISTING FILES
sed -i "1 i\MODEL $i" "${pdbs}"/"${model}"_"$i"_FA.pdb
echo "ENDMDL" >> "${pdbs}"/"${model}"_"$i"_FA.pdb
i=$[$i+1]
done
done
# cat pre-processed filles
for model in ${systems[#]}; do
cat "${pdbs}"/"${model}"_[0-4]_FA.pdb > "${output}/${model}.pdb"
done
1 - Would it be possible to merge together the both loops ? E.g. should it be the same as
# pre-processing PBDs and it catting
for model in "${systems[#]}"; do
##echo "$model"
i="0"
while [ $i -lt 5 ]; do
k=$[$i+1]
## do something with pdb
sed -i "1 i\MODEL $k" "${pdbs}"/"${model}"_"$i"_FA.pdb
echo "ENDMDL" >> "${pdbs}"/"${model}"_"$i"_FA.pdb
#gedit "${pdbs}"/"${model}"_"$i"_FA.pdb
i=$[$i+1]
done
# now we cat together the post-processed files
cat "${pdbs}"/"${model}"_[0-4]_FA.pdb > "${output}/${model}.pdb"
done
2- would it be possible simplify two operations from the first loop of the edition of the file?
sed -i "1 i\MODEL $i" "${pdbs}"/"${model}"_"$i"_FA.pdb
echo "ENDMDL" >> "${pdbs}"/"${model}"_"$i"_FA.pdb
how to match info from array "groups" to the files present in the folder ?
Use find. It is there to find files.
groups=(NpXynWT_apo_300K NpXynWT_apo_340K NpXynWT_com_300K NpXynWT_com_340K)
for group in ${groups[#]}; do
find . -name "${group}_*.pdb" -type f
done
You can be even more exact by using -regex and similar find options.
I am writing a bash script named safeDel.sh with base functionalities including:
file [file1, file2, file3...]
-l
-t
-d
-m
-k
-r arg
For the single letter arguments I am using the built in function getops which works fine. The issue I'm having now is with the 'file' argument. The 'file' argument should take a list of files to be moved to a directory like this:
$ ./safeDel.sh file file1.txt file2.txt file3.txt
The following is a snippet of the start of my program :
#! /bin/bash
files=("$#")
arg="$1"
echo "arguments: $arg $files"
The echo statement shows the following:
$ arguments : file file
How can I split up the file argument from the files that have to be moved to the directory?
Assuming that the options processed by getopts have been shifted off the command line arguments list, and that a check has been done to ensure that at least two arguments remain, this code should do what is needed:
arg=$1
files=( "${#:2}" )
echo "arguments: $arg ${files[*]}"
files=( "${#:2}" ) puts all the command line arguments after the first into an array called files. See Handling positional parameters [Bash Hackers Wiki] for more information.
${files[*]} expands to the list of files in the files array inside the argument to echo. To safely expand the list in files for looping, or to pass to a command, use "${files[#]}". See Arrays [Bash Hackers Wiki].
This is a way you can achieve your needs:
#!/bin/bash
declare -a files="$#"
for fileToManage in ${files}; do
echo "Managing ... $fileToManage"
done
But it works only if there is no space in your file names, in which case you need to do some additional work.
Let me know if you need further help.
function getting_arguments {
# using windows powershell
echo #($args).GetType()
echo #($args).length
echo "#($args)[0]"
echo #($args)[0]
echo "#($args)[1..(#($args).length)]"
echo #($args)[1..(#($args).length)]
echo "debug: $(#($args)[0])" #($args)[1..(#($args).length)]
}
OUTPUT
PS C:\monkey> getting_arguments 1 2 3 4 5 6
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True Object[] System.Array
6
#(1 2 3 4 5 6)[0]
1
#(1 2 3 4 5 6)[1..(#(1 2 3 4 5 6).length)]
2
3
4
5
6
debug: 1
2
3
4
5
6
I have a directory full of directories containing exam subjects I would like to work on randomly to simulate the real exam.
They are classified by difficulty level:
0-0, 0-1 .. 1-0, 1-1 .. 2-0, 2-1 ..
I am trying to write a shell script allowing me to pick one subject (directory) randomly based on the parameter I pass when executing the script (0, 1, 2 ..).
I can't quite figure it, here is my progress so far:
ls | find . -name "1$~" | sort -r | head -n 1
What am I missing here?
There's no need for any external commands (ls, find, sort, head) for this at all:
#!/usr/bin/env bash
set -o nullglob # make globs expand to nothing, not themselves, when no matches found
dirs=( "$1"*/ ) # list directories starting with $1 into an array
# Validate that our glob actually had at least one match
(( ${#dirs[#]} )) || { printf 'No directories start with %q at all\n' "$1" >&2; exit 1; }
idx=$(( RANDOM % ${#dirs[#]} )) # pick a random index into our array
echo "${dirs[$idx]}" # and look up what's at that index
I have this in my local directory ~/Report:
Rep_{ReportType}_{Date}_{Seq}.csv
Rep_0001_20150102_0.csv
Rep_0001_20150102_1.csv
Rep_0102_20150102_0.csv
Rep_0503_20150102_0.csv
Rep_0503_20150102_0.csv
Using shell-script,
How do I get multiple files from a local directory with a fixed batch size?
How do I segregate/group the files together by report type (0001 files are grouped together, 0102 grouped together, 0503 grouped together, etc.)
I will generate a sequence file (using forqlift) for EACH group/report type. The output would be Report0001.seq, Report0102.seq, Report0503.seq (3 sequence files). In which I will save to a different directory.
Note: In sequence files, the key is the filename of csv (Rep_0001_20150102.csv), and the value is the content of the file. It is stored as [String, BytesWritable].
This is my code:
1 reportTypes=(0001 0102 8902)
2
3 # collect all files matching expression into an array
4 filesWithDir=(~/Report/Rep_[0-9][0-9][0-9][0-9]_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_[0-1].csv)
5
6 # take only the first hundred
7 filesWithDir =( "${filesWithDir[#]:0:100}" )
8
9 # files="${filesWithDir[#]##*/}" #### commented out since forqlift cannot create sequence file without the path/to/file
10 # echo ${files[#]}
11
12 shopt -s nullglob
13
14 # Line 21 is commented out since it has a bug. It collects files in
15 # current directory when it should be filtering the "files array" created
16 # in line 7
17
18
19 for i in ${reportTypes[#]}; do
20 printf -v val '%04d' "$i"
21 # files=("Rep_${val}_"*.csv)
# solution to BUG: (filter files array)
groupFiles=( $( for j in ${filesWithDir[#]} ; do echo $j ; done | grep ${val} ) )
22
23 # Generate sequence file for EACH Report Type
24 forqlift create --file="Report${val}.seq" "${groupFiles[#]}"
25 done
(Note: The sequence file output should be in current directory, not in ~/Report)
It's easy to take only a subset of an array:
# collect all files matching expression into an array
files=( ~/Report/Rep_[0-9][0-9][0-9][0-9]_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9].csv )
# take only the first hundred
files=( "${files[#]:0:100}" )
The second part is trickier: Bash has associative arrays ("maps"), but the only legal values which can be stored in arrays are strings -- not other arrays -- so you can't store a list of filenames as a value associated with a single entry (without serializing the array to and from a string -- a moderately tricky thing to do safely, since file paths in UNIX can contain any character other than NUL, newlines included).
It's better, then, to just generate the array as you need it.
shopt -s nullglob # allow a glob to expand to zero arguments
for ((i=1; i<=1000; i++)); do
printf -v val '%04d' "$i" # pad digits: 12 -> 0012
files=( "Rep_${val}_"*.csv ) # collect files that match
## emit NUL-separated list of files, if any were found
#(( ${#files[#]} )) && printf '%s\0' "${files[#]}" >"Reports.$val.txt"
# Create a sequence file with forqlift
forqlift create --file="Reports-${val}.seq" "${files[#]}"
done
If you really don't want to do that, then we can put something together that uses namevars for redirection:
#!/bin/bash
# This only works with bash 4.3
re='^REP_([[:digit:]]{4})_[[:digit:]]{8}.csv$'
counter=0
for f in *; do
[[ $f =~ $re ]] || continue # skip files not matching regex
if ((++counter > 100)); then break; fi # stop after 100 files
group=${BASH_REMATCH[1]} # retrieve first regex group
declare -g -a "array${group}" # declare an array
declare -n group_arr="array${group}" # redirect group_arr to that array
group_arr+=( "$f" ) # append to the array
done
for varname in "${!array#}"; do
declare -n group_arr="$varname"
## NUL-delimited form
#printf '%s\0' "${group_arr[#]}" \
# >"collection${varname#array}" # write to files named collection0001, etc.
# forqlift sequence file form
forqlift create --file="Reports-${varname#array}.seq" "${group_arr[#]}"
done
I would move away from shell scripts and start to look towards perl.
#!/usr/bin/env perl
use strict;
use warnings;
my %groups;
while ( my $filename = glob ( "~/Reports/Rep_*.csv" ) ) {
my ( $group, $id ) = ( $filename =~ m,/Rep_(\d{4})_(\d{8})\.csv$, );
next unless $group; #undefined means it didn't match;
#anything past 100 in a group is discarded:
if ( #{$groups{$group}} < 100 ) {
push ( #{$groups{$group}}, $filename );
}
}
foreach my $group ( keys %groups ) {
print "$group contains:\n";
print join ("\n", #{$groups{$group});
}
Another alternative is to clobber some bash commands together with regexp.
See implementation below
# Explanation:
# ls -p = List all files and directories in local directory by path
# grep -v / = ignore subdirectories
# grep "^Rep_\d{4}_\d{8}\.csv$" = Look for files matching your regexp
# tail -100 = get 100 results
for file in $(ls -p | grep -v / | grep "^Rep_\d{4}_\d{8}\.csv$" | tail -100);
do echo $file;
# Use reg exp to extract the desired sequence
re="^Rep_([[:digit:]]{4})_([[:digit:]]{8}).csv$";
if [[ $name =~ $re ]]; then
sequence = ${BASH_REMATCH[1};
# Didn't end up using date, but in case you want it
# date = ${BASH_REMATCH[2]};
# Just in case the sequence file doesn't exist
if [ ! -f "$sequence" ] ; then
touch "$sequence"
fi
# Output/Concat your filename to the sequence file, which you can
# read in later to do whatever administrative tasks you wish to do
# to them
echo "$file" >> "$sequence"
fi
done;
I am creating a script to run on OS X which will be run often by a novice user, and so want to protect a directory structure by creating a fresh one each time with an n+1 over the last:
target001 with the next run creating target002
I have so far:
lastDir=$(find /tmp/target* | tail -1 | cut -c 6-)
let n=$n+1
mkdir "$lastDir""$n"
However, the math isn't working here.
What about
mktemp?
Create a temporary file or directory, safely, and print its name.
TEMPLATE must contain at least 3 consecutive `X's in last component.
If TEMPLATE is not specified, use tmp.XXXXXXXXXX, and --tmpdir is
implied. Files are created u+rw, and directories u+rwx, minus umask
restrictions.
Use this line to calculate the new sequence number:
...
n=$(printf "%03d" $(( 10#$n + 1 )) )
mkdir "$lastDir""$n"
10# to force base 10 arithmetic. Provided $n beeing the last secuence already e.g. "001".
No pipes and subprocesses:
targets=( /tmp/target* ) # all dirs in an array
lastdir=${targets[#]: (-1):1} # select filename from last array element
lastdir=${lastdir##*/} # remove path
lastnumber=${lastdir/target/} # remove 'target'
lastnumber=00$(( 10#$lastnumber + 1 )) # increment number (base 10), add leading zeros
mkdir /tmp/target${lastnumber: -3} # make dir; last 3 chars from lastnumber
A version with 2 parameters:
path='/tmp/x/y/z' # path without last part
basename='target' # last part
targets=( $path/${basename}* ) # all dirs in an array
lastdir=${targets[#]: (-1):1} # select path from last entry
lastdir=${lastdir##*/} # select filename
lastnumber=${lastdir/$basename/} # remove 'target'
lastnumber=00$(( 10#$lastnumber + 1 )) # increment number (base 10), add leading zeros
mkdir $path/$basename${lastnumber: -3} # make dir; last 3 chars from lastnumber
Complete solution using extended test [[ and BASH_REMATCH :
[[ $(find /tmp/target* | tail -1) =~ ^(.*)([0-9]{3})$ ]]
mkdir $(printf "${BASH_REMATCH[1]}%03d" $(( 10#${BASH_REMATCH[2]} + 1 )) )
Provided /tmp/target001 is your directory pattern.
Like this:
lastDir=$(find /tmp/target* | tail -1)
let n=1+${lastDir##/tmp/target}
mkdir /tmp/target$(printf "%03d" $n)