I have a bunch of files that need to be copied over to a tmp/ directory and then compressed.
I tried cp -rf $SRC $DST but the job is terminated before the command is complete. The verbose option int help either because the log file exceeds the size limit.
I wrote a small function to print only a percentage bar, but I get the same problem with the log size limit so maybe I need to redirect stdout to stderr but I'm not sure.
This is the snippet with the function:
function cp_p() {
local files=0
while IFS= read -r -d '' file; do ((files++)); done < <(find -L $1 -mindepth 1 -name '*.*' -print0)
local duration=$(tput cols)
duration=$(($duration<80?$duration:80-8))
local count=1
local elapsed=1
local bar=""
already_done() {
bar="\r|"
for ((done=0; done<$(( ($elapsed)*($duration)/100 )); done++)); do
printf -v bar "$bar▇"
done
}
remaining() {
for ((remain=$(( ($elapsed)*($duration)/100 )); remain<$duration; remain++)); do
printf -v bar "$bar "
done
printf -v bar "$bar|"
}
percentage() {
printf -v bar "$bar%3d%s" $elapsed '%%'
}
mkdir -p "$2/$1"
chmod `stat -f %A "$1"` "$2/$1"
while IFS= read -r -d '' file; do
file=$(echo $file | sed 's|^\./\(.*\)|"\1"|')
elapsed=$(( (($count)*100)/($files) ))
already_done
remaining
percentage
printf "$bar"
if [[ -d "$file" ]]; then
dst=$2/$file
test -d "$dst" || (mkdir -p "$dst" && chmod `stat -f %A "$file"` "$dst")
else
src=${file%/*}
dst=$2/$src
test -d "$dst" || (mkdir -p "$dst" && chmod `stat -f %A "$src"` "$dst")
cp -pf "$file" "$2/$file"
fi
((count++))
done < <(find -L $1 -mindepth 1 -name '*.*' -print0)
printf "\r"
}
This is the error I get
packaging files (this may take several minutes) ...
|▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ | 98%
The log length has exceeded the limit of 4 MB (this usually means that the test suite is raising the same exception over and over).
The job has been terminated
Have you tried travis_wait cp -rf $SRC $DST? See https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received for details.
Also, I believe that generally disk operations are rather slow on macOS builds. You might be better off compressing the file structure while the files are touched. Assuming you want to gzip the thing:
travis_wait tar -zcf $DST.tar.gz $SRC
Related
I have files in this format:
2022-03-5344-REQUEST.jpg
2022-03-5344-IMAGE.jpg
2022-03-5344-00imgtest.jpg
2022-03-5344-anotherone.JPG
2022-03-5343-kdijffj.JPG
2022-03-5343-zslkjfs.jpg
2022-03-5343-myimage-2010.jpg
2022-03-5343-anotherone.png
2022-03-5342-ebee5654.jpeg
2022-03-5342-dec.jpg
2022-03-5341-att.jpg
2022-03-5341-timephoto_december.jpeg
....
about 13k images like these.
I want to create folders like:
2022-03-5344/
2022-03-5343/
2022-03-5342/
2022-03-5341/
....
I started manually moving them like:
mkdir name
mv name-* name/
But of course I'm not gonna repeat this process for 13k files.
So I want to do this using bash scripting, and since I am new to bash, and I am working on a production environment, I want to play it safe, but it doesn't give me my results. This is what I did so far:
#!/bin/bash
name = $1
mkdir "$name"
mv "${name}-*" $name/
and all I can do is: ./move.sh name for every folder, I didn't know how to automate this using loops.
With bash and a regex. I assume that the files are all in the current directory.
for name in *; do
if [[ "$name" =~ (^....-..-....)- ]]; then
dir="${BASH_REMATCH[1]}"; # dir contains 2022-03-5344, e.g.
echo mkdir -p "$dir" || exit 1;
echo mv -v "$name" "$dir";
fi;
done
If output looks okay, remove both echo.
Try this
xargs -i sh -c 'mkdir -p {}; mv {}-* {}' < <(ls *-*-*-*|awk -F- -vOFS=- '{print $1,$2,$3}'|uniq)
Or:
find . -maxdepth 1 -type f -name "*-*-*-*" | \
awk -F- -vOFS=- '{print $1,$2,$3}' | \
sort -u | \
xargs -i sh -c 'mkdir -p {}; mv {}-* {}'
Or find with regex:
find . -maxdepth 1 -type f -regextype posix-extended -regex ".*/[0-9]{4}-[0-9]{2}-[0-9]{4}.*"
You could use awk
$ cat awk.script
/^[[:digit:]-]/ && ! a[$1]++ {
dir=$1
} /^[[:digit:]-]/ {
system("sudo mkdir " dir )
system("sudo mv " $0" "dir"/"$0)
}
To call the script and use for your purposes;
$ awk -F"-([0-9]+)?[[:alpha:]]+.*" -f awk.script <(ls)
You will see some errors such as;
mkdir: cannot create directory ‘2022-03-5341’: File exists
after the initial dir has been created, you can safely ignore these as the dir now exist.
The content of each directory will now have the relevant files
$ ls 2022-03-5344
2022-03-5344-00imgtest.jpg 2022-03-5344-IMAGE.jpg 2022-03-5344-REQUEST.jpg 2022-03-5344-anotherone.JPG
I am i need to find & remove empty files. The definition of empty files in my use case is a file which has zero lines.
I did try testing the file to see if it's empty However, this behaves strangely as in even though the file is empty it doesn't detect it so.
Hence, the best thing I could write up is the below script which i way too slow given it has to test several hundred thousand files
#!/bin/bash
LOOKUP_DIR="/path/to/source/directory"
cd ${LOOKUP_DIR} || { echo "cd failed"; exit 0; }
for fname in $(realpath */*)
do
if [[ $(wc -l "${fname}" | awk '{print $1}') -eq 0 ]]
then
echo "${fname}" is empty
rm -f "${fname}"
fi
done
Is there a better way to do what I'm after or alternatively, can the above logic be re-written in a way that brings better performance please?
Your script is slow beacuse wc reads every file to the end, which is not needed for your purpose. This might be what you're looking for:
#!/bin/bash
lookup_dir='/path/to/source/directory'
cd "$lookup_dir" || exit
for file in *; do
if [[ -f "$file" && -r "$file" && ! -L "$file" ]]; then
read < "$file" || echo rm -f -- "$file"
fi
done
Drop the echo after making sure it works as intended.
Another version, calling the rm only once, could be:
#!/bin/bash
lookup_dir='/path/to/source/directory'
cd "$lookup_dir" || exit
for file in *; do
if [[ -f "$file" && -r "$file" && ! -L "$file" ]]; then
read < "$file" || files_to_be_deleted+=("$file")
fi
done
rm -f -- "${files_to_be_deleted[#]}"
Explanation:
The core logic is in the line
read < "$file" || rm -f -- "$file"
The read < "$file" command attempts to read a line from the $file. If it succeeds, that is, a line is read, then the rm command on the right-hand side of the || won't be executed (that's how the || works). If it fails then the rm command will be executed. In any case, at most one line will be read. This has great advantage over the wc command because wc would read the whole file.
if ! read < "$file"; then rm -f -- "$file"; fi
could be used instead. The two lines are equivalent.
To check a "$fname" is a file and is empty or not, use [ -s "$fname" ]:
#!/usr/bin/env sh
LOOKUP_DIR="/path/to/source/directory"
for fname in "$LOOKUP_DIR"*/*; do
if ! [ -s "$fname" ]; then
echo "${fname}" is empty
# remove echo when output is what you want
echo rm -f "${fname}"
fi
done
See: help test:
File operators:
...
-s FILE True if file exists and is not empty.
Yet another method
wc -l ~/tmp/* 2>/dev/null | awk '$1 == 0 {print $2}' | xargs echo rm
This will break if any of your files have whitespace in the name.
To work around that, with awk still
wc -l ~/tmp/* 2>/dev/null \
| awk 'sub(/^[[:blank:]]+0[[:blank:]]+/, "")' \
| xargs echo rm
This works because the sub function returns the number of substitutions made, which can be treated as a boolean zero/not-zero condition.
Remove the echo to actually delete the files.
I am currently stuck with a problem in my Bash script and seem to run even deeper in the dark with every attempt of trying to fix it.
Background:
We have a folder which is getting filled with numbered crash folders, which get filled with crash files. Someone is exporting a list of these folders on a daily basis. During that export, the numbered crash folders get an attribute "user.exported=1".
Some of them do not get exported, so they will not have the attribute and these should be deleted only if they are older than 30 days.
My problem:
I am setting up a bash script, which is being run via Cron in the end to check on a regular basis for folders, which have the attribute "user.exported=1" and are older than 14 days and deletes them via rm -rfv FOLDER >> deleted.log
We however also have folders which do not have or get the attribute "user.exported=1" which then need to be deleted after they are older than 30 days. I created an IF ELIF FI comparison to check for that but that is where I got stuck.
My Code:
#!/bin/bash
# Variable definition
LOGFILE="/home/crash/deleted.log"
DATE=`date '+%d/%m/%Y'`
TIME=`date '+%T'`
FIND=`find /home/crash -maxdepth 1 -mindepth 1 -type d`
# Code execution
printf "\n$DATE-$TIME\n" >> "$LOGFILE"
for d in $FIND; do
# Check if crash folders are older than 14 days and have been exported
if [[ "$(( $(date +"%s") - $(stat -c "%Y" $d) ))" -gt "1209600" ]] && [[ "$(getfattr -d --absolute-names -n user.exported --only-values $d)" == "1" ]]; then
#echo "$d is older than 14 days and exported"
"rm -rfv $d" >> "$LOGFILE"
# Check if crash folders are older than 30 days and delete regardless
elif [[ "$(( $(date +"%s") - $(stat -c "%Y" $d) ))" -gt "1814400" ]] && [[ "$(getfattr -d --absolute-names -n user.exported $d)" == FALSE ]]; then
#echo "$d is older than 30 days"
"rm -rfv $d" >> "$LOGFILE"
fi
done
The IF part is working fine and it deleted the folders with the attribute "user.exported=1" but the ELIF part does not seem to work, as I only get an output in my bash such as:
/home/crash/1234: user.exported: No such attribut
./crash_remove.sh: Line 20: rm -rfv /home/crash/1234: File or Directory not found
When I look into the crash folder after the script ran, the folder and its content is still there.
I definitely have an error in my script but cannot see it. Please could anyone help me out with this?
Thanks in advance
Only quote the expansions, not the whole command.
Instead of:
"rm -rfv $d"
do:
rm -rfv "$d"
If you quote it all, bash tries to run a command named literally rm<space>-rfv<space><expansion of d>.
Do not use backticks `. Use $(...) instead. Bash hackers wiki obsolete deprecated syntax.
Do not for i in $(cat) or var=$(...); for i in $var. Use a while IFS= read -r loop. How to read a file line by line in bash.
Instead of if [[ "$(( $(date +"%s") - $(stat -c "%Y" $d) ))" -gt "1814400" ]] just do the comparison in the arithmetic expansion, like: if (( ( $(date +"%s") - $(stat -c "%Y" $d) ) > 1814400 )).
I think you could just do it all in find, like::
find /home/crash -maxdepth 1 -mindepth 1 -type d '(' \
-mtime 14 \
-exec sh -c '[ "$(getfattr -d --absolute-names -n user.exported --only-values "$1")" = "1" ]' -- {} \; \
-exec echo rm -vrf {} + \
')' -o '(' \
-mtime 30 \
-exec sh -c '[ "$(getfattr -d --absolute-names -n user.exported "$1")" = FALSE ]' -- {} \; \
-exec echo rm -vrf {} + \
')' >> "$LOGFILE"
I am looking to collect yang models from my project .jar files.Though i came with an approach but it takes time and my colleagues are not happy.
#!/bin/sh
set -e
# FIXME: make this tuneable
OUTPUT="yang models"
INPUT="."
JARS=`find $INPUT/system/org/linters -type f -name '*.jar' | sort -u`
# FIXME: also wipe output?
[ -d "$OUTPUT" ] || mkdir "$OUTPUT"
for jar in $JARS; do
artifact=`basename $jar | sed 's/.jar$//'`
echo "Extracting modules from $artifact"
# FIXME: better control over unzip errors
unzip -q "$jar" 'META-INF/yang/*' -d "$artifact" \
2>/dev/null || true
dir="$artifact/META-INF/yang"
if [ -d "$dir" ]; then
for file in `find $dir -type f -name '*.yang'`; do
module=`basename "$file"`
echo -e "\t$module"
# FIXME: better duplicate detection
mv -n "$file" "$OUTPUT"
done
fi
rm -rf "$artifact"
done
If the .jar files don't all change between invocations of your script then you could make the script significantly faster by caching the .jar files and only operating on the ones that changed, e.g.:
#!/bin/env bash
set -e
# FIXME: make this tuneable
output='yang models'
input='.'
cache='/some/where'
mkdir -p "$cache" || exit 1
readarray -d '' jars < <(find "$input/system/org/linters" -type f -name '*.jar' -print0 | sort -zu)
# FIXME: also wipe output?
mkdir -p "$output" || exit 1
for jarpath in "${jars[#]}"; do
diff -q "$jarpath" "$cache" || continue
cp "$jarpath" "$cache"
jarfile="${jarpath##*/}"
artifact="${jarfile%.*}"
printf 'Extracting modules from %s\n' "$artifact"
# FIXME: better control over unzip errors
unzip -q "$jarpath" 'META-INF/yang/*' -d "$artifact" 2>/dev/null
dir="$artifact/META-INF/yang"
if [ -d "$dir" ]; then
readarray -d '' yangs < <(find "$dir" -type f -name '*.yang' -print0)
for yangpath in "${yangs[#]}"; do
yangfile="${yangpath##*/}"
printf '\t%s\n' "$yangfile"
# FIXME: better duplicate detection
mv -n "$yangpath" "$output"
done
fi
rm -rf "$artifact"
done
See Correct Bash and shell script variable capitalization, http://mywiki.wooledge.org/BashFAQ/082, https://mywiki.wooledge.org/Quotes, How can I store the "find" command results as an array in Bash for some of the other changes I made above.
I assume you have some reason for looping on the .yang files and not moving them if a file by the same name already exists rather than unzipping the .jar file into the final output directory.
AIM: To find files with a word count less than 1000 and move them another folder. Loop until all under 1k files are moved.
STATUS: It will only move one file, then error with "Unable to move file as it doesn't exist. For some reason $INPUT_SMALL doesn't seem to update with the new file name."
What am I doing wrong?
Current Script:
Check for input files already under 1k and move to Split folder
INPUT_SMALL=$( ls -S /folder1/ | grep -i reply | tail -1 )
INPUT_COUNT=$( cat /folder1/$INPUT_SMALL 2>/dev/null | wc -l )
function moveSmallInput() {
while [[ $INPUT_SMALL != "" ]] && [[ $INPUT_COUNT -le 1003 ]]
do
echo "Files smaller than 1k have been found in input folder, these will be moved to the split folder to be processed."
mv /folder1/$INPUT_SMALL /folder2/
done
}
I assume you are looking for files that has the word reply somewhere in the path. My solution is:
wc -w $(find /folder1 -type f -path '*reply*') | \
while read wordcount filename
do
if [[ $wordcount -lt 1003 ]]
then
printf "%4d %s\n" $wordcount $filename
#mv "$filename" /folder2
fi
done
Run the script once, if the output looks correct, then uncomment the mv command and run it for real this time.
Update
The above solution has trouble with files with embedded spaces. The problem occurs when the find command hands its output to the wc command. After a little bit of thinking, here is my revised soltuion:
find /folder1 -type f -path '*reply*' | \
while read filename
do
set $(wc -w "$filename") # $1= word count, $2 = filename
wordcount=$1
if [[ $wordcount -lt 1003 ]]
then
printf "%4d %s\n" $wordcount $filename
#mv "$filename" /folder2
fi
done
A somewhat shorter version
#!/bin/bash
find ./folder1 -type f | while read f
do
(( $(wc -w "$f" | awk '{print $1}' ) < 1000 )) && cp "$f" folder2
done
I left cp instead of mv for safery reasons. Change to mv after validating
I you also want to filter with reply use #Hai's version of the find command
Your variables INPUT_SMALL and INPUT_COUNT are not functions, they're just values you assigned once. You either need to move them inside your while loop or turn them into functions and evaluate them each time (rather than just expanding the variable values, as you are now).