BASH script more smart with cat - bash

I have multiple files in multiple folders
[tiagocastro#cascudo clean_reads]$ ls
11 13 14 16 17 18 3 4 5 6 8 9
and I want to make a tiny bash script to concatenate these files inside :
11]$ ls
FCC4UE9ACXX-HUMcqqTAAFRAAPEI-206_L6_1.fq FCC4UE9ACXX-HUMcqqTAAFRAAPEI-206_L7_1.fq
FCC4UE9ACXX-HUMcqqTAAFRAAPEI-206_L6_2.fq FCC4UE9ACXX-HUMcqqTAAFRAAPEI-206_L7_2.fq
But only L6 with L6 and L7 with L7
Right now I am on the basic level. I want to learn how to do it more smartly, instead of reproduce the commands I could do in terminal in the script.
Thank you everybody, for helping me.

This isn't an free programmiing service, but you can learn something from the next:
#!/bin/bash
echo2() { echo "$#" >&2; }
get_Lnums() {
find . -type f -regextype posix-extended -iregex '.*_L[0-9]+_[0-9]+\.fq' -maxdepth 1 -print | grep -oP '_\KL\d+' | sort -u
}
docat() {
echo2 doing $(pwd)
for lnum in $(get_Lnums)
do
echo cat *_${lnum}_*.fq "> new_${lnum}.all" #remove (comment out) this line when satisfied
#cat *_${lnum}_*.fq > new_${lnum}.all #and uncomment this
done
}
while read -r -d $'\0' dir
do
(cd "$dir" && docat) #subshell - don't need cd back
done < <(find . -type dir -maxdepth 1 -mindepth 1 -print0)

Related

How to change multiple directories' name by computation on the numbers in the names?

I have some directories. Their names are as follows,
s1_tw
s2_tw
s3_tw
s4_tw
How to change their names by add a fixed integer to the number following "s"? How can I change the directories' name to
s22_tw
s23_tw
s24_tw
s25_tw
by changing s1 to s22 (1 + 21 = 22), s2 to s23, etc? (Here adding 21 is expected)
I tried with
for f in s*_tw
do
for i in `seq 1 1 4`
do
mv -i "${f//s${i}/s(${i}+21)}"
done
done
But I know it is not correct, because I can not perform addition operation in this command. Could you please give some suggestions?
This will rename your directories:
#!/bin/bash
find . -maxdepth 1 -type d -name "s?_tw" -print0 | while IFS= read -r -d '' dir
do
digit=$(echo "$dir" | sed 's#./s\([0-9]\)_tw#\1#')
echo "DIGIT=$digit"
(( newdigit = digit + 21 ))
echo "NEWDIGIT=$newdigit"
mv "$dir" "s${newdigit}_tw"
done
The find -print0 with while and read comes from https://mywiki.wooledge.org/BashFAQ/001.

Find groups of files that end with the same 17 characters

I'm grabbing files that have a unique and common pattern. I'm trying to match on the common. Currently trying with bash. I can use python or whatever.
file1_02_01_2021_002244.mp4
file2_02_01_2021_002244.mp4
file3_02_01_2021_002244.mp4
# _02_01_2021_002244.mp4 should be the 'match all files that contain this string'
file1_03_01_2021_092200.mp4
file2_03_01_2021_092200.mp4
file3_03_01_2021_092200.mp4
# _03_01_2021_092200.mp4 is the match
...
file201_01_01_2022_112230.mp4
file202_01_01_2022_112230.mp4
file203_01_01_2022_112230.mp4
# _01_01_2022_112230.mp4 is the match
the goal is to find all that are matching from the very end of the file back to the first uniq character, then move them into a folder. The actionable part will be easy. I just need help with the matching.
find -type f $("all that match the same last 17 characters of the file name"); do
do things
done
this is my example directory:
total 28480
drwxr-xr-x 2 user user 64B Feb 24 10:49 dir1
drwxr-xr-x 2 user user 64B Feb 24 10:49 dir2
-rw-r--r-- 2 user user 6.8M Feb 24 08:59 file1_02_01_2021_002244.mp4
-rw-r--r-- 2 user user 468K Feb 24 09:06 file1_03_01_2021_092200.mp4
-rw-r--r-- 2 user user 4.5M Feb 24 08:59 file2_02_01_2021_002244.mp4
-rw-r--r-- 2 user user 665K Feb 24 09:06 file2_03_01_2021_092200.mp4
-rw-r--r-- 1 user user 0B Feb 24 10:49 otherfile1
-rw-r--r-- 1 user user 0B Feb 24 10:49 otherfile2
I've got it to work with suggestions from the answer marked as correct. They python method probably could work better (especially with the file names that have spaces in them) but I'm not proficient with python enough to make it do everything I want. the script in full is found below:
#!/usr/local/bin/bash
# this is my solution
# create array with patterns
aPATTERN=($(find . -type f -name "*.mp4" | sed 's/^[^_]*//'|sort -u ))
# itterate through all patterns, do things
for each in ${aPATTERN[#]}; do
# create a temp working directory for files that match the pattern
vDIR=`gmktemp -d -p $(pwd)`
# create array of all files found matching the pattern
aFIND+=(`find . -mindepth 1 -maxdepth 1 -type f -iname \*$each`)
# move all files that match the match to the working temp directory
for file in ${aFIND[#]}; do
mv -iv $file $vDIR
done
# reset the found files array, get ready for next pattern
aFIND=()
done
In python:
import os
os.chdir("folder_path")
data = {}
data = [[file[-22:], file] for file in os.listdir()]
output = {}
for pattern, filename in data:
output.setdefault(pattern, []).append(filename)
print(output)
This will create a dict associating each file with the corresponding pattern.
Output:
{
'_03_01_2021_092200.mp4': ['file1_03_01_2021_092200.mp4', 'file3_03_01_2021_092200.mp4', 'file2_03_01_2021_092200.mp4'],
'_01_01_2022_112230.mp4': ['file202_01_01_2022_112230.mp4', 'file201_01_01_2022_112230.mp4', 'file203_01_01_2022_112230.mp4'],
'_02_01_2021_002244.mp4': ['file1_02_01_2021_002244.mp4', 'file2_02_01_2021_002244.mp4', 'file3_02_01_2021_002244.mp4']
}
Try to play with this
first get all pattern sorted and uniq
find ./data -type f -name "*.mp4" | sed 's/^[^_]*//'|sort -u
or with regex
find ./data -type f -regextype sed -regex '.*_[0-9]\{2\}_[0-9]\{2\}_[0-9]\{4\}_[0-9]\{6\}\.mp4$'| sed 's/^[^_]*//'|sort -u
then iterate the the pattern via while loop to find files for every pattern
while read pattern
do
# find and exec
find ./data -type f -name "*$pattern" -exec mv {} /to/whatever/you/want/ \;
#or find and xargs
find ./data -type f -name "*$pattern" | xargs -I {} mv {} /to/whaterver/you/want/
done < <(find ./data -type f -name "*.mp4" | sed 's/^[^_]*//'|sort -u)
There are several ways to approach this, including writing a bash script, but if it were me, I'd take the quick and easy road. Use grep and read:
PATTERN=_02_01_2021_002244.mp4
find . -name '*.mp4' | grep $PATTERN; while read -t 1 A; do echo $A; done
There are probably better ways that I haven't thought of but this gets the job done.
Try this:
#!/bin/bash
while IFS= read -r line
do
if [[ "$line" == *_+([0-9])_+([0-9])_+([0-9])_+([0-9])\.mp4 ]]
then
echo "MATCH: $line"
else
echo "no match: $line"
fi
done < <(/bin/ls -c1)
Remember that is uses globbing, not regex when you build your pattern.
That is why I did not use [0-9]{2} to match 2 digits, {} does not do that in globbing, like it does in regex.
To use regex, use:
#!/bin/bash
while IFS= read -r line
do
if [[ $(echo "$line" | grep -cE '*_[0-9]{2}_[0-9]{2}_[0-9]{4}_[0-9]{6}\.mp4') -ne 0 ]]
then
echo "MATCH: $line"
else
echo "no match: $line"
fi
done < <(/bin/ls -c1)
This is a more precise match since you can specify how many digits to accept in each sub-pattern.

Bash new line feed in results [duplicate]

This question already has answers here:
Iterate over a list of files with spaces
(12 answers)
Closed 5 years ago.
Trying to create a mysql backup script.
However, I am finding that I am getting line feeds in the results:
#!/bin/bash
cd /home
for i in $(find $PWD -type f -name "wp-config.php" );
do echo "'$i'";
done
And the results show:
'/home/site1/public_html/folders/wp-config.php'
\'/home/site2/public_html/New'
'Website/wp-config.php'
'/home/site3/public_html/wp-config.php'
'/home/site4/public_html/old'
'website/wp-config.php'
'/home/site5/public_html/wp-config.php'
Do a ls from the command-line, we see for the folders in question:
New\ website
old\ website
and is treating the '\' as newline character.
OK.. Doing some research:
https://stackoverflow.com/a/5928254/175063
${foo/ /.}
Updating for what we may want:
${i/\ /}
The code now becomes:
#!/bin/bash
cd /home
for i in $(find $PWD -type f -name "wp-config.php" |${i/\ /});
do echo "'$i'";
done
Ref. https://tomjn.com/2014/03/01/wordpress-bash-magic/
Ultimately, I really want something like this:
!/bin/bash
# delete files older than 7 days
## find /home/dummmyacount/backups/ -type f -name '*.7z' -mtime +7 -exec rm {} \;
# set a date variable
DT=$(date +"%m-%d-%Y")
cd /home
for i in $(find $PWD -type f -name "wp-config.php" );
WPDBNAME=`cat $i | grep DB_NAME | cut -d \' -f 4`
WPDBUSER=`cat $i | grep DB_USER | cut -d \' -f 4`
WPDBPASS=`cat $i | grep DB_PASSWORD | cut -d \' -f 4`
do echo "$i";
#do echo $File;
#mysqldump...
done
You can do this
find . -type f -name "wp-config.php" -print0 | while read -rd $'\x00' f
do
printf '[%s]\n' "$f"
done
which uses the NUL character as the delimiter to avoid special chars

bash loop in parallel

I am trying to run this script in parallel, for i<=4 in each set. The runspr.py is itself parallel, and thats fine. What I am trying to do is running only 4 i loop in any instance.
In my present code, it will run everything.
#!bin/bash
for i in *
do
if [[ -d $i ]]; then
echo "$i id dir"
cd $i
python3 ~/bin/runspr.py SCF &
cd ..
else
echo "$i nont dir"
fi
done
I have followed https://www.biostars.org/p/63816/ and https://unix.stackexchange.com/questions/35416/four-tasks-in-parallel-how-do-i-do-that
but unable to impliment the code in parallel.
You don't need to use for loop. You can use gnu parallel like this with find:
find . -mindepth 1 -maxdepth 1 -type d ! -print0 |
parallel -0 --jobs 4 'cd {}; python3 ~/bin/runspr.py SCF'
Another possible solution is:
find . -mindepth 1 -maxdepth 1 -type d ! -print0 |
xargs -I {} -P 4 sh -c 'cd {}; python3 ~/bin/runspr.py SCF'

Shell win32 delete oldest directories (recursive)

I need to delete the oldest folders (including their contents) from a certain path. E.g. if there are more than 10 directories, delete the oldest ones until you are below 8 directories. The log would show count of directories before/after + the filesystem before/after and what dirs were deleted.
Thank you in advance!
You should test this first on your backup directory,
#!/bin/bash
DIRCOUNT="$(find . -type d -printf x | wc -c)"
if [ "$DIRCOUNT" -gt 10 ]; then
ls -A1td */ | tail -n -8 | xargs rm -r
fi
if i do not misunderstanding your intentions, below is your answer
#! /usr/bin/env bash
DIRCOUNT="$(find . -maxdepth 1 -type d -printf x | wc -c)"
echo "Now you have $DIRCOUNT dirs"
[[ "$DIRCOUNT" -gt 10 ]] && ls -A1td */ | tail -n $((DIRCOUNT-8)) | xargs rm -r && echo "Now you have 8 dirs"

Resources