Bash script to automatically create symlinks from file with file list - symlink

I have file list.txt with file list.
file_1.txt
file_2.txt
file_3.txt
file_4.txt
.....
.....
.....
file_50.txt
I need create symlink for all files.
Example
file_1.txt > newfile_1.txt
file_2.txt > newfile_2.txt
file_3.txt > newfile_3.txt
file_4.txt > newfile_4.txt
.....
.....
.....
file_50.txt > newfile_50.txt
I tested this
cat list.txt | egrep -v '^#|^[[:space:]]*$' | xargs ln -sf
but not works.

If all you need is a constant filename prefix and the list contains only basenames without directories, the following shell loop should work:
while read f; do
ln -sf "$f" "new$f"
done < list.txt

Related

Create import file for a dataset with single-label images in Google Cloud Vertex AI

I have a bucket in GCS with the following hierarchy:
dataset/class1/image1.png
image2.png
..
dataset/class2/image1.png
image2.png
..
dataset/class3/image1.png
image2.png
..
So, all the examples for the same class are in the same folder.
I would like to create an import file that for each images creates a new line with the URI and the class. It would look like this:
gs://dataset/class1/image1.png, class1
gs://dataset/class1/image2.png, class1
..
gs://dataset/class2/image1.png, class2
gs://dataset/class2/image2.png, class2
..
gs://dataset/class3/image1.png, class3
gs://dataset/class3/image2.png, class3
..
I am trying this but it doesn't work
export BUCKET=<bucket name>
export IMPORT_DATA=<import file>
gsutil ls -r gs://$BUCKET/** > $IMPORT_DATA
sed -i '1d' $IMPORT_DATA
sed -e 's/$/$(basename $)/' -i filename
I might have found one way to do it.
export BUCKET=<bucket name>
export IMPORT_DATA=<import file>
gsutil ls -r gs://$BUCKET/** > tmp.csv
sed -i '1d' tmp.csv # the first line is not a file
cat tmp.csv | while read line ; do echo $line ',' $(basename $(dirname $line)) ; done > $IMPORT_DATA
wc -l $IMPORT_DATA
rm tmp.csv

Is there a way to add a suffix to files where the suffix comes from a list in a text file?

So currently the searches are coming up with a single word renaming solution, where you define the (static) suffix within the code. I need to rename based on a text based filelist and so -
I have a list of files in /home/linux/test/ :
1000.ext
1001.ext
1002.ext
1003.ext
1004.ext
Then I have a txt file (labels.txt) containing the labels I want to use:
Alpha
Beta
Charlie
Delta
Echo
I want to rename the files to look like (example1):
1000 - Alpha.ext
1001 - Beta.ext
1002 - Charlie.ext
1003 - Delta.ext
1004 - Echo.ext
How would you a script which renames all the files in /home/linux/test/ to the list in example1?
Use paste to loop through the two lists in parallel. Split the filenames into the prefix and extension, then combine everything to make the new filenames.
dir=/home/linux/test
for file in "$dir"/*.ext
do
read -r label
prefix=${file%.*} # remove everything from last .
ext=${file##*.} # remove everything before last .
mv "$file" "$prefix - $label.$ext"
done < labels.txt
I originally partly got the request wrong, although this step is still useful, because it gives you the filenames you need.
#!/bin/sh
count=1000
cp labels.txt stack
cat > ed1 <<EOF
1p
q
EOF
cat > ed2 <<EOF
1d
wq
EOF
next () {
[ -s stack ] && main
}
main () {
line="$(ed -s stack < ed1)"
echo "${count} - ${line}.ext" >> newfile
ed -s stack < ed2
count=$(($count+1))
next
}
next
Now we just need to move the files:-
cp newfile stack
for i in *.ext
do
newname="$(ed -s stack < ed1)"
mv -v "${i}" "${newname}"
ed -s stack < ed2
done
rm -v ./ed1
rm -v ./ed2
rm -v ./stack
rm -v ./newfile
On the possibility that you don't have exactly the same number of files as labels, I set it up to cycle a couple of arrays in pseudo-parallel.
$: cat script
#!/bin/env bash
lst=( *.ext ) # array of files to rename
mapfile -t labels < labels.txt # array of labels to attach
for ndx in ${!lst[#]} # for each filename's numeric index
do # assign the new name
new="${lst[ndx]/.ext/ - ${labels[ndx%${#labels[#]}]}.ext}"
# show the command to rename the file
echo "mv \"${lst[ndx]}\" \"$new\""
done
$: ls -1 *ext # I added an extra file
1000.ext
1001.ext
1002.ext
1003.ext
1004.ext
1005.ext
$: ./script # loops back if more files than labels
mv "1000.ext" "1000 - Alpha.ext"
mv "1001.ext" "1001 - Beta.ext"
mv "1002.ext" "1002 - Charlie.ext"
mv "1003.ext" "1003 - Delta.ext"
mv "1004.ext" "1004 - Echo.ext"
mv "1005.ext" "1005 - Alpha.ext"
$: ./script > do # use ./script to write ./do
$: ./do # use ./do to change the names
$: ls -1
'1000 - Alpha.ext'
'1001 - Beta.ext'
'1002 - Charlie.ext'
'1003 - Delta.ext'
'1004 - Echo.ext'
'1005 - Alpha.ext'
do
labels.txt
script
You can just remove the echo to have ./script rename the files there.
I renamed labels to labels.txt to match your example.
If you aren't using bash this will need a call to something like sed or awk. Here's a short awk-based script that will do the same.
$: cat script2
#!/bin/env sh
printf "%s\n" *.ext > files.txt
awk 'NR==FNR{label[i++]=$0}
NR>FNR{ if (! label[i] ) { i=0 } cmd="mv \""$0"\" \""gensub(/[.]ext/, " - "label[i++]".ext", 1)"\"";
print cmd;
# system(cmd);
}' labels.txt files.txt
Uncomment the system line to make it actually do the renames as well.
It does assume your filenames don't have embedded newlines. Let us know if that's a problem.

find only the first file from many directories

I have a lot of directories:
13R
613
AB1
ACT
AMB
ANI
Each directories contains a lots of file:
20140828.13R.file.csv.gz
20140829.13R.file.csv.gz
20140830.13R.file.csv.gz
20140831.13R.file.csv.gz
20140901.13R.file.csv.gz
20131114.613.file.csv.gz
20131115.613.file.csv.gz
20131116.613.file.csv.gz
20131117.613.file.csv.gz
20141114.ab1.file.csv.gz
20141115.ab1.file.csv.gz
20141116.ab1.file.csv.gz
20141117.ab1.file.csv.gz
etc..
The purpose if to have the first file from each directories
The result what I expect is:
13R|20140828
613|20131114
AB1|20141114
Which is the name of the directories pipe the date from the filename.
I guess I need a find and head command + awk but I can't make it, I need your help.
Here what I have test it
for f in $(ls -1);do ls -1 $f/ | head -1;done
But the folder name is missing.
When I mean the first file, is the first file returned in an alphabetical order within the folder.
Thanks.
You can do this with a Bash loop.
Given:
/tmp/test
/tmp/test/dir_1
/tmp/test/dir_1/file_1
/tmp/test/dir_1/file_2
/tmp/test/dir_1/file_3
/tmp/test/dir_2
/tmp/test/dir_2/file_1
/tmp/test/dir_2/file_2
/tmp/test/dir_2/file_3
/tmp/test/dir_3
/tmp/test/dir_3/file_1
/tmp/test/dir_3/file_2
/tmp/test/dir_3/file_3
/tmp/test/file_1
/tmp/test/file_2
/tmp/test/file_3
Just loop through the directories and form an array from a glob and grab the first one:
prefix="/tmp/test"
cd "$prefix"
for fn in dir_*; do
cd "$prefix"/"$fn"
arr=(*)
echo "$fn|${arr[0]}"
done
Prints:
dir_1|file_1
dir_2|file_1
dir_3|file_1
If your definition of 'first' is different that Bash's, just sort the array arr according to your definition before taking the first element.
You can also do this with find and awk:
$ find /tmp/test -mindepth 2 -print0 | awk -v RS="\0" '{s=$0; sub(/[^/]+$/,"",s); if (s in paths) next; paths[s]; print $0}'
/tmp/test/dir_1/file_1
/tmp/test/dir_2/file_1
/tmp/test/dir_3/file_1
And insert a sort (or use gawk) to sort as desired
sort has an unique option. Only the directory should be unique, so use the first field in sorting -k1,1. The solution works when the list of files is sorted already.
printf "%s\n" */* | sort -k1,1 -t/ -u | sed 's#\(.*\)/\([0-9]*\).*#\1|\2#'
You will need to change the sed command when the date field may be followed by another number.
This works for me:
for dir in $(find "$FOLDER" -type d); do
FILE=$(ls -1 -p $dir | grep -v / | head -n1)
if [ ! -z "$FILE" ]; then
echo "$dir/$FILE"
fi
done

Removed all occurences from file A from file B

I have two files: A and B.
Contents of A:
http://example.com/1
http://example.com/2
http://example.com/3
http://example.com/4
http://example.com/5
http://example.com/6
http://example.com/7
http://example.com/8
http://example.com/9
http://example.com/4
Contents from file B:
http://example.com/1
http://example.com/3
http://example.com/9
http://example.com/4
Now, I would like to remove all the occurences of the lines in file B from file A.
I have tried following:
for LINK in $(sort -u B);do sed -i -e 's/"$LINK"//g' A; echo "Removed $LINK";done
But it didn't do anything at all.
grep -vf will be simpler for this:
grep -vxFf file2 file1
http://example.com/2
http://example.com/5
http://example.com/6
http://example.com/7
http://example.com/8

Concatenate files with prefix

Assume a folder with files named like this:
FOO.1
FOO.2
...
BAR-1.1
BAR-1.2
...
BAR-2.1
BAR-2.2
...
And I would like to concatenate them such that it results in 3 files:
FOO (consisting of FOO.1 + FOO.2 + FOO.N)
BAR-1 (consisting of BAR-1.1 + BAR-1.2 + BAR-1.N)
BAR-2 (consisting of BAR-2.1 + BAR-2.2 + BAR-2.N)
How would this be done in bash/shell script? Assume all the files are in one folder (no need to go into subfolders)
Requires not knowing the filename prefixes in advance
for file in *.*
do
prefix="${file%.*}"
echo "Adding $file to $prefix ..."
cat "$file" >> "$prefix"
done
for i in $(ls | sed 's/\(.*\)\..$/\1/' | sort -u)
do
cat $i* > $i
done

Resources