I need to add a consecutive index to a searchable string inside a file - bash

So I have a file called, say, 'test', with the contents:
39366
39371
45005
45005
216274
216277
216345
396480
396480
I need to add an index to the end of every string, which should correspond to the count of that string inside the file. It should look like this:
39366_1
39371_1
45005_1
45005_2
216274_1
216277_1
216345_1
396480_1
396480_2
396480_2
396480_3
Then I should repeat the process in another file, say 'test2' in a more complicated format, something like this:
+39366
ffasd
+39371
fasdasd
+45005
fasdfdf
+45005
asdfasdf
My first question here, I've never asked for anything, please help :)

$ cat script.sh
#!/bin/bash
FILE=$1
n=$(cat $FILE | wc -l)
for (( i=1; i<=$n; i++ ))
do
LINE=$(sed "${i}q;d" $FILE)
if [ -n "$LINE" ]
then
INDEX=$(cat test | head -n $i | grep -c "$LINE")
LINE="${LINE}_${INDEX}"
sed -i "${i}s/.*/$LINE/" $FILE
fi
done
$ cat test
39366
39371
45005
45005
216274
216277
216345
396480
396480
396480
396480
$ ./script.sh test
$ cat test
39366_1
39371_1
45005_1
45005_2
216274_1
216277_1
216345_1
396480_1
396480_2
396480_3
396480_4

Related

Shell: Add string to the end of each line, which match the pattern. Filenames are given in another file

I'm still new to the shell and need some help.
I have a file stapel_old.
Also I have in the same directory files like english_old_sync, math_old_sync and vocabulary_old_sync.
The content of stapel_old is:
english
math
vocabulary
The content of e.g. english is:
basic_grammar.md
spelling.md
orthography.md
I want to manipulate all files which are given in stapel_old like in this example:
take the first line of stapel_old 'english', (after that math, and so on)
convert in this case english to english_old_sync, (or after that what is given in second line, e.g. math to math_old_sync)
search in english_old_sync line by line for the pattern '.md'
And append to each line after .md :::#a1
The result should be e.g. of english_old_sync:
basic_grammar.md:::#a1
spelling.md:::#a1
orthography.md:::#a1
of math_old_sync:
geometry.md:::#a1
fractions.md:::#a1
and so on. stapel_old should stay unchanged.
How can I realize that?
I tried with sed -n, while loop (while read -r line), and I'm feeling it's somehow the right way - but I still get errors and not the expected result after 4 hours inspecting and reading.
Thank you!
EDIT
Here is the working code (The files are stored in folder 'olddata'):
clear
echo -e "$(tput setaf 1)$(tput setab 7)Learning directories:$(tput sgr 0)\n"
# put here directories which should not become flashcards, command: | grep -v 'name_of_directory_which_not_to_learn1' | grep -v 'directory2'
ls ../ | grep -v 00_gliederungsverweise | grep -v 0_weiter | grep -v bibliothek | grep -v notizen | grep -v Obsidian | grep -v z_nicht_uni | tee olddata/stapel_old
# count folders
echo -ne "\nHow much different folders: " && wc -l olddata/stapel_old | cut -d' ' -f1 | tee -a olddata/stapel_old
echo -e "Are this learning directories correct? [j ODER y]--> yes; [Other]-->no\n"
read lernvz_korrekt
if [ "$lernvz_korrekt" = j ] || [ "$lernvz_korrekt" = y ];
then
read -n 1 -s -r -p "Learning directories correct. Press any key to continue..."
else
read -n 1 -s -r -p "Learning directories not correct, please change in line 4. Press any key to continue..."
exit
fi
echo -e "\n_____________________________\n$(tput setaf 6)$(tput setab 5)Found cards:$(tput sgr 0)$(tput setaf 6)\n"
#GET && WRITE FOLDER NAMES into olddata/stapel_old
anzahl_zeilen=$(cat olddata/stapel_old |& tail -1)
#GET NAMES of .md files of every stapel and write All to 'stapelname'_old_sync
i=0
name="var_$i"
for (( num=1; num <= $anzahl_zeilen; num++ ))
do
i="$((i + 1))"
name="var_$i"
name=$(cat olddata/stapel_old | sed -n "$num"p)
find ../$name/ -name '*.md' | grep -v trash | grep -v Obsidian | rev | cut -d'/' -f1 | rev | tee olddata/$name"_old_sync"
done
(tput sgr 0)
I tried to add:
input="olddata/stapel_old"
while IFS= read -r line
do
sed -n "$line"p olddata/stapel_old
done < "$input"
The code to change only the english_old_sync is:
lines=$(wc -l olddata/english_old_sync | cut -d' ' -f1)
for ((num=1; num <= $lines; num++))
do
content=$(sed -n "$num"p olddata/english_old_sync)
sed -i "s/"$content"/""$content":::#a1/g"" olddata/english_old_sync
done
So now, this need to be a inner for-loop, of a outer for-loop which holds the variable for english, right?
stapel_old should stay unchanged.
You could try a while + read loop and embed sed inside the loop.
#!/usr/bin/env bash
while IFS= read -r files; do
echo cp -v "$files" "${files}_old_sync" &&
echo sed '/^.*\.md$/s/$/:::#a1/' "${files}_old_sync"
done < olddata/staple_old
convert in this case english to english_old_sync, (or after that what is given in second line, e.g. math to math_old_sync)
cp copies the file with a new name, if the goal is renaming the original file name from the content of the file staple_old then change cp to mv
The -n and -i flag from sed was ommited , include it, if needed.
The script also assumes that there are no empty/blank lines in the content of staple_old file. If in case there are/is add an addition test after the line where the do is.
[[ -n $files ]] || continue
It also assumes that the content of staple_old are existing files. Just in case add an additional test.
[[ -e $files ]] || { printf >&2 '%s no such file or directory.\n' "$files"; continue; }
Or an if statement.
if [[ ! -e $files ]]; then
printf >&2 '%s no such file or directory\n' "$files"
continue
fi
See also help test
See also help continue
Combining them all together should be something like:
#!/usr/bin/env bash
while IFS= read -r files; do
[[ -n $files ]] || continue
[[ -e $files ]] || {
printf >&2 '%s no such file or directory.\n' "$files"
continue
}
echo cp -v "$files" "${files}_old_sync" &&
echo sed '/^.*\.md$/s/$/:::#a1/' "${files}_old_sync"
done < olddata/staple_old
Remove the echo's If you're satisfied with the output so the script could copy/rename and edit the files.

Issues with grep and get a count of a string in a loop

I have a set of search strings in a file (File1) and a content file (File2). I am trying to loop through all the search strings within File1 and get a count of each of the search string within File2 and output it - I want to automate this and make it generic so I can search through multiple content files. However, I dont seem to be able to get the exact count when I execute this loop. I get a "0" count for each of the strings although I have those strings in the file. Unable to figure out what I am doing wrong and can use some help !
Below is the script I came up with:
#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
count=$(echo cat "$2" | grep -c "$line")
echo "$count - $line"
done < "$1"
Command I am using to run this script:
./scanscript.sh File1.log File2.log
I say this since I searched this command separately and get the right value. This command works by itself but I want to put this in a loop
cat File2.log | grep -c "Search String"
Sample Data for File 1 (Search Strings):
/SERVER_NAME/Root/DEV/Database/NJ-CONTENT/Procs/
/SERVER_NAME3/Root/DEV/Database/NJ-CONTENT/Procs/
Sample Data for File 2 (Content File):
./SERVER_NAME/Root/DEV/Database/NJ-CONTENT/Procs/test.test_proc.sql:29:
./SERVER_NAME2/Root/DEV/Database/NJ-CONTENT/Procs/test.test_proc.sql:100:
./SERVER_NAME3/Root/DEV/Database/NJ-CONTENT/Procs/test.test_proc.sql:143:
./SERVER_NAME4/Root/DEV/Database/NJ-CONTENT/Procs/test.test_proc.sql:223:
./SERVER_NAME5/Root/DEV/Database/NJ-CONTENT/Procs/test.test_proc.sql:5589:
Problem is this line:
count=$(echo cat "$2" | grep -c "$line")
That should be changed to:
count=$(grep -Fc "$line" "$2")
Also note -F is to be used for fixed string search instead of regex search.
Full code:
while IFS='' read -r line || [[ -n "$line" ]]; do
count=$(grep -Fc "$line" "$2");
echo "$count - $line";
done < "$1"
Run it as:
./scanscript.sh File1.log File2.log
Output:
1 - /SERVER_NAME/Root/DEV/Database/NJ-CONTENT/Procs/
1 - /SERVER_NAME3/Root/DEV/Database/NJ-CONTENT/Procs/

Occurrence of a string in all the file names within a folder in Bash

I am trying to make a script which allow me to select files with 2 or more occurrences of a string in their name.
Example:
test.txt // 1 occurrence only, not ok
test-test.txt // 2 occurrences OK
test.test.txt // 2 occurrences OK
I want the script to return me only the files 2 and 3. I tried like that but this didn't work:
rows=$(ls | grep "test")
for file in $rows
do
if [[ $(wc -w $file) == 2 ]]; then
echo "the file $file matches"
fi
done
grep and wc are overkill. A simple glob will suffice:
*test*test*
You can use this like so:
ls *test*test*
or
for file in *test*test*; do
echo "$file"
done
You can use :
result=$(grep -o "test" yourfile | wc -l)
-wc is a word count
In shell script if $result>1 do stuff...

Bash script to remove text from each line of a txt before a :

I have written this script to remove text from each line before ::
#!/bin/bash
txt=test.txt
COUNT=$(cat $txt | wc -l)
while [ $COUNT -gt 0 ]; do
data=$(sed -n ${count}p $txt)
sed '$count \c
"${data#*:}"' $txt
let COUNT=COUNT-1
done
I think I have an issue with using variables in commands without spaces. Can anyone tell me what I have done wrong?
I think you are over complicating it. To do this you just need cut:
cut -d':' -f2- file
-d sets the field separator.
-f indicates what fields to use. By saying 2- we indicate "all from the 2nd one on".
Test
$ cat a
hello
hello:man i am here:or there
and:you are here
$ cut -d':' -f2- a
hello
man i am here:or there
you are here
Some comments regarding your code:
#!/bin/bash
txt=test.txt
COUNT=$(cat $txt | wc -l) # you can directly say 'wc -l < "$txt"'
while [ $COUNT -gt 0 ]; do
data=$(sed -n ${count}p $txt) # you are using "count", not "COUNT"
sed '$count \c # same here. And I don't know what
"${data#*:}"' $txt # this sed is supposed to work like
let COUNT=COUNT-1 # you have to say let "COUNT=COUNT-1"
done
Also, it is good to indent the code, so that it shows like:
while ...
do
... things ...
done
All together, I would do:
#!/bin/bash
txt=a
count=$(wc -l < "$txt")
while (( count-- > 0 )); do
data=$(sed -n "${count}p" "$txt")
#sed '$COUNT \c "${data#*:}"' $txt # not using it
echo "${data#*:}"
done
Since you are reading the file from the bottom and done some conditions around it, you could just drop it and just use tac to print the file on reverse:
while IFS= read -r data do
echo "${data#*:}"
done < <(tac file)

Concatenate strings in bash

I have in a bash script:
for i in `seq 1 10`
do
read AA BB CC <<< $(cat file1 | grep DATA)
echo ${i}
echo ${CC}
SORT=${CC}${i}
echo ${SORT}
done
so "i" is a integer, and CC is a string like "TODAY"
I would like to get then in SORT, "TODAY1", etc
But I get "1ODAY", "2ODAY" and so
Where is the error?
Thanks
You should try
SORT="${CC}${i}"
Make sure your file does not contain "\r" that would end just in the end of $CC.
This could well explain why you get "1ODAY".
Try including
|tr '\r' ''
after the cat command
try
for i in {1..10}
do
while read -r line
do
case "$line" in
*DATA* )
set -- $line
CC=$3
SORT=${CC}${i}
echo ${SORT}
esac
done <"file1"
done
Otherwise, show an example of file1 and your desired output
ghostdog is right: with the -r option, read avoids succumbing to potential horrors, like CRLFs. Using arrays makes the -r option more pleasant:
for i in `seq 1 10`
do
read -ra line <<< $(cat file1 | grep DATA)
CC="${line[3]}"
echo ${i}
echo ${CC}
SORT=${CC}${i}
echo ${SORT}
done

Resources