CVS branch name from tag name - bash

I have a number of modules in CVS with different tags. How would I go about getting the name of the branch these tagged files exist on? I've tried checking out a file from the module using cvs co -r TAG and then doing cvs log but it appears to give me a list of all of the branches that the file exists on, rather than just a single branch name.
Also this needs to be an automated process, so I can't use web based tools like viewvc to gather this info.

I have the following Korn functions that you might be able to adjust to run in bash. It should be apparent what it's doing.
Use get_ver() to determine the version number for a file path and given tag. Then pass the file path and version number to get_branch_name(). The get_branch_name() function relies on a few other helpers to fetch information and slice up the version numbers.
get_ver()
{
typeset FILE_PATH=$1
typeset TAG=$2
TEMPINFO=/tmp/cvsinfo$$
/usr/local/bin/cvs rlog -r$TAG $FILE_PATH 1>$TEMPINFO 2>/dev/null
VER_LINE=`grep "^revision" $TEMPINFO | awk '{print $2}'`
echo ${VER_LINE:-NONE}
rm -Rf $TEMPINFO 2>/dev/null 1>&2
}
get_branch_name()
{
typeset FILE=$1
typeset VER=$2
BRANCH_TYPE=`is_branch $VER`
if [[ $BRANCH_TYPE = "BRANCH" ]]
then
BRANCH_ID=`get_branch_id $VER`
BRANCH_NAME=`get_tags $FILE $BRANCH_ID`
echo $BRANCH_NAME
else
echo $BRANCH_TYPE
fi
}
get_minor_ver()
{
typeset VER=$1
END=`echo $VER | sed 's/.*\.\([0-9]*\)/\1/g'`
echo $END
}
get_major_ver()
{
typeset VER=$1
START=`echo $VER | sed 's/\(.*\.\)[0-9]*/\1/g'`
echo $START
}
is_branch()
{
typeset VER=$1
# We can work out if something is branched by looking at the version number.
# If it has only two parts (i.e. 1.123) then it's on the trunk
# If it has more parts (i.e. 1.2.2.4) then it's on the branch
# We can error detect if it has an odd number of parts
POINTS=`echo $VER | tr -dc "." | wc -c | awk '{print $1}'`
PARTS=$(($POINTS + 1))
if [[ $PARTS -eq 2 ]]
then
print "TRUNK"
elif [[ $(($PARTS % 2)) -eq 0 ]]
then
print "BRANCH"
else
print "ERROR"
fi
}
get_branch_id()
{
typeset VER=$1
MAJOR_VER=`get_major_ver $VER`
MAJOR_VER=${MAJOR_VER%.}
BRANCH_NUMBER=`get_minor_ver $MAJOR_VER`
BRANCH_POINT=`get_major_ver $MAJOR_VER`
echo ${BRANCH_POINT}0.${BRANCH_NUMBER}
}
get_tags()
{
typeset FILE_PATH=$1
typeset VER=$2
TEMP_TAGS_INFO=/tmp/cvsinfo$$
cvs rlog -r$VER $FILE_PATH 1>${TEMP_TAGS_INFO} 2>/dev/null
TEMPTAGS=`sed -n '/symbolic names:/,/keyword substitution:/p' ${TEMP_TAGS_INFO} | grep ": ${VER}$" | cut -d: -f1 | awk '{print $1}'`
TAGS=`echo $TEMPTAGS | tr ' ' '/'`
echo ${TAGS:-NONE}
rm -Rf $TEMP_TAGS_INFO 2>/dev/null 1>&2
}

Related

Shell: Add string to the end of each line, which match the pattern. Filenames are given in another file

I'm still new to the shell and need some help.
I have a file stapel_old.
Also I have in the same directory files like english_old_sync, math_old_sync and vocabulary_old_sync.
The content of stapel_old is:
english
math
vocabulary
The content of e.g. english is:
basic_grammar.md
spelling.md
orthography.md
I want to manipulate all files which are given in stapel_old like in this example:
take the first line of stapel_old 'english', (after that math, and so on)
convert in this case english to english_old_sync, (or after that what is given in second line, e.g. math to math_old_sync)
search in english_old_sync line by line for the pattern '.md'
And append to each line after .md :::#a1
The result should be e.g. of english_old_sync:
basic_grammar.md:::#a1
spelling.md:::#a1
orthography.md:::#a1
of math_old_sync:
geometry.md:::#a1
fractions.md:::#a1
and so on. stapel_old should stay unchanged.
How can I realize that?
I tried with sed -n, while loop (while read -r line), and I'm feeling it's somehow the right way - but I still get errors and not the expected result after 4 hours inspecting and reading.
Thank you!
EDIT
Here is the working code (The files are stored in folder 'olddata'):
clear
echo -e "$(tput setaf 1)$(tput setab 7)Learning directories:$(tput sgr 0)\n"
# put here directories which should not become flashcards, command: | grep -v 'name_of_directory_which_not_to_learn1' | grep -v 'directory2'
ls ../ | grep -v 00_gliederungsverweise | grep -v 0_weiter | grep -v bibliothek | grep -v notizen | grep -v Obsidian | grep -v z_nicht_uni | tee olddata/stapel_old
# count folders
echo -ne "\nHow much different folders: " && wc -l olddata/stapel_old | cut -d' ' -f1 | tee -a olddata/stapel_old
echo -e "Are this learning directories correct? [j ODER y]--> yes; [Other]-->no\n"
read lernvz_korrekt
if [ "$lernvz_korrekt" = j ] || [ "$lernvz_korrekt" = y ];
then
read -n 1 -s -r -p "Learning directories correct. Press any key to continue..."
else
read -n 1 -s -r -p "Learning directories not correct, please change in line 4. Press any key to continue..."
exit
fi
echo -e "\n_____________________________\n$(tput setaf 6)$(tput setab 5)Found cards:$(tput sgr 0)$(tput setaf 6)\n"
#GET && WRITE FOLDER NAMES into olddata/stapel_old
anzahl_zeilen=$(cat olddata/stapel_old |& tail -1)
#GET NAMES of .md files of every stapel and write All to 'stapelname'_old_sync
i=0
name="var_$i"
for (( num=1; num <= $anzahl_zeilen; num++ ))
do
i="$((i + 1))"
name="var_$i"
name=$(cat olddata/stapel_old | sed -n "$num"p)
find ../$name/ -name '*.md' | grep -v trash | grep -v Obsidian | rev | cut -d'/' -f1 | rev | tee olddata/$name"_old_sync"
done
(tput sgr 0)
I tried to add:
input="olddata/stapel_old"
while IFS= read -r line
do
sed -n "$line"p olddata/stapel_old
done < "$input"
The code to change only the english_old_sync is:
lines=$(wc -l olddata/english_old_sync | cut -d' ' -f1)
for ((num=1; num <= $lines; num++))
do
content=$(sed -n "$num"p olddata/english_old_sync)
sed -i "s/"$content"/""$content":::#a1/g"" olddata/english_old_sync
done
So now, this need to be a inner for-loop, of a outer for-loop which holds the variable for english, right?
stapel_old should stay unchanged.
You could try a while + read loop and embed sed inside the loop.
#!/usr/bin/env bash
while IFS= read -r files; do
echo cp -v "$files" "${files}_old_sync" &&
echo sed '/^.*\.md$/s/$/:::#a1/' "${files}_old_sync"
done < olddata/staple_old
convert in this case english to english_old_sync, (or after that what is given in second line, e.g. math to math_old_sync)
cp copies the file with a new name, if the goal is renaming the original file name from the content of the file staple_old then change cp to mv
The -n and -i flag from sed was ommited , include it, if needed.
The script also assumes that there are no empty/blank lines in the content of staple_old file. If in case there are/is add an addition test after the line where the do is.
[[ -n $files ]] || continue
It also assumes that the content of staple_old are existing files. Just in case add an additional test.
[[ -e $files ]] || { printf >&2 '%s no such file or directory.\n' "$files"; continue; }
Or an if statement.
if [[ ! -e $files ]]; then
printf >&2 '%s no such file or directory\n' "$files"
continue
fi
See also help test
See also help continue
Combining them all together should be something like:
#!/usr/bin/env bash
while IFS= read -r files; do
[[ -n $files ]] || continue
[[ -e $files ]] || {
printf >&2 '%s no such file or directory.\n' "$files"
continue
}
echo cp -v "$files" "${files}_old_sync" &&
echo sed '/^.*\.md$/s/$/:::#a1/' "${files}_old_sync"
done < olddata/staple_old
Remove the echo's If you're satisfied with the output so the script could copy/rename and edit the files.

Create a backup of a file in bash

I want to write into a file in a bash script but I want to make sure that the file is backed up if it exists and I also want to avoid overwriting any existing backups.
So basically I have $FILE, if this exists, I want to move $FILE to $FILE.bak if it does not already exist, otherwise to $FILE.bak2, $FILE.bak3, etc.
Is there a shell command for this?
Using a function to find the next available name:
#!/usr/bin/env bash
function nextsuffix {
local name="$1.bak"
if [ -e "$name" ]; then
printf "%s" "$name"
else
local -i num=2
while [ -e "$name$num" ]; do
num+=1
done
printf "%s%d" "$name" "$num"
fi
}
mv "$1" "$(nextsuffix "$1")"
If foo.bak already exists, it just loops until a given foo.bakN filename doesn't exist, incrementing N each time.
You can just output to a file with a date.
FILE=~/test
echo "123" >> $FILE.$(date +'%Y%d%m')
If you want the numbers logrotate seems to be most ideal.
cp "$FILE" "$FILE.bak$(( $(grep -Eo '[[:digit:]]+' <(sort -n <(for fil in $FILE.bak*;do echo $fil;done) | tail -1 )) + 1 ))"
Breaking the commands down
sort -n <(for fil in $FILE.bak*;do echo $fil;done) | tail -1
List the last file in the directory which is sorted in numeric form
grep -Eo '[[:digit:]]+' <(sort -n <(for fil in $FILE.bak*;do echo $fil;done) | tail -1 ))
Strip out everything but the digits
(( $(grep -Eo '[[:digit:]]+' <(sort -n <(for fil in $FILE.bak*;do echo $fil;done) | tail -1 )) + 1 ))
Add one to the result
For posterity, my function with changes inspired by #Shawn's answer
backup() {
local file new n=0
local fmt='%s.%(%Y%m%d)T_%02d'
for file; do
while :; do
printf -v new "$fmt" "$file" -1 $((++n))
[[ -e $new ]] || break
done
command cp -vp "$file" "$new"
done
}
I like to cp not mv.

Bash - Extract Matching String from GZIP Files Is Running Very Slow

Complete novice in Bash. Trying to iterate thru 1000 gzip files, may be GNU parallel is the solution??
#!/bin/bash
ctr=0
echo "file_name,symbol,record_count" > $1
dir="/data/myfolder"
for f in "$dir"/*.gz; do
gunzip -c $f | while read line;
do
str=`echo $line | cut -d"|" -f1`
if [ "$str" == "H" ]; then
if [ $ctr -gt 0 ]; then
echo "$f,$sym,$ctr" >> $1
fi
ctr=0
sym=`echo $line | cut -d"|" -f3`
echo $sym
else
ctr=$((ctr+1))
fi
done
done
Any help to speed the process will be greatly appreciated !!!
#!/bin/bash
ctr=0
export ctr
echo "file_name,symbol,record_count" > $1
dir="/data/myfolder"
export dir
doit() {
f="$1"
gunzip -c $f | while read line;
do
str=`echo $line | cut -d"|" -f1`
if [ "$str" == "H" ]; then
if [ $ctr -gt 0 ]; then
echo "$f,$sym,$ctr"
fi
ctr=0
sym=`echo $line | cut -d"|" -f3`
echo $sym >&2
else
ctr=$((ctr+1))
fi
done
}
export -f doit
parallel doit ::: *gz 2>&1 > $1
The Bash while read loop is probably your main bottleneck here. Calling multiple external processes for simple field splitting will exacerbate the problem. Briefly,
while IFS="|" read -r first second third rest; do ...
leverages the shell's built-in field splitting functionality, but you probably want to convert the whole thing to a simple Awk script anyway.
echo "file_name,symbol,record_count" > "$1"
for f in "/data/myfolder"/*.gz; do
gunzip -c "$f" |
awk -F "\|" -v f="$f" -v OFS="," '
/H/ { if(ctr) print f, sym, ctr
ctr=0; sym=$3;
print sym >"/dev/stderr"
next }
{ ++ctr }'
done >>"$1"
This vaguely assumes that printing the lone sym is just for diagnostics. It should hopefully not be hard to see how this can be refactored if this is an incorrect assumption.

Shell script: check if any files in one directory are newer than any files in another directory

I want to run a command in a shell script if files in one directory have changed more recently than files in another directory.
I would like something like this
if [ dir1/* <have been modified more recently than> dir2/* ]; then
echo 'We need to do some stuff!'
fi
As described in BashFAQ #3, broken down here into reusable functions:
newestFile() {
local latest file
for file; do
[[ $file && $file -nt $latest ]] || latest=$file
done
}
directoryHasNewerFilesThan() {
[[ "$(newestFile "$1"/*)" -nt "$(newestFile "$2" "$2"/*)" ]]
}
if directoryHasNewerFilesThan dir1 dir2; then
echo "We need to do something!"
else
echo "All is well"
fi
If you want to count the directories themselves as files, you can do that too; just replace "$(newestFile "$1"/*)" with "$(newestFile "$1" "$1"/*)", and likewise for the call to newestFile for $2.
Using /bin/ls
#!/usr/bin/ksh
dir1=$1
dir2=$2
#get modified time of directories
integer dir1latest=$(ls -ltd --time-style=+"%s" ${dir1} | head -n 2 | tail -n 1 | awk '{print $6}')
integer dir2latest=$(ls -ltd --time-style=+"%s" ${dir2} | head -n 2 | tail -n 1 | awk '{print $6}')
#get modified time of the latest file in the directories
integer dir1latestfile=$(ls -lt --time-style=+"%s" ${dir1} | head -n 2 | tail -n 1 | awk '{print $6}')
integer dir2latestfile=$(ls -lt --time-style=+"%s" ${dir2} | head -n 2 | tail -n 1 | awk '{print $6}')
#sort the times numerically and get the highest time
val=$(/bin/echo -e "${dir1latest}\n${dir2latest}\n${dir1latestfile}\n${dir2latestfile}" | sort -n | tail -n 1)
#check to which file the highest time belongs to
case $val in
#(${dir1latest}|${dir1latestfile})) echo $dir1 is latest ;;
#(${dir2latest}|${dir2latestfile})) echo $dir2 is latest ;;
esac
It's simple, get times stamps of both the folders in machine format(epoch time) then do simple comparison. that's all

Extract a certain part of a string in bash with different patterns

I have this file:
CLUSTERS=SP1,SP2,SP3
FNAME_SP1="REWARDS_BTS_SP1_<GTS>.dat"
FNAME_SP2="DUMP_LOG_SP2_<GTS>.dat"
FNAME_SP3="TEST_CASE_TABLE_SP3_<GTS>.dat"
What I want to get from these are:
REWARDS_BTS_SP1_
DUMP_LOG_SP2_
TEST_CASE_TABLE_SP3_
I loop through the CLUSTERS field, get the values, and use it to find the appropriate FNAME_<CLUSTERNAME> value. Basically, the CLUSTERS value are ALWAYS before the _<GTS> part of the string. Any string pattern will do, provided that the CLUSTERS value come before the _<GTS> at the end of the string.
Any suggestions? Here's a part of the script.
function loadClusters() {
for i in `echo ${!CLUSTER*}`
do
CLUSTER=`echo ${i} | grep $1`
if [[ -n ${CLUSTER} ]]; then
CLUSTER=${!i}
break;
fi
done
echo -e ${CLUSTER}
}
function loadClustersCampaign() {
for i in `echo ${!BPOINTS*}`
do
BPOINTS=`echo ${i} | grep $1`
if [[ -n ${BPOINTS} ]]; then
BPOINTS=${!i}
break;
fi
done
for i in `echo ${!FNAME*}`
do
FNAME=`echo ${i} | grep $1`
if [[ -n ${FNAME} ]]; then
FNAME=${!i}
break;
fi
done
echo -e ${BPOINTS}"|"${FNAME}
}
#get clusters
clusters=$(loadClusters $1)
for i in `echo $clusters | sed 's/,/ /g'`
do
file=$(loadClustersCampaign ${i/-/_} | awk -F"|" '{print $2}') ;
echo $file;
#then get the part of the $file variable
done
Fun with Shell Parameter Expansions
You can use matching-prefix notation and indirect expansion to get at the variables you want, and use the "remove suffix" expansion on each result to collect just the portions of the filename that you want. For example:
FNAME_SP1='REWARDS_BTS_SP1_<GTS>.dat'
FNAME_SP2='DUMP_LOG_SP2_<GTS>.dat'
FNAME_SP3='TEST_CASE_TABLE_SP3_<GTS>.dat'
for cluster in "${!FNAME_SP#}"; do
echo ${!cluster%%<GTS>*}
done
This will print out the following:
REWARDS_BTS_SP1_
DUMP_LOG_SP2_
TEST_CASE_TABLE_SP3_
but you could issue any valid shell command inside the loop instead of using echo.
See Also
http://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html
If you like an awk solution for this ,may be below will be useful.
> echo 'FNAME_SP1="REWARDS_BTS_SP1_<GTS>.dat"' | awk -F"<GTS>" '{split($1,a,"=\"");print substr(a[2],2)}'
REWARDS_BTS_SP1_
Furthur more detail below:
> cat temp
LUSTERS=SP1,SP2,SP3
FNAME_SP1="REWARDS_BTS_SP1_<GTS>.dat"
FNAME_SP2="DUMP_LOG_SP2_<GTS>.dat"
FNAME_SP3="TEST_CASE_TABLE_SP3_<GTS>.dat"
> awk -F"<GTS>" '/FNAME_SP/{split($1,a,"=");print substr(a[2],2)}' temp
REWARDS_BTS_SP1_
DUMP_LOG_SP2_
TEST_CASE_TABLE_SP3_
>

Resources