I am currently doing this:
while read l
do
echo git add $l/
git add $l/
# sed -i -e '1,1d' data/commit-folders.csv
# echo git commit -am "'Autocommit'"
# git commit -uno -am "'Autocommit'"
# echo git push origin master
# git push origin master
done < data/commit-folders.csv
Essentially just git add <folder> for a list of folders in a CSV file. I would like for this to be more robust, in that every time it restarts it restarts from where it left off. So I added that commented out line which does an in-place delete sed -i -e '1,1d' data/commit-folders.csv. However, with while read line, it messes up with the current line if they are being deleted. So I'm wondering how to do this properly.
How to iterate through a CSV file with <path> on each line, and delete the path once it is successfully git added. It seems like you need to have a loop that selects the first line from a file, and then deletes it from the file afterwards, rather than using while read line.
Here a solution without sed.
#!/bin/bash
csv="data/commit-folders.csv"
done="$(mktemp)"
# autoremove tempfile at exit
trap 'rm "$done"' EXIT
# loop over all lines in csv
while read -r file; do
printf "git add %s\n" "$file"
git add "$file"
# write processed files in tempfile
printf "%s\n" "$file" >> "$done"
#...
done < "$csv"
# create tempfile for merge result
newfile="$(mktemp)"
# sort: merge and sort $cvs with $done
# uniq -u: write only unique files into tempfile
sort "$csv" "$done" | uniq -u > "$newfile"
# override $csv with tempfile
mv "$newfile" "$csv"
you can use sed -i "/${l}/d" , it will find the exact line and delete it. This assumes that there would be no duplicate lines.
while read l
do
echo git add $l/
git add $l/
# sed -i -e '1,1d' data/commit-folders.csv
sed -i "/${l}/d" data/commit-folders.csv
# echo git commit -am "'Autocommit'"
# git commit -uno -am "'Autocommit'"
# echo git push origin master
# git push origin master
done < data/commit-folders.csv
Related
Using git-filter-repo is it possible to combine N repositories into a mono-repository re-writing the commits so that the commits are interwoven, or "zippered" up by date?
Currently, I'm testing this with only 2 repos with each repo having their own subdirectory. After the operation, the commits for each repo are on "top" of each other rather than interwoven. What I really want is to be able to have a completely linear history by authored data without the added merge commits.
rm -rf ___x
mkdir ___x
cd ___x
echo "creating the monorepo"
git init
touch "README.md"
git add .
git commit -am "Hello World!"
declare -A data
data=(
["foo"]="https://github.com/bcanzanella/foo.git"
["bar"]="https://github.com/bcanzanella/bar.git"
)
for d in "${!data[#]}";
do {
REPO_NAME=$d
REPO_REMOTE=${data[$d]}
# since we can use a foo/bar as the repo identifier, replace the / with a -
REPO_DIR_TMP="$(mktemp -d -t "${REPO_NAME/\//-}.XXXX")"
echo "REPO REMOTE: $REPO_REMOTE"
echo "REPO NAME: $REPO_NAME"
echo "REPO TMP DIR: $REPO_DIR_TMP"
echo ""
echo "Cloning..."
git clone "$REPO_REMOTE" "$REPO_DIR_TMP"
echo "filtering into ..."
cd $REPO_DIR_TMP && git-filter-repo --to-subdirectory-filter "$REPO_NAME"
# cat .git/filter-repo/commit-map
## merge the rewritten repo
git remote add "$REPO_NAME" "$REPO_DIR_TMP"
echo "fetching..."
git fetch "$REPO_NAME"
echo "merging..."
git merge --allow-unrelated-histories "$REPO_NAME/master" --no-edit
## delete the rewritten repo
echo "Removing temp dir $REPO_DIR_TMP..."
rm -rf "$REPO_DIR_TMP"
echo "Removing remote $REPO_NAME..."
# git remote rm "$REPO_NAME"
echo "$REPO_NAME done!"
}
done
To emphasize on eftshift0's comment : rebasing and rewriting history can lead to commits being ordered in seemingly absurd chronoogical order.
If you know for a fact that all commits are well ordered (e.g : the commit date of a parent commit is always "older" than the commit date of its child commit), you may be able to generate the correct list of commits to feed in a git rebase -i script.
[edit] after thinking about it, this may be enough for your use case :
Look at the history of your repo using --date-order :
git log --graph --oneline --date-order
If the sequence of commits matches what you expect, you can use git log to generate a rebase -i sequence script :
# --reverse : 'rebase -i' asks for entries starting from the oldest
# --no-merges : do not mention the "merge" commits
# sed -e 's/^/pick /' : use any way you see fit to prefix each line with 'pick '
# (another valid way is to copy paste the list of commits in an editor,
# and add 'pick ' to each line ...)
git log --reverse --no-merges --oneline --date-order |\
sed -e 's/^/pick /' > /tmp/rebase-apply.txt
Then rebase the complete history of your repo :
git rebase -i --root
In the editor, copy/paste the script you created with your first command,
save & close.
Hopefully, you will get a non conflicting unified history.
Background
I need to close multiple Git branches (hundreds) that have been left open on a remote repository.
I wanted to sort them by last-commit and put them in a text-file so others could confirm they were no longer needed.
I found a method that allowed me to dump them in a way that I could distribute (and open in Excel to sort by date)
git for-each-ref --format='%(committerdate:short) %09 %(authorname) %09 %(refname)' | sort -k5n -k2M -k3n -k4n >> branches.txt
And then I need to read the updated text-file back in and delete the remote branches:
#!/bin/bash
#prefix of the branches if they are all remote
prefix="refs/remotes/origin/"
prefix1="refs/remotes/"
# path to branches, compiled with:
# git for-each-ref --format='%(committerdate:short) %09 %(authorname) %09 %(refname)' | sort -k5n -k2M -k3n -k4n >> branches.txt
# read through entire input
input="branches.txt"
while IFS= read -r line
do
# echo "$line"]
IFS=' ' read -r -a array <<< "$line"
# search the array for the prefix of the branch
for index in "${!array[#]}"
do
if [[ ${array[$index]} == *"refs/remotes/origin"* ]]; then
#echo -e " \e[34m ${array[$index]} \e[39m Last commit: \e[34m" ${array[0]} ${array[1]} ${array[2]} ${array[3]} ${array[4]}
# echo ${array[$index]#"$prefix"}
#remove the prefix if they are not pulled locally
branch=${array[$index]#"$prefix"}
echo $(git push origin --delete $branch)
fi
done
done < "$input"
Issue
However, keep getting an error that the refspec does not exist:
fatal: invalid refspec ':PracticeBranch?'
I don't have the ' or ? in the name of the branch variable - so this is probably my ignorance of using echo with `bash but I don't know where it is coming from.
A command like:
git push origin --delete PracticeBranch
When calling directly from the shell?
I am using a CI/CD system to automate the building of Docker Images from a git repository. The Image Tag of the image corresponds to the short (i.e. 8-characters) hash of the corresponding git commit, e.g. myimage:123456ab.
The repository contains source code that gets packaged in the Docker Image and stuff like documentation and deployment configuration that is excluded using a .dockerignore file (similar to .gitignore).
While the process works in general, it leads to rebuilding and redeploying Docker Images that are absolute identical, because the only changes were made to files that did not become part of the Image (e.g. the repositories README).
Using only the shell (bash in this case), git and standard *nix tools, is there a way to get the short hash of the latest commit that changed a file which is not ignored by the .dockerignore file? This should as well cover removing a non-ignored file.
You can do this through a combination of git log and git show.
The following script will go backwards through the revision history and find the first commit to have a change that would not be ignored by .dockerignore
for commit in $(git log --pretty=%H)
do
# Get the changed file names for the commit.
# Use `sed 1d` to remove the first line, which is the commit description
files=$(git show $commit --oneline --name-only | sed 1d)
if docker-check-ignore $files
then
echo $commit
exit 0
fi
done
exit 1
And then you could define docker-check-ignore as a script like the following:
#!/bin/sh
DIR=$(mktemp -d)
pushd $DIR
# Set up a temporary git repository so we can use
# git check-ignore with .dockerignore
git init
popd
cp .dockerignore $DIR/.gitignore
pushd $DIR
git check-ignore $#
# Store the error code
ERROR=$?
popd
rm -rf $DIR
exit $ERROR
I will leave reducing the number of file system operations rather than creating/removing a directory for each commit.
#!/usr/bin/env bash
declare -a ign_table=()
# Populates ign_table with patterns from .dockerignore
while IFS= read -r line || [[ ${line} ]]; do
ign_table+=("${line}")
done < <(sed '/^#/d;/^$/d' .dockerignore)
is_docker_ignored() {
locale -i ignore=1 # false, default not ignored
for ign_patt in "${ign_table[#]}"; do
# If pattern starts with ! it is an exception rule
# when filename match !pattern, do not ignore it
# shellcheck disable=SC2053 # $ign_patt must not use quotes to match wildcards
if [[ ${ign_patt} =~ ^\!(.*) ]] && [[ ${1} == ${BASH_REMATCH[1]} ]]; then
return 1 # false: no need to check further patterns, file not ignored
fi
# Normal exclusion pattern, if file match,
# shellcheck disable=SC2053 # $ign_patt must not use quotes to match wildcards
if [[ ${1} == $ign_patt ]]; then
ignore=0 # true: it match an ignore pattern, file may not be ignored if it later matches an exception pattern
fi
done
return "${ignore}"
}
while IFS= read -r file
do
is_docker_ignored "${file}" && continue # File is in .dockerignore
commit_hash="$(git rev-list --all -1 "${file}")"
printf '%s\n' "${commit_hash:0:8}"
done < <(git ls-files)
When I use git via terminal I open a tab dedicated to git command only, then I don't want to type "git " every time. Is there any way to make some text automatically typed on every line?
You can define every git command as an alias, so that for example typing diff mybranch will invoke git diff mybranch. To invoke the normal shell command, type a backslash before it, for example \diff file ../elsewhere/file invokes /usr/bin/diff and not git diff.
Put the following code in a file ~/.git.bashrc. Configure your git terminal to run bash --rcfile ~/.git.bashrc instead of just running bash.
. ~/.bashrc
for c in $(COLUMNS=4 git help -a | sed -n 's/^ \([a-z]\)/\1/p';
git config --get-regexp '^alias.' | sed 's/alias\.//; s/ .*//')
do
alias "$c=git $c"
complete -F _complete_alias foo
done
The complete line requires the _complete_alias function.
I created this .bashrc function that pushes the code and tags it.
All you need to give it is the comment you want for the push.
The alias of the function is "gp" (which stands for git push).
So if you want to push and tag some code all you need after you add this code to your .bashrc is:
$ gp "test my new git push function"
gpfunction() {
git status
echo [Enter to continue...]
read a
git pull
git commit -am"$1"
git push
tag_major_min=$(git tag |sort -V|tail -1|awk -F. '{print $1 "." $2 "."}')
echo Tag major min $tag_major_min
latest_tag_number=$(git tag |sort -V|tail -1|awk -F. '{print $3}')
echo Latest tag number $latest_tag_number
next=$(echo $latest_tag_number + 1 | bc)
echo Next $next
new_tag=$(echo $tag_major_min $next | sed 's/ //g')
echo New tag $new_tag
git tag $new_tag
git push origin $new_tag
}
alias gp=gpfunction
This script uses a major.minor.patch version standard and increments the patch version.
You can tweak it as you please.
Reason: I want to compare two arbitrary different commits using a difftool. I know the hashes from a search and I don't want to copy these hashes, thus I am looking for a command that does something like
$ log_str=$(git log --all -S"new_tour <-" --pretty=format:"%h")
$ git difftool -t kdiff3 log_str[1] log_str[2] myfile.txt
I would like to be able to address arbitrary indices - not always 1 and 2
It would be great if the answer also gives a hint, how to figure out, what the structure of log_str is. Is it a character? An array of characters? A list? ... using the Bash.
I found some related help here and here, but I can't make it work.
Now I do:
$ git log --pretty=format:"%h"
3f69dc7
b8242c6
01aa74f
903c5aa
069cfc5
and
$ git difftool -t kdiff3 3f69dc7 b8242c6 myfile.txt
I would take a two step approach using a temporary file:
git log --all -S'SEARCH' --pretty=format:"%h" > tmp_out
git diff "$(sed -n '1p' tmp_out)" "$(sed -n '2p' tmp_out)" myfile.txt
rm tmp_out
sed is used to display line 1 and line 2 of the file.
With variables:
search="foo"
index_a="1"
index_b="2"
file="myfile.txt"
git log --all -S"${search}" --pretty=format:"%h" > tmp_out
git diff "$(sed -n "${index_a}p" tmp_out)" "$(sed -n "${index_b}p" tmp_out)" "${file}"
rm tmp_out
in a bash function:
search_diff() {
search="${1}"
index_a="${2}"
index_b="${3}"
file="${4}"
git log --all -S"${search}" --pretty=format:"%h" > tmp_out
git diff "$(sed -n "${index_a}p" tmp_out)" "$(sed -n "${index_b}p" tmp_out)" "${file}"
rm tmp_out
}
search_diff "foo" 2 3 myfile.txt