Git smudge and clean using local configuration branch - bash

The local configuration of the project I'm working on involves changing several files in complicated ways that cannot be committed to any submitted branches. To work around this I've committed these local configuration changes to a dedicated local branch config, and have been running this bash script config.sh after starting a new work branch:
#!/bin/bash
# put relevant config files in array
mapfile -t files < <(git diff config develop --name-only)
# overwrite only those files to my working directory
git checkout config -- ${files[#]}
# unstage them so they aren't accidentally committed
git reset HEAD ${files[#]}
echo The following files were successfully overwritten for local configuration:
printf '\t%s\n' "${files[#]}"
Along with another .deconfig script that does the same in reverse. Run directly from the terminal, these scripts have been working fine, but I'd like to streamline the process further using git's clean and smudge filters. So I created a .gitattributes file:
*.* filter=config
and then added this to my .git/config file:
[filter "config"]
smudge = ./config.sh
clean = ./deconfig.sh
However, it just isn't working. If I had to guess it's because git isn't expecting me to run an additional checkout as part of a filter, which itself runs after the checkout command against all files. Most use cases for smudge and clean seem to involve simple find and replace operations, but that approach would be complicated to implement and difficult to maintain given the complexity of changes needed. I could store the configuration files in a static, external directory somewhere, but I'd like to smudge and clean based off the same configuration branch because the local configuration itself frequently evolves and benefits from versioning alongside the rest of the project, and ideally the branch could be used as a baseline for other devs for their local configuration. Git's filter-branch might be a better fit but git's own documentation recommends against using it at all. Is there a way to do this? Is there something wrong with my git configuration? Could the script itself be causing a problem? Any other possible approaches?

Although it is not documented anywhere, you cannot change the state of the working tree with a smudge or clean filter. Git expects to invoke the filter once for each file by piping data into it and reading the data from the standard output. In other words, these filters are intended to be invoked on a per-file basis and process only that file, not by modifying the working tree state.
The best solution to your problem is to avoid keeping a separate branch. Simply keep all of the files, both development and production, in some directory, and use a script to copy the correct one into place. The location of the running config file should be ignored, so the script won't cause Git to show anything as modified. Alternatively, keep a template somewhere, and have the script generate the appropriate one based on the environment. This is good if you have secrets for production that should not be checked in; you can pass them to the script through the environment and have the right values generated.
What you're doing is related to ignoring tracked files, which, as outlined in the Git FAQ, generally can't be done successfully.

Related

Touch all files in a git repo so git thinks they are changes

I need 'touch' (I think) all files in my git repo (lots of files) so that running git status will have them as modified (and then I can add and commit them). I need to do this because our in-house tool uses the files from a git commit to generate a report ... which I've been asked to do
In posix environments I think I could just touch a directory and go from there.
I don't think that's possible because git detect that a file change if the content of the file changed. Touching the file will have no effect (even on unix).
Perhaps changing the permission on the file could be a very dirty solution but I'm not even sure of that and that's if you find a new permission that don't introduced some bad side effects!
The better solution is to update your reporting tool.
And being obliged to commit changes for ALL files to trick your tool and dirty your history is in my opinion a very bad idea...
If you were asked to "generate a report with all files" does that mean list all files in a commit? Cause that's easily done with something like a git ls-tree -R HEAD
I had a demo repo that had a bunch of files in it, that had commit messages that I didnt want showing up in the demo - and to be clear, the repo was "garbage", in that it was just basically a dump of files to demonstrate a folder structure.
That having been said, one way you could do this is to
create a new temporary folder in your repo, for example "ez"
move all the files of the repo into it, e.e. "$ mv * ez"
commit that locally, the do the reverse and move them out again
"$ mv ez/* .; rmdir ez"
That would show all files as having been changed. For my purposes, I then committed that change too, and pushed it up to my demo repo.

Programmatically overwrite a specific local file with remote file on every git pull

I have an XML file that we consider binary in git. This file is externally modified and committed.
I don't care about who edited it and what's new in the file. I just want to have the latest file version at every pull. At this time, at every git pull I have a merge conflict.
I just want that this file is overwritten on every git pull, without manually doing stuff like git fetch/checkout/reset every time I have to sync my repo.
Careful: I want to overwrite just that file, not every file.
Thanks
I thought you could use Git Hooks, but I don't see one running before a pull...
A possible workaround would be to make a script to delete this file and chain with the needed git pull...
This answer shows how to always select the local version for conflicted merges on a specific file. However, midway through the answer, the author describes also how to always use the remote version.
Essentially, you have to use git attributes to specify a specific merge driver for that specific file, with:
echo binaryfile.xml merge=keepTheirs > dir/with/binary/file/.gitattributes
git config merge.keepTheirs.name "always keep their file during merge"
git config merge.keepTheirs.driver "keepTheirs.sh %O %A %B"
git add -A
git commit -m "commit file for git attributes"
and then create keepTheirs.sh in your $PATH:
cp -f "$3" "$2"
exit 0
Please refer to that answer for a detailed explanation.
If the changes to your files are not actual changes, you should not submit them. This will clutter your version history and cause numerous problems.
From your statement I’m not quite sure which is the case, but there are 2 possibilities:
The file in question is a local storage file, the contents of which are not relevant for your actual sourcecode. In this case the file should be part of your .gitignore.
This file is actually part of your source and will thus have relevant changes in the future. By setting up the merge settings like you are planning to do, you will cause trouble once this file actually changes. Because merges will then be destructive.
In this case the solution is a little bit more complicated (apart from getting a fix for the crappy tool that changes stuff it doesn’t actually change …). What you are probably looking for is the assume unchanged functionality of git. You can access it with this command:
git update-index --assume-unchanged <file>
git docu (git help update-index):
You can set "assume unchanged" bit to
paths you have not changed to cause git not to do this check. Note that setting this bit on a path does not mean git will check the
contents of the file to see if it has changed — it makes git to omit any checking and assume it has not changed. When you make changes
to working tree files, you have to explicitly tell git about it by dropping "assume unchanged" bit, either before or after you modify
them.

Is Git checkout without merging a common file possible?

My goal involves having a file with the same name but different implementations in different branches. For example, I want to develop in a branch with verbose mode and another that works silently. Or, one branch uses a list, but the other uses a hash. Similar to prior question.
In my case, the changes are in a file with the same name. Unfortunately, checkout from one branch to the other merges the files of the same name (content?). In that case, the release version inherits the verbose print statements I had hoped to keep separate.
I learned and succeeded in using stash save; checkout; (edit other branch, add, commit); checkout back; and stash apply (to erase merge changes caused by checkout). It works, but the manual's examples (interrupted workflow, partial commits) suggest this is not the intended workflow. Creating an orphan branch for verbose destroys the history. Is there another way to switch between branches without carrying unintended changes to files with the same name?
Update I can't replicate the behavior any longer, despite seeing it five times before submitting here. It used to show the text below. But, I guess this question should be closed.
$ git checkout master
M Test.java
Switched to branch 'master'
I think the following command is what you are looking for:
git update-index --assume-unchanged <file>
To undo run:
git update-index --no-assume-unchanged <file>
From ""Difference Between 'assume-unchanged' and 'skip-worktree'", I would go with:
git update-index --skip-worktree -- a file
git update-index --no-skip-worktree -- a file
skip-worktree is useful when you instruct git not to touch a specific file ever.
That is handy for an already tracked config file.

Incremental deploy from a shell script

I have a project, where I'm forced to use ftp as a means of deploying the files to the live server.
I'm developing on linux, so I hacked together a bash script that makes a backup of the ftp server's contents,
deletes all the files on the ftp, and uploads all the fresh files from the mercurial repository.
(and taking care of user uploaded files and folders, and making post-deploy changes, etc)
It's working well, but the project is starting to get big enough to make the deployment process too long.
I'd like to modify the script to look up which files have changed, and only deploy the modified files. (the backup is fine atm as it is)
I'm using mercurial as a VCS, so my idea is to somehow request the changed files between two revisions from it, iterate over the changed files,
and upload each modified file, and delete each removed file.
I can use hg log -vr rev1:rev2, and from the output, I can carve out the changed files with grep/sed/etc.
Two problems:
I have heard the horror stories that parsing the output of ls leads to insanity, so my guess is that the same applies to here,
if I try to parse the output of hg log, the variables will undergo word-splitting, and all kinds of transformations.
hg log doesn't tell me a file is modified/added/deleted. Differentiating between modified and deleted files would be the least.
So, what would be the correct way to do this? I'm using yafc as an ftp client, in case it's needed, but willing to switch.
You could use a custom style that does the parsing for you.
hg log --rev rev1:rev2 --style mystyle
Then pipe it to sort -u to get a unique list of files. The file "mystyle" would look like this:
changeset = '{file_mods}{file_adds}\n'
file_mod = '{file_mod}\n'
file_add = '{file_add}\n'
The mods and adds templates are files modified or added. There is a similar file_dels and file_del template for deleted files.
Alternatively, you could use hg status -ma --rev rev1-1:rev2 which adds an M or an A before modified/added files. You need to pass a different revision range, one less than rev1, as it is the status since that "baseline". Deleted files are similar - you need the -d flag and a D is added before each deleted file.

How to determine files that are subjected to filter via gitattributes when filter is executed?

I have bunch of ruby scripts in a git repository and it seems to be really hard to enforce people to write properly indented code.
I also have a small ruby script that formats to code to specific standard and now i would like to run that as a a filter script so that junk wont get committed into repository.
echo "*.rb filter=rubyfilter" > .gitattributes
echo "[filter \"rubyfilter\"]" >> .git/config
echo " clean = /home/rasjani/bin/rbeauty" >> .git/config
echo " smudge = /home/rasjani/bin/rbeauty" >> .git/config
does the dirty trick git side but the ruby script should then process the files affected:
how / where do i look those up from ?
As described in the GitPro Book
Git applies those settings only for a subdirectory or subset of files. These path-specific settings are called Git attributes and are set either in a .gitattributes file in one of your directories
The git attributes man page mentions:
Upon checkout, when the smudge command is specified, the command is fed the blob object from its standard input, and its standard output is used to update the worktree file.
Similarly, the clean command is used to convert the contents of worktree file upon checkin.
So your script will process each *.rb files (in the directory and subdirectories where the .gitattributes file is located) on checkout and commit.
See this SO question for a concrete example.
You can test your own setup with a:
git checkout --force
Note: as mentioned in this SO question, smudge and clean scripts can only modify the content of a file, without knowing what exact file they are processing.

Resources