How to apply a prediffer to WinMerge such that it runs on all files in the diff? - winmerge

TL;DR: How can I automatically apply PrediffLineFilter on all diff results, not manually on only the open file?
WinMerge supports using plugin prediffers to, among things, diff while ignoring items that meet conditions specified with a regular expression*. However it appears you must manually apply the prediff after you perform the diff (huh?) and on each and every individual file if it doesn't match a file filter. The help manual indicates the file filter for PrediffLineFilter.sct is *.txt and gives no guide to changing it.
I have thousands of files where the only difference might be a server name which follows a predictable pattern. (Example: server01, server02, etc.) I have figured out the regex for that pattern and the manual application of PrediffLineFilter after a diff works. But I can't be doing that on thousands of files.
How can I automatically apply PrediffLineFilter on all diff results, not manually on only the open file? I only want to see the files where the differences are meaningful.
*Learned this thanks to WinMerge : how to ignore specific words in a comparison?

In your winmerge installation folder, go to MergePlugins then edit the PrediffLineFilter.sct. Then change the return of get_PluginFileFilters to files you want this plugin to work on. For example you want this plugin to work on html and txt files:
Function get_PluginFileFilters()
get_PluginFileFilters = "\.html$;\.txt$"
End Function
After you have made these changes, select all the files on your folder compare window then right click > Plugin Settings > Prediffer Settings > Auto Prediffer, then refresh.
Btw, in the later versions(2.16.10) you can do this in the Plugin Settings window.


Compare files in VSCode and output the differences automatically to a separate file

I use VSCode to compare 2 large csv files (each is almost 140k lines).. I have used >compare active file with..
I want to extract only the difference to clean it up manually.
How can I do that?
Is there any extension or tip will help me on this?

How do I not commit the development team lines in project.pbxproj without deselecting those lines manually?

I am collaborating with my friend on an iOS app. We use different Apple IDs in our Xcodes, so in "Signing and Capabilities" tab of project settings, we select different teams in the "Team" field:
From my observation, changing this affects the MyProject.xcodeproj/project.pbxproj file, which stores the file references that the Xcode project has, in addition to the "Team". Here's a snippet of what is changed:
buildSettings = {
CODE_SIGN_STYLE = Automatic;
DEVELOPMENT_TEAM = <my team ID>; /* this is changed */
INFOPLIST_FILE = MyProject/Info.plist;
PRODUCT_BUNDLE_IDENTIFIER = io.github.sweeper777.MyApp;
The problem arises, when one of us commits this file and the other person pulls. The "puller" will now have the "Team" set to something invalid. When this person then tries to run the app on a real device, there will be code signing errors for obvious reasons. To solve this, this person must tediously go through all the targets that we have, and set each "Team" to their own team.
How can we make it so that on each person's computer, the "Team" stays the same after pulling, but any other changes to MyProject.xcodeproj/project.pbxproj is applied?
Putting the entire MyProject.xcodeproj/project.pbxproj in .gitignore doesn't work, because that would ignore every other change to it. Adding a new file to the project, for example, also changes MyProject.xcodeproj/project.pbxproj, and we want to be able to pull that change.
Manually deselecting the lines that say "DEVELOPMENT_TEAM = ..." when committing is as tedious as reselecting the correct team every time, so that's not a solution.
I found this. Apparently, I can configure git to run sed before git checkout and git add. However, that answer seems ignore the line by deleting it completely. This means that my friend, when he pulls, would still have to reselect the correct team. What I want is the kind of "ignore" that simply stops tracking that line. That is, if there is a local version of that line, use that.
I am also aware that this all wouldn't be a problem if we are on the same team. But if I understand this correctly, I can't have multiple people on my team unless I have a Company account, and not only can I not afford that, I don't own a Company.
I don't use Xcode itself and do not know how to smuggle Git hooks and scripts past the Xcode interface, so you'll need more than just this answer. But you mention sed in comments, and given your proposed file format, that may well be the way to go:
buildSettings = {
CODE_SIGN_STYLE = Automatic;
DEVELOPMENT_TEAM = <my team ID>; /* this is changed */
INFOPLIST_FILE = MyProject/Info.plist;
PRODUCT_BUNDLE_IDENTIFIER = io.github.sweeper777.MyApp;
Git has the ability to run what it calls clean and smudge filters. These can be used to run any arbitrary program you like, including sed, the "stream editor", which is particularly good at making single-line changes based on regular expression matches.
There is another method that may also work, and may "play better" with Xcode, or may play worse. I'll go over that too, after covering clean and smudge filters.
Before we dive into writing clean and smudge filters, and using them from Git—you'll need to know all of these details as you will have to write your own custom filters—we should start with a simple fact about Git commits: No part of any commit can ever be changed. Once you make a commit, the stuff that's inside the commit—the stored data in all of its files—is the way it is, forever. So these filters have to work within that system. Remember that, as it will help with understanding what we're doing.
How Git makes and stores objects
The files inside a commit are not files, exactly: they're not the same thing as files in your file system, at least. Instead, they are what Git calls objects, specifically blob objects. A blob object holds the file's data; other objects hold the file's name; and commit objects collect everything together to be used all at once. There's one more internal object type for annotated tags but we'll stop here as we're really only interested in the blob-object part.
When Git extracts a commit, it reads the internal blob objects and runs them through internal code to decompress and format them into regular files. This can include doing end-of-line hacking (turning LFs into CRLFs) if desired. Normally, all this happens entirely inside Git, and the end result is that Git writes out an ordinary everyday file for you to use. This ordinary file is what you will work on / with, in Xcode or any other editor and compiler system and so on. These ordinary files are in your working tree.
After you've extracted some commit, you'll do some work on it, by changing some or all of the files in your working tree, to achieve whatever result you wanted. This can include changing the buildSettings, editing Swift code, editing Objectionable-C Objective-C code, and so on. You might add all-new files to the working tree, some of which you never commit at all (you can help make sure this never happens by listing such files in .gitignore).
Eventually, though, you'd like to commit the updated code. To do so, you must run git add, or maybe have your IDE run git add for you (perhaps Xcode has clicky buttons to do this). This invokes code in Git that converts the working tree file(s) back to internal blob objects if and as needed.
Again, normally this is all handled entirely inside of Git. Git will read the working tree file, maybe do CRLF-to-LF-only changes, compress the text, search for duplicate objects, and do all the other complicated things necessary to prepare the file, so that it is ready to be committed. The resulting data need not match what's in your working tree at all: it just has to be something that, when Git later goes to extract the file, produces what you will need in your working tree.
Clean and smudge filters
This is where these clean and smudge filters come in. I said, above, that normally, Git does the extraction and insertion all on its own. For binary files, the only thing Git does here is apply lossless compression.1 For text files, Git can do CRLF/LF substitutions as well. But what if you'd like to do your own operations?
You can: Git will let you do whatever you want during the extract process with a smudge filter, and will let you do whatever you want during the compress process with a clean filter. The clean filter replaces the in-file data, using a stream-edit type process,2 and then Git does its CRLF hacking if any and compressing on the "cleaned" data. The smudge filter replaces the decompressed, post-CRLF-hacking data coming out of Git with the data that should go into the working tree.
Hence you can write, as your clean filter, a sed script of the form:
With that as the entire sed script, what sed will do is edit the incoming data stream and replace any actual development team text with the word DEVTEAMTEMPLATE.
Your smudge filter has to work slightly harder: it must find the template line and adjust it so that it contains the correct team ID. Where will you get the correct team ID? That's up to you: perhaps you can store it in a file in your home directory, or in a file that you create in the working tree but never commit in Git. You'll have to write this one or two or however-many-liner sed and/or shell script yourself.
1There are multiple phases of compression; git add does just one, and git checkout undoes all—including reading from "pack" files—as needed. The deeper level of compression, using delta encoding techniques, is entirely invisible at the "object" level, so nobody ever really has to think about it.
2With the advent of Git-LFS, Git gained the ability to run long-lived filters. Before that, Git always used simple stream filtering. The stream filtering is easier to understand, but is less efficient for doing en-masse operations on many files. Here, we're only interested in one file per repository anyway, so there's no need to go into the fancier long-lived filter details.
Defining clean and smudge filters
The tricky part here, with Git, is that you must define the filters in one place—in $HOME/.gitconfig or .git/config, for instance—and then tell Git to invoke them from another place, using the .gitattributes file. This is described in the gitattributes documentation. This documentation is pretty thorough, so read it. You can ignore all the long-running filter discussion, as noted above. I will quote one bit from the documentation here for emphasis, though, and expound on it:
Note that "%f" is the name of the path that is being worked on. Depending on the version that is being filtered, the corresponding file on disk may not exist, or may have different contents. So, smudge and clean commands should not try to access the file on disk, but only act as filters on the content provided to them on standard input.
When Git is running the smudge filter, it:
has opened some internal object (which may or may not be packed);
has decompressed it, or is in the process of decompressing it, and pumped / is-pumping out the data; and
this data is being fed to your filter, but is not written out to any file anywhere.
Your filter can use %f to know the name of the target output file, but the data are not in that file yet. The data bytes are only in some OS-level pipes or sockets or whatever your OS uses for connecting the output of one program (Git's internal decompressors) to another (your filter). Your smudge filter must read its standard input to get the data, and write the smudged data to standard output so that Git can read it (if necessary) and/or redirect that output to the correct file. Do not attempt to open the file by name!
(The same holds for the clean filter, except that in many cases, the input to your filter is just the raw data already in the file, so that opening the file and reading it mostly works. So this can mislead you, if you do your tests using a clean filter.)
Note that you can implement this scheme without a clean filter at all: your smudge filter can replace whatever is in the committed file even if it's a real team ID, rather than just a template. If you choose to do this, however, you'll "see" the team ID changing every time a different team-ID commits the file. The nice thing about using the clean filter is that once the committed copies of the file use the template line, every future cleaned file also uses the template line, so that it never changes.
Alternative: a template file
In general, it's unwise to commit actual configurations. Clean and smudge tricks can work, but they can only go so far: this particular file format works well because the change you want made is on a single line, and Git itself shows you file changes on a line-by-line basis, and sed works well with line-oriented input, and so on.
A lot of configuration files, though, wind up storing at least slightly-sensitive data, or perhaps very-sensitive data such as cleartext passwords. Such files should not be stored in Git at all if at all possible. Instead, you would store a template file in Git.
In this case, for instance, instead of storing MyProject.xcodeproj/project.pbxproj, you might have Git store MyProject.xcodeproj/project.pbxproj.template. This file would have template-ized contents. When you clone and check out the repository, you'd subsequently copy the template file into place and do any required adjustments.
Should the MyProject.xcodeproj/project.pbxproj file itself need to change, e.g., to acquire a new SWIFT_VERSION setting, you'd instead edit the template file, add that to Git, and commit. You would then use the usual "convert template to mine" process, or manually update the MyProject.xcodeproj/project.pbxproj file. Since this file is never committed—and is listed in .gitignore—it never goes into any commit and you never have to worry about collisions within it. Only the template file goes into Git.

It it possible to configure Compare to use the Merge tool instead of the Diff tool?

I often need to compare the current version of a file with some previous one, and copy bits from one to the other. I can get close with TFS Compare:
except that by default, Compare uses the Diff tool, which only lets me view the diffs, but not copy from one file to another:
TFS comes with a Merge tool, which does let me copy, so is there a way to use this for comparing, or some other way to invoke the Merge tool, specifying the current file and a previous version? (I know that I could set this up with command line arguments, but then I'd have to do a lot of manual work to pass the current filename, get the previous version from TFS into a temp file and pass that filename. I'm looking for something that's integrated into Solution Explorer, and works like Compare, so it's just a couple of clicks away.)
I love using WinMerge for both the operations. You can find instructions on how to do it here:

how to search for a term only in non test files

I use ack and I like it.
However from time to time I need to search for something in my code base and I want to ignore all the files residing in test directory. Basically all the files which have test in their absolute path should be not included in the search.
How do I achieve that?
I am willing to have a custom bash script. Something like
ack_no_test "application" -> search for "application" in all files but ignore files residing in test directory
From man ack:
Ignore directory (as CVS, .svn, etc are ignored). May be used
multiple times to ignore multiple directories. For example, mason
users may wish to include --ignore-dir=data. The --noignore-dir
option allows users to search directories which would normally be
ignored (perhaps to research the contents of .svn/props
one could add "test" to the "repodirs" var in findrepo.
Personally I think ack to too complicated/slow and "non unixy",
as it doesn't reuse the existing unix toolkit.

Include only certain file types when searching in Visual Studio

Often when I want to search through my code in Visual Studio, I know the thing I'm looking for is in some C# code. However, as I've used the same variable name in a JavaScript file, I have to wade through all those search results too. This gets even worse when the text I'm looking for is also used in a third-party JavaScript library that we've brought into the project: this can result in hundreds of search results.
To compound things, our designers include HTML mock-ups of the pages in the same project, so I often find I'm hitting loads of search results in there too.
I guess what I really want is to see results in my .cs, .aspx, and .ascx files, but not .js or .htm.
Is there any way to do any of the following:
Search only in files of a particular type (search only .cs files).
Search only in files of any of a given set of types (search only .cs, .aspx and .ascx files).
Search in all file types except a particular type or types (search everything except .js).
I suspect not, in which case is there any crafty way of working around this?
In the Find in Files dialog (Ctrl+Shift+F), there should be a field called Find Options. You should be able to enter the extensions of fields you want to search in a field in this dialog:
*.cs; *.aspx; *.ascx;
Instead of Ctrl + F, I think it is Ctrl + Shift + F which gives you the choice to specify file types, you wish to look into.
You can choose file types from default or type your own. Regular expressions available for complex search.
Another way to limit file searches is by only choosing certain folder sets.
I like to exclude js files by using the following search:
Most of the times, I end up searching for stuff in aspx, cs, cshtml files so this is quite helpful.
Notice how I use *.cs* instead of *.c* since the latter would select jquery custom files such as jquery.custom1234.js (which I usually use in most of my projects), of course if you don't you could just use *.c*.
In the Find dialog box, go to "find options->Look at these file types".
Type in you own string, eg, *.cs, *.aspx, *.ascx. The click the "find all" button.