I'm dealing with LF/CRLF issues in a git repository and reading git's documentation to try to understand what I need to do.
One part of the documentation is confusing to me: they write here:
...many editors on Windows silently replace existing LF-style line endings with CRLF, or insert both line-ending characters when the user hits the enter key. Git can handle this by auto-converting CRLF line endings into LF when you add a file to the index, and vice versa when it checks out code onto your filesystem. You can turn on this functionality with the core.autocrlf setting. If you’re on a Windows machine, set it to true — this converts LF endings into CRLF when you check out code.
What I don't understand is: if I'm using Windows, why would I care to convert line endings from LF to CRLF? Would it be because my editor doesn't recognize LF line endings and thus shows all the code in a file as being one line? If it's that, then it seems that if I'm using an editor that does recognize such LF line endings and shows the code correctly even when the file is using LF line endings, then I wouldn't need to do the LF-to-CRLF conversion, right?
If your editor supports LF and you don't care about other people that might want to contribute to your repository then yes, for the most part, you don't need the conversion.
Any decent code editor from the last 20 years should handle LF files. Notepad on the other hand only supported CRLF for a very long time. Fixed in Windows 10 1809.
There still might be some command line tools that choke on LF and perhaps the biggest issue; command line tools that fopen files in text mode using the Microsoft C run-time will output CRLF even when \n is used in their code.
In the end I suppose it is a matter of preference and where you want potential conversion errors to occur; in the git auto-conversion or in tools used to parse/process the files.
Related
After training and working almost exclusively under gnu-linux, I came across a multiplatform project I'll have to work on.
I was quite disappointed to find that in the repo there is a mix of LF and CRLF line feeds. I noticed that most CRLF were introduced by two committers and ran dos2unix on the involved files.
However, I also noticed that all CMakeLists.txt files were CRLF-terminated even if they came from commits by the main developer (who had committed all other sources with LF line endings), so for the moment I left them alone.
I'm not very familiar with windows or multiplatform development, nor with cmake and I was wondering: is there any convention? Or, is there any cmake-specific tool that generates CRLF even on gnu-linux?
If it may matter, the repo is at https://github.com/glipari/rtlib2.0
There are CMake rules:
Newlines may be encoded as either \n or \r\n but will be converted to
\n as input files are read.
https://cmake.org/cmake/help/v3.3/manual/cmake-language.7.html#encoding
All project I know use \n and ask their developers to configure their editors accordingly. Similar to trailing whitespace and the use of tabs.
If you are using Git you can ask your fellow developers to use a Git commit hook that checks for line breaks. If they are not following your convention, the commit is rejected (unless the developer disables it).
Using Visual Studio 2010. I have a resource.h file which TortoiseHg thinks is binary so it won't display a diff for it in the commit window. I can easily open the file in a text editor and see that it is plain text.
I saw a related question (Why does Mercurial think my SQL files are binary?) which suggests it has to do with file encoding. Indeed opening the file in Notepad++ says the file is in "UCS-2 Little Endian". How can I fix this? I, obviously, don't want to break some Visual Studio expectation.
For display purposes only, Mercurial treats all files containing NUL bytes as binary due to long-standing UNIX convention. This is just about always right.. except for UTF-16 (formerly known as UCS-2).. where half your file is NUL bytes!
Internally, Mercurial treats all files as binary all the time, so this issue is only relevant for things like whether or not we try to display diffs.
So you have two options:
ignore it, Mercurial will work just fine
use an encoding other than UTF-16
Some web searched for "resource.h utf-16" suggest that VS2010 will be just fine if you save this file in UTF-8 or ASCII, which should be perfectly fine choices for C source code.
http://social.msdn.microsoft.com/Forums/en/vssetup/thread/aff0f96d-16e3-4801-a7a2-5032803c8d83
Try explicitly converting / changing the encoding to UTF-8 / ASCII and see. You can do that from Notepad++'s Encoding menu ( choose Encode in UTF-8)
Visual Studio will work with the UTF-8 file just fine.
I am using luadoc and running it on Unix and windows, unfortunately the output is different on each system because of the DOS/Unix line endings, this really confuses my source control as it thinks every file has changed (mercurial). How can I make lua use one or the other?
I know nothing about lua, but you might want to solve this at the SCM level, Mercurial has the EolExtension for that.
That being said, you're probably missing some feature of luadoc.
Mercurial FAQ 7.4
I was looking at an open source Mac application, and they gave some suggested values for .gitignore. They were what I would expect...
However, they also suggested an entry into a .gitattributes file:
*.pbxproj -crlf -diff -merge
I'm not the most knowledgable in terms of git, so I was wondering - what exactly are the benefits of adding this line? What does do in particular? I've only seen this suggested in this one project, and if it was normal practice I would have expected to see it elsewhere right now. So I was curious about how it applies to the pbxproj file specifically.
The pbxproj file isn't really human mergable. While it is plain ASCII text, it's a form of JSON. Essentially you want to treat it as a binary file.
Here's what the individual flags do:
-crlf: don't use crlf <=> cr conversion
-diff: do not diff the file
-merge: do not attempt to merge the file
From the Pro Git book by Scott Chacon
Some files look like text files but
for all intents and purposes are to be
treated as binary data. For instance,
Xcode projects on the Mac contain a
file that ends in .pbxproj, which is
basically a JSON (plain text
javascript data format) dataset
written out to disk by the IDE that
records your build settings and so on.
Although it’s technically a text file,
because it’s all ASCII, you don’t want
to treat it as such because it’s
really a lightweight database — you
can’t merge the contents if two people
changed it, and diffs generally aren’t
helpful. The file is meant to be
consumed by a machine. In essence, you
want to treat it like a binary file.
A diff is oftentimes useful at commit time to check what has been changed. So I find it useful to keep the diffing ability but just prevent merging. So I use this in my .gitattributes file:
*.pbxproj -crlf -merge
On another note, has anybody tried using merge=union for pbxproj files? See: Should I merge .pbxproj files with git using merge=union?
I faced the problem of corruption *.pbxproj file after resolving merge conflicts manually. Or, more often, my files just 'disappeared' from the working tree after the merge. It drove me mad because we work in a team, so you can imagine how messy it can become very fast.
So, I have tested merge=union and it works well so far. I know that it can't help if files were deleted or renamed at the same time, but for adding new files it works as expected: there is no conflicts and files don't disappear after the merge. And it also saves quite a bit of time.
If you want to try it out, here is what I did.
1) Create a global .gitattributes file. Run in terminal:
touch ~/.gitattributes
git config --global core.attributesfile ~/.gitattributes
2) This command should open it in a text editor:
open ~/.gitattributes
3) When the file opens, add this line and save the file:
*.pbxproj binary merge=union
Done. Hope this will help new readers like it helped me.
I wrote a python script named xUnique to solve this merge conflicts problem.
This script do following things:
replace all 24 chars UUID to project-wide unique 32 chars MD5 digests, and remove any unused UUIDs(usually caused by careless merge before). This would prevent duplicate UUIDs because different machines/Xcode generate different UUIDs in this file. Xcode does recognize it and the project could be opened. During this process, remove all invalid lines in project file
sort the project file. I wrote a python version of sort-Xcode-project-file from Webkit team with more new features:
support to sort PBXFileReference and PBXBuildFile sections
remove duplicated files/refs
avoid creating new file even if no changes made, this makes less commits after using this script
More details and updates of xUnique, please refer to README
When I open a file in eclipse it shows with the improper line spacing showing an extra line break between every line. When I open the file with notepad or wordpad it doesn't show these extra line breaks that only eclipse shows. How do I get eclipse to read these files like notepad and wordpad without those line breaks?
-edit: I don't have this problem with all files but only a select few where I have made local changes > uploaded them to our sun station > then pulled those files back to my local workstation for future modifications.
Eclipse should have a File -> Convert Line Delimiters To... option that may correct this for you. (If it doesn't work on your file, this article may help.)
Really, though, you should have your file transfer program treat your source files as ascii instead of binary. Then your line ending problem should be moot.
It's possible that the server (or something in-between) is replacing all your CR+LF with CR LF (separate)?
Try specifically setting the Text File Encoding (Window->Preferences->General->Workspace), or alternatively use File->Convert Line Delimiters To->Windows every time you get the latest version (I know, not ideal).
It turns out that the problem was solved by doing my ftp in binary only, and setting the Eclipse encoding to US-ASCII. I don't fully understand why this fixed the problem but it worked. Thanks for the 2 answers they both lead me to my solution.