Visual Studio encoding problems - visual-studio

I have problems with files encoding in Visual Studio 2008. While compiling I'm getting such errors:
When I'm trying to open file where particular error occures, encoding window appears:
By defualt auto-detect is set. When I change encoding option to UTF-8, everything works. If I open each problematic file in my project using UTF-8 encoding, project starts to compile. The problem is I have too many files and there is ridiculous to open each file and set encoding to UTF-8. Is there any way to make this in a quick way ?
My VS settings are:
I'm using Windows Server 2008 R2.
UPDATE:
For Hans Passant and Noah Richards. Thanks for interaction. I recently changed my operating system so everything is fresh. I've also downloaded fresh solution from source control.
In OS regional settings I've changed system locale to Polish(Poland):
In VS I've changed international settings to the same as windows:
The problem is still not solved.
When I open some .cs files using auto-detection for encoding, and then check Files -> Advanced Save Options..., some of this .cs files have codepage 1250:
but internally looks following:
It is wired, because when I check properties of such particular files in source control, they seems to have UTF-8 encoding set:
I don't understand this mismatch.
All other files have UTF-8 encoding:
and opens correctly. I have basically no idea what is going wrong because as far as I know my friend has the same options set as me, and the same project compiles for him correctly. But so far he happily hasn't encountered encoding issues.

That uppercase A with circumflex tells me that the file is UTF-8 (if you look with a hex editor you will probably see that the bytes are C2 A0). That is a non-breaking space in UTF-8.
Visual Studio does not detect the encoding because (most likely) there are not enough high-ASCII characters in the file to help with a reliable detection.
Also, there is no BOM (Byte Order Mark). That would help with the detection (this is the "signature" in the "UTF-8 with signature" description).
What you can do: add BOM to all the files that don't have one.
How to add? Make a file with a BOM only (empty file in Notepad, Save As, select UTF-8 as encoding). It will be 3 bytes long (EF BB BF).
You can copy that at the beginning of each file that is missing the BOM:
copy /b/v BOM.txt + YourFile.cs YourFile_Ok.cs
ren YourFile.cs YourFile_Org.cs
ren YourFile_Ok.cs YourFile.cs
Make sure there is a + between the name of the BOM file and the one of the original file.
Try it on one or two files, and if it works you can create some batch file to do that.
Or a small C# application (since you are a C# programmer), that can detect if the file already has a BOM or not, so that you don't add it twice. Of course, you can do this in almost anything, from Perl to PowerShell to C++ :-)

Once you've opened the files in UTF-8 mode, can you try changing the Advanced Save Options for the file and saving it (as UTF-8 with signature, if you think these files should be UTF-8)?
The auto-detect encoding detection is best-effort, so it's likely that something in the file is causing it to be detected as something other than UTF-8, such as having only ASCII characters in the first kilobyte of the file, or having a BOM that indicates the file is something other than UTF-8. Re-saving the file as UTF-8 with signature should (hopefully) correct that.
If it continues happening after that, let me know, and we can try to track down what is causing them to be created/saved like that in the first place.

Related

How to make ApprovalTests create UTF-8 files

I use Visual Studio 2019 and have added ApprovalTests nuget package. Test class is configured with [UseReporter(typeof(DiffReporter))] and approval is done with Approvals.Verify(result)
It works fine except for the file encoding. In VS I get two files opened. But I also get a warning: "These files have different encodings. Left file: Unicode (UTF-8) with signature. Right file: Western European (Windows). You can resolve the difference by saving the right file with the encoding Unicode (UTF-8) with signature."
I can obviously manually change the right file by saving it with different encoding. That will make the comparison accept the result, but I will the have a content with weird looking escaping in both windows. That makes it much less readable. Example: Simple plus sign is exchanged with \u002B
When debugging the code just before the approval I can verify that the result looks good with all characters as I expect them to look. What happens then? My impression is that the ApprovalTests framework forces an encoding that I can not control.

Android Studio warnings about gradle build files

how can I solve such problem ..Warning:The project encoding (windows-1252) does not match the encoding specified in the Gradle build files (UTF-8). This can lead to serious bugs. More Info...Open File Encoding Settings
The answer may be in the link that you posted.
"When you encounter the above problem (which points to the this page), either change your IDE setting s or** build.gradle to UTF-8 such that the two matches**, or (if necessary) change your encoding to whatever custom encoding you have specified such that the two are in agreement.
(Note: If your source files contain more than plain ASCII characters, you can't "just" change the encoding to UTF-8. If your source files were written with a custom encoding, you'll need to convert them such that the actual characters are read in with the previous encoding and written out with the new encoding.)"

How to fix encoding of doxygen produced tex files?

I have an eclipse CDT C project on a windows machine with all files, inc. doxy file, encoded as UTF-8.
UTF-8 has been specified as encoding within the doxy file as well.
Yet the LaTex files produced are encoded in ISO-8859-1.
In fact if I open a .tex file (with TexWorks), change the file encoding, save it and close it, when I re-open it the encoding is still marked as ISO-8859-1.
This means that UTF-8 symbols (such as \Delta) in the source make it through a doxygen build OK, but cause the PDF make to fail.
Im not at all familiar with LaTex, so not sure where to even start searching on this one, Google queries to date have been fruitless. I'm also still not overly sure if this is a Doxygen, Tex or windows issue that causes the .tex file encoding to be ISO-8859-1!
Thus it would be good to know that, even though there's no specific option for setting doxygen .tex output encoding, would it be set to the same as the DOXYFILE_ENCODING setting?
Assuming that is the case, then moving one of the .tex files from the project folder to the desktop and attempting the encoding change via TexWorks still fails to hold, so it leads me to think either windows or TexWorks is preventing the encoding being UTF-8, but lack of knowledge on encodings and LaTex has left me at a loose end here, any suggestions on what to try next?
Thanks
:\ I basically just ended up re-installing everything and making sure git ignored the tex files and handled the PDF files separately to the code files, so that the encoding was forced. Not really a fix, but it builds.

Why does TortoiseHg think Resource.h is binary?

Using Visual Studio 2010. I have a resource.h file which TortoiseHg thinks is binary so it won't display a diff for it in the commit window. I can easily open the file in a text editor and see that it is plain text.
I saw a related question (Why does Mercurial think my SQL files are binary?) which suggests it has to do with file encoding. Indeed opening the file in Notepad++ says the file is in "UCS-2 Little Endian". How can I fix this? I, obviously, don't want to break some Visual Studio expectation.
For display purposes only, Mercurial treats all files containing NUL bytes as binary due to long-standing UNIX convention. This is just about always right.. except for UTF-16 (formerly known as UCS-2).. where half your file is NUL bytes!
Internally, Mercurial treats all files as binary all the time, so this issue is only relevant for things like whether or not we try to display diffs.
So you have two options:
ignore it, Mercurial will work just fine
use an encoding other than UTF-16
Some web searched for "resource.h utf-16" suggest that VS2010 will be just fine if you save this file in UTF-8 or ASCII, which should be perfectly fine choices for C source code.
http://social.msdn.microsoft.com/Forums/en/vssetup/thread/aff0f96d-16e3-4801-a7a2-5032803c8d83
Try explicitly converting / changing the encoding to UTF-8 / ASCII and see. You can do that from Notepad++'s Encoding menu ( choose Encode in UTF-8)
Visual Studio will work with the UTF-8 file just fine.

Why is mercurial (hg) treating my Visual Studio solutions (.sln) as binary?

I get the message "File or diffs not displayed: File is binary."
Why is mercurial (hg) treating my visual studio solutions (.sln) as binary?
And how do I stop it?
Thanks
I tried this out on one of my projects and the sln file was treated as a text file. Check if your sln file is in a different encoding like UTF-16. Otherwise, Hg should not be treating it as binary. Try explicitly converting / changing the encoding to UTF-8 / ASCII and see.
For actual storage Mercurial treats all files as binary. It never does line conversions or anything else that requires considering things as text or knowing the file's encoding.
However, at the UI level (separate from the storage level) it will try to avoid filling your screen with binary gookus, and to do that it uses a simple test -- a file won't be displayed in diffs if it has one or more NUL (0x00) characters in it.
So your .sln file must have a 0x00 somewhere in it. The most common cause is misbehaving editors putting a Byte Order Mark (BOM) at the front of the file.
If you can remove the NUL Mercurial will display the file contents, and if you can't I think you're out of luck.

Resources