Visual Studio Extensibility: Get encoding of ProjectItem/TextDocument - visual-studio-2010

If I have a ProjectItem (MSDN), how do I get its its encoding as detected by Visual Studio?
I want to get the same result as shown in this dialog:

I'm not sure how you can get it directly from the ProjectItem. I believe you need to wait for the actual document to be open as the encoding is detected at that time. At that time you should have an ITextBuffer and can get the Encoding this way
ITextDocumentFactoryService factoryService = ...;
ITextBuffer textBuffer = ...;
ITextDocument textDocument;
if (factoryService.TryGetTextDocument(textBuffer, out textDocument)) {
Encoding encoding = textDocument.Encoding;
...
}

Related

Reading and writing Windows "tags" with Python 3

In Windows image files can be tagged. These tags can be viewed and edited by right clicking on a file, clicking over to the Details tab, then clicking on the Tags property value cell.
I want to be able to read and write these tags using Python 3.
This is not EXIF data so EXIF solutions won't work. I believe it's part of the Windows Property System, but I can't find a reference in Dev Center. I looked into win32com.propsys and couldn't see anything in there either.
I wrote a program that does this once before, but I've since lost it, so I know it's possible. Previously I did it without pywin32, but any solution would be great. I think I used windll, but I can't remember.
Here is some sample code that's using the IPropertyStore interface through propsys:
import pythoncom
from win32com.propsys import propsys
from win32com.shell import shellcon
# get PROPERTYKEY for "System.Keywords"
pk = propsys.PSGetPropertyKeyFromName("System.Keywords")
# get property store for a given shell item (here a file)
ps = propsys.SHGetPropertyStoreFromParsingName("c:\\path\\myfile.jpg", None, shellcon.GPS_READWRITE, propsys.IID_IPropertyStore)
# read & print existing (or not) property value, System.Keywords type is an array of string
keywords = ps.GetValue(pk).GetValue()
print(keywords)
# build an array of string type PROPVARIANT
newValue = propsys.PROPVARIANTType(["hello", "world"], pythoncom.VT_VECTOR | pythoncom.VT_BSTR)
# write property
ps.SetValue(pk, newValue)
ps.Commit()
This code is pretty generic for any Windows property.
I'm using System.Keywords because that's what corresponds to jpeg's "tags" property that you see in the property sheet.
And the code works for jpeg and other formats for reading (GetValue) properties, but not all Windows codecs support property writing (SetValue), to it doesn't work for writing extended properties back to a .png for example.

How to cancel the binding of option key in OSX?

I know that OXS bind special keys on option key, such as option+y=¥,option+t=þ... But now, I want to cancel all these bindings, and use option+character as the shortcuts in JetBrains IDEs and other apps.
I have searched for hours and still haven't found a useful way.Can anyone share a tool or the correct way?I used the way to add Library/KeyBindings/DefaultKeyBinding.dict file, but it only canceled some key-bindings, not work for all.
It is my DefaultKeyBinding.dict file below :
{
"~d" = "deleteWordForward:";
"^w" = "deleteWordBackward:";
"~f" = "moveWordForward:";
"~b" = "moveWordBackward:";
"~u" = "pageUp:";
}
Now ,option+f/b can used as my jetbrains IDEs shortcut,but option+u still print special character.
I have found a way: if I select Unicode Hex Input as my input source from System preferences->keyBoard->Input Sources, all bindings now disappeared!

VS2010 save array/collection data to a file while debugging

Is there some way to save array/list/collection data to a file while debugging in VS2010?
For example, in this code:
var addressGraphs = from a in context.Addresses
where a.CountryRegion == "Canada"
select new { a, a.Contact };
foreach(var ag in addressGraphs) {
Console.WriteLine("LastName: {0}, Addresses: {1}", ag.Contact.LastName.Trim(),
ag.Contact.Addresses.Count());
foreach(var Address in ag.Contact.Addresses) {
Console.WriteLine("...{0} {1}", Address.Street1, Address.City);
}
}
I'd like to set a breakpoint on the first 'foreach' line and then save the data in 'addressGraph' to a file.
where 'a' contains fields such as:
int addressID
string Street1
string City
<Ect.>
and 'Contact' contains fields such as:
string FirstName
string LastName
int contactID
<Ect.>
I'd like the file to contain the values of each of the fields for each item in the collection.
I don't see an obvious way to do this. Is it possible?
When your breakpoint is hit, open up the Immediate window and use Tools.LogCommandWindowOutput to dump the output to a file:
>Tools.LogCommandWindowOutput c:\temp\temp.log
?addressGraphs
>Tools.LogCommandWindowOutput /off
Note: You can use Log which is an alias for Tools.LogCommandWindowOutput
Update:
The > character is important. Also, the log alias is case sensitive.
See screenshot:
I also encoutered such a question, but in VS2013. I have to save a content of array while debugging.
For example, I need to save a content of double array named "trimmedInput". I do so:
Open QuickWatch Window from Debug menu (Ctrl+D, Q).
Put your variable in Expression and push Recalculate Button
You'll see all the values. Now you could select them all (Ctrl+A) and copy (Ctrl+C).
Paste (Ctrl+V) them in your favorite editor. Notepad, for example. And use them.
That's the simples way that I know. Without additional efforts. Hope that my description helps you!
P.S. Sorry for non English interface on screenshots. All necessary information are written in the text.
Something similar is possible with this method:
I built an extension method that I use in all of my projects that is a general and more powerful ToString() method that shows the content of any object.
I included the source code in this link:
https://rapidshare.com/files/1791655092/FormatExtensions.cs
UPDATE:
You just have to put FormatExtensions.cs in your project and change the Namespace of FormatExtensions to coincide to the base Namespace of your project. So when you are in your breakpoint you can type in your watch window:
myCustomCollection.ToStringExtended()
And copy the output wherever you want
On Visual studio Gallery search for: Object Exporter Extension.
be aware: as far as I worked with, it has a bug that block you from exporting object once in a while.
You can also call methods in the Immediate Window, and so I think your best bet would be to use an ObjectDumper object, like the one in the LINQ samples or this one, and then write something like this in the Immediate Window:
File.WriteAllText("myFileName.txt", ObjectDumper.Dump(addressGraph));
Depending on which ObjectDumper you decide to use, you may be able to customize it to suit your needs, and to be able to tell it how many levels deep you want it to dig into your object when it's dumping it.
Here's a solution that takes care of collections. It's a VS visualizer that will display the collection values in a grid while debugging as well as save to the clipboard and csv, xml and text files. I'm using it in VS2010 Ultimate. While I haven't tested it extensively, I have tried it on List and Dictionary.
http://tinyurl.com/87sf6l7
It handles the following collections:
•System.Collections classes
◦System.Collections.ArrayList
◦System.Collections.BitArray
◦System.Collections.HashTable
◦System.Collections.Queue
◦System.Collections.SortedList
◦System.Collections.Stack
◦All classes derived from System.Collections.CollectionBase
•System.Collections.Specialized classes
◦System.Collections.Specialized.HybridDictionary
◦System.Collections.Specialized.ListDictionary
◦System.Collections.Specialized.NameValueCollection
◦System.Collections.Specialized.OrderedDictionary
◦System.Collections.Specialized.StringCollection
◦System.Collections.Specialized.StringDictionary
◦All classes derived from System.Collections.Specialized.NameObjectCollectionBase
•System.Collections.Generic classes
◦System.Collections.Generic.Dictionary
◦System.Collections.Generic.List
◦System.Collections.Generic.LinkedList
◦System.Collections.Generic.Queue
◦System.Collections.Generic.SortedDictionary
◦System.Collections.Generic.SortedList
◦System.Collections.Generic.Stack
•IIS classes, as used by
◦System.Web.HttpRequest.Cookies
◦System.Web.HttpRequest.Files
◦System.Web.HttpRequest.Form
◦System.Web.HttpRequest.Headers
◦System.Web.HttpRequest.Params
◦System.Web.HttpRequest.QueryString
◦System.Web.HttpRequest.ServerVariables
◦System.Web.HttpResponse.Cookies
As well as a couple of VB6-compatible collections
In "Immediate Window" print following to get the binary dump:
byte[] myArray = { 02,01,81,00,05,F6,05,02,01,01,00,BA };
myArray
.Select(b => string.Format("{0:X2}", b))
.Aggregate((s1, s2) => s1 + s2)
This will print something like:
0201810005F60502010100BA
Change the '.Aggregate(...)' call to add blanks between bytes, or what ever you like.

Silverlight: Encoding a webClient stream

I've been trying to get this to work, but I'm very frustrated at this point. I am a beginner in this field, so maybe I'm just making mistakes.
What I need to do is to take in a website .html and store it into a txt file. Now the problem is that this website is in Russian (encoding windows-1251) and Silverlight only supports 3 encodings. So in order to bypass that limitation, I got my hands on an encoding class that transfers the stream into a byte array and then tries to pull the correctly encoded string from the text. The problem with this is that
1) I try to ensure that webClient recieves a Unicode encoded stream, because the other ones do not seem to create a retrievable string, but it still doesn't seem to work.
WebClient wc = new WebClient();
wc.Encoding = System.Text.Encoding.Unicode;
wc.DownloadStringCompleted += new DownloadStringCompletedEventHandler(wc_LoadCompleted);
wc.DownloadStringAsync(new Uri(site));
2) I fear that when I store the html into a txt file using streamWriter, the encoding is, yet again, somehow screwed up.
3) The encoding class is not doing its job.
Encoding rus = Encoding.GetEncoding(1251);
Encoding eng = Encoding.Unicode;
byte[] bytes = rus.GetBytes(string);
textBlock1.Text = eng.GetString(bytes);
Can anyone offer any help on this matter? This huge detriment to my project. Thanks in advance,
Since you want to handle an encoding alien to Silverlight you should start with downloading using OpenReadAsync and OpenReadCompleted.
Now you should be able to take the Stream provided by the event args Result property and supply it directly to the encoding component you have acquired to generate the correct string result.

SSIS - Flat file always ANSI never UTF-8 encoded

Have a pretty straight forward SSIS package:
OLE DB Source to get data via a view, (all string columns in db table nvarchar or nchar).
Derived Column to format existing date and add it on to the dataset, (data type DT_WSTR).
Multicast task to split the dataset between:
OLE DB Command to update rows as "processed".
Flat file destination - the connection manager of which is set to Code Page 65001 UTF-8 and Unicode is unchecked. All string columns map to DT_WSTR.
Everytime I run this package an open the flat file in Notepad++ its ANSI, never UTF-8. If I check the Unicode option, the file is UCS-2 Little Endian.
Am I doing something wrong - how can I get the flat file to be UTF-8 encoded?
Thanks
In Source -> Advance Editor -> Component Properties ->
Set Default Code Page to 65001
AlwaysUseDefaultCodePage to True
Then Source->Advance Editor -> Input And OutPut Properties
Check Each Column in External Columns and OutPut Columns and set CodePage to 65001 wherever possible.
That's it.
By the way Excel can not define data inside the file to be UTF - 8. Excel is just a file handler. You can create csv files using notepad also. as long as you fill the csv file with UTF-8 you should be fine.
Adding explanation to the answers ...
setting the CodePage to 65001 (but do NOT check the Unicode checkbox on the file source), should generate a UTF-8 file. (yes, the data types internally also should be nvarchar, etc).
But the file that is produced from SSIS does not have a BOM header (Byte Order Marker), so some programs will assume it is still ASCII, not UTF-8. I've seen this confirmed by MS employees on MSDN, as well as confirmed by testing.
The file append solution is a way around this - by creating a blank file WITH the proper BOM, and then appending data from SSIS, the BOM header remains in place. If you tell SSIS to overwrite the file, it also loses the BOM.
Thanks for the hints here, it helped me figure out the above detail.
I have recently worked on a problem where we come across a situation such as the following:
You are working on a solution using SQL Server Integration Services(Visual Studio 2005).
You are pulling data from your database and trying to place the results into a flat file (.CSV) in UTF-8 format. The solution exports the data perfectly and keeps the special characters in the file because you have used 65001 as the code page.
However, the text file when you open it or try to load it to another process, it says the file is ANSI instead of UTF-8. If you open the file in notepad and do a SAVE AS and change the encode to UTF-8 and then your external process works but this is a tedious manual work.
What I have found that when you specify the Code Page property of the Flat file connection manager, it do generates a UTF-8 file. However, it generates a version of the UTF-8 file which misses something we call as Byte Order Mark.
So if you have a CSV file containing the character AA, the BOM for UTF8 will be 0xef, 0xbb and 0xbf. Even though the file has no BOM, it’s still UTF8.
Unfortunately, in some old legacy systems, the applications search for the BOM to determine the type of the file. It appears that your process is also doing the same.
To workaround the problem you can use the following piece of code in your script task which can be ran after the export process.
using System.IO;
using System.Text;
using System.Threading;
using System.Globalization;
enter code here
static void Main(string[] args)
{
string pattern = "*.csv";
string[] files = Directory.GetFiles(#".\", pattern, SearchOption.AllDirectories);
FileCodePageConverter converter = new FileCodePageConverter();
converter.SetCulture("en-US");
foreach (string file in files)
{
converter.Convert(file, file, "Windows-1252"); // Convert from code page Windows-1250 to UTF-8
}
}
class FileCodePageConverter
{
public void Convert(string path, string path2, string codepage)
{
byte[] buffer = File.ReadAllBytes(path);
if (buffer[0] != 0xef && buffer[0] != 0xbb)
{
byte[] buffer2 = Encoding.Convert(Encoding.GetEncoding(codepage), Encoding.UTF8, buffer);
byte[] utf8 = new byte[] { 0xef, 0xbb, 0xbf };
FileStream fs = File.Create(path2);
fs.Write(utf8, 0, utf8.Length);
fs.Write(buffer2, 0, buffer2.Length);
fs.Close();
}
}
public void SetCulture(string name)
{
Thread.CurrentThread.CurrentCulture = new CultureInfo(name);
Thread.CurrentThread.CurrentUICulture = new CultureInfo(name);
}
}
when you will run the package you will find that all the CSVs in the designated folder will be converted into a UTF8 format which contains the byte order mark.
This way your external process will be able to work with the exported CSV files.
if you are looking only for particular folder...send that variable to script task and use below one..
string sPath;
sPath=Dts.Variables["User::v_ExtractPath"].Value.ToString();
string pattern = "*.txt";
string[] files = Directory.GetFiles(sPath);
I hope this helps!!
OK - seemed to have found an acceptable work-around on SQL Server Forums. Essentially I had to create two UTF-8 template files, use a File Task to copy them to my destination then make sure I was appending data rather than overwriting.
For very large files #Prashanthi's in-memory solution will cause out of memory exceptions. Here is my implementation, a variation of the code from here.
public static void ConvertFileEncoding(String path,
Encoding sourceEncoding, Encoding destEncoding)
{
// If the source and destination encodings are the same, do nothting.
if (sourceEncoding == destEncoding)
{
return;
}
// otherwise, move file to a temporary path before processing
String tempPath = Path.GetDirectoryName(path) + "\\" + Guid.NewGuid().ToString() + ".csv";
File.Move(path, tempPath);
// Convert the file.
try
{
FileStream fileStream = new FileStream(tempPath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
using (StreamReader sr = new StreamReader(fileStream, sourceEncoding, false))
{
using (StreamWriter sw = new StreamWriter(path, false, destEncoding))
{
//this seems to not work here
//byte[] utf8 = new byte[] { 0xef, 0xbb, 0xbf };
//sw.BaseStream.Write(utf8, 0, utf8.Length);
int charsRead;
char[] buffer = new char[128 * 1024];
while ((charsRead = sr.ReadBlock(buffer, 0, buffer.Length)) > 0)
{
sw.Write(buffer, 0, charsRead);
}
}
}
}
finally
{
File.Delete(tempPath);
}
}
I know this is a very old topic, but here goes another answer that may be easier to implement than the other ones already posted (take your pick).
I found this; which you can download the .exe file from this location. (It's free).
Make sure to follow the instructions in the first link and copy the .exe into your C:\Windows\System32 and C:\Windows\SysWOW64 for easy usage without having to type/remember complicated paths.
In SSIS, add an Execute process task.
Configure the object with convertcp.exe in the Process -> Executable field.
Configure the object with the arguments in the Process -> Arguments field with the following: 0 65001 /b /i "\<OriginalFilePath<OriginalFile>.csv" /o "\<TargetFilePath<TargetFile>_UTF-8.csv"
I suggest Window style to be set to hidden.
Done! If you run the package the Execute process task will convert the original ANSI file to UTF-8. You can convert from other codepages to other codepages as well. Just find the codepage numbers and you are good to go!
Basically this command line utility gives SSIS the ability to convert from codepage to codepage using the Execute process task. Worked like a charm for me. (If you deploy to a SQL Server you will have to copy the executable into the server in the system folders as well, of course.)
Best, Raphael

Resources