I work with CSV files and upload them to an S3 server.
Sometimes after a small process that I did with the file I get hidden characters to look like this  before the first columns, I want to write a script that "clean" the files before upload but I can see those characters only on specific text editors like nano, the python didn't recognize those characters and I can see them in Amazon Athena after the query was created already and I need to upload it again.
Does anyone know a solution to this problem?
After a small research I learn that the symbol called BOM and they added to the files because I added encoding='utf-8'.
Is it possible to compile Excel-VSTO workbook into the Excel-File itself?
This is my Project
But I want all the files compiled into the Excel file.
Is this possible?
No, that's not possible because in order to enable a document-level VSTO customization the workbook needs to have an entry _AssemblyLocation in the Workbook.CustomDocumentProperties.
This entry has to contain the path to the .vsto file.
If you use ClickOnce it will look something like this:
file:///DeploymentServer/MaterialTable.vsto|74744e4b-e4d6-41eb-84f7-ad20346fe2d9
If you use your own Setup.exe to deploy all files locally instead, you can specify this local path as well by appending vstolocal to the end:
file:///C:/Program Files/MaterialTable/MaterialTable.vsto|74744e4b-e4d6-41eb-84f7-ad20346fe2d9|vstolocal
This .vsto file contains the related names of the .manifest and .dll files that Excel also needs to load.
So unfortunately you just can't compile a document-level VSTO customization into a workbook because Excel needs to have physical access to the .vsto/.manifest/.dll files.
But if you're using the vstolocal deployment, you can at least specify an absolute file path so that your .xlsx file doesn't need to be in the same directory as your .vsto/.manifest/.dll files. So maybe this could be (at least kind of) an alternative solution for your problem.
I often have the need to store a certain file version in a non-exe file, just the way I give a file version to an .exe, for example "1.0.392"
However, I have not found a way to do that for non-exe-files (such as .db files, etc.)
Is there are way to do that in a simple way?
Thank you!
Is there a way of specifying components to remove from MS or Openoffice documents via ruby? I'm talking about removing macros/meta information and also removing/replacing images. I've looked at a number of conversion programs with a view to doing a conversion from/to the same file format, but I can't find any that allow such options to be specified.
I've looked at:
Convert_office
Abiword - I've modified the original gem to allow conversion to doc as well as pdf.
Docx files are really zip files. You can unzip them (inflate) into a directory and delete or change the files you need, and update references to those files. The files inside the zip are text files, XML, so you can use LibXML-Ruby or Nokogiri.
My windows co-workers were asking me if I could modify my non-windows binary files such that when their "Properties" are examined under Windows, they could see a "Version" tab like that which would show for a Visual Studio compiled exe.
Specifically, I have some gzipped binary files and was wondering if I could modify them to satisfy this demand. If there's a better way, that would be fine, too.
Is there a way I could make my binaries appear to be exe files?
I tried simply appending the VS_VERSION_INFO block from notepad.exe to the end of one of my binaries in the hope that Windows scans for the block, but it didn't work.
I tried editing the other information regarding Author, Subject, Revision, etc. That doesn't modify the file, it just creates another data fork(what's the windows term?) for the file in NTFS.
It is not supported by windows, since each file type has their own file format. But that doesn't mean you can't accomplish it. The resources stored inside dlls and exes are part of the file format.
Display to the user:
If you wanted this information to be displayed to the user, this would probably be best accomplished with using a property page shell extension. You would create a similar looking page, but it wouldn't be using the exact same page. There is a really good multi part tutorial on shell extensions, including property pages starting with that link.
Where to actually store the resource:
Instead of appending a block to the file, you could store the resource into a separate alternate data stream on the same file. This would leave the original file stream non corrupted on disk and not cause its primary file size to change.
Alternate data streams allow more than one data stream to be associated with a filename. Each stream is identified by a colon : at the end of the filename and an identifier.
You can create them for example by doing:
notepad test.txt:adsname1
notepad test.txt:adsname2
notepad test.txt
Getting the normal Win32 APIs working:
If you wanted the normal API to work, you'd have to intercept the Win32 APIs: LoadLibraryEx, FindResource, LoadResource and LockResource. This is probably not worth the trouble though since you are already creating your own property page.
Can't think of any way to do this short of a shell extension. The approach I've taken in the past is a separate "census" program that knows how to read version information from any kind of file.
Zip files can be converted into exe files by using a program that turns a zip file into a self-extracting zip (I know that WinZip does this, there are most likely free utilities for this also; here's one that came up on a search but I haven't actually tried it). Once you've got an exe, you should be able to use a tool like Resource Hacker to change the version information.
It won't work. Either Windows would have to know every file format or no file format would be disturbed if version information were appended to it.
No, resource section is only expected inside PE (portable executable; exe, dll, sys).
It is more then just putting the data inside the file, you have a table that points to the data in the file header.
What you can do if you have NTFS drive, is to use NTFS stream to store custom properties this way the contact of the binary file will remain the same, but you will need to use a custom shell extension to show the content of the stream.