clean csv file from hidden characters - hidden-characters

I work with CSV files and upload them to an S3 server.
Sometimes after a small process that I did with the file I get hidden characters to look like this  before the first columns, I want to write a script that "clean" the files before upload but I can see those characters only on specific text editors like nano, the python didn't recognize those characters and I can see them in Amazon Athena after the query was created already and I need to upload it again.
Does anyone know a solution to this problem?

After a small research I learn that the symbol called BOM and they added to the files because I added encoding='utf-8'.

Related

Laravel app just on remote PC that accepts .zip and executes commands contained in .txt file. Results returned to client

I've been asked to set up a laravel app that, when running on a remote desktop, will accept two files via api: a .txt file and a .zip file. The text file would contain cmd prompt commands (e.g. 'ls'). The aim is then for the app to process the contents of the .zip file using the instructions in the .txt file and then return the result to the client (e.g. perhaps another text file with a list of the files in the zip file).
Struggling to know how best to approach it. I know the question is very vague (it's all I have) but very much appreciate any suggestions.

Cannot be Compressed because it includes characters that cannot be used, but I don't have any files with special characters

So I'm trying to compress some documents I made when I get the following error message:
I have no idea what the character is, as it just looks like a blank space. I have removed the blank spaces from my documents and it still won't let me zip it. Online answers seem to refer to needing the change the language setting on my computer, but I haven't written any foreign languages. Any help would be appreciated.
Go to Users directory and make a new directory called 'Analytics',
Then, move you 'Account_Over_Time_Analysis' to this folder and try to comporess again.
If it fails again, please try 7zip incase your using something else.
Such an error could be caused from different language dir-name, a name with spaces or a name with escape chars.
To fix this you could hunt around for the correct language pack, or just install 7-Zip and use that to zip the files instead

ChemSpider refuses to accept the .MOL file I present it

I converted a .pdb file to a .MOL file through BABEL (Converter Software). I do get the .MOL file, but when I submit the file online for a similar structure search It doesn't even load the file in ChemSpider.
To use PubChem's database you need SMILES format, which BABEL cannot properly convert my .pdb file to. So I'm out of luck there.
Any way I can search my .MOL file on any chemical database that's out there?
Thanks

How to edit the contents of index.dat windows file

I need to be able edit the content of index.dat file programmatically (C:\Documents and Settings\Username\Cookies\index.dat). More precisely I need to modify it in order that index.dat for one user can be used for a different user name. Is there any documentation out there for this kind of binary file?
Pasco (http://www.foundstone.com/us/resources/proddesc/pasco.htm) is a free index.dat parser that comes with the source code.
Docs will be hard to come by - Microsoft has never publicly documented the structure of the the structure of this file. That said, you can find docs on the web such as the one mentioned above.
However, note that IE keeps close tabs on this file. The file is locked while IE is running (meaning, you can open/read it in some modes but not in others) and you can certainly not write to it.
One method that might still work is to boot-up in safe mode and then assign yourself administrator rights and then see if you can find the files to delete them.
The method I now use is to create a batch file to rename the subfolder below the folder containing the index.bat files and to then only copy the folders back to the original location that don't contain these files but the resultant batch files needs to be run from a separate windows account that has full administrator permissions.
The freeware code editor PSPad will allow you to view and to edit the contents of all of the index.dat files on your computer in hexadecimal form. This is done by replacing all of the digits in the first eight columns with zeros. This removes all of the information contained in the files.
It's a tedious process, requiring holding down the "0" (zero numeric key) as all of the edits are made, but anyone then accessing any of the index.dat files will get no information.
IE must be closed when doing this or you may receive an error message when attempting to save the modified file(s).

Can VS_VERSION_INFO be added to non-exe files?

My windows co-workers were asking me if I could modify my non-windows binary files such that when their "Properties" are examined under Windows, they could see a "Version" tab like that which would show for a Visual Studio compiled exe.
Specifically, I have some gzipped binary files and was wondering if I could modify them to satisfy this demand. If there's a better way, that would be fine, too.
Is there a way I could make my binaries appear to be exe files?
I tried simply appending the VS_VERSION_INFO block from notepad.exe to the end of one of my binaries in the hope that Windows scans for the block, but it didn't work.
I tried editing the other information regarding Author, Subject, Revision, etc. That doesn't modify the file, it just creates another data fork(what's the windows term?) for the file in NTFS.
It is not supported by windows, since each file type has their own file format. But that doesn't mean you can't accomplish it. The resources stored inside dlls and exes are part of the file format.
Display to the user:
If you wanted this information to be displayed to the user, this would probably be best accomplished with using a property page shell extension. You would create a similar looking page, but it wouldn't be using the exact same page. There is a really good multi part tutorial on shell extensions, including property pages starting with that link.
Where to actually store the resource:
Instead of appending a block to the file, you could store the resource into a separate alternate data stream on the same file. This would leave the original file stream non corrupted on disk and not cause its primary file size to change.
Alternate data streams allow more than one data stream to be associated with a filename. Each stream is identified by a colon : at the end of the filename and an identifier.
You can create them for example by doing:
notepad test.txt:adsname1
notepad test.txt:adsname2
notepad test.txt
Getting the normal Win32 APIs working:
If you wanted the normal API to work, you'd have to intercept the Win32 APIs: LoadLibraryEx, FindResource, LoadResource and LockResource. This is probably not worth the trouble though since you are already creating your own property page.
Can't think of any way to do this short of a shell extension. The approach I've taken in the past is a separate "census" program that knows how to read version information from any kind of file.
Zip files can be converted into exe files by using a program that turns a zip file into a self-extracting zip (I know that WinZip does this, there are most likely free utilities for this also; here's one that came up on a search but I haven't actually tried it). Once you've got an exe, you should be able to use a tool like Resource Hacker to change the version information.
It won't work. Either Windows would have to know every file format or no file format would be disturbed if version information were appended to it.
No, resource section is only expected inside PE (portable executable; exe, dll, sys).
It is more then just putting the data inside the file, you have a table that points to the data in the file header.
What you can do if you have NTFS drive, is to use NTFS stream to store custom properties this way the contact of the binary file will remain the same, but you will need to use a custom shell extension to show the content of the stream.

Resources