How do I bundle a C program with auxiliary data files?

How do I bundle a C program with auxiliary data files? - installation

I apologize if this question is a repost. It seems like it ought to be
but I'm having an extremely difficult time finding a solution on the web.
I wrote a C program that needs to read in some auxiliary files during
its operation. I typically just put these files in the same folder as
the compiled executable, and then just load these files using a
relative path.
However, this only works if the user's working directory is the same
as the folder where the executable is stored. This is insufficient for
my purposes, as I want users to be able to call this executable from
other directories.
What is the standard way of packaging programs that span more than a
single executable binary? The program is meant to be used by near
computer-illiterate people, and I am looking for the closest to a
point-and-click solution possible.
Thanks very much.
-Patrick
Thanks very much for the replies! That gives me a starting place to look. I'll google around for determining the executable path for each OS.

This depends on your operating system. Windows has things called resources (https://msdn.microsoft.com/en-us/library/7zxb70x7.aspx) for this which allow you to bundle files with the executable. Perhaps the easiest way to do this on any platform would be to include the data in an array. If you need to edit make changes to that data at runtime you could add identifiers to the array (something like char Data[] = {'I','D','E','N','T',/*your data*/,'I','D','E','N','T','1'}) and then find that array in the binary at runtime and change its contents. I know that is a very bad solution, but it might just work.

Related

Calling os.Lstat only if the file has changed since the last time I called os.Lstat

I'm trying to write a program, calcsize, that calculates the size of all sub directories. I want to create a cache of the result and only re-walk the directory if it has changed since the last time I've run the program.
Something like:
./calcsize
//outputs
/absolute/file/path1/ 1000 Bytes
/absolute/file/path2/ 2000 Bytes
I'm already walking the dirs with my own walk implementation because the built in go filepath.Walk is already calling Lstat on every file.
Is there any way to know if a directory or set of files has changed without calling Lstat on every file? Maybe a system call I'm not aware of?

In general, no. However you might want to look at: https://github.com/mattn/go-zglob/blob/master/fastwalk/fastwalk_unix.go
And using that data you can skip some of the stat calls, if you only care about files.

Whether, and how, this is possible depends heavily on your operating system. But you might take a look at github.com/howeyc/fsnotify which claims to offer this (I've never used it--I just now found it via Google).
In general, look at any Go program that provides a 'watch' feature. GoConvey and GopherJS's serve mode come to mind as examples, but there are others (possibly even in the standard library).

identify portable executable

I want to write a program that allows or blocks processes while openning a file depending on a policy.
I could make a control by checking the name of the program. However, it would not be enough because user can change the name to pass the policy. i.e. let's say that policy doesn't allow a.exe to access txt files whereas b.exe is allowed. If user change a.exe with b.exe, i cannot block it.
On the other hand, verifying portable executable signature is not enough for me, because i don't care whether the executable signed or not. I just want to identify the executable that is wanted to execute even its name is changed.
For this type of case, what would you propose? Any solutions are welcome.
Thanks in advance

There are many ways to identify an executable file. Here is a simple list:
Name:
The most simple and straightforward approach is to identify a file by its name. But it is one of the easiest things to change, and you already ruled that out.
Date:
Files have an access, creation, and modification date, and they are managed by the operating system. They are not foolproof, or maybe not even accurate.
Also, they are very simple to change.
Version Information:
Since we are talking about executable files, then most executable files have version information attached to it. For example, original file name, file version, product version, company, description, etc. You can check these fields if you are sure the user cannot modify them by editing your executable. It doesn't require you to keep a database of allowed files. However, it does require you to have something to compare to, like company name, or a product name. Also, If someone made an executable with the same value
they can run instead of the allowed one and bypass your protection.
Location:
If the file is located in a specific place and is protected by file system access rights, and it cannot be changed, then you can use that. You can, for example, put the allowed files in a folder where the user (without admin rights) can only read/execute them, but not rename/move. Then identify the file with its location. If it is run from this location, then allow it, else block it. It is good as it doesn't need a database of
allowed/blocked files, it just compares the location, it it is a valid one, then allow, and you can keep adding and removing files to allowed locations
without affecting your program.
Size:
If the file has a specific file size, you can quickly check its size and compare it. But it is unreliable as files can be changed/patched and without any change in size. You can avoid that by also applying a CRC check to detect if the content of the file changed.
But, both size and CRC can be changed. Also, this requires you to have a list of file names and their sizes/CRC, and keeping it up to date.
Signature:
Deanna mentioned that you can self-sign your executable files. Then check if the signature matches yours and allow/deny based on that. This seems to be a good way if
it is okay for you to sign all the executable files you want to allow. It doesn't require you to keep an updated list of allowed files.
Hash:
arx also pointed out, that you can hash the files. It is one of the slowest methods, as it requires the file to be hashed every time it is executed, then compared to a list of files. But it very reliable as it can uniquely identify each file and hard to break. But, you will need to keep an up to date database of every file hash.
Finally, and depend on your needs and options, you can mix two or more ways together to get the results you want. Like, checking file name + location, etc.
I hope I covered most of things, but I'm sure there are more ways. Anyone can freely edit my post to include anything that I have missed.

I would recommend using the signature, if it has one, or the hash otherwise. Apps such as Office that update frequently are more likely to be signed, whereas smaller apps downloaded off the Internet are unlikely to ever be updated and so should have a consistent hash.

How do I determine where process executable code starts and ends when loaded in memory?

Say I have app TestApp.exe
While TestApp.exe is running I want a separate program to be able to read the executable code that is resident in memory. I'd like to ignore stack and heap and anything else that is tangential.
Put another way, I guess I'm asking how to determine where the memory-side equivalent of the .exe binary data on disk resides. I realize it's not a 1:1 stuffing into memory.
Edit: I think what I'm asking for is shown as Image in the following screenshot of vmmap.exe
Edit: I am able to get from memory all memory that is tagged with any protect flag of Execute* (PAGE_EXECUTE, etc) using VirtualQueryEx and ReadProcessMemory. There are a couple issues with that. First, I'm grabbing about 2 megabytes of data for notepad.exe which is a 189 kilobyte file on disk. Everything I'm grabbing has a protect flag of PAGE_EXECUTE. Second, If I run it on a different Win7 64bit machine I get the same data, only split in half and in a different order. I could use some expert guidance. :)
Edit: Also, not sure why I'm at -1 for this question. If I need to clear anything up please let me know.

Inject a DLL to the target process and call GetModuleHandle with the name of the executable. That will point to its PE header that has been loaded in the memory. Once you have this information, you can parse the PE header manually and find where .text section is located relative to the base address of the image in the memory.

no need to inject a dll
use native api hooking apis

I learned a ton doing this project. I ended up parsing the PE header and using that information to route me all over. In the end I accomplished what I set out to and I am more knowledgeable as a result.

Is it possible to create a file that cannot be copied?

To restrict the scope, let assume we are in Windows world only.
Also assume we don't want to play with permission policy.
Is it possible for us to create a file that cannot be copied?
Thank you in advance.

"Trying to make digital files uncopyable is like trying to make water not wet." ~ Bruce Schneier

No. You can't create a file that a SYSADMIN can't copy. You could encrypt it, though.

Well, how about creating a file that uses up more than 50% of the total space on that machine and that is not compressible?
For instance, let us assume that you want to save a boolean (true or false) in such a fashion.
Depending on its value, you could then write a bit stream of ones or zeroes and encrypt said stream using some kind of encryption algorith, such as AES in CBC mode. This gives you the added advantage of error correction. Even in case of massive data corruption, you should be able to recover your boolean by checking whether ones or zeroes are prevalent in the decrypted stream.
In that case you cannot copy it around (completely) on the machine...
Of course, any type of external memory that can be added to the system would pose a problem in this scenario. But the file would be already encrypted, so don't worry about it too much...

Any file that can be read can have its contents written to another location (such as another file, i.e. copied).
The only thing you can do is limit who/what can read the file.

What is the motivation behind? If it is a read-only file, you can have it as embedded resources within your assembly.

Nice try, RIAA.
But seriously, no you can not. It is always possible to copy, you can just make it it more difficult for people to make sense of the file or try to hide it using like encryption. Spotify does it.
If you really try hard thou, you cold make a root-kit for windows and use it to prevent windows from even knowing about the file and also prevent copies. The file will still be there and copy-able by other tools, or Linux accessing the ntfs.

If in a running process you open a file and hold an exclusive lock, then other processes cannot read the file until you close the handle or your process terminates. However, as admin you could forcibly remove the lock handle.

Short answer: No.
You can, of course, use security settings to limit who can read the file. But if someone can read it, then they can copy it. Even if you found some operating system trick to disable "ordinary" copying, if someone can read the file, they can extract the contents, store it in memory, and then write it somewhere else.
You can encrypt the contents so it's only useful to your own program, that knows how to decrypt it.
That's about it.

When using Windows 7 to copy some files from a hard drive, certain files popped up a message saying they could not be copied in their entirety; certain data would be omitted from the copy. I suspect that had something to do with slack space at the end of the files, though I thought the message was curious. I would have expected the copy operation to just ignore the slack space.

If you are running old (OLD) versions of windows, there are certain characters you can put in the filename that make it invalid, not listed in folders, etc. They were used a lot in the old pub ftp days of filesharing ;)

In the old DOS days, you used to be able to flag disk sectors as bad and still read from them. This meant the OS ignored the sector in question but your application would know where to look and be able to get the data. Not sure this would work these days.
Another old MS-DOS trick was to put a space character in the middle of the filename (yes, spaces were valid characters for filenames). Since there was no method on the command line to escape a space, the file couldn't be copied using the DOS commands.

This answer is outside Windows so yeah
Dont know if its already been said but what about a file that is an inseperable part of the firmware so that it is always on AND running, perhaps it has firmware that generates a sequence that is required for the other . AN incedental effect of its running is to prevent any 80% or more of its code from being replicated. Lets say its on an entirely different board, protected by surge protectors, heavy em proof shielding and anything else required to make it completely unerasable.
If its possible to make a program that is ALWAYS on and running as long as the copying software is running then yes.
I have another way and this IS with windows. I will come to your house and give you a disk, i will then proceed to destroy every single computer you put the disk into. This doesnt work on XP

Well technically you could create and write to a write-only network share.

Should I deal with files longer than MAX_PATH?

Just had an interesting case.
My software reported back a failure caused by a path being longer than MAX_PATH.
The path was just a plain old document in My Documents, e.g.:
C:\Documents and Settings\Bill\Some Stupid FOlder Name\A really ridiculously long file thats really very very very..........very long.pdf
Total length 269 characters (MAX_PATH==260).
The user wasn't using a external hard drive or anything like that. This was a file on an Windows managed drive.
So my question is this. Should I care?
I'm not saying can I deal with the long paths, I'm asking should I. Yes I'm aware of the "\?\" unicode hack on some Win32 APIs, but it seems this hack is not without risk (as it's changing the behaviour of the way the APIs parse paths) and also isn't supported by all APIs .
So anyway, let me just state my position/assertions:
First presumably the only way the user was able to break this limit is if the app she used uses the special Unicode hack. It's a PDF file, so maybe the PDF tool she used uses this hack.
I tried to reproduce this (by using the unicode hack) and experimented. What I found was that although the file appears in Explorer, I can do nothing with it. I can't open it, I can't choose "Properties" (Windows 7). Other common apps can't open the file (e.g. IE, Firefox, Notepad). Explorer will also not let me create files/dirs which are too long - it just refuses. Ditto for command line tool cmd.exe.
So basically, one could look at it this way: a rouge tool has allowed the user to create a file which is not accessible by a lot of Windows (e.g. Explorer). I could take the view that I shouldn't have to deal with this.
(As an aside, this isn't an vote of approval for a short max path length: I think 260 chars is a joke, I'm just saying that if Windows shell and some APIs can't handle > 260 then why should I?).
So, is this a fair view? Should I say "Not my problem"?
UPDATE: Just had another user with the same problem. This time an mp3 file. Am I missing something? How can these users be creating files that violate the MAX_PATH rule?

It's not a real problem. NTFS support filenames up to 32K (32,767 wide characters). You need only use correct API and correct syntax of filenames. The base rule is: the filename should start with '\\?\' (see http://msdn.microsoft.com/en-us/library/aa365247(v=VS.85).aspx) like \\?\C:\Temp. The same syntax you can use with UNC: \\?\UNC\Server\share\Path. Important to understand that you can use only a small subset of API function. For example look at MSDN description of functions
CreateFile
CreateDirectory
MoveFile
and so on
you will find text like :
In the ANSI version of this function,
the name is limited to MAX_PATH
characters. To extend this limit to
32,767 wide characters, call the
Unicode version of the function and
prepend "\?\" to the path. For more
information, see Naming a File.
This functions you can safe use. If you have a file handle from CreateFile you can use all other functions used hFile (ReadFile, WriteFile etc.) without any restriction.
If you write a program like virus scanner or backup software or some good software running on a server you should write your program so, that all file operations support filenames up to 32K characters and not MAX_PATH characters.

This limitation is baked into a lot of software written in C or C++. Including MSFT code, although they've been chipping away at it. It is only partly a Win32 limitation, it still has a hard upper limit on the length of a file name (not path) through WIN32_FIND_DATA for example. One reason that even .NET has length restrictions. This is not going away any time soon, Win32 is still going strong and the stone-age C string won't disappear.
Your customer will have little sympathy with it, no doubt, probably until you can show them another program that fails the same way. Do however make sure that your code reliably can detect the potential string buffer overflow, followed by a reasonable diagnostic. No sympathy for programs bombing on heap corruption.

As you mentioned many of the Windows Shell functions only work on paths up to MAX_PATH. Windows XP and I believe Vista both have problems in Explorer when nesting directories with long file names. I've not checked Windows 7 - perhaps they have fixed that. This unfortunately means that users have a hard time browsing these file.
If you really wish to support long paths you'll need to check any functions you are using in Shell32.dll that take paths to ensure they support long paths. For those that don't you'll have to use write them yourself using Kernel32 functions.
If you decide to use Shell32 and be limited to MAX_PATH, writing your code to support long file paths would be advisable. If Microsoft later change Shell32 (or create an alternative), you will be better positioned to add support for them.
Just to add another couple of dimensions to the problem, remember that filenames are UTF-16, and you may encounter non NTFS or FAT filesystems that may be case sensitive!

Your own APIs should not hard-code a fixed limit on the path length (or any other hard limits); however, you shouldn't violate the preconditions of the system APIs in order to accomplish some task. IMHO, the fact that Windows limits the length of path names is absurd and should be considered a bug. That said, no I would suggest you not attempt to use the various system APIs other than as documented, even if that results in certain undesireable behavior such as limits to the maximum path length. So, in short, your view is completely fair; if the OS doesn't support it, then the OS doesn't support it. That said, you may want to make it clear to users that this is a limitation of Windows and not of your own code.

One easy way how these files with long paths could be created even by software that does not support paths longer than MAX_PATH: through a file share.
Example:
"C:\My veeeeeeeeeeeeeeeeeeeeery looooooooooooooooooong folder" could be shared as "data". Users could then access that folder through the UNC path \\computer\data or (even shorter) through a drive letter (M:\) assuming that M: is mapped to \\computer\data.
This often happens on file servers.

Paths often can be bigger than 260, one example would be when symlinks get nested and repeat over and over sometimes even on purpose. I think programmers should think about whether they want their program to handle these insanely large paths or not. IMO, 260 is PLENTY of space but thats just me. My answer to this is:
if you have to ask yourself so deeply about breaking the 260 char limit, then thats probably what you should do. We often look for confirmation when we are about to do something that we are unsure about...
I think the maximum path anywhere in the API is about 32k long but thats up to you. Back in the day that was a pretty big chunk of change (half of an entire memory segment!! sheesh!) but nowdays, in the segment-transparent addressing environment in which we live, where all memory is heaped together on the flat, 32k is nothin'... AFAIK paths wouldn't need to be that long unless you are using some fancy unicode language that requires lots of other characters, etc, etc.. we could blab about this all day but you get the idea. I hope this helps..... or hurts?

I am doung some C programming and I was searching for a way to get the maximum length of a given filename, after a search for MAX_PATH I stumbled to this thread and after som thoughts on this matter and after reading the comments on this thread I have come to the following conclusion.
So I understand that NTFS support filenames up to 32.767 characters in length, however, according to knowledge FAT16 only support 11 character filenames, 8 + 3, so in reallity operating systems should have a function which our program can call to dertemine the maximum filename size, simply because all filesystems have different limitations including the length of the filename.
So the end conclusion must be that since us, the developers, don't know anything about which filesystem the data is going to be stored in, so therefore the only solution must be an try and error method.

Not strictly an answer to your specific question, but it might help those who do need to handle long file names.
The Delimon library is a .NET Framework 4 based library on Microsoft TechNet for overcoming the long filenames problem:
Delimon.Win32.IO Library (V4.0).
It has its own versions of key methods from System.IO. For example, you would replace:
System.IO.Directory.GetFiles
with
Delimon.Win32.IO.Directory.GetFiles
which will let you handle long files and folders.
From the website:
Delimon.Win32.IO replaces basic file functions of System.IO and
supports File & Folder names up to up to 32,767 Characters.
This Library is written on .NET Framework 4.0 and can be used either
on x86 & x64 systems. The File & Folder limitations of the standard
System.IO namespace can work with files that have 260 characters in a
filename and 240 characters in a folder name (MAX_PATH is usually
configured as 260 characters). Typically you run into the
System.IO.PathTooLongException Error with the Standard .NET Library.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio