Is it possible to modify a file without changing its hash value - winapi

I need to edit some cfg files for an application, but the thing is the application wont start if I do since it must match. I dont have the sources of the application.
I guess if the hash doesnt match the hash of the exe, it exits.
Could you bypass this somehow?

Actually, there is a way:
while(hash of malicious config file does not match original)
{
make random, non-functional change to malicious config file.
}
This might take a while.

With cretain hash algorithms, you can append data to the end of a file (if an xml file, say, inside comment tags). But its probably more trouble than its worth. E.g., http://www.schneier.com/blog/archives/2005/06/more_md5_collis.html

If the program uses a good hash, it will be difficult to change without breaking the hash. Some applications use relatively poor hashes. It's relatively easy, for example, to edit a file without affecting a CRC-32 if you can afford to set 32 bits of the file to arbitrary values. Any idea what sort of hash function is used?

You can have the app quit checking, but no, there is no way to duplicate a crypto hash of an existing file. That's the point.

Does a file exist having your desired settings and with the same hash? possibly
Will you be able to find it? Almost certainly not

It's time to break out your disassembler and pull apart the application to get rid of the hash check I'm afraid. No other solution will do what you want in a timely manner.

This kind of validation is intentionally difficult to circumvent. Hashes generally work such that small changes in the input produce widely varied output. The check in this case is doing its duty, unfortunately for your situation.
Although in theory there are other inputs that hash to the same thing, they'll be very different from your input, not just a little different. Finding these inputs will also be as time-consuming and difficult as hacking encrypted data. So basically, no.
As some other posts have mentioned, if you are adventurous and life and death are at stake, you could disassemble the application binary and actually remove the machine language check for the hash. This is varsity-ninja work though.

Related

Parsing STDF Files to Compare results

I am new to this site and I would like to get some inputs regarding parsing STDF files. Generally speaking, I am trying to parse a STDF file to gather only the results (numbers) and not the rest of the line. If I am able to achieve this, I would then like to compare all the numbers together through a bubble sort or insertion sort and see if any numbers are equal to each other. I am capable of doing this in C/C++ and Java but I have no experience parsing documents using Scripts.
Could anyone push me in the right direction? What should I be reading to learn my way around this?
Are you already using an STDF library?
You did not mention one, so I assume not.
You should find a library you are comfortable with (the list changes over time, but you can find some by Googling or looking at the STDF page on Wikipedia) rather than attempting to parse STDF yourself, unless you have a good reason to recreate the STDF parser wheel.
An STDF file contains many tests. It generally does not make sense to compare the results for different tests, so I assume you are looking for matching values within the set of results for each test.
I would use your chosen STDF parser to read the value of each test for each part. Keep a set of the results for each test. As you read each new result, check the set to see if already exists. If it does, you have found the case you were looking for, otherwise add the result to the set.

How to split a large csv file into multiple files in GO lang?

I am a novice Go lang programmer,trying to learn Go lang features.I wanted to split a large csv file into multiple files in GO lang, each file containing the header.How do i do this? I have searched everywhere but couldnt get the right solution.Any help in this regard will be greatly appreciated.
Also please suggest me a good book for reference.
Thanking You
Depending on your shell fu this problem might be better suited for common shell utilities but you specifically mentioned go.
Let's think through the problem.
How big is this csv file? Are we talking 100 lines or is it 5G ?
If it's smallish I typically use this:
http://golang.org/pkg/io/ioutil/#ReadFile
However, this package also exists:
http://golang.org/pkg/encoding/csv/
Regardless - let's return to the abstraction of the problem. You have a header (which is the first line) and then the rest of the document.
So what we probably want to do (if ignoring csv for the moment) is to read in our file.
Then we want to split the file body by all the newlines in it.
You can use this to do so:
http://golang.org/pkg/strings/#Split
You didn't mention but do you know how many files you want to split by or would you rather split by the line count or byte count? What's the actual limitation here?
Generally it's not going to be file count but if we pretend it is we simply want to divide our line count by our expected file count to give lines/file.
Now we can take slices of the appropriate size and write the file back out via:
http://golang.org/pkg/io/ioutil/#WriteFile
A trick I use sometime to help think me threw these things is to write down our mission statement.
"I want to split a large csv file into multiple files in go"
Then I start breaking that up into pieces but take the divide/conquer approach - don't try to solve the entire problem in one go - just break it up to where you can think about it.
Also - make gratiutious use of pseudo-code until you can comfortably write the real code itself. Sometimes it helps to just write a short comment inline with how you think the code should flow and then get it down to the smallest portion that you can code and work from there.
By the way - many of the golang.org packages have example links where you can literally run in your browser the example code and cut/paste that to your own local environment.
Also, I know I'll catch some haters with this - but as for books - imo - you are going to learn a lot faster just by trying to get things working rather than reading. Action trumps passivity always. Don't be afraid to fail.
Here is a package that might help. You can set a necessary chunk size in bytes and a file will be split on an appropriate amount of chunks.

Why are files returned by a For Each loop sorted, but not always?

I'm not sure if this is the correct place to post this question because I have a hunch that the behavior I witness will also be observed using other methods. But anyway, here it goes.
I have a VBscript that contains code like this:
For Each objFile In colFiles
...
Next
I've been running this code for quite some time on many different systems. I never bothered to order the files alphabetically. But today I found out by accident that the logic of my program depends on it. I ran the code on a new system (under Citrix) and the files were returned in a seemingly random order.
Does anybody know why Windows sometimes returns the files sorted alphabetically while sometimes it doesn't?
Added note: It might be relevant to note that the script as well as the input folder are on a network share (where my script outputs randomly ordered files).
Ordering is not supported for FileSystemObject. See KB 189751 http://support.microsoft.com/kb/189751/en-us
Also check out an answer on how to deal with that on SO Order of Files collection in FileSystemObject
The docs do not specify an ordering. Thus, you cannot depend on it to have an order. The Files property needs to ask the underlying file system for the files, and then gives it to you as it, without any processing. If that file system happens to return the files in order, that's great. If not, you'll have to sort it. Regardless of whether it is in order, you should always order it if you expect it in a certain order because the implementation may change tomorrow (as you've just witnessed).
It depends on what data structure you are looping through.
You will obviously get a different order if you use foreach loop in an array and a hashset, for example.
Personally, I don't know anything about VB. But it does work this way in C#.

Why do people use plain english as translation placeholders?

This may be a stupid question, but here goes.
I've seen several projects using some translation library (e.g. gettext) working with plain english placeholders. So for example:
_("Please enter your name");
instead of abstract placeholders (which has always been my instinctive preference)
_("error_please_enter_name");
I have seen various recommendations on SO to work with the former method, but I don't understand why. What I don't get is what do you do if you need to change the english wording? Because if the actual text is used as the key for all existing translations, you would have to edit all the translations, too, and change each key. Or don't you?
Isn't that awfully cumbersome? Why is this the industry standard?
It's definitely not proper normalization to do it this way. Are there massive advantages to this method that I'm not seeing?
Yes, you have to alter the existing translation files, and that is a good thing.
If you change the English wording, the translations probably need to change, too. Even if they don't, you need someone who speaks the other language to check.
You prep a new version, and part of the QA process is checking the translations. If the English wording changed and nobody checked the translation, it'll stick out like a sore thumb and it'll get fixed.
The main language is already existent: you don't need to translate it.
Translators have better context with a real sentence than vague placeholders.
The placeholders are just the keys, it's still possible to change the original language by creating a translation for it. Because when the translation doesn't exists, it uses the placeholder as the translated text.
We've been using abstract placeholders for a while and it was pretty annoying having to write everything twice when creating a new function. When English is the placeholder, you just write the code in English, you have meaningful output from the start and don't have to think about naming placeholders.
So my reason would be less work for the developers.
I like your second approach. When translating texts you always have the problem of homonyms. Like 'open' can mean a state of a window but also the verb to perform the action. In other languages these homonyms may not exist. That's why you should be able to add meaning to your placeholders. Best approach is to put this meaning in your text library. If this is not possible on the platform the framework you use, it might be a good idea to define a 'development language'. This language will add meaning to the text entries like: 'action_open' and 'state_open'. you will off course have to put extra effort i translating this language to plain english (or the language you develop for). I have put this philosophy in some large projects and in the long run this saves some time (and headaches).
The best way in my opinion is keeping meaning separate so if you develop your own translation library or the one you use supports it you can do something like this:
_(i18n("Please enter your name", "error_please_enter_name"));
Where:
i18n(text, meaning)
Interesting question. I assume the main reason is that you don't have to care about translation or localization files during development as the main language is in the code itself.
Well it probably is just that it's easier to read, and so easier to translate. I'm of the opinion that your way is best for scalability, but it does just require that extra bit of effort, which some developers might not consider worth it... and for some projects, it probably isn't.
There's a fallback hierarchy, from most specific locale to the unlocalised version in the source code.
So French in France might have the following fallback route:
fr_FR
fr
Unlocalised. Source code.
As a result, having proper English sentences in the source code ensures that if a particular translation is not provided for in step (1) or (2), you will at least get a proper understandable sentence than random programmer garbage like “error_file_not_found”.
Plus, what do you do if it is a format string: “Sorry but the %s does not exist” ? Worse still: “Written %s entries to %s, total size: %d” ?
Quite old question but one additional reason I haven't seen in the answers yet:
You could end up with more placeholders than necessary, thus more work for translators and possible inconsistent translations. However, good editors like Poedit or Gtranslator can probably help with that.
To stick with your example:
The text "Please enter your name" could appear in a different context in a different template (that the developer is most likely not aware of and shouldn't need to be). E.g. it could be used not as an error but as a prompt like a placeholder of an input field.
If you use
_("Please enter your name");
it would be reusable, the developer can be unaware of the already existing key for an error message and would just use the same text intuitively.
However, if you used
_("error_please_enter_name");
in a previous template, developers wouldn't necessarily be aware of it and would make up a second key (most likely according to a predefined wording scheme to not end up in complete chaos), e.g.
_("prompt_please_enter_name");
which then has to be translated again.
So I think that doesn't scale very well. A pre-agreed wording scheme of suffixes/prefixes e.g. for contexts can never be as precise as the text itself I think (either too verbose or too general, beforehand you don't know and afterwards it's difficult to change) and is more work for the developer that's not worth it IMHO.
Does anybody agree/disagree?

Is it possible to design our own algorithm to create unique GUIDs?

GUID are generated by the combination of numbers and characters with a hyphen.
eg) {7B156C47-05BC-4eb9-900E-89966AD1430D}
In Visual studio, we have the 'Create GUID' tool to create it. I hope the same can be created programmatically through window APIs.
How GUIDs are made to be unique? Why they don't use any special characters like #,^ etc...
Also Is it possible to design our own algorithm to create unique GUIDs?
Yes, it is possible, however you shouldn't try to reinvent the wheel without a good reason to do so. GUIDs are made unique by including elements which (statistically speaking) are very unlikely to be the same in two distinct cases, such as:
Current timestamp
The update of the system
The MAC address of the system
A random number
...
Also, consider the privacy implications of GUIDs if you implement it from zero, because they contain the above data which some people deem sensitive.
UUIDs are defined e.g. in http://www.faqs.org/rfcs/rfc4122.html. One problem with using your own algorithm to generate something like a UUID is that you could collide with other people's UUIDs. So you should definitely use somebody else's implementation if at all possible and, if not, write your own implementation of one of the standard algorithms.
Just to answer this: Why they don't use any special characters like #,^ etc..
It is supposed to be a 128bit Integer. So the common representation is simple 32 Hexadecimals.
You can create also use 128 characters of 1's and 0's etc.
As for the rest of the question, wiki has good answers.
You should create new GUID's by calling the API designed for it. In native windows land it is
CoCreateGUID; in .NET land it is System.Guid.NewGuid();
Yes although easier to make use of someone else's e.g. Boost uuid

Resources