Toy Program - Using GDB to bypass a C-function - debugging

I'm doing an exercise in GDB using the C programming language, which is way to help developers debug their coding, but can't seem to understand the direction to take.
All I know is that the toy program asks for a password, and the eventual exercise destination is that GDB is supposed to allow the finding of vulnerabilities and thus bypass the checking of the password. That's really all I'm given, so any code would be as confusing as it is to me as it is to the reader. So basically, I'm asking for a set of steps more experienced people would attempt.
I have dumped through objdump and found a very large dump. I've found the main func but afterwards just a lot of library functions with the very cryptic hex and assembly code.
Is there a different heuristic I should be taking? I am not too familiar with stacks, so maybe something with that? I know I can set breakpoints but would not know what function to look for to start with. When the program runs, it says "Please input password", I type a random password, and the toy program says "Password is incorrect!". Somewhere in between that I need to intercept something, but that is beyond me.
Thanks for the assistance. You'd be making a lifelong scholar very happy.

Related

Why does mjit functions get invoked?

I'm doing research in ruby interpreter and mJIT.
And, as a first step, I would like to understand the behaviors of both. Thus, I simply ran a very simple ruby program without --jit command puts ("hello world!") and got the execution trace of it. Then, one thing I found that even without mJIT enabled, some of the mJIT functions get invoked, such as mjit_add_class_serial, mjit_remove_class_serial, mjit_mark, mjit_gc_finish_hook, mjit_free_iseq, and mjit_finish.
And, I would like to understand why that is. My guess is that the interpreter and mJIT shares some of those codes, but not 100% sure. Especially, the description of mjit_finish is briefly saying that it is for finishing up whatever the operation is happening by the mJIT compiler. In such case, why does this function gets invoked when interpreter-only execution code?
If anyone has an idea regarding my question, any recommendation would be very much appreciated.
Thank you.
This is for ruby version 2.6.2. And, I've gone through the source code as well as the comments explaining each code, but they are not very clear.

How do I reverse-engineer the "import file" feature of an abandoned pascal application?

first question I've asked and I'm not sure how to ask it clearly, or if there will be an answer that I want to hear ;)
tl;dr: "I want to import a file into my application at work but I don't know the input format. How can I discover it?"
Forgive any pending wordiness and/or redaction.
In my work I depend on an unsupported (and proprietary) application written in Pascal. I have no experience with pascal (yet...) and naturally have no source code access. It is an excellent (and very secret/NDA sort of deal I think) application that allows us to deal with inventory and financial issues in my employer's organization. It is quite feature-comprehensive, reasonably stable and robust, and kind of foistered (word?) on us by a higher power.
One excellent feature that it has is the ability to load up "schedules" into our corporate system. This feature should be saving us hundreds of hours in data entry.
But it isn't.
The problem is, the schedules we receive are written in a legacy format intended for human eyes. The "new" system can't interpret them.
Our current information (which I have to read and then re-enter into the database by hand) is send in a sort of rich-text flat-file format, which would be easy to parse with the string library of probably any mainstream language.
So I want to write a converter to convert our data into a format that the new software can interpret.
By feeding certain assorted files into the system, I have learned a little bit about what kind of file it expects:
I "import" a zero-byte file. Nothing happens (same as printing a report with no data)
I "import" an XML file that I guess might look like the system expects. It responds with an exception dialog and a stacktrace. Apparently the string <?xml contains illegal characters or something
I "import" a jpeg image -- similar result to #2.
So I think that my target wants a flat-file itself. The file would need to contain a "document number" along with {entries with "incident IDs" and descriptions and numeric values}.
But I don't know this for certain.
Nobody is able to tell me exactly what these files should look like. Someone in the know said that they have seen the feature demonstrated -- somewhere out there is a utility that creates my importable schedules. But for now, the utility is lost and I am on my own.
What methods can I use to figure out the input file format? I know nothing about debugging pascal, but I assume that that is probably my best bet. Or do I have to keep on with brute force until I can afford a million monkey-operated typewriters? Do I have to decompile the target application? I don't know if I can get away with that, let alone read the decompiled source.
My google-fu has failed me.
Has anyone done something like this before or could they point me in the right direction? Are there any guides on this subject?
Thanks in advance.
PS: I am certain that I am not breaking any laws at this point, although I will have to check to find out if decompilation would get me into trouble or not, and that might be outside of my technical competence anyway.
If you have an example file you can try to take a hexdump utility and try to see if there things you can identify. Any additional info that you have (what should in the file) helps with that. Better even, if you know a program that can edit the file, you can use the editor to make minimal changes and then compare the file before and after.
IOW standard tricks of binary file format reverse engineering.
...If you have no existing files whatsoever, then reverse engineering the binary is your only option, and that is not pretty. Decompilation of native binaries is a black art that requires considerable time and skill. Read the various decompilation FAQs on the net.
First and for all, I would try to contact the authors of the program. Source code are options 1,2,3 and you only go with other options if there is really, really, really no hope whatsoever of obtaining source or getting normal support.

What is 'NAILDUMPS''?

I encountered a new term called 'NAILDUMPS' when I analysed a flowchart explaining a series of jcls.In some steps of that flowchart it is mentioned as"this file is naildumped" .Can anyone explain what is a naildump and why it is used?.
Thanks in advance
In all my travels through the mainframe world, I've never heard this term, not with Fault Analyser (or its competition) or with system abend stuff, where you'd expect to find it.
Most likely thing is that it's an application specific thing. If you could provide the context around the comment in the JCL, such as a program name like IEBGENER or IEFBR14 (with the options), it may be easier to tell you what it's doing.
For what it's worth (a), there's one page that Google serves up showing one use for this elusive program. The link states that, to empty a dataset, you can use:
//STEP01 EXEC PGM=NAILDUMP
//FILE DD DSN=your filename,DISP=SHR
in your JCL. But given the scarcity of information on this program, the fact it doesn't seem to appear in any of the IBM z/OS docs and the fact that there are perfectly good supported ways of doing this, I'd still maintain that it's some in-house utility. Ask your local sysprogs - even if they don't know, they should be able to see inside the JCL member.
(a) It's probably not worth much since there are all sorts of wonderful things you can do with JCL just by specifying DD commands, even with programs that do absolutely nothing, a la the infamous IEFBR14 program.
NAILDUMP is not a "normal" name for any standard IBM Mainframe (zos) utility program.
This leaves three possibilities. NAILDUMP could be:
a locally developed program, in which case you need to find the local documentation (good luck!).
a catalogued procedure fronting a standard utility. For example, DFSORT is a catalogued procedure used in many shops to front the standard system sort program.
an alias for another program. For example, ICEMAN is a commonly defined alias for the standard system sort program.
If you had access to the mainframe (or can find someone who does)
the ISRDDN utility under TSO can be used to find the actual program load module
that NAILDUMP relates to provided it is a locally developed program or is an alias for some other
standard utility program. This link
gives a brief explanation of how to do it.
If it is a catalogued procedure you can find it by searching for a member named NAILDUMP in the system default
catalogued procedure library or those specified in the JCL.
Getting to the real name can be a bit of a challenge, but once you get there it should become clear what it is being used for through context.
It seems this a case when the author who made a document is very familiar with some term ( "naildump") but not to the audience of the document.
I think you should first ask the author for clarification because even if someone answers to you what it supposes to mean they could be wrong for that case in particular.
Given your little context makes a little sense that "NAILDUMP" empties the dataset or delete it.

How to debug a program without a debugger? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Interview question-
Often its pretty easier to debug a program once you have trouble with your code.You can put watches,breakpoints and etc.Life is much easier because of debugger.
But how to debug a program without a debugger?
One possible approach which I know is simply putting print statements in your code wherever you want to check for the problems.
Are there any other approaches other than this?
As its a general question, its not restricted to any specific language.So please share your thoughts on how you would have done it?
EDIT- While submitting your answer, please mention a useful resource (if you have any) about any concept. e.g. Logging
This will be lot helpful for those who don't know about it at all.(This includes me, in some cases :)
UPDATE: Michal Sznajderhas put a real "best" answer and also made it a community wiki.Really deserves lots of up votes.
Actually you have quite a lot of possibilities. Either with recompilation of source code or without.
With recompilation.
Additional logging. Either into program's logs or using system logging (eg. OutputDebugString or Events Log on Windows). Also use following steps:
Always include timestamp at least up to seconds resolution.
Consider adding thread-id in case of multithreaded apps.
Add some nice output of your structures
Do not print out enums with just %d. Use some ToString() or create some EnumToString() function (whatever suits your language)
... and beware: logging changes timings so in case of heavily multithreading you problems might disappear.
More details on this here.
Introduce more asserts
Unit tests
"Audio-visual" monitoring: if something happens do one of
use buzzer
play system sound
flash some LED by enabling hardware GPIO line (only in embedded scenarios)
Without recompilation
If your application uses network of any kind: Packet Sniffer or I will just choose for you: Wireshark
If you use database: monitor queries send to database and database itself.
Use virtual machines to test exactly the same OS/hardware setup as your system is running on.
Use some kind of system calls monitor. This includes
On Unix box strace or dtrace
On Windows tools from former Sysinternals tools like http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx, ProcessExplorer and alike
In case of Windows GUI stuff: check out Spy++ or for WPF Snoop (although second I didn't use)
Consider using some profiling tools for your platform. It will give you overview on thing happening in your app.
[Real hardcore] Hardware monitoring: use oscilloscope (aka O-Scope) to monitor signals on hardware lines
Source code debugging: you sit down with your source code and just pretend with piece of paper and pencil that you are computer. Its so called code analysis or "on-my-eyes" debugging
Source control debugging. Compare diffs of your code from time when "it" works and now. Bug might be somewhere there.
And some general tips in the end:
Do not forget about Text to Columns and Pivot Table in Excel. Together with some text tools (awk, grep or perl) give you incredible analysis pack. If you have more than 32K records consider using Access as data source.
Basics of Data Warehousing might help. With simple cube you may analyse tons of temporal data in just few minutes.
Dumping your application is worth mentioning. Either as a result of crash or just on regular basis
Always generate you debug symbols (even for release builds).
Almost last but not least: most mayor platforms has some sort of command line debugger always built in (even Windows!). With some tricks like conditional debugging and break-print-continue you can get pretty good result with obscure bugs
And really last but not least: use your brain and question everything.
In general debugging is like science: you do not create it you discover it. Quite often its like looking for a murderer in a criminal case. So buy yourself a hat and never give up.
First of all, what does debugging actually do? Advanced debuggers give you machine hooks to suspend execution, examine variables and potentially modify state of a running program. Most programs don't need all that to debug them. There are many approaches:
Tracing: implement some kind of logging mechanism, or use an existing one such as dtrace(). It usually worth it to implement some kind of printf-like function that can output generally formatted output into a system log. Then just throw state from key points in your program to this log. Believe it or not, in complex programs, this can be more useful than raw debugging with a real debugger. Logs help you know how you got into trouble, while a debugger that traps on a crash assumes you can reverse engineer how you got there from whatever state you are already in. For applications that you use other complex libraries that you don't own that crash in the middle of them, logs are often far more useful. But it requires a certain amount of discipline in writing your log messages.
Program/Library self-awareness: To solve very specific crash events, I often have implemented wrappers on system libraries such as malloc/free/realloc which extensions that can do things like walk memory, detect double frees, attempts to free non-allocated pointers, check for obvious buffer over-runs etc. Often you can do this sort of thing for your important internal data types as well -- typically you can make self-integrity checks for things like linked lists (they can't loop, and they can't point into la-la land.) Even for things like OS synchronization objects, often you only need to know which thread, or what file and line number (capturable by __FILE__, __LINE__) the last user of the synch object was to help you work out a race condition.
If you are insane like me, you could, in fact, implement your own mini-debugger inside of your own program. This is really only an option in a self-reflective programming language, or in languages like C with certain OS-hooks. When compiling C/C++ in Windows/DOS you can implement a "crash-hook" callback which is executed when any program fault is triggered. When you compile your program you can build a .map file to figure out what the relative addresses of all your public functions (so you can work out the loader initial offset by subtracting the address of main() from the address given in your .map file). So when a crash happens (even pressing ^C during a run, for example, so you can find your infinite loops) you can take the stack pointer and scan it for offsets within return addresses. You can usually look at your registers, and implement a simple console to let you examine all this. And voila, you have half of a real debugger implemented. Keep this going and you can reproduce the VxWorks' console debugging mechanism.
Another approach, is logical deduction. This is related to #1. Basically any crash or anomalous behavior in a program occurs when it stops behaving as expected. You need to have some feed back method of knowing when the program is behaving normally then abnormally. Your goal then is to find the exact conditions upon which your program goes from behaving correctly to incorrectly. With printf()/logs, or other feedback (such as enabling a device in an embedded system -- the PC has a speaker, but some motherboards also have a digital display for BIOS stage reporting; embedded systems will often have a COM port that you can use) you can deduce at least binary states of good and bad behavior with respect to the run state of your program through the instrumentation of your program.
A related method is logical deduction with respect to code versions. Often a program was working perfectly at one state, but some later version is not longer working. If you use good source control, and you enforce a "top of tree must always be working" philosophy amongst your programming team, then you can use a binary search to find the exact version of the code at which the failure occurs. You can use diffs then to deduce what code change exposes the error. If the diff is too large, then you have the task of trying to redo that code change in smaller steps where you can apply binary searching more effectively.
Just a couple suggestions:
1) Asserts. This should help you work out general expectations at different states of the program. As well familiarize yourself with the code
2) Unit tests. I have used these at times to dig into new code and test out APIs
One word: Logging.
Your program should write descriptive debug lines which include a timestamp to a log file based on a configurable debug level. Reading the resultant log files gives you information on what happened during the execution of the program. There are logging packages in every common programming language that make this a snap:
Java: log4j
.Net: NLog or log4net
Python: Python Logging
PHP: Pear Logging Framework
Ruby: Ruby Logger
C: log4c
I guess you just have to write fine-grain unit tests.
I also like to write a pretty-printer for my data structures.
I think the rest of the interview might go something like this...
Candidate: So you don't buy debuggers for your developers?
Interviewer: No, they have debuggers.
Candidate: So you are looking for programmers who, out of masochism or chest thumping hamartia, make things complicated on themselves even if they would be less productive?
Interviewer: No, I'm just trying to see if you know what you would do in a situation that will never happen.
Candidate: I suppose I'd add logging or print statements. Can I ask you a similar question?
Interviewer: Sure.
Candidate: How would you recruit a team of developers if you didn't have any appreciable interviewing skill to distinguish good prospects based on relevant information?
Peer review. You have been looking at the code for 8 hours and your brain is just showing you what you want to see in the code. A fresh pair of eyes can make all the difference.
Version control. Especially for large teams. If somebody changed something you rely on but did not tell you it is easy to find a specific change set that caused your trouble by rolling the changes back one by one.
On *nix systems, strace and/or dtrace can tell you an awful lot about the execution of your program and the libraries it uses.
Binary search in time is also a method: If you have your source code stored in a version-control repository, and you know that version 100 worked, but version 200 doesn't, try to see if version 150 works. If it does, the error must be between version 150 and 200, so find version 175 and see if it works... etc.
use println/log in code
use DB explorer to look at data in DB/files
write tests and put asserts in suspicious places
More generally, you can monitor side effects and output of the program, and trigger certain events in the program externally.
A Print statement isn't always appropriate. You might use other forms of output such as writing to the Event Log or a log file, writing to a TCP socket (I have a nice utility that can listen for that type of trace from my program), etc.
For programs that don't have a UI, you can trigger behavior you want to debug by using an external flag such as the existence of a file. You might have the program wait for the file to be created, then run through a behavior you're interested in while logging relevant events.
Another file's existence might trigger the program's internal state to be written to your logging mechanism.
like everyone else said:
Logging
Asserts
Extra Output
&
your favorite task manager or process
explorer
links here and here
Another thing I have not seen mentioned here that I have had to use quite a bit on embedded systems is serial terminals.
You can cannot a serial terminal to just about any type of device on the planet (I have even done it to embedded CPUs for hydraulics, generators, etc). Then you can write out to the serial port and see everything on the terminal.
You can get real fancy and even setup a thread that listens to the serial terminal and responds to commands. I have done this as well and implemented simple commands to dump a list, see internal variables, etc all from a simple 9600 baud RS-232 serial port!
Spy++ (and more recently Snoop for WPF) are tremendous for getting an insight into Windows UI bugs.
A nice read would be Delta Debugging from Andreas Zeller. It's like binary search for debugging

How detailed should error messages be?

I was wondering what the general consensus on error messages was. How detailed should they be?
I've worked on projects where there was a different error message for entering a number that was too big, too small, had a decimal, was a string, etc. That was quite nice for the user as they knew exactly where things went wrong, but the error handling code started to rival the actual business logic in size, and started to develop some of its own bugs.
On the other side I've worked on a project where you'd get very generic errors such as
COMPILE FAILED REASON 3
Which needless to say was almost entirely useless as reason 3 turned out to mean a link error.
So where is the middle ground? How do I know if I've added descriptive enough error messages? How do I know if the user will be able to understand where they've gone wrong?
There are two possible target audiences for an error message, the user, and the developer.
One should generally have the message target the user.
o what is the cause of the problem.
o why the program can't work around the problem
o what the user can do to work around the problem.
o how to report the problem.
If the problem is to be reported, the report should include as much program context information as possible.
o module name
o function name
o line number
o variables of interest in the general area of the problem
o maybe even a core dump.
Target the correct audience.
You should communicate what happened, and what the user's options are, in as few words as possible. The longer the error message is, the less likely the user is to read it. By the same token, short error messages are cryptic and useless. There's a sweet spot in terms of length, and it's different for every situation.
Too short:
Invalid input.
Too long:
Please enter a correctly formatted IP address, such as 192.168.0.1. An IP address is a number used to identify your computer on a network.
Just right:
Please enter a valid IP address.
As far as code bloat is concerned, if a little extra code will prevent a user from calling support or getting frustraited, then it's a good investment.
There are two types of error messages: Those that will be seen by the user and those that'll be seen by the programmer.
"How do I know if the user will be able to understand where they've gone wrong?"
I'm assuming that those messages are only going to be seen by the user, and not a very technical one, and COMPILE FAILED REASON 3 is not a typical end-user error message. It's something that the programmer will see(the user doesn't usually compile things).
So, if it's the user that'll see it:
Provide a short "This is an error message"("Ops! Something went wrong!", etc.)
Provide a small generic description of the error ("The site you're trying to connect to seems to be unavailable"/"You don't seem to have enough permissions to perform the XYZ task"/etc.)
Add a "Details>>" button, in case your user happens to understand computers well, including detailed information(exception stack trace, error code, etc.)
Finally, provide some simple and understandable commands for the user ("Try again", "Cancel", etc.)
The real question about error messages is if they should even be displayed. A lot of error messages are presented to a user but there is NO WAY for them to correct them.
As long as there is a way to correct the error then give enough information to the user that will allow them to correct it on their own. If they are not able to correct it on their own is there any reason to tell them the technical reason for the crash? No just log it to a file for troubleshooting later.
As detailed as they need to be ;)
I know it sounds like a smart ass answer but so much of this depends on your target audience and the type of error. For errors caused by invalid user entry, you could show them what constitutes a valid entry. For errors that the user can't control, a generic "we're working on it" type message might do.
I agree with Jon B's comments regarding length as well.
Error messages should be detailed, but clear. This can be achieved by combining error messages from multiple levels:
Failed to save the image
Permission denied: /foo.jpg
Here we have two levels. There can be more. First we tell the big picture and then we tell the details. The order is such that first we have the part understood by most and then the part less people understand, but both can still be visible.
Additionally there could be a fix suggestion.
I would err on the side of more detail, but I think you answered your own question. To avoid the bloat in code then provide useful information in the code/error message but you can give more details in the documentation perhaps or with help files or FAQ.
having too little information is worse in my opinion.
If you are using a language with rich introspection or other capabilities, a log witht he line that failed a check would be useful. The user can then forward to tech support or otherwise get detailed information and this is not additional code bloat, but using your own code to provide information.

Resources