Does all code get converted into machine code? - compilation

For programs to run on a computer, does the code have to be converted into machine code so the CPU can run it?
How does this happen?

Wow! That needs a lot to explain. :D
First, machines like humans have their own languages so we can simply say that if you want a computer to work as you say, you have to say it in its language :)
But you probably heard about compiling and interpreting:
Compile: convert (a program) into a machine-code or lower-level form in which the program can be executed.
So basically it means that the code code will be converted to something else like an executable file, when the programmer(s) decides that they are done programming. So if you look at an .exe file with notepad, you cannot simply understand any thing. and the code that has been compiled for windows, cannot be executed on Mac.
interpreting : the code will be converted by another program in the runtime. So the code is human readable until the last seconds. Like if you right click on this page and select "view page source", you can see the HTML code that has been generated for this page. This means that the code flexible and can work on different machines like as you see, you can see the same page on your Mac, windows or with different browsers like chrome, firefox or IE but then it will be a lot slower than compiling.
What we do in practice?
We compile our code to a an intermediate language that is understandable by a virtual machine that is specific for each machine.
Let me explain it with an example. Lets say someone wants to give a speech in UN lets say in Chinese.
If he translate all of his speech to different languages and give it to people, it is compiling.
If he speaks and some people translate his words online to French, English, etc. then it is interpreting. But it sucks and you probably won't find anyone to do it for many languages
If he give a translated version (like English version) of it to translators, before the speech and they can read it and say in to different languages when the speaker speaks, then it is what we do now :D
You can read more in here : Runtime vs Compile time

Related

Windows 10 bootmgr Help: viewing the source code

I am in a process of learning things in reverse order for fun, and I have decided to dissect Windows 10, bit-by-bit, and learn what makes a great OS function. And I also suppose that my question will be geared in other ways as well.
My question is, how do I look at something like Windows bootmgr source code properly? I have opened the file - which the file type is redundantly called "File" - and even though it is in Assembly language, it is completely impossible to read. My guess is that whoever wrote the File did something to encrypt the File so that it is unreadable, and thus unchangeable/unable to be edited.
Let me be perfectly clear: my purpose is not to change the bootmgr File to change windows, but rather to get a better understanding of how an OS works via reading, and also through trial and error.
Any help that anyone can give would be greatly appreciated. I love to learn about these things, and I just have been completely unable to find the answer I am looking for on any site thus far, including this one...IDK if I need to refine my searches or what.
Thank in advanced for your help. :)
Ps. I shall include a picture of what I am seeing in Notepad++ so you can get a better understanding of what I need here .
I think you may be confusing assembly language with machine code. Machine code is the language that your computer's processor understands. Assembly language is a series of symbols that are used to represent machine code. Compiled executables are stored in machine code.
That said, the standard way to view the machine code for a compiled binary is through the use of a program called a hex editor. A hex editor will display the binary code in a numerical format, rather than attempting to interpret the binary as text, like your editor is trying to do in the screenshot you supplied. Frhed is a popular hex editor, but there are many good ones to choose from.

Interact with GUI Elements of a Windows Application

First of all, I want to appreciate the work for the SCIDvsPC Project. I know that the basic SCID one has been discontinued many years back and the developer have done a great job with expanding it and doing his share for the Chess Field. We have a Minor Project to do in this 6th semester of our college. We've decided to start a project on a Chess Next Move Analyzer that is based on variety of filters and implements Self Learning and Machine Learning.
I've been researching over the project idea for the last 2 months. Actually we need to import several games defined on some filters and read and analyze from the PGN file generated. For example, if the user chooses to get the next best move predicted according to the rating range of 2000-2500, our program should only export and analyze the PGN files that have both the opponents from this range only. I know the project can do all this but I'm confused over how to automate this. I mean I have to manually enter the moves and then click on 'Generate PGN' but how to make my program do this ie take input from the user (like first 3 moves), make the project run these moves (what I had to manually) and then generate the PGN file and keep it in a folder.
I've surfed the net about interacting with GUI elements in Windows (we have no problem in working with Linux either) and came to know about Microsoft UI Automation, Python, Java and C# softwares and something like COM. Do the software support COM or any one of these or have you already developed some functionality like this? Please can you guide me over this?
If asked to Generalize this what I want to do is to interact with GUI Elements, be it any application. Take Notepad as an example. Suppose I want to open a file on it, find and replace a particular word. Now, I know how to do this manually but when I have over thousands of file I need some kind of program to do this for me. Do some specific programs like SCID in my case has some feature (read bit about COM) pre-built to handle this? In which programming language domain does this come into? Is using Linux help me more?
Take Notepad as an example. Suppose I want to open a file on it, find
and replace a particular word. Now, I know how to do this manually but
when I have over thousands of file I need some kind of program to do
this for me. Do some specific programs like SCID in my case has some
feature (read bit about COM) pre-built to handle this?
Your situation sounds to be quite specific so I doubt whether you will be able to find a pre-existing program to do this for you. Meaning: you'll have to code it yourself.
In which programming language domain does this come into?
Well, this could probably be done in many, many different programming languages. A simple shell script would be able to achieve the Notepad example you gave.
Is using Linux help me more?
No, your goals seem to be pretty achievable by a simple shell script, whether you write it in a Windows, macOS or a Linux distro.
#SB87 gave you some useful hints, I'd like to expand his answers.
Sorry, I don't think you know what you're talking about. Reinforcement learning (better term than self-learning) and machine learning are not something suitable for a college project. It's at the PhD or research level, consider getting yourself into university before even thinking about anything like that.
UI automation is possible, but error prone and slow. If you want to do it, you'd write a console program. You mentioned something about user inputs, do you mean you want to apply machine learning on user mouse-keyboard inputs? It's not going to work. Machine learning for chess requires hundreds and thousands of training set.
I think you should downplay the project and focus on something you can achieve.

c#, vb, java, python script testing environment?

I was watching a video on KhanAcademy:
http://www.khanacademy.org/video/insertion-sort-in-python?playlist=Computer+Science
And i noticed the IDE he was using allowed a neat little interface for testing functions where you could code in however much you needed, then test it without the need to compile and run your entire application in order to test a function. Rather, in a command line, you could just say a = 100, then tell it to run a method on that value Function(a) and have it run.
I don't know how else to explain this other than telling you to watch that video. Now, I know in visual studio, you can run your application, then play with watched variables to manipulate the outcome, but thats not really the same. I'm looking for something quick and snappy out there similar to pyScripter in this sense. Does anyone know of any tools like this for any of aforementioned languages?
For c# there is LINQPad. A c# scratch pad, that also speaks LINQ2SQL.
The concept in Python and similar languages is called REPL, LINQPad is not exactly the same, it does not keep old results in the same way, you need to run complete snippets of code, but that usually is not a problem.

Macro/Scripting language for non-developers with a simple GUI-based editor

We wish to provide people to be able to add some logic to their accounts(say, given a few arguments, how to compute a particular result). So, essentially, this would be tantamount to writing simple business rules with support for conditionals and expressions. However, the challenge is to provide them a simple online editor where they can create the logic (preferably) by completely visual means (drag/drop Expr-tree nodes maybe -- kinda like Y! pipes).
Does anybody know of a scripting/macro/domain-specific language that lets people do this? The challenge is the visual editor, since we don't wish to invest in developing the UI to do the editing. The basic requirements would be:
1. Embedded into another language, or run securely (no reboot -n or <JUNK-DANGEROUS-COMMAND> >> ~/.bashrc)
2. Easily accessible to users without coding background (no need of any advanced features)
3. Preferably have a simple GUI based editor to create the logic programs accessible to non-developers (kinda like spreadsheets)
4. Some ability to generate compile-time warnings (invalid code) would be good (Type safety?)
5. Ability to embed some data before execution which is available to the interpreter (Eg., name, birthday, amount)
Anybody tried doing something like this and got any ideas? I looked at Lua, Io, Python, Ruby and a host of others, but the challenge essentially is that I don't think non-programmers will be able to understand the code all that much. Something that could be added via "meta-programming" to say a Ruby would be good as well, if an editor could be easily developed!
As a matter fact, Microsoft is developing Oslo, which is right up your alley.
Chris Sells has been writing a lot about it recently.
It is designed to be a way to author DSLs and also to visually author these models with a graphical tool called Quadrant. Sounds very very similar to what you are looking for.
Open source wise, Ruby I think can be close, as you can see if you look at _whytheluckystiff's Try Ruby or Hackety.
I don't think you'll find anything that isn't too generic, especially regarding the GUI editor. There's no generic tools as far as I know that will be able to automatically interface with your program and be able to query data from it and interpret the script into commands in your software -- if there is I'd like to have a copy. Not being flippant, but you will have to do some (probably alot) of work to get this working. It will probably result in you writing a custom DSL.
I would take a look at PowerShell. You could surface all the activities a user would like to script in a very readable way.
There is some talk of using PowerShell to create a DSL on the PowerShell team blog and Bruce Payette, the technical lead, talks about this in his book Windows PowerShell in Action from Manning.
At the other end of the scale is to write something simple as a HyperText Application (HTA) -- assuming Windows of course -- along the lines of my Clive tool. The article on the blog doesn't mention the HTA version, but essentially I could enter VBScript-ish code into one textarea and interpret it on the spot, output going into another text area on the form.
With HTAs giving you all the form control of HTML, plus the DOM, you could come up with something interesting fairly quickly.

Is it possible to "decompile" a Windows .exe? Or at least view the Assembly?

A friend of mine downloaded some malware from Facebook, and I'm curious to see what it does without infecting myself. I know that you can't really decompile an .exe, but can I at least view it in Assembly or attach a debugger?
Edit to say it is not a .NET executable, no CLI header.
With a debugger you can step through the program assembly interactively.
With a disassembler, you can view the program assembly in more detail.
With a decompiler, you can turn a program back into partial source code, assuming you know what it was written in (which you can find out with free tools such as PEiD - if the program is packed, you'll have to unpack it first OR Detect-it-Easy if you can't find PEiD anywhere. DIE has a strong developer community on github currently).
Debuggers:
OllyDbg, free, a fine 32-bit debugger, for which you can find numerous user-made plugins and scripts to make it all the more useful.
WinDbg, free, a quite capable debugger by Microsoft. WinDbg is especially useful for looking at the Windows internals, since it knows more about the data structures than other debuggers.
SoftICE, SICE to friends. Commercial and development stopped in 2006. SoftICE is kind of a hardcore tool that runs beneath the operating system (and halts the whole system when invoked). SoftICE is still used by many professionals, although might be hard to obtain and might not work on some hardware (or software - namely, it will not work on Vista or NVIDIA gfx cards).
Disassemblers:
IDA Pro(commercial) - top of the line disassembler/debugger. Used by most professionals, like malware analysts etc. Costs quite a few bucks though (there exists free version, but it is quite quite limited)
W32Dasm(free) - a bit dated but gets the job done. I believe W32Dasm is abandonware these days, and there are numerous user-created hacks to add some very useful functionality. You'll have to look around to find the best version.
Decompilers:
Visual Basic: VB Decompiler, commercial, produces somewhat identifiable bytecode.
Delphi: DeDe, free, produces good quality source code.
C: HexRays, commercial, a plugin for IDA Pro by the same company. Produces great results but costs a big buck, and won't be sold to just anyone (or so I hear).
.NET(C#): dotPeek, free, decompiles .NET 1.0-4.5 assemblies to C#. Support for .dll, .exe, .zip, .vsix, .nupkg, and .winmd files.
Some related tools that might come handy in whatever it is you're doing are resource editors such as ResourceHacker (free) and a good hex editor such as Hex Workshop (commercial).
Additionally, if you are doing malware analysis (or use SICE), I wholeheartedly suggest running everything inside a virtual machine, namely VMware Workstation. In the case of SICE, it will protect your actual system from BSODs, and in the case of malware, it will protect your actual system from the target program. You can read about malware analysis with VMware here.
Personally, I roll with Olly, WinDbg & W32Dasm, and some smaller utility tools.
Also, remember that disassembling or even debugging other people's software is usually against the EULA in the very least :)
psoul's excellent post answers to your question so I won't replicate his good work, but I feel it'd help to explain why this is at once a perfectly valid but also terribly silly question. After all, this is a place to learn, right?
Modern computer programs are produced through a series of conversions, starting with the input of a human-readable body of text instructions (called "source code") and ending with a computer-readable body of instructions (called alternatively "binary" or "machine code").
The way that a computer runs a set of machine code instructions is ultimately very simple. Each action a processor can take (e.g., read from memory, add two values) is represented by a numeric code. If I told you that the number 1 meant scream and the number 2 meant giggle, and then held up cards with either 1 or 2 on them expecting you to scream or giggle accordingly, I would be using what is essentially the same system a computer uses to operate.
A binary file is just a set of those codes (usually call "op codes") and the information ("arguments") that the op codes act on.
Now, assembly language is a computer language where each command word in the language represents exactly one op-code on the processor. There is a direct 1:1 translation between an assembly language command and a processor op-code. This is why coding assembly for an x386 processor is different than coding assembly for an ARM processor.
Disassembly is simply this: a program reads through the binary (the machine code), replacing the op-codes with their equivalent assembly language commands, and outputs the result as a text file. It's important to understand this; if your computer can read the binary, then you can read the binary too, either manually with an op-code table in your hand (ick) or through a disassembler.
Disassemblers have some new tricks and all, but it's important to understand that a disassembler is ultimately a search and replace mechanism. Which is why any EULA which forbids it is ultimately blowing hot air. You can't at once permit the computer reading the program data and also forbid the computer reading the program data.
(Don't get me wrong, there have been attempts to do so. They work as well as DRM on song files.)
However, there are caveats to the disassembly approach. Variable names are non-existent; such a thing doesn't exist to your CPU. Library calls are confusing as hell and often require disassembling further binaries. And assembly is hard as hell to read in the best of conditions.
Most professional programmers can't sit and read assembly language without getting a headache. For an amateur it's just not going to happen.
Anyway, this is a somewhat glossed-over explanation, but I hope it helps. Everyone can feel free to correct any misstatements on my part; it's been a while. ;)
Good news. IDA Pro is actually free for its older versions now:
http://www.hex-rays.com/idapro/idadownfreeware.htm
x64dbg is a good and open source debugger that is actively maintained.
Any decent debugger can do this. Try OllyDbg. (edit: which has a great disassembler that even decodes the parameters to WinAPI calls!)
If you are just trying to figure out what a malware does, it might be much easier to run it under something like the free tool Process Monitor which will report whenever it tries to access the filesystem, registry, ports, etc...
Also, using a virtual machine like the free VMWare server is very helpful for this kind of work. You can make a "clean" image, and then just go back to that every time you run the malware.
I'd say in 2019 (and even more so in 2022), Ghidra (https://ghidra-sre.org/) is worth checking out. It's open source (and free), and has phenomenal code analysis capabilities, including the ability to decompile all the way back to fairly readable C code.
Sure, have a look at IDA Pro. They offer an eval version so you can try it out.
You may get some information viewing it in assembly, but I think the easiest thing to do is fire up a virtual machine and see what it does. Make sure you have no open shares or anything like that that it can jump through though ;)
Boomerang may also be worth checking out.
I can't believe nobody said nothing about Immunity Debugger, yet.
Immunity Debugger is a powerful tool to write exploits, analyze malware, and reverse engineer binary files. It was initially based on Ollydbg 1.0 source code, but with names resoution bug fixed. It has a well supported Python API for easy extensibility, so you can write your python scripts to help you out on the analysis.
Also, there's a good one Peter from Corelan team wrote called mona.py, excelent tool btw.
If you want to run the program to see what it does without infecting your computer, use with a virtual machine like VMWare or Microsoft VPC, or a program that can sandbox the program like SandboxIE
You can use dotPeek, very good for decompile exe file. It is free.
https://www.jetbrains.com/decompiler/
What you want is a type of software called a "Disassembler".
Quick google yields this: Link
If you have no time, submit the malware to cwsandbox:
http://www.cwsandbox.org/
http://jon.oberheide.org/blog/2008/01/15/detecting-and-evading-cwsandbox/
HTH
The explorer suite can do what you want.

Resources