I am required to design and build an 8 bit Pseudo Random Number Generator. I have looked at possible methods; using background noise, user input etc. I was wondering if anyone could give me some advice on where to start as this would be of great help to me.
random.org is perhaps the best place to start your investigation.
Below should get you started with the basics
howstuffworks.com
Construct your own random number generator
For a simple 8 bit PRNG you could ry something like a Linear Feedback Shift Register. This is very simple to implement in either software or hardware.
My plan is to use a temperature sensor. When the temps are being processed in the ADC, I am going to amplify the noise generated. This will then give me the random 8 bit number I require which will be used as the 'seed' for the PRNG in stdlib (C programming).
What do you's think?
I've found that the following works very well. This is implemented in MSP430 assembly, but would be easy enough to port to another processor. I've used this to generate 'white' noise for a synthesizer project, and there were no audible patterns in the output. Depending on what your requirements are, this might be sufficient. It uses two state variables, the previous output (8 bits), and a 16-bit state register. I found this online, http://www.avrfreaks.net/index.php?name=PNphpBB2&file=viewtopic&t=95614&highlight=radbrad, where it's listed in AVR assembly, and ported it to MSP.
Because it uses shifts and shifts the top bit out of one register into the bottom of another, it doesn't really lend itself to efficient implementation in C. Hence the assembly. I hope you find this as useful as I did.
mov.b &rand_out, r13
mov.b r13,r12
and.b #66, r13
jz ClearCarry
cmp.b #66, r13
xor.w #1, sr ; invert carry flag
jmp SkipClearCarry
ClearCarry:
clrc
SkipClearCarry:
rlc.w &rand_state
rlc.b r12
mov.b r12,&rand_out
ret
Related
What is the fastest way of turning some value (stored in register) into 2 to the power of that value in assembly language? I think that some bitwise operations can be used. For example:
Value: 8
Result: 256 (2<sup>8</sup>)
So, short answer: What you're looking for is a left shift.
in C and many other languages, your particular wish would be served by 1 << 8.
You could do it in x86 assembler with shl but there's really no sane reason to do so since pretty much any compiler you come across is going to compile the code into the native shift instruction.
I'm currently working on a project that requires me to write a bubble sort algorithm in Harvard Machine 16 Bit Assembly Code. I tried searching for it online, however most assembly code snippets use the CMP and MOV operators.
I have the following instruction available:
ADD, SUB, AND, Copy, ADDI, SUBI, ANDI, LOADI, BZ, BEQ, BRA, SW, LW.
Could anyone please give me a nudge in the proper direction?
Thanks in advance,
You can always implement an equivalent of CMP using SUB (or even ADD if SUB isn't available).
MOV can always be constructed out of a load and a store. You could also simulate it using a load and ADD to a zero-initialized register or memory location.
Don't search. Write the algorithm in pseudo-code and see how you can construct each step with the instructions you've got.
TASK:
I'm building a set of x86 assembly reverse engineering challenges, of which I have twenty or so already completed. They're just for fun / education.
The current challenge is one of the more advanced ones, and involves some trickery that makes it look like the EP is actually in the normal program, but it's actually packed away in another PE section.
Heres' the basic flow:
Starts out as if it were a normal MSVC++ application.
Injected a sneaky call away to a bunch of anti-debugger tricks.
If they pass, a DWORD in memory is set to 1.
Later in the program flow, it checks for that value being 1, and if it works it decrypts a small call table. If it fails, it sends them off on a wild goose chase of fake anti-debug tricks and eventually just crashes.
The call table points to the real decryption routines that decrypt the actual program code section.
The decryption routines are called, and they decrypt using a basic looped xor (C^k^n where C is ciphertext, k is a 32-bit key and n is the current data offset)
VirtualProtect is used to switch the section's protection flags from RW to RX.
Control flow is redirected to OEP, program runs.
The idea is that since they think they're in normal program flow, it makes them miss the anti-debug call and later checks. Anyway, that all works fine.
PROBLEM:
The current problem is that OllyDbg and a few other tools look at the packed section and see that it has high entropy, and throw up a warning that it's packed. The code section pointer in the PE header is correctly set, so it doesn't get this from having EP outside code - it's purely an entropy analysis thing.
QUESTION:
Is there an encryption method I can use that preserves low entropy, but is still easy to implement in x86 asm? I don't want to use a plain xor, since it's too easy, but I also don't want it to catch it as packed and give the game away.
I thought of something like a shuffler (somehow produce a keystream and use it to swap 4-byte blocks of code around), but I'm not sure that this is going to work, or even be simple.
Anyone got any ideas?
Actually, OllyDbg works like this pseudocode:
useful_bytes = number_of_bytes_in_section - count_bytes_with_values(0x00, 0x90, 0xCC)
warn about compression if useful_bytes > 0x2000 and count_bytes_with_values(0xFF, 0xE8, 0x8B, 0x89, 0x83) / useful_bytes < 0.075
So, the way to avoid that warning is to use enough bytes with the values 0xFF 0xE8 0x8B 0x89 0x83 in the compressed section.
Don't pack/encrypt your entire program code. Just encrypt a small percentage of bytes, randomly selected from your program code. If they're not decrypted, the program will soon crash if it tries to run the code anyway - and because the majority of the program is unchanged, entropy-based checks won't be set off.
What about simply reversing the bytes (from last to first)? Intel assembler instructions aren't fixed length, so this would shuffle them a little. Or you could simply rotate each byte by a fixed amount...
EDIT: Wrong guess, this is not how Olly works. See my other answer. This still applies to tools other than OllyDbg that calculates entropy.
Expanding on ninjaljs comment:
While I haven't checked, the entropy value OllyDbg calculates is likely bytewise, without context. See How to calculate the entropy of a file? for a common algorithm for doing this.
This algorithm gives that the sequence 0 1 2 ... 254 255 have the maximum entropy possible, despite being completely predictable. A sequence of random bytes between 0 and 255 would get slightly lower entropy, since it won't have exactly the same number of each possible value.
Some quick checks on uncompressed executables with pefile tells me that uncompressed x86 code has entropy of about 6.3 to 6.6. Compressed code with entropy 8.0, encoded with base64, has entropy 6.0. Thus, base64 is easily enough to stop this algorithm from finding compressed code.
Is it possible to write a sequence of instructions that will place a 1 in the least significant bit of the memory cell at address B3 without disturbing the other bits in the memory cell?
The machine instructions I am referring to is the STOP, ADD, SWITCH, STOP, LOAD, ROTATE etc.
Clarification: this question was originally tagged C#; since it wasn't the OP that re-tagged it, I'll leave this here until the OP's intentions are clearer.
C# is a high-level programming language, which compiles down to IL, not machine code. As such: no, there is absolutely no supported mechanism for performing specific machine code operations (and even if there were, it couldn't possibly port between langauges).
You can do high level bit operations, using the operators on the integer-based types; and if you really want you can write IL, either building it manually (ilasm), or at runtime via DynamicMethod / ILGenerator - but these still only deal with CIL opcodes, not machine codes.
I think ORing it with 1 will do the job ain't it:
algo:
byte= [data at 0xB3]
byte = byte | 0x01
this works fine with me in developing for 8051 MCUs.
I am taking an assembly course now, and the guy who checks our home assignments is a very pedantic old-school optimization freak. For example he deducts 10% if he sees:
mov ax, 0
instead of:
xor ax,ax
even if it's only used once.
I am not a complete beginner in assembly programing but I'm not an optimization expert, so I need your help in something (might be a very stupid question but I'll ask anyway):
if I need to set a register value to 1 or (-1) is it better to use:
mov ax, 1
or do something like:
xor ax,ax
inc ax
I really need a good grade, so I'm trying to get it as optimized as possible. ( I need to optimize both time and code size)
A quick google for 8086 instructions timings size turned up a listing of instruction timings which seems to have all the timings and sizes for the 8086/8088 through Pentium.
Although you should note that this probably doesn't include code fetch memory bottlenecks which can be very significant, especially on an 8088. This usually makes optimization for code-size a better choice. See here for some details on this.
No doubt you could find official Intel documentation on the web with similar information, such as the "8086/8088 User's Manual: Programmer's and Hardware Reference".
For your specific question, the table below gives a comparison that indicates the latter is better (less cycles, and same space):
Instructions
Clock cycles
Bytes
xor ax, axinc ax
33---6
21---3
mov ax, 1
4
3
But you might want to talk to your educational institute about this guy. A 10% penalty for a simple thing like that seems quite harsh. You should ask what should be done in the case where you have two possibilities, one faster and one shorter.
Then, once they've admitted that there are different ways to optimise code depending on what you're trying to achieve, tell them that what you're trying to do is optimise for readability and maintainability, and seriously couldn't give a damn about a wasted cycle or byte here or there(1).
Optimisation is something you generally do if and when you have a performance problem, after a piece of code is in a near-complete state - it's almost always wasted effort when the code is still subject to a not-insignificant likelihood of change.
For what it's worth, sub ax,ax appears to be on par with xor ax,ax in terms of clock cycles and size, so maybe you could throw that into the mix next time to cause him some more work.
(1)No, don't really do that , but it's fun to vent occasionally :-)
You're better off with
mov AX,1
on the 8086. If you're tracking register contents, you can possibly do better if you know that, for example, BX already has a 1 in it:
mov AX,BX
or if you know that AH is 0:
mov AL,1
etc.
Depending upon your circumstances, you may be able to get away with ...
sbb ax, ax
The result will either be 0 if the carry flag is not set or -1 if the carry flag is set.
However, if the above example is not applicable to your situation, I would recommend the
xor ax, ax
inc ax
method. It should satisfy your professor for size. However, if your processor employs any pipe-lining, I would expect there to be some coupling-like delay between the two instructions (I could very well be wrong on that). If such a coupling exists, the speed could be improved slightly by reordering your instructions slightly to have another instruction between them (one that does not use ax).
Hope this helps.
I would use mov [e]ax, 1 under any circumstances. Its encoding is no longer than the hackier xor sequence, and I'm pretty sure it's faster just about anywhere. 8086 is just weird enough to be the exception, and as that thing is so slow, a micro-optimization like this would make most difference. But any where else: executing 2 "easy" instructions will always be slower than executing 1, especially if you consider data hazards and long pipelines. You're trying to read a register in the very next instruction after you modify it, so unless your CPU can bypass the result from stage N of the pipeline (where the xor is executing) to to stage N-1 (where the inc is trying to load the register, never mind adding 1 to its value), you're going to have stalls.
Other things to consider: instruction fetch bandwidth (moot for 16-bit code, both are 3 bytes); mov avoids changing flags (more likely to be useful than forcing them all to zero); depending on what values other registers might hold, you could perhaps do lea ax,[bx+1] (also 3 bytes, even in 32-bit code, no effect on flags); as others have said, sbb ax,ax could work too in circumstances - it's also shorter at 2 bytes.
When faced with these sorts of micro-optimizations you really should measure the alternatives instead of blindly relying even on processor manuals.
P.S. New homework: is xor bx,bx any faster than xor bx,cx (on any processor)?