LC-3 PC-relative offset

LC-3 PC-relative offset - lc3

Can someone explain if my reasoning is right?
The book gives the following question
Question: If a control instruction is in location 5, what is the PC-relative offset of address 15? Assume that the control transfer instructions work the same way as in the LC-3.
Answer: The incremented PC is 6. This means that the PC-relative offset of address 15 is 15-6=9.
Is it because since it is in location 5,and since the PC is incremented every instruction cycle that the PC is 6? and then do I just subtract to find the PC-relative offset of a given address?
the following question after that gives something familiar, I want to know if my reasoning is correct, that I know how to do the problem

Your reasoning is correct. If the check instruction is at memory location 5, by the time you get to it the PC is already set to be on location 6. So you have to add something to it, that will take it to memory location 15, which is 9. I suggest translating the hex values in the simulator into binary and checking the PC offset bits on the instructions that have PC offset. It will help you better understand what's going on.

Related

Flash ECC algorithm on STM32L1xx

How does the flash ECC algorithm (Flash Error Correction Code) implemented on STM32L1xx work?
Background:
I want to do multiple incremental writes to a single word in program flash of a STM32L151 MCU without doing a page erase in between. Without ECC, one could set bits incrementally, e.g. first 0x00, then 0x01, then 0x03 (STM32L1 erases bits to 0 rather than to 1), etc. As the STM32L1 has 8 bit ECC per word, this method doesn't work. However, if we knew the ECC algorithm, we could easily find a short sequence of values, that could be written incrementally without violating the ECC.
We could simply try different sequences of values and see which ones work (one such sequence is 0x0000001, 0x00000101, 0x00030101, 0x03030101), but if we don't know the ECC algorithm, we can't check, whether the sequence violates the ECC, in which case error correction wouldn't work if bits would be corrupted.
[Edit] The functionality should be used to implement a simple file system using STM32L1's internal program memory. Chunks of data are tagged with a header, which contains a state. Multiple chunks can reside on a single page. The state can change over time (first 'new', then 'used', then 'deleted', etc.). The number of states is small, but it would make things significantly easier, if we could overwrite a previous state without having to erase the whole page first.

Thanks for any comments! As there are no answers so far, I'll summarize, what I found out so far (empirically and based on comments to this answer):
According to the STM32L1 datasheet "The whole non-volatile memory embeds the error correction code (ECC) feature.", but the reference manual doesn't state anything about ECC in program memory.
The datasheet is in line with what we can find out empirically when subsequentially writing multiple words to the same program mem location without erasing the page in between. In such cases some sequences of values work while others don't.
The following are my personal conclusions, based on empirical findings, limited research and comments from this thread. It's not based on official documentation. Don't build any serious work on it (I won't either)!
It seems, that the ECC is calculated and persisted per 32-bit word. If so, the ECC must have a length of at least 7 bit.
The ECC of each word is probably written to the same nonvolatile mem as the word itself. Therefore the same limitations apply. I.e. between erases, only additional bits can be set. As stark pointed out, we can only overwrite words in program mem with values that:
Only set additional bits but don't clear any bits
Have an ECC that also only sets additional bits compared to the previous ECC.
If we write a value, that only sets additional bits, but the ECC would need to clear bits (and therefore cannot be written correctly), then:
If the ECC is wrong by one bit, the error is corrected by the ECC algorithm and the written value can be read correctly. However, ECC wouldn't work anymore if another bit failed, because ECC can only correct single-bit errors.
If the ECC is wrong by more than one bit, the ECC algorithm cannot correct the error and the read value will be wrong.
We cannot (easily) find out empirically, which sequences of values can be written correctly and which can't. If a sequence of values can be written and read back correctly, we wouldn't know, whether this is due to the automatic correction of single-bit errors. This aspect is the whole reason for this question asking for the actual algorithm.
The ECC algorithm itself seems to be undocumented. Hamming code seems to be a commonly used algorithm for ECC and in AN4750 they write, that Hamming code is actually used for error correction in SRAM. The algorithm may or may not be used for STM32L1's program memory.
The STM32L1 reference manual doesn't seem to explicitely forbid multiple writes to program memory without erase, but there is no documentation stating the opposit either. In order not to use undocumented functionality, we will refrain from using such functionality in our products and find workarounds.

Interessting question.
First I have to say, that even if you find out the ECC algorithm, you can't rely on it, as it's not documented and it can be changed anytime without notice.
But to find out the algorithm seems to be possible with a reasonable amount of tests.
I would try to build tests which starts with a constant value and then clearing only one bit.
When you read the value and it's the start value, your bit can't change all necessary bits in the ECC.
Like:
for <bitIdx>=0 to 31
earse cell
write start value, like 0xFFFFFFFF & ~(1<<testBit)
clear bit <bitIdx> in the cell
read the cell
next
If you find a start value where the erase tests works for all bits, then the start value has probably an ECC of all bits set.
Edit: This should be true for any ECC, as every ECC needs always at least a difference of two bits to detect and repair, reliable one defect bit.
As the first bit difference is in the value itself, the second change needs to be in the hidden ECC-bits and the hidden bits will be very limited.
If you repeat this test with different start values, you should be able to gather enough data to prove which error correction is used.

Assembler passes issue

I have an issue with my 8086 assembler I am writing.
The problem is with the assembler passes.
During pass 1 you calculate the position relative to the segment for each label.
Now to do this the size of each instruction must be calculated and added to the offset.
Some instructions in the 8086 should be smaller if the position of the label is within a range. For example "jmp _label" would choose a short jump if it could and if it couldn't it would a near jump.
Now the problem is in pass 1 the label is not yet reached, therefore it cannot determine the size of the instruction as the "jmp short _label" is smaller than the "jmp near _label" instruction.
So how can I decided weather "jmp _label" becomes a "jmp short _label" or not?
Three passes may also be a problem as we need to know the size of every instruction before the current instruction to even give an offset.
Thanks

What you can do is start with the assumption that a short jump is going to be sufficient. If the assumption becomes invalid when you find out the jump distance (or when it changes), you expand your short jump to a near jump. After this expansion you must adjust the offsets of the labels following the expanded jump (by the length of the near jump instruction minus the length of the short jump instruction). This adjustment may make some other short jumps insufficient and they will have to be changed to near jumps as well. So, there may actually be several iterations, more than 2.
When implementing this you should avoid moving code in memory when expanding jump instructions. It will severely slow down assembling. You should not reparse the assembly source code either.
You may also pre-compute some kind of interdependency table between jumps and labels, so you can skip labels and jump instructions unaffected by an expanded jump instruction.
Another thing to think about is that your short jump has a forward distance of 127 bytes and that when the following instructions amount to more than 127 bytes and the target label is still not encountered, you can change the jump to a near jump right then. Keep in mind that at any moment you may have up to 64 forward short jumps that may become near in this fashion.

Why is the offset of the "logical bottom" and "physical bottom" of the stack random?

I run a program on my Windows 10 machine with windbg, and let it break on the initial breakpoint. I take the address of the physical bottom of the stack (stackBase of the TEB), and subtract the rsp value of ntdll!LdrInitializeThunk. I just did this 5 times on the same program, and I got 5 different values:
0x600
0x9f0
0xa40
0x5d0
0x570
You get similar results if you do the same with ntdll!RtlUserThreadStart, etc. This suggests that the "logical bottom" of the stack is somewhat randomized. Why is that? Is this some kind of "mini-ASLR" inside of the stack? Is this documented anywhere?

After some googling for ASLR in Vista specifically (ASLR was introduced in Vista), I found this document from Symantec. On page 5, it mentions the phenomenon that my question is about (emphasis mine):
Once the stack has been placed, the initial stack pointer is further randomized by a random decremental
amount. The initial offset is selected to be up to half a page (2,048 bytes), but is limited to naturally
aligned addresses [...]
So it seems it's done intentionally for security reasons (this way it's harder to figure out addresses of things that reside at fixed offsets relative to the stack base).
I'm leaving this question open for some time hoping that someone can provide a more insightful answer. If no one does, I will accept this answer.

Adding bytes and determining flag values

I am trying to add two hex numbers for example $E2 + $3C which I can do just fine; however, I do not know how to determine the V, N, Z and C flag values?
Any help would be GREATLY appreciated. I have been scratching my head for much too long.
Thanks!

The flags are bits in the status register. They are set or cleared by some instructions, (such as ADD or ADC), but not all.
You can look at the status register, SREG, directly, but in assembly, there are branch instructions that operate according to these bits. There is a summary of the branch instructions on p. 9 of the instruction set manual.
Whether or not the flags are set is described in detail in the entries for each instruction, such as for ADD on p. 17.

Win32 EXCEPTION_INT_OVERFLOW vs EXCEPTION_INT_DIVIDE_BY_ZERO

I have a question about the EXCEPTION_INT_OVERFLOW and EXCEPTION_INT_DIVIDE_BY_ZERO exceptions.
Windows will trap the #DE errors generated by the IDIV instruction and will end up generating and SEH exception with one of those 2 codes.
The question I have is how does it differentiate between the two conditions? The information about idiv in the Intel manual indicates that it will generate #DE in both the "divide by zero" and "underflow cases".
I took a quick look at the section on the #DE error in Volume 3 of the intel manual, and the best I could gather is that the OS must be decoding the DIV instruction, loading the divisor argument, and then comparing it to zero.
That seems a little crazy to me though. Why would the chip designers not use a flag of some sort to differentiate between the 2 causes of the error? I feel like I must be missing something.
Does anyone know for sure how the OS differentiates between the 2 different causes of failure?

Your assumptions appear to be correct. The only information available on #DE is CS and EIP, which gives the instruction. Since the two status codes are different, the OS must be decoding the instruction to determine which.
I'd also suggest that the chip makers don't really need two separate interrupts for this case, since anything divided by zero is infinity, which is too big to fit into your destination register.
As for "knowing for sure" how it differentiates, all of those who do know are probably not allowed to reveal it, either to prevent people exploiting it (not entirely sure how, but jumping into kernel mode is a good place to start looking to exploit) or making assumptions based on an implementation detail that may change without notice.
Edit: Having played with kd I can at least say that on the particular version of Windows XP (32-bit) I had access to (and the processor it was running on) the nt!Ki386CheckDivideByZeroTrap interrupt handler appears to decode the ModRM value of the instruction to determine whether to return STATUS_INTEGER_DIVIDE_BY_ZERO or STATUS_INTEGER_OVERFLOW.
(Obviously this is original research, is not guaranteed by anyone anywhere, and also happens to match the deductions that can be made based on Intel's manuals.)

Zooba's answer summarizes the Windows parses the instruction to find out what to raise.
But you cannot rely on that the routine correctly chooses the code.
I observed the following on 64 bit Windows 7 with 64 bit DIV instructions:
If the operand (divisor) is a memory operand it always raises EXCEPTION_INT_DIVIDE_BY_ZERO, regardless of the argument value.
If the operand is a register and the lower dword is zero it raises EXCEPTION_INT_DIVIDE_BY_ZERO regardless if the upper half isn't zero.
Took me a day to find this out... Hope this helps.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio