The ATmega8A manual defines the bits contained in MCUCR register at page 56.
These definitions don't match the #defines contained in the ATmega8A I/O library supplied by avr-gcc, located at /usr/lib/avr/include/avr/iom8a.h.
For example, the SE bit (Sleep Enable) is defined in the manual as bit 5, while in the above library it is #define SE 7.
I haven't checked if the AVR actually misinterprets these MCUCR flags.
Am I missing something here?
The data sheet seems to be wrong.
Section "14.8.1. MCUCR – MCU Control Register" on page 56 states
while section "17.1.1. MCUCR – MCU Control Register" on page 74 states
which is not possible, since bit 2 and 3 would be ambiguous.
The register summary is correct:
That means the library definitions are correct.
Related
When I compile a C code using GCC to MIPS, it contains code like:
daddiu $28,$28,%lo(%neg(%gp_rel(f)))
And I have trouble understanding instructions starting with %.
I found that they are called macros and predefined macros are dependent on the assembler but I couldn't find description of the macros (as %lo, %neg etc.) in the documentation of gas.
So does there exist any official documentation that explains macros used by GCC when generating MIPS code?
EDIT: The snippet of the code comes from this code.
This is a very odd instruction to find in compiled C code, since this instruction is not just using $28/$gp as a source but also updating that register, which the compiler shouldn't be doing, I would think. That register is the global data pointer, which is setup on program start, and used by all code accessing near global variables, so it shouldn't ever change once established. (Share a godbolt.org example, if you would.)
The functions you're referring to are for composing the address of labels that are located in global data. Unlike x86, MIPS cannot load (or otherwise have) a 32-bit immediate in one instruction, and so it uses multiple instructions to do work with 32-bit immediates including address immediates. A 32-bit immediate is subdivided into 2 parts — the top 16-bits are loaded using an LUI and the bottom 16-bits using an ADDI (or LW/SW instruction), forming a 2 instruction sequence.
MARS does not support these built-in functions. Instead, it uses the pseudo instruction, la $reg, label, which is expanded by the assembler into such a sequence. MARS also allows lw $reg, label to directly access the value of a global variable, however, that also expands to multiple instruction sequence (sometimes 3 instructions of which only 2 are really necessary..).
%lo computes the low 16-bits of a 32-bit address for the label of the argument to the "function". %hi computes the upper 16-bits of same, and would be used with LUI. Fundamentally, I would look at these "functions" as being a syntax for the assembly author to communicate to the assembler to share certain relocation information/requirements to the linker. (In reverse, a disassembler may read relocation information and determine usage of %lo or %hi, and reflect that in the disassembly.)
I don't know %neg() or %gp_rel(), though could guess that %neg negates and %gp_rel produces the $28/$gp relative value of the label.
%lo and %hi are a bit odd in that the value of the high immediate sometimes is offset by +1 — this is done when the low 16-bits will appear negative. ADDI and LW/SW will sign extend, which will add -1 to the upper 16-bits loaded via LUI, so %hi offsets its value by +1 to compensate when that happens. This is part of the linker's operation since it knows the full 32-bit address of the label.
That generated code is super weird, and completely different from that generated by the same compiler, but 32-bit version. I added the option -msym32 and then the generated code looks like I would expect.
So, this has something to do with the large(?) memory model on MIPS 64, using a multiple instruction sequence to locate and invoke g, and swapping the $28/$gp register as part of the call. Register $25/$t9 is somehow also involved as the generated code sources it without defining it; later, prior to where we would expect the call it sets $25.
One thing I particularly don't understand, though, is where is the actual function invocation in that sequence! I would have expected a jalr instruction, if it's using an indirect branch because it doesn't know where g is (except as data), but there's virtually nothing but loads and stores.
There are two additional oddities in the output: one is the blank line near where the actual invocation should be (maybe those are normal, but usually don't see those inside a function) and the other is a nop that is unnecessary but might have been intended for use in the delay slot following an invocation instruction.
I have some startup assembly for RISCV which defines the .text section as beginning at .globl _start.
I know what this is - as a disassembly shows me the address, but I cannot see where it is defined. It's not in the linker script and a grep in the build directories shows it is in various binary files, but I cannot find a definition.
I am guessing this appears in a file somewhere as a function of the architecture, but can anyone tell me where? (This is all being built using RISCV GNU cross compilers on Linux)
Unless you control it yourself there is usually at least in the gnu tools world a file called crt0.s. Or perhaps some other name. Should be one per architecture since it is in assembly language. It is the default bootstrap, zeros .bss copies .data as needed, etc.
I dont remember if it is part of the C library (glibc, newlib, etc), or if it is added on later by folks that build a toolchain targeting some specific platform.
Not required certainly but it is not uncommon to see _start be the label of the beginning of the binary, it is supposed to be the entry point certainly. So if you have an operating system/loader that uses a binary with labels present (elf, etc), then it can load the binary and instead of branching to the first address it branches to the entry point.
So the _start is merely defined as being at the start of the .text section, and the address of the .text section is defined in the linker script.
I'm intrigued by the DISCARDABLE flag in the section flags in PE files, specifically in the context of Windows drivers (in this case NDIS). I noticed that the INIT section was marked as RWX in a driver I'm reviewing, which seems odd - good security practice says you should adopt a W^X policy.
The dump of the section is as follows:
Name Virtual Size Virtual Addr Raw Size Raw Addr Reloc Addr LineNums RelocCount LineNumCount Characteristics
INIT 00000B7E 0000E000 00000C00 0000B200 00000000 00000000 0000 0000 E2000020
The characteristics map to:
IMAGE_SCN_MEM_EXECUTE
IMAGE_SCN_MEM_READ
IMAGE_SCN_MEM_WRITE
IMAGE_SCN_MEM_DISCARDABLE
IMAGE_SCN_CNT_CODE
The INIT section seems to contain the driver entry, which implies that it might be used to ensure that the driver entry function resides in nonpaged memory, whereas the rest of the code is allowed to be paged. I'm not entirely sure, though. I can see no evidence in the driver code to say that the developers explicitly set the page flags, or forced the driver entry into a separate section, so it looks like the compiler did it automatically. I also manually flipped the writeable flag in the driver binary to test it out, and it works fine without writing enabled, so that implies that having it RWX is unnecessary.
So, my questions are:
What is the INIT section used for in the context of a Windows driver and why is it marked discardable?
How are discardable sections treated in the Windows kernel? I have some idea of how ReactOS handles them but that's still fuzzy and not massively helpful.
Why would the compiler move the driver entry to an INIT section?
Why would the compiler mark the section as RWX, when RX is sufficient and RWX may constitute a security issue?
References I've looked at so far:
What happens when you mark a section as DISCARDABLE? - The Old New Thing
Windows Executable Files - x86 Disassembly Book
Pageable and Discardable Code in a Protocol Driver - MSDN
EDIT, 2022: I forgot to update this, but a while after I posted this question I passed it on to Microsoft and it did turn out to be a bug in the MSVC linker. They were mistakenly marking the discard section that contained DriverEntry as RWX. The issue was fixed in VS2015.
What is the INIT section used for in the context of a Windows...
It is normally used for the DriverEntry() function.
How are discardable sections treated in the Windows kernel?
It allows the page(s) that contain the DriverEntry() function code to be discarded. They are no longer needed after the driver is initialized.
Why would the compiler move the driver entry to an INIT section?
An NDIS driver normally contains
#pragma NDIS_INIT_FUNCTION(DriverEntry)
Which is a macro in the WDK's inc/ddk/ndis.h header file:
#define NDIS_INIT_FUNCTION(_F) alloc_text(INIT,_F)
#pragma alloc_text is one of the ways to move a function into a particular section. Another common way it is done is by bracketing the DriverEntry function with #pragma code_seg(INIT) and #pragma code_seg().
Why would the compiler mark the section as RWX
That requires an archeological dig. Many drivers were started a long time ago and are likely to still use ~VS6, back when life was still uncomplicated and programmers wore white hats. Or perhaps the programmer used #pragma section, yet another way to name sections, it permits setting the attributes directly. A modern toolchain certainly won't do this, you get RX from #pragma alloc_text. There very little point in fretting about it, given that DriverEntry() lives for a very short time and any malware code that runs with ring0 privileges can do a lot more practical damage.
I passed this information on to Microsoft and it did turn out to be a bug in the MSVC linker. They were mistakenly marking the discard section that contained DriverEntry as RWX. This issue was fixed in Visual Studio 2015.
I wrote about the issue in more detail here.
I'd like to generate pseudo-random ARM instructions. Via assembler directives, I can tell gcc what mode I'm in, and it will complain if I try a set of opcodes and operands that's not legal in that mode, so it must have some internal listing of what can be done in which mode. Where does that live? Would it be easier to extract that info from LLVM?
Is this question "not even wrong"? Should I try a different approach entirely?
To answer my own question, this is actually really easy to do from arm.md and and constraints.md in gcc/config/arm/. I probably spent more time answering asking this question and answering comments for it than I did figuring this out. Turns out I just need to look for 'TARGET_THUMB1', until I get around to implementing thumb2.
For the ARM family the buck stops at the ARM ARM (ARM Architectural Reference Manual). There is an ARM instruction set section and a Thumb instruction set section. Within both each instruction tells you what generation (ARMvX where X is some number like 4 (arm7), or 5 (arm9 time frame) ,etc). Since the opcode and pseudo code is listed for each instruction you should be able to figure out what is a real instruction and, if any, are syntax to save typing on another (push and pop for example).
With the Cortex-m3 and thumb2 in particular you also need to look at the TRM (Technical Reference Manual) as well. ARM has, I forget the name, a universal syntax they are trying to use that should work on both Thumb and ARM. For example on an ARM you have three register instructions:
add r1,r1,r2
In thumb there are only two register operations
add r1,r2
The desire basically is to meet in the middle or I would say more accurately to encourage ARM assemblers to parse Thumb instructions and encode them with the equivalent ARM instruction without complaining. This may have started with thumb and not thumb2, I have always separated the two syntaxes in my code until recently (and I still generally use ARM syntax for ARM and Thumb for Thumb).
And then yes you have to see what the specific implementation of the assembler tool is, in your case binutils. And it sounds like you have found the binutils/gnu secret decoder ring.
I'm really new to programming (in general - it's pathetic) and some Python-related assembly has cropped up in this app that I'm hacking to run on 64-bit.
Essentially, the code goes like this:
#define FUNCTION(name) \
.globl _##name; \
_##name: \
jmp *(_p_##name)
.text
FUNCTION(name)
The FUNCTION(name) syntax is used about 50 times to define headers for an external Python library as far as I can tell (I'm not going to pretend that I fully understand it, I'm just bugfixing).
Since I'm compiling for x86_64, the following error is spit out by GCC for each FUNCTION(name) instance:
32-bit absolute addressing is not supported for x86-64
cannot do signed 4 byte relocation
How would I go about "fixing" this to run on x86_64?
Grab a copy of the Intel Architecture Software Developer's Manuals. As you're seeing, some forms of the jmp instruction are invalid in 64-bit mode. In particular, the two "Jump far, absolute, address given in operand" forms won't work. You will need to change to a relative addressing or absolute indirect addressing form of the instruction. Volume 2A, page 3-549 in my copy, of the manual has a huge pile of information about jmp.