How can I get correct function address on Aix? - gcc

see a simple code below:
int foo(int a)
{
return a;
}
int main() {
printf("%x\n", foo);
printf("%x\n", &foo);
printf("%x\n", *foo);
foo(1);
}
They all displayed the same value:
0x20453840
0x20453840
0x20453840
I used gdb to check foo() entry point is:
(gdb) p foo
$1 = {int (int)} 0x100003d8 <foo>
the value 0x20453840 is actually foo() pointer of pointer:
(gdb) p /x *0x20453850
$3 = 0x100003d8
(gdb) si
0x10000468 76 foo(1);
0x10000464 <main+76>: 38 60 00 01 li r3,1
=> 0x10000468 <main+80>: 4b ff ff 71 bl 0x100003d8 <foo>
(gdb)
foo (a=541407312) at insertcode.c:57
57 {
=> 0x100003d8 <foo+0>: 93 e1 ff fc stw r31,-4(r1)
0x100003dc <foo+4>: 94 21 ff e0 stwu r1,-32(r1)
0x100003e0 <foo+8>: 7c 3f 0b 78 mr r31,r1
0x100003e4 <foo+12>: 90 7f 00 38 stw r3,56(r31)
(gdb)
So I think 0x100003d8 is the entry point.
I used gcc 4.6.2 to compile.
I have tow questions:
why different function address definition on AIX? is it related to gcc?
I have to use gcc not xlC.
how to get real function address in C on AIX?
Thanks in advance!

why different function address definition on AIX?
nm -Pg ./f_addr | grep foo
Try this command, and you will see you have too symbols: foo and .foo One of them lives in the code segment (or text segment), the other, in the data segment.
The purpose is, indeed, creating an indirection in function calling; it is important when creating/using shared libraries.
is it related to gcc? I have to use gcc not xlC.
No.
How to get real function address in C on AIX?
Please clarify your question: what do you want to do with the 'real address'.

Related

how to use SymGetSourceFile api for fetching source file in postmortem debugging

I want to use SymGetSourceFile to get a source file from source server using info from a dump file. But the first param is a handle to process but during postmortem we dont have a process, so is it meant to be used only for live debugging tools? How can I use it from a postmortem debugging tool?
BOOL IMAGEAPI SymGetSourceFile(
HANDLE hProcess,
ULONG64 Base,
PCSTR Params,
PCSTR FileSpec,
PSTR FilePath,
DWORD Size
);
https://learn.microsoft.com/en-us/windows/win32/api/dbghelp/nf-dbghelp-symgetsourcefile
Update:
I have tried using IDebugAdvanced3 interface for same but get HR = 0x80004002 for GetSourceFileInformation call.
char buf[1000] = { 0 };
HRESULT hr = g_ExtAdvanced->GetSourceFileInformation(DEBUG_SRCFILE_SYMBOL_TOKEN,
"Application.cs",
0x000000dd6f5f1000, 0, buf, 1000, 0);
if (SUCCEEDED(hr))
{
dprintf("GetSourceFileInformation = %s", buf);
char buftok[5000] = { 0 };
hr = g_ExtAdvanced->FindSourceFileAndToken(0, 0x000000dd6f5f1000,
"Application.cs", DEBUG_FIND_SOURCE_TOKEN_LOOKUP,
buf, 1000, 0, buftok, 5000, 0);
if (SUCCEEDED(hr))
{
dprintf("FindSourceFileAndToken = %s", buf);
}
else
dprintf("FindSourceFileAndToken HR = %x", hr);
}
else
dprintf("GetSourceFileInformation HR = %x", hr);
I have dump that has this module and pdb loaded. and pass an address within the module - 0x000000dd6f5f1000, to GetSourceFileInformation
this was a comment but grew up so addingas answer
GetSourceFileINformation iirc checks the source servers those that start with srv or %srcsrv%
this returns a token for use with findsourcefileandtoken
if you have a known offset (0x1070 == main() in case below )
use GetLineByOffset this has the added advantage of reloading all the modules
hope you have your private pdb for the dump file you open.
this is engext syntax
Hr = m_Client->OpenDumpFile("criloc.dmp");
Hr = m_Control->WaitForEvent(0,INFINITE);
unsigned char Buff[BUFFERSIZE] = {0};
ULONG Buffused = 0;
DEBUG_READ_USER_MINIDUMP_STREAM MiniStream ={ModuleListStream,0,0,Buff,BUFFERSIZE,Buffused};
Hr = m_Advanced2->Request(DEBUG_REQUEST_READ_USER_MINIDUMP_STREAM,&MiniStream,sizeof(
DEBUG_READ_USER_MINIDUMP_STREAM),NULL,NULL,NULL);
MINIDUMP_MODULE_LIST *modlist = (MINIDUMP_MODULE_LIST *)&Buff;
Hr = m_Symbols->GetLineByOffset(modlist->Modules[0].BaseOfImage+0x1070,&Line,
FileBuffer,0x300,&Filesize,&Displacement);
Out("getlinebyoff returned %x\nsourcefile is at %s line number is %d\n",Hr,FileBuffer,Line);
this is part src adapt it to your needs.
the result of the extension command is pasted below
0:000> .load .\mydt.dll
0:000> !mydt
Loading Dump File [C:\Users\xxxx\Desktop\srcfile\criloc.dmp]
User Mini Dump File with Full Memory: Only application data is available
OpenDumpFile Returned 0
WaitForEvent Returned 0
Request Returned 0
Ministream Buffer Used 28c
06 00 00 00 00 00 8d 00 00 00 00 00 00 e0 04 00
f0 9a 05 00 2d 2e a8 5f ba 14 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
43 00 00 00 4a 38 00 00 00 00 00 00 00 00 00 00
40 81 00 00 00 00 00 00 00 00 00 00 00 00 00 00
No of Modules =6
Module[0]
Base = 8d0000
Size = 4e000
getlinebyoff returned 0
sourcefile is at c:\users\xxx\desktop\misc\criloc\criloc.cpp line number is 21 <<<<<<<<<
||1:1:010> lm
start end module name
008d0000 0091e000 CRILOC (private pdb symbols) C:\Users\xxxx\Desktop\misc\CRILOC\CRILOC.pdb
||1:1:010>
and the actual source file contents on path
:\>grep -i -n main CRILOC.CPP
20:int main(void) << the curly braces is on line 21
UPDATE:
yes if the src file is not source indexed (cvs,perforce,... ) GetSourceFileInformation () will not return a token
it checks for a token using the Which parameter
and the returned info can be used in FindSourceFileAndToken();
if your source is not source indexed and you only have a source path
use FindSourceFileandToken() with DEBUG_FIND_SOURCE_FULL_PATH Flag
be aware you need to either use SetSourcePath() or issue .srcpath command or use _NT_SOURCE_PATH environment variable or use -srcpath commandline switch prior to invoking FindSourceFileAndToken()
see below for a walkthrough
sourcefile and contents
:\>ls *.cpp
mydt.cpp
:\>cat mydt.cpp
#include <engextcpp.cpp>
#define BSIZE 0x1000
class EXT_CLASS : public ExtExtension {
public:
EXT_COMMAND_METHOD(mydt);
};
EXT_DECLARE_GLOBALS();
EXT_COMMAND( mydt, "mydt", "{;e,o,d=0;!mydt;}" ){
HRESULT Hr = m_Client->OpenDumpFile("criloc.dmp");
Hr = m_Control->WaitForEvent(0,INFINITE);
char Buff[BSIZE] = {0};
ULONG Buffused = 0;
DEBUG_READ_USER_MINIDUMP_STREAM MiniStream ={ModuleListStream,0,0,
Buff,BSIZE,Buffused};
Hr = m_Advanced2->Request(DEBUG_REQUEST_READ_USER_MINIDUMP_STREAM,&MiniStream,
sizeof(DEBUG_READ_USER_MINIDUMP_STREAM),NULL,NULL,NULL);
MINIDUMP_MODULE_LIST *modlist = (MINIDUMP_MODULE_LIST *)&Buff;
//m_Symbols->SetSourcePath("C:\\Users\\xxx\\Desktop\\misc\\CRILOC");
char srcfilename[BSIZE] ={0};
ULONG foundsize =0 ;
Hr = m_Advanced3->FindSourceFileAndToken(0,modlist->Modules[0].BaseOfImage,"criloc.cpp",
DEBUG_FIND_SOURCE_FULL_PATH,NULL,0,NULL,srcfilename,0x300,&foundsize);
Out("gsfi returned %x\n" , Hr);
Out("srcfilename is %s\n",srcfilename);
}
compiled and linked with
:\>cat bld.bat
#echo off
set "INCLUDE= %INCLUDE%;E:\windjs\windbg_18362\inc"
set "LIB=%LIB%;E:\windjs\windbg_18362\lib\x86"
set "LINKLIBS=user32.lib kernel32.lib dbgeng.lib dbghelp.lib"
cl /LD /nologo /W4 /Od /Zi /EHsc mydt.cpp /link /nologo /EXPORT:DebugExtensionInitialize /Export:mydt /Export:help /RELEASE %linklibs%
:\>bld.bat
mydt.cpp
E:\windjs\windbg_18362\inc\engextcpp.cpp(1849): warning C4245: 'argument': conversion from 'int' to 'ULONG64', signed/unsigned mismatch
Creating library mydt.lib and object mydt.exp
:\>file mydt.dll
mydt.dll; PE32 executable for MS Windows (DLL) (GUI) Intel 80386 32-bit
executing
:\>cdb cdb
Microsoft (R) Windows Debugger Version 10.0.18362.1 X86
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
ntdll!LdrpDoDebuggerBreak+0x2c:
77d805a6 cc int 3
0:000> .load .\mydt.dll
0:000> .chain
Extension DLL chain:
.\mydt.dll: API 1.0.0, built Thu Mar 18 20:40:04 2021
[path: C:\Users\xxxx\Desktop\srcfile\New folder\mydt.dll]
0:000> !mydt
Loading Dump File [C:\Users\xxxx\Desktop\srcfile\New folder\criloc.dmp]
User Mini Dump File with Full Memory: Only application data is available
gsfi returned 80004002
srcfilename is
||1:1:010> .srcpath "c:\\users\\xxxx\\desktop\\misc\\criloc\\"
Source search path is: c:\\users\\xxxx\\desktop\\misc\\criloc\\
************* Path validation summary **************
Response Time (ms) Location
OK c:\\users\\xxxx\\desktop\\misc\\criloc\\
||1:1:010> !mydt
Loading Dump File [C:\Users\xxxx\Desktop\srcfile\New folder\criloc.dmp]
gsfi returned 0
srcfilename is c:\\users\\xxxx\\desktop\\misc\\criloc\\criloc.cpp
||2:2:021>

mprotect subset of functions in shared library

I'm trying to mprotect a subset of functions bundled in a shared library for the purposes of a larger level feature.
Given the requirements around page-alignment for mprotect, I've setup a separate section in my linker script that ensures that alignment:
SECTIONS
{
.protectedsection ALIGN(4096) : {
*(.protectedsection)
}
}
INSERT AFTER .rodata;
And in the declaration of the function I want protection on, I add the relevant GCC attributes:
// Setup with protection
void bar(int a) __attribute__ ((section ("protectedsection")));
// Setup without protection
void foo(int a);
I compile the resultant C code with GCC, along with the -T option to pass in the linker file:
gcc -fpic -shared -T linkerscript.ld libfuncs.c -o libfuncs.so
Objdump'ing it reveals that while it is in the right section, the alignment isn't right:
Disassembly of section protectedsection:
00000000000010da <_Z3bari>:
10da: 55 push %rbp
10db: 48 89 e5 mov %rsp,%rbp
10de: 48 83 ec 10 sub $0x10,%rsp
10e2: 89 7d fc mov %edi,-0x4(%rbp)
10e5: 48 8d 3d e1 00 00 00 lea 0xe1(%rip),%rdi # 11cd <_fini+0x9>
10ec: e8 df f4 ff ff callq 5d0 <puts#plt>
10f1: 90 nop
10f2: c9 leaveq
10f3: c3 retq
Is what I'm trying to do here possible, and if so, how?
Found the problem - changing 'INSERT AFTER .rodata' to 'INSERT AFTER .text' fixed my problem. With that, I was able to setup a 'buffered' region for the code I want to have mprotect'ed, and it all works like a charm!

What's the proper way of calling a Win32/64 function from LLVM?

I'm attempting to call a method from LLVM IR back to C++ code. I'm working in 64-bit Visual C++, or as LLVM describes it:
Machine CPU: skylake
Machine info: x86_64-pc-windows-msvc
For integer types and pointer types my code works fine as-is. However, floating point numbers seem to be handled a bit strange.
Basically the call looks like this:
struct SomeStruct
{
static void Breakpoint( return; } // used to set a breakpoint
static void Set(uint8_t* ptr, double foo) { return foo * 2; }
};
and LLVM IR looks like this:
define i32 #main(i32, i8**) {
varinit:
// omitted here: initialize %ptr from i8**.
%5 = load i8*, i8** %instance0
// call to some method. This works - I use it to set a breakpoint
call void #"Helper::Breakpoint"(i8* %5)
// this call fails:
call void #"Helper::Set"(i8* %5, double 0xC19EC46965A6494D)
ret i32 0
}
declare double #"SomeStruct::Callback"(i8*, double)
I figured that the problem is probably in the way the calling conventions work. So I've attempted to make some adjustments to correct for that:
// during initialization of the function
auto function = llvm::Function::Create(functionType, llvm::Function::ExternalLinkage, name, module);
function->setCallingConv(llvm::CallingConv::X86_64_Win64);
...
// during calling of the function
call->setCallingConv(llvm::CallingConv::X86_64_Win64);
Unfortunately no matter what I try, I end up with 'invalid instruction' errors, which this user reports to be an issue with calling conventions: Clang producing executable with illegal instruction . I've tried this with X86-64_Win64, Stdcall, Fastcall and no calling convention specs - all with the same result.
I've read up on https://msdn.microsoft.com/en-us/library/ms235286.aspx in an attempt to figure out what's going on. Then I looked at the assembly output that's supposed to be generated by LLVM (using the targetMachine->addPassesToEmitFile API call) and found:
movq (%rdx), %rsi
movq %rsi, %rcx
callq "Helper2<double>::Breakpoint"
vmovsd __real#c19ec46965a6494d(%rip), %xmm1
movq %rsi, %rcx
callq "Helper2<double>::Set"
xorl %eax, %eax
addq $32, %rsp
popq %rsi
According to MSDN, argument 2 should be in %xmm1 so that also seems correct. However, when checking if everything works in the debugger, Visual Studio reports a lot of question marks (e.g. 'illegal instruction').
Any feedback is appreciated.
The disassembly code:
00000144F2480007 48 B8 B6 48 B8 C8 FA 7F 00 00 mov rax,7FFAC8B848B6h
00000144F2480011 48 89 D1 mov rcx,rdx
00000144F2480014 48 89 54 24 20 mov qword ptr [rsp+20h],rdx
00000144F2480019 FF D0 call rax
00000144F248001B 48 B8 C0 48 B8 C8 FA 7F 00 00 mov rax,7FFAC8B848C0h
00000144F2480025 48 B9 00 00 47 F2 44 01 00 00 mov rcx,144F2470000h
00000144F248002F ?? ?? ??
00000144F2480030 ?? ?? ??
00000144F2480031 FF 08 dec dword ptr [rax]
00000144F2480033 10 09 adc byte ptr [rcx],cl
00000144F2480035 48 8B 4C 24 20 mov rcx,qword ptr [rsp+20h]
00000144F248003A FF D0 call rax
00000144F248003C 31 C0 xor eax,eax
00000144F248003E 48 83 C4 28 add rsp,28h
00000144F2480042 C3 ret
Some of the information about the memory is missing. Memory view:
0x00000144F248001B 48 b8 c0 48 b8 c8 fa 7f 00 00 48 b9 00 00 47 f2 44 01 00 00 62 f1 ff 08 10 09 48 8b 4c 24 20 ff d0 31 c0 48 83 c4 28 c3 00 00 00 00 00 ...
The question marks that are missing here are: '62 f1 '.
Some code is helpful to see how I get the JIT to compile etc. I'm afraid it's a bit long, but helps to get the idea... and I have no clue how to create a smaller piece of code.
// Note: FunctionBinderBase basically holds an llvm::Function* object
// which is bound using the above code and a name.
llvm::ExecutionEngine* Module::Compile(std::unordered_map<std::string, FunctionBinderBase*>& externalFunctions)
{
// DebugFlag = true;
#if (LLVMDEBUG >= 1)
this->module->dump();
#endif
// -- Initialize LLVM compiler: --
std::string error;
// Helper function, gets the current machine triplet.
llvm::Triple triple(MachineContextInfo::Triplet());
const llvm::Target *target = llvm::TargetRegistry::lookupTarget("x86-64", triple, error);
if (!target)
{
throw error.c_str();
}
llvm::TargetOptions Options;
// Options.PrintMachineCode = true;
// Options.EnableFastISel = true;
std::unique_ptr<llvm::TargetMachine> targetMachine(
target->createTargetMachine(MachineContextInfo::Triplet(), MachineContextInfo::CPU(), "", Options, llvm::Reloc::Default, llvm::CodeModel::Default, llvm::CodeGenOpt::Aggressive));
if (!targetMachine.get())
{
throw "Could not allocate target machine!";
}
// Create the target machine; set the module data layout to the correct values.
auto DL = targetMachine->createDataLayout();
module->setDataLayout(DL);
module->setTargetTriple(MachineContextInfo::Triplet());
// Pass manager builder:
llvm::PassManagerBuilder pmbuilder;
pmbuilder.OptLevel = 3;
pmbuilder.BBVectorize = false;
pmbuilder.SLPVectorize = true;
pmbuilder.LoopVectorize = true;
pmbuilder.Inliner = llvm::createFunctionInliningPass(3, 2);
llvm::TargetLibraryInfoImpl *TLI = new llvm::TargetLibraryInfoImpl(triple);
pmbuilder.LibraryInfo = TLI;
// Generate pass managers:
// 1. Function pass manager:
llvm::legacy::FunctionPassManager FPM(module.get());
pmbuilder.populateFunctionPassManager(FPM);
// 2. Module pass manager:
llvm::legacy::PassManager PM;
PM.add(llvm::createTargetTransformInfoWrapperPass(targetMachine->getTargetIRAnalysis()));
pmbuilder.populateModulePassManager(PM);
// 3. Execute passes:
// - Per-function passes:
FPM.doInitialization();
for (llvm::Module::iterator I = module->begin(), E = module->end(); I != E; ++I)
{
if (!I->isDeclaration())
{
FPM.run(*I);
}
}
FPM.doFinalization();
// - Per-module passes:
PM.run(*module);
// Fix function pointers; the PM.run will ruin them, this fixes that.
for (auto it : externalFunctions)
{
auto name = it.first;
auto fcn = module->getFunction(name);
it.second->function = fcn;
}
#if (LLVMDEBUG >= 2)
// -- ASSEMBLER dump code
// 3. Code generation pass manager:
llvm::legacy::PassManager CGP;
CGP.add(llvm::createTargetTransformInfoWrapperPass(targetMachine->getTargetIRAnalysis()));
pmbuilder.populateModulePassManager(CGP);
std::string result;
llvm::raw_string_ostream str(result);
llvm::buffer_ostream os(str);
targetMachine->addPassesToEmitFile(CGP, os, llvm::TargetMachine::CodeGenFileType::CGFT_AssemblyFile);
CGP.run(*module);
str.flush();
auto stringref = os.str();
std::string assembly(stringref.begin(), stringref.end());
std::cout << "ASM code: " << std::endl << "---------------------" << std::endl << assembly << std::endl << "---------------------" << std::endl;
// -- end of ASSEMBLER dump code.
for (auto it : externalFunctions)
{
auto name = it.first;
auto fcn = module->getFunction(name);
it.second->function = fcn;
}
#endif
#if (LLVMDEBUG >= 2)
module->dump();
#endif
// All done, *RUN*.
llvm::EngineBuilder engineBuilder(std::move(module));
engineBuilder.setEngineKind(llvm::EngineKind::JIT);
engineBuilder.setMCPU(MachineContextInfo::CPU());
engineBuilder.setMArch("x86-64");
engineBuilder.setUseOrcMCJITReplacement(false);
engineBuilder.setOptLevel(llvm::CodeGenOpt::None);
llvm::ExecutionEngine* engine = engineBuilder.create();
// Define external functions
for (auto it : externalFunctions)
{
auto fcn = it.second;
if (fcn->function)
{
engine->addGlobalMapping(fcn->function, const_cast<void*>(fcn->FunctionPointer())); // Yuck... LLVM only takes non-const pointers
}
}
// Finalize
engine->finalizeObject();
return engine;
}
Update (progress)
Apparently my Skylake has problems with the vmovsd instruction. When running the same code on a Haswell (server), the test succeeds. I've checked the assembly output on both - they are exactly the same.
Just to be sure: XSAVE/XRESTORE shouldn't be the problem on Win10-x64, but let's find out anyways. I've checked the features with the code from https://msdn.microsoft.com/en-us/library/hskdteyh.aspx and the XSAVE/XRESTORE from https://insufficientlycomplicated.wordpress.com/2011/11/07/detecting-intel-advanced-vector-extensions-avx-in-visual-studio/ . The latter runs just fine. As for the former, these are the results:
GenuineIntel
Intel(R) Core(TM) i7-6700HQ CPU # 2.60GHz
3DNOW not supported
3DNOWEXT not supported
ABM not supported
ADX supported
AES supported
AVX supported
AVX2 supported
AVX512CD not supported
AVX512ER not supported
AVX512F not supported
AVX512PF not supported
BMI1 supported
BMI2 supported
CLFSH supported
CMPXCHG16B supported
CX8 supported
ERMS supported
F16C supported
FMA supported
FSGSBASE supported
FXSR supported
HLE supported
INVPCID supported
LAHF supported
LZCNT supported
MMX supported
MMXEXT not supported
MONITOR supported
MOVBE supported
MSR supported
OSXSAVE supported
PCLMULQDQ supported
POPCNT supported
PREFETCHWT1 not supported
RDRAND supported
RDSEED supported
RDTSCP supported
RTM supported
SEP supported
SHA not supported
SSE supported
SSE2 supported
SSE3 supported
SSE4.1 supported
SSE4.2 supported
SSE4a not supported
SSSE3 supported
SYSCALL supported
TBM not supported
XOP not supported
XSAVE supported
It's weird, so I figured: why not simply emit the instruction directly.
int main()
{
const double value = 1.2;
const double value2 = 1.3;
auto x1 = _mm_load_sd(&value);
auto x2 = _mm_load_sd(&value2);
std::string s;
std::getline(std::cin, s);
}
This code runs fine. The disassembly:
auto x1 = _mm_load_sd(&value);
00007FF7C4833724 C5 FB 10 45 08 vmovsd xmm0,qword ptr [value]
auto x1 = _mm_load_sd(&value);
00007FF7C4833729 C5 F1 57 C9 vxorpd xmm1,xmm1,xmm1
00007FF7C483372D C5 F3 10 C0 vmovsd xmm0,xmm1,xmm0
Apparently it won't use register xmm1, but still proves that the instruction itself does the trick.
I just checked on another Intel Haswell what's going on here, and found this:
0000015077F20110 C5 FB 10 08 vmovsd xmm1,qword ptr [rax]
Apparently on Intel Haswell it emits another byte code instruction than on my Skylake.
#Ha. actually was kind enough to point me in the right direction here. Yes, the hidden bytes indeed indicate VMOVSD, but apparently it's encoded as EVEX. That's all nice and well, but EVEX prefix / encoding will be introduced in the latest Skylake architecture as part of AVX512, which won't be supported until Skylake Purley in 2017. In other words, this is an invalid instruction.
To check, I've put a breakpoint in X86MCCodeEmitter::EmitMemModRMByte. At some point, I do see an bool HasEVEX = [...] evaluating to true. This confirms that the codegen / emitter is producing the wrong output.
My conclusion is therefore that this has to be a bug in the target information of LLVM for Skylake CPU's. That means there are only two things remaining to do: figure out where this bug is exactly in LLVM so we can solve this and report the bug to the LLVM team...
So where is it in LLVM? That's tough to tell... x86.td.def defines skylake features as 'FeatureAVX512' which will probably trigger X86SSELevel to AVX512F. That in turn will give the wrong instructions. As a workaround, it's best to simply tell LLVM that we have an Intel Haswell instead and all will be well:
// MCPU is used to call createTargetMachine
llvm::StringRef MCPU = llvm::sys::getHostCPUName();
if (MCPU.str() == "skylake")
{
MCPU = llvm::StringRef("haswell");
}
Test, works.

JNI DLL crashes JVM (32-bit only)

I have a JNI DLL, that crashes when using GetFieldID() on a class object, that was passed into a function. The library is working fine on Linux with 32-bit and 64-bit JVMs and only crashes when using 32-bit under Windows - 64-bit is fine.
The original DLL was cross-compiled on ubuntu 13.10 x86_64 using MinGW-w64 GCC 4.6.3, but I also compiled it natively under Windows using MinGW-w64 GCC 4.6.3 and I still got the same crash. Using ubuntu 14.04 with MinGW-w64 4.8.2 still produces the same error.
It appears there is some memory corruption going on since when I use an unoptimized DLL the crash doesn't happen an the first call on GetFieldID(), but on a later one (the original DLL has way more code than the stripped down example below) or even after the function finished somewhere in the JVM garbage collection.
The JVM I am using is Java 7u60, but I also tried it with 8u5 and got the same results. I tested it with the 32-bit JVM on a 64-bit and 32-bit systems as I came across an article, that said, that a 32-bit JVM might not be reliable on 64-bit Windows operating systems (sounded a bit bogus to me, but just to be sure).
Also there are other JNI DLLs, that don't utilize GetFieldID() at all and they are working just fine with 32-bit.
The crash data from the hs_err_pid.log
Current thread (0x00d5e000): JavaThread "main" [_thread_in_native, id=1104, stack(0x00dd0000,0x00e20000)]
siginfo: ExceptionCode=0xc0000005, ExceptionInformation=0x00000008 0x3462c9e8
Registers:
EAX=0x00000000, EBX=0x00e1f1fc, ECX=0x97254d7c, EDX=0x00d5eac4
ESP=0x00e1f1dc, EBP=0x00e1f1ec, ESI=0x3462c6e8, EDI=0x00d5e000
EIP=0x3462c9e8, EFLAGS=0x00010246
Top of Stack: (sp=0x00e1f1dc)
0x00e1f1dc: 00000000 3462c6e8 00000000 00e1f1fc
0x00e1f1ec: 00e1f224 025f334f 246970c0 025f88c9
0x00e1f1fc: 24695668 2460b700 00e1f204 34628d1b
0x00e1f20c: 00e1f22c 34628ee8 00000000 34628d40
0x00e1f21c: 00e1f1fc 00e1f22c 00e1f25c 025f3207
0x00e1f22c: 24693760 24693760 00000001 24693758
0x00e1f23c: 00e1f234 34628c56 00e1f264 34628ee8
0x00e1f24c: 00000000 34628c88 00e1f22c 00e1f268
Instructions: (pc=0x3462c9e8)
0x3462c9c8: 78 bc 62 34 50 bb 62 34 c0 bd 62 34 30 bd 62 34
0x3462c9d8: 00 00 00 00 00 00 00 00 0c 00 00 00 02 00 00 00
0x3462c9e8: 01 00 00 00 60 f9 5f 39 02 00 00 00 a0 b9 62 34
0x3462c9f8: 0a 00 b8 00 10 d6 00 39 00 00 00 00 01 00 40 80
Register to memory mapping:
EAX=0x00000000 is an unknown value
EBX=0x00e1f1fc is pointing into the stack for thread: 0x00d5e000
ECX=0x97254d7c is an unknown value
EDX=0x00d5eac4 is an unknown value
ESP=0x00e1f1dc is pointing into the stack for thread: 0x00d5e000
EBP=0x00e1f1ec is pointing into the stack for thread: 0x00d5e000
ESI=0x3462c6e8 is an oop
{method}
- klass: {other class}
EDI=0x00d5e000 is a thread
Stack: [0x00dd0000,0x00e20000], sp=0x00e1f1dc, free space=316k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C 0x3462c9e8
j jnitest.JNIClass.<init>()V+27
j jnitest.JNIClass.getInstance()Ljnitest/JNIClass;+22
j jnitest.Program.main([Ljava/lang/String;)V+0
v ~StubRoutines::call_stub
V [jvm.dll+0x140e6a]
V [jvm.dll+0x20529e]
V [jvm.dll+0x140eed]
V [jvm.dll+0x14d2ee]
V [jvm.dll+0x14d515]
V [jvm.dll+0xf1f99]
C [java.dll+0x7d82]
j sun.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+87
j sun.reflect.DelegatingMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+6
j java.lang.reflect.Method.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+57
j com.intellij.rt.execution.application.AppMain.main([Ljava/lang/String;)V+163
v ~StubRoutines::call_stub
V [jvm.dll+0x140e6a]
V [jvm.dll+0x20529e]
V [jvm.dll+0x140eed]
V [jvm.dll+0xca5c5]
V [jvm.dll+0xd5267]
C [java.exe+0x2063]
C [java.exe+0xa5d1]
C [java.exe+0xa65b]
C [kernel32.dll+0x1338a]
C [ntdll.dll+0x39f72]
C [ntdll.dll+0x39f45]
Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j jnitest.JNIWrapper.createUuid(Ljnitest/JNIWrapper$sender_id_t;)I+25
j jnitest.JNIClass.<init>()V+27
j jnitest.JNIClass.getInstance()Ljnitest/JNIClass;+22
j jnitest.Program.main([Ljava/lang/String;)V+0
v ~StubRoutines::call_stub
j sun.reflect.NativeMethodAccessorImpl.invoke0(Ljava/lang/reflect/Method;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+0
j sun.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+87
j sun.reflect.DelegatingMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+6
j java.lang.reflect.Method.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+57
j com.intellij.rt.execution.application.AppMain.main([Ljava/lang/String;)V+163
v ~StubRoutines::call_stub
The Java class:
package jnitest;
public class JNIClass {
static final Object _mutex = new Object();
static JNIClass _instance = null;
public static JNIClass getInstance()
{
if (_instance == null)
{
synchronized (_mutex)
{
if (_instance == null)
_instance = new JNIClass();
}
}
return _instance;
}
JNIWrapper.sender_id_t sid = null;
JNIClass() {
//create uuid
sid = new JNIWrapper.sender_id_t();
System.out.print(JNIWrapper.createUuid(sid));
}
}
The JNI wrapper class:
package jnitest;
public final class JNIWrapper {
static {
System.loadLibrary("JNIWrapper");
}
public static class sender_id_t
{
public long phy_idx;
}
public static native int createUuid(JNIWrapper.sender_id_t id);
}
The application:
package jnitest;
public class Program
{
public static void main(String[] args)
{
JNIClass.getInstance();
System.exit(0);
}
}
The auto-generated JNI DLL header:
/* DO NOT EDIT THIS FILE - it is machine generated */
#include <jni.h>
/* Header for class jnitest_JNIWrapper */
#ifndef _Included_jnitest_JNIWrapper
#define _Included_jnitest_JNIWrapper
#ifdef __cplusplus
extern "C" {
#endif
/*
* Class: jnitest_JNIWrapper
* Method: createUuid
* Signature: (Ljnitest/JNIWrapper/sender_id_t;)I
*/
JNIEXPORT jint JNICALL Java_jnitest_JNIWrapper_createUuid
(JNIEnv *, jclass, jobject);
#ifdef __cplusplus
}
#endif
#endif
The JNI DLL implementation (updated to be able to use either C or C++ interface):
#include "jnitest_JNIWrapper.h"
#ifdef __cplusplus
extern "C" {
#endif
#ifdef __cplusplus
#define JNIFUNC(e,f) e->f()
#define JNIFUNCV(e,f,...) e->f(__VA_ARGS__)
#else
#define JNIFUNC(e,f) (*e)->f(e)
#define JNIFUNCV(e,f,...) (*e)->f(e,__VA_ARGS__)
#endif
JNIEXPORT jint JNICALL Java_jnitest_JNIWrapper_createUuid(JNIEnv *env, jclass clazz, jobject sid)
{
(void)clazz;
jclass cls = JNIFUNCV(env,GetObjectClass, sid);
jfieldID phyID = JNIFUNCV(env,GetFieldID, cls, "phy_idx", "J");
(void)phyID;
if (JNIFUNC(env,ExceptionCheck))
return 100;
return 0;
}
#ifdef __cplusplus
}
#endif
Update:
The compilation command:
i686-w64-mingw32-gcc -std=c99 -O3 -s -Wall -Wextra -Werror -o ../bin/JNIWrapper.dll -shared -Wl,--subsystem,windows dllmain.c JNIWrapper.c -I /usr/lib/jvm/java-7-openjdk-amd64/include
You are trying to get a field id of an inner class but your cls variable is the outer class JNIWrapper. You probably need to run something like (*env)->FindClass(env, "jnitest/JNIWrapper$sender_id_t") to get the correct cls to call get field id on. The javap -c tool can tell you what "jnitest/JNIWrapper$sender_id_t" should be.

Unexpected global variable read result in C++ using avr-gcc for (local variable access is as expected)

I am getting unexpected global variable read results when compiling the following code in avr-gcc 4.6.2 for ATmega328:
#include <avr/io.h>
#include <util/delay.h>
#define LED_PORT PORTD
#define LED_BIT 7
#define LED_DDR DDRD
uint8_t latchingFlag;
int main() {
LED_DDR = 0xFF;
for (;;) {
latchingFlag=1;
if (latchingFlag==0) {
LED_PORT ^= 1<<LED_BIT; // Toggle the LED
_delay_ms(100); // Delay
latchingFlag = 1;
}
}
}
This is the entire code. I would expect the LED toggling to never execute, seeing as latchingFlag is set to 1, however the LED blinks continuously. If latchingFlag is declared local to main() the program executes as expected: the LED never blinks.
The disassembled code doesn't reveal any gotchas that I can see, here's the disassembly of the main loop of the version using the global variable (with the delay routine call commented out; same behavior)
59 .L4:
27:main.cpp **** for (;;) {
60 .loc 1 27 0
61 0026 0000 nop
62 .L3:
28:main.cpp **** latchingFlag=1;
63 .loc 1 28 0
64 0028 81E0 ldi r24,lo8(1)
65 002a 8093 0000 sts latchingFlag,r24
29:main.cpp **** if (latchingFlag==0) {
66 .loc 1 29 0
67 002e 8091 0000 lds r24,latchingFlag
68 0032 8823 tst r24
69 0034 01F4 brne .L4
30:main.cpp **** LED_PORT ^= 1<<LED_BIT; // Toggle the LED
70 .loc 1 30 0
71 0036 8BE2 ldi r24,lo8(43)
72 0038 90E0 ldi r25,hi8(43)
73 003a 2BE2 ldi r18,lo8(43)
74 003c 30E0 ldi r19,hi8(43)
75 003e F901 movw r30,r18
76 0040 3081 ld r19,Z
77 0042 20E8 ldi r18,lo8(-128)
78 0044 2327 eor r18,r19
79 0046 FC01 movw r30,r24
80 0048 2083 st Z,r18
31:main.cpp **** latchingFlag = 1;
81 .loc 1 31 0
82 004a 81E0 ldi r24,lo8(1)
83 004c 8093 0000 sts latchingFlag,r24
27:main.cpp **** for (;;) {
84 .loc 1 27 0
85 0050 00C0 rjmp .L4
The lines 71-80 are responsible for port access: according to the datasheet, PORTD is at address 0x2B, which is decimal 43 (cf. lines 71-74).
The only difference between local/global declaration of the latchingFlag variable is how latchingFlag is accessed: the global variable version uses sts (store direct to data space) and lds (load direct from data space) to access latchingFlag, whereas the local variable version uses ldd (Load Indirect from Data Space to Register) and std (Store Indirect From Register to Data Space) using register Y as the address register (which can be used as a stack pointer, by avr-gcc AFAIK). Here are the relevant lines from the disassembly:
63 002c 8983 std Y+1,r24
65 002e 8981 ldd r24,Y+1
81 004a 8983 std Y+1,r24
The global version also has latchingFlag in the .bss section. I am really not what to attribute the different global vs. local variable behavior to. Here's the avr-gcc command-line (notice -O0):
/usr/local/avr/bin/avr-gcc \
-I. -g -mmcu=atmega328p -O0 \
-fpack-struct \
-fshort-enums \
-funsigned-bitfields \
-funsigned-char \
-D CLOCK_SRC=8000000UL \
-D CLOCK_PRESCALE=8UL \
-D F_CPU="(CLOCK_SRC/CLOCK_PRESCALE)" \
-Wall \
-ffunction-sections \
-fdata-sections \
-fno-exceptions \
-Wa,-ahlms=obj/main.lst \
-Wno-uninitialized \
-c main.cpp -o obj/main.o
With -Os compiler flags the loop is gone from the disassembly, but can be forced to be there again if latchingFlag is declared volatile, in which case the unexpected persists for me.
According to your disassembler listing, latchingFlag global variable is located at RAM address 0. This address corresponds to mirrored register r0 and is not a valid RAM address for global variable.
After couple checks and code compares in EE chat I noticed that my version of avr-gcc (4.7.0) stores the value for latchFlag in 0x0100, whereas Egor Skriptunoff mentioned SRAM addres 0 being in OP's assembly listing.
Looking at OP's disassembly (the avr-dump version), I noticed that OP's compiler (4.6.2) stores latchFlag value in a different address (specifically, 0x060) than my compiler (version 4.7.0), which stores latchFlag value at address 0x0100.
My advice is to update the avr-gcc version to at least version 4.7.0. The advantage of 4.7.0 rather than latest and greatest available is the ability to compare the generated code again with my findings.
Of course if 4.7.0 solves the issue, then there is harm in upgrading to a more recent version (if available).
Egor Skriptunoff suggestion is almost exactly right: the SRAM variable is mapped to the wrong memory address. The latchingFlag variable is not at 0x0100 address, which is the first valid SRAM address, but is mapped to 0x060, overlapping the WDTCSR register. This can be seen in the disassembly lines like the following one:
lds r24, 0x0060
THis line is supposed to load the value of latchingFlag from SRAM, and we can see that location 0x060 is used instead of 0x100.
The problem has to with a bug in the binutils which two conditions are met:
The linker is invoked with --gc-sections flag (compiler options: -Wl,--gc-sections) to save code space
None of your SRAM variables are initialized (i.e. initialized to non-zero values)
When both of these conditions are met, the .data section gets removed. When the .data section is missing, the SRAM variables start at address 0x060 instead of 0x100.
One solution is to reinstall binutils: the current versions have this bug fixed. Another solution is to edit your linker scripts: on Ubuntu this is probably in /usr/lib/ldscripts. For ATmega168/328 the script that needs to be edited is avr5.x, but you should really edit all them, otherwise you could run into this bug on other AVR platforms. The change that needs to be made is the following one:
.data : AT (ADDR (.text) + SIZEOF (.text))
{
PROVIDE (__data_start = .) ;
- *(.data)
+ KEEP(*(.data))
So replace the line *(.data) with KEEP(*(.data)). This ensures that the .data section is not discarded, and consequently the SRAM variable addresses start at 0x0100

Resources