What are these "Expectation failed" messages in VS2010 PGO and how do I fix them? - visual-studio-2010

When I perform the PGO optimization step (using LINK.EXE /LTCG:PGU), the Visual Studio 2010 linker complains:
Merging foo!1.pgc
'FOO_EDGE::get_input': Arc 2 --> 4 has negative count (-414343)
Expectation failed: f line 4241
'FOO_DELAY::set_delay': Block 18 outgoing counts differ from block count (-9 diff)
Expectation failed: f line 4261
Expectation failed: f line 4211
'FOO_DELAY::set_delay': Arc 12 --> 23 has negative count (-3)
Expectation failed: f line 4220
Generating code
907 of 4948 ( 18.33%) profiled functions will be compiled for speed
4948 of 4948 functions (100.0%) were optimized using profile data
42912225037 of 42912225037 instructions (100.0%) were optimized using profile data
What's causing these "expectation failures"? How should I address them? It seems like PGO is still optimizing the code, but I'm a little suspicious of the quality/completeness of the optimizations in the presence of these messages.

It seems that these errors occur when performing PGO-instrumented runs of multithreaded applications. They can be avoided by compiling (not linking) with the /PogoSafeMode flag on x64.
I didn't find the MSDN documentation on this flag particularly clear; the correct procedure for performing PGO on multithreaded code is:
Compile with cl.exe /PogoSafeMode
Link with link.exe /LTCG:PGI
Execute your multithreaded profiling run(s)
Re-link with link.exe /LTCG:PGO

Related

Can valigrind memcheck be used with CGo?

we have an application that is mostly Go (1.17) that makes a lot of calls through CGo (GCC 7.5) to CUDA on an ARM processor. We occasionally see panics that look like something has done bad things to the heap in the C side. I tried running the whole application under valgrind, but I get too many messages like
==14869== Thread 1:
==14869== Invalid read of size 8
==14869== at 0x4783AC: runtime.startm (proc.go:2508)
==14869== by 0x47890B: runtime.wakep (proc.go:2584)
==14869== by 0x47CF8F: runtime.newproc.func1 (proc.go:4261)
==14869== by 0x4A476B: runtime.systemstack (asm_arm64.s:230)
==14869== by 0x4A465F: runtime.mstart (asm_arm64.s:117)
==14869== Address 0x1fff0001a8 is on thread 1's stack
==14869== 8 bytes below stack pointer
to see anything useful. I am assuming these are false positives, and the Go runtime is not in fact riddled with undefined behaviour. I can't see a flag to suppress that check. Have I missed it? Is there some other way to investigate this problem? I could write test harnesses in C++ but that will change the use pattern which I suspect is key to the problem.
I build the cgo software to an executable with
go build
which produces an executable by the name of the directory you are in. Let's say it's call "mycgoprog"
I then run valgrind against it with this command:
G_SLICE=always-malloc G_DEBUG=gc-friendly valgrind -v --tool=memcheck --leak-check=full --num-callers=40 --log-file=valgrind.log ./mycgoprog
The valgrind.log then contains the all the indiscretions detected by valgrind in the linked c code.

ModelSim: Intel On-Chip Flash IP: Error: (vsim-3033) Instantiation of 'altera_onchip_flash_block' failed

I am getting this vsim error when I'm trying to use an Intel On-Chip Flash IP generated by Quartus. There's an altera_onchip_flash_block.v file in the submodules/rtl folder but it's only hex numbers in it so it's not compilable by ModelSim.
# Time: 0 ps Iteration: 0 Instance: /ufm_testbench/ufm_inst/flash/onchip_flash_0 File: ../../FFB900_UFM/verilog/altera_onchip_flash.v Line: 309
# Searched libraries:
(all my libraries)
The altera_onchip_flash_block gets instantiated in the altera_onchip_flash.v as seen above.
When I'm only compiling the IP it's working but when I'm using it from my Top-Level testbench I always get this error. I am using VHDL in all my files, except the verilog files generated by quartus.
Any help is appreciated.
As you already realized, this is a precompiled IP-Core. Normally these precompiled IP-Cores come with files for simulation.
This pdf includes a step-by-step tutorial to instantiate the IP-Core and generate simulation files with Quartus (on page 12 of 36 or section 4-2).

How can I specify a minimum compute capability to the mexcuda compiler to compile a mexfunction?

I have a CUDA project in a .cu file that I would like to compile to a .mex file using mexcuda. Because my code makes use of the 64-bit floating point atomic operation atomicAdd(double *, double), which is only supposed for GPU devices of compute capability 6.0 or higher, I need to specify this as a flag when I am compiling.
In my standard IDE, this works fine, but when compiling with mexcuda, this is not working as I would like. In this post on MathWorks, it was suggested to use the following command (edited from the comment by Joss Knight):
mexcuda('-v', 'mexGPUExample.cu', 'NVCCFLAGS=-gencode=arch=compute_60,code=sm_60')
but when I use this command on my file, the verbose option spits out the following line last:
Building with 'NVIDIA CUDA Compiler'.
nvcc -c --compiler-options=/Zp8,/GR,/W3,/EHs,/nologo,/MD -
gencode=arch=compute_30,code=sm_30 -gencode=arch=compute_50,code=sm_50 -
gencode=arch=compute_60,code=sm_60 -
gencode=arch=compute_70,code=\"sm_70,compute_70\"
(and so on), which signals to me that the specified flag was not passed to the nvcc properly. And indeed, compilation fails with the following error:
C:/path/mexGPUExample.cu(35): error: no instance of overloaded function "atomicAdd" matches
the argument list. Argument types are: (double *, double)
The only other post I could find on this topic was this post on SO, but it is almost three years old and seemed to me more like a workaround - one which I do not understand even after some research, otherwise I would have tried it - rather than a true solution to the problem.
Is there a setting I missed, or can this simply not be done without a workaround?
I was able to work my way around this problem after some messing around with the standard xml-files in the MatLab folder. The following steps allowed me to compile using -mexcuda:
-1) Go to the folder C:\Program Files\MATLAB\-version-\toolbox\distcomp\gpu\extern\src\mex\win64, which contains xml-files for different versions of msvcpp;
-2) Make a backup of the file that corresponds to the version you are using. In my case, I made a copy of the file nvcc_msvcpp2017 and named it nvcc_msvcpp2017_old, to always have the original.
-3) Open nvcc_msvcppYEAR with notepad, and scroll to the following block of lines:
COMPILER="nvcc"
COMPFLAGS="--compiler-options=/Zp8,/GR,/W3,/EHs,/nologo,/MD $ARCHFLAGS"
ARCHFLAGS="-gencode=arch=compute_30,code=sm_30 -gencode=arch=compute_50,code=sm_50 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_70,code=\"sm_70,compute_70\" $NVCC_FLAGS"
COMPDEFINES="--compiler-options=/D_CRT_SECURE_NO_DEPRECATE,/D_SCL_SECURE_NO_DEPRECATE,/D_SECURE_SCL=0,$MATLABMEX"
MATLABMEX="/DMATLAB_MEX_FILE"
OPTIMFLAGS="--compiler-options=/O2,/Oy-,/DNDEBUG"
INCLUDE="-I"$MATLABROOT\extern\include" -I"$MATLABROOT\simulink\include""
DEBUGFLAGS="--compiler-options=/Z7"
-4) Remove the architectures that will not allow your code to compile, i.e. all the architecture flags below 60 in my case:
ARCHFLAGS="-gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_70,code=\"sm_70,compute_70\" $NVCC_FLAGS"
-5) I was able to compile using mexcuda after this. You do not need to specify any architecture flags in the mexcuda call.
-6) (optional) I suppose you want to revert this change after you are done with the project that required you to make this change, if you want to ensure maximum portability of the code you will compile after this.
Note: you will need administrator permission to make these changes.

How to run gcov on test application with Nuttx OS on STM discovery board?

Setup:
Toolchain: gcc-arm-none-eabi-5_2-2015q4-20151219
Target: STM429i-disco board
I want to run gcov and get real time report generated in target as per below link:
https://mcuoneclipse.com/2014/12/26/code-coverage-for-embedded-target-with-eclipse-gcc-and-gcov/
First, sucessfully have compiled my code (POSIX compliant NUTTX OS) with -fprofile-arcs & -ftest-coverage flags & got generated the .gcno files for my src files.
second, sucesfully have linked with -fprofile-arcs flags enabled and using the libgcov.a file (part of the toolchain) and the final binary is generated.
Now, I dont know what changes are needed in my test application to invoke gcov, generate report & dump report.
Another problem is, gcov functions are with HIDDEN attribute in libgcov.a as below.
9: 00000000 4 FUNC GLOBAL HIDDEN 1 __gcov_flush
9: 00000000 4 FUNC GLOBAL HIDDEN 1 __gcov_init
so, I could not invoke as I need.
Any inputs in getting the .gcda file generated would be of great help.
Can you look for gcov_exit instead? It is similar to __gcov_flush. Typically, it's one of gcov_exit and __gcov_flush that would be there and you can use any.
In case this is not there or is also hidden, you can use this approach which I tried for one of my projects. I picked (and modified for various reasons) the implementation of gcov_exit from gcc source code (of version matching my toolchain) (available at https://github.com/reeteshranjan/libgcov-embedded) and plugged it in my project. With everything else remaining the same (the compiler flags etc.), I was able to then break into gcov_exit and follow the rest of the approach in the blog link you have mentioned.

Is there a known issue relating to Windows 7 Kernel Symbols?

I have a few Windows 7 machines that I am not able to read their memory dumps. I found something that I suspect may be related, but am not positive:
https://twitter.com/aionescu/status/634028737458114560
I also found this: http://support.microsoft.com/kb/2528507
However, the scenario message regarding wow64exts given in the doc is not seen in any of my dumps. I also cannot apply that hotfix at this time to test it. So I'm just looking for some more information or opinions.
I'm able to open any other OS dump as well as my own system's Windows 7 dump, but there are 2 other machines that run Win 7 and it's telling me I have the wrong kernel symbols.
I have tried clearing out my symbol cache, reinstalled the Windows SDK, and also tried to open the dumps on two other machines with the same result. If it matters, the crash is manually created using the scroll lock method.
Symbol path: SRV*c:\symbols*http://msdl.microsoft.com/download/symbols;
Seeing these errors: followed by "Type referenced: nt!_KPRCB"
Does anyone know about the issue mentioned by Alex in the twitter link and if it's possibly related to what I'm seeing?
Update 2015-10-22:
With the Microsoft patch day (2015-10-13) and KB3088195, symbols are available again.
However, symbols for the broken version have not been provided, so below may still be useful.
Microsoft has already published "good" symbols for ntdll in the past, containing type information like _TEB or _KPRCB. Starting from mid of July 2015, Microsoft has still published symbols for ntdll, but not containing that information.
So it depends on the version of ntdll whether you get type information or not. Old dumps referencing an old version of ntdll will download old PDBs containing type information while new dumps reference new versions of ntdll and WinDbg (or any other debugger) downloads PDBs without type information.
Could Microsoft remove type information of "good" symbols retroactively, thus making them "bad"?
Yes. As described in this answer, there is a tool to remove type information from existing PDBs. Doing that and replacing the PDB would result in such an effect.
Can Microsoft publish the "good" version of those PDBs which are currently "bad"?
That's hard to tell, since we don't know whether Microsoft has kept a copy of the "good" version so they could replace the "bad" version on the symbol server with the "good" one. Rebuilding ntdll from the same source code and thus creating new PDBs sounds possible, but the PDB gets a new time stamp and checksum. This can potentially be corrected manually, especially be Microsoft, since they should have the knowledge about the PDB internal format, but IMHO it's unlikely they'll do that. Things may go wrong and MS will hardly have tests to guarantee the correctness of such a thing.
So what can I do?
IMHO you can do nothing to really correct the situation.
You could assume that the types in ntdll have not changed so much. This would allow you to take an older version of wntdll.pdb and the new version of ntdll.dll and apply ChkMatch -m to it. This will copy the timestamp and checksum from the DLL to the PDB. After you did that (in an empty folder), replace the existing wntdll.pdb in your symbols directory with the hacked one.
WinDbg walkthrough (with output shortened to relevant things). I am using the latest version of wntdll.pdb I could find on my PC.
WARNING: doing the following may fix the type information but will likely destroy the correctness of the callstacks. Since any changes in the implementation (which are likely for security fixes) will change the method offsets.
0:005> dt nt!_PEB
*************************************************************************
*** ***
*** Either you specified an unqualified symbol, or your debugger ***
...
*** Type referenced: nt!_PEB ***
*** ***
*************************************************************************
Symbol nt!_PEB not found.
0:005> lm m ntdll
start end module name
773f0000 77570000 ntdll (pdb symbols) e:\debug\symbols\wntdll.pdb\FA9C48F9C11D4E0894B8970DECD92C972\wntdll.pdb
0:005> .shell cmd /c copy C:\Windows\SysWOW64\ntdll.dll e:\debug\temp\ntdllhack\ntdll.dll
1 file(s) copied.
0:005> .shell cmd /c copy "E:\Windows SDk\8.0\Debuggers\x86\sym\wntdll.pdb\B081677DFC724CC4AC53992627BEEA242\wntdll.pdb" e:\debug\temp\ntdllhack\wntdll.pdb
1 file(s) copied.
0:005> .shell cmd /c E:\debug\temp\ntdllhack\chkmatch.exe -m E:\debug\temp\ntdllhack\ntdll.dll E:\debug\temp\ntdllhack\wntdll.pdb
...
Executable: E:\debug\temp\ntdllhack\ntdll.dll
Debug info file: E:\debug\temp\ntdllhack\wntdll.pdb
Executable:
TimeDateStamp: 55a69e20
Debug info: 2 ( CodeView )
TimeStamp: 55a68c18 Characteristics: 0 MajorVer: 0 MinorVer: 0
Size: 35 RVA: 000e63e0 FileOffset: 000d67e0
CodeView format: RSDS
Signature: {fa9c48f9-c11d-4e08-94b8-970decd92c97} Age: 2
PdbFile: wntdll.pdb
Debug info: 10 ( Unknown )
TimeStamp: 55a68c18 Characteristics: 0 MajorVer: 565 MinorVer: 6526
Size: 4 RVA: 000e63dc FileOffset: 000d67dc
Debug information file:
Format: PDB 7.00
Signature: {b081677d-fc72-4cc4-ac53-992627beea24} Age: 4
Writing to the debug information file...
Result: Success.
0:005> .shell cmd /c copy E:\debug\temp\ntdllhack\wntdll.pdb E:\debug\symbols\wntdll.pdb\FA9C48F9C11D4E0894B8970DECD92C972\wntdll.pdb
1 file(s) copied.
0:005> .reload
Reloading current modules
.............................
0:005> dt nt!_PEB
ntdll!_PEB
+0x000 InheritedAddressSpace : UChar
+0x001 ReadImageFileExecOptions : UChar
...
0:005> !heap -s
LFH Key : 0x219ab08b
Termination on corruption : DISABLED
Heap Flags Reserv Commit Virt Free List UCR Virt Lock Fast
(k) (k) (k) (k) length blocks cont. heap
-----------------------------------------------------------------------------
Virtual block: 00920000 - 00920000 (size 00000000)
Virtual block: 02c60000 - 02c60000 (size 00000000)
Virtual block: 02e10000 - 02e10000 (size 00000000)
...
Note: using ChkMatch like this has the benefit that you do not need to turn on .symopt- 100, since that option would affect all PDB files, and you would not find potential other symbol issues. If you don't mind using .symopt, you could simply copy an old wntdll.PDB over the new one.
The issue is now fixed according to Microsoft and Microsoft told me that you should clear your symbol cache to get the new PDBs, otherwise Windbg would use the old Symbols which miss the information.

Resources