SonarQube, JaCoCo: (11 of 22 conditions) when they are supposed to be 3 (or 4) - gradle

I am lost on how SonarQube calculates conditions covered by tests.
Versions of tools used:
* JaCoCo 0.8.1
* SonarQube 7.4
This is my groovy code
boolean condition1(boolean b1, boolean b2) {
!b1 || !b2
}
boolean condition2(boolean b1, boolean b2) {
b1 || b2
}
boolean condition3(boolean b1, boolean b2) {
!b1 && !b2
}
boolean condition4(boolean b1, boolean b2) {
b1 && b2
}
boolean condition5(boolean b1, boolean b2) {
b1 && !b2
}
boolean condition6(boolean b1, boolean b2, boolean b3) {
b1 && b2 && b3
}
Here are the tests
void "test condition 1"() {
expect:
service.condition1(c1,c2)
where:
c1 | c2
true | true
true | false
false | true
false | false
}
void "test condition 2"() {
expect:
service.condition2(c1,c2)
where:
c1 | c2
true | true
true | false
false | true
false | false
}
void "test condition 3"() {
expect:
service.condition3(c1,c2)
where:
c1 | c2
true | true
true | false
false | true
false | false
}
void "test condition 4"() {
expect:
service.condition4(c1,c2)
where:
c1 | c2
true | true
true | false
false | true
false | false
}
void "test condition 5"() {
expect:
service.condition5(c1,c2)
where:
c1 | c2
true | true
true | false
false | true
false | false
}
void "test condition 6"() {
expect:
service.condition6(c1, c2, c3)
where:
c1 | c2 | c3
true | true | true
true | true | false
true | false | true
true | false | false
false | true | true
false | true | false
false | true | true
false | true | false
false | false | false
}
The code coverage report says those conditions are not satisfied and the followings are the only info I get
condition1. (11 of 22 conditions)
condition2. (7 of 14 conditions)
condition3. (11 of 22 conditions)
condition4. (7 of 14 conditions)
condition5. (9 of 18 conditions)
condition6. (11 of 22 conditions)
That means I am not able to reach 100% of covered tests although I believe logically did.
I am aware of SonarQube documentation
https://docs.sonarqube.org/latest/user-guide/metric-definitions/
where it says
On each line of code containing some boolean expressions, the condition coverage simply answers the following question: 'Has each boolean expression been evaluated both to true and false?'. This is the density of possible conditions in flow control structures that have been followed during unit tests execution
Anyone has an idea on how this actually works and what I am doing wrong here?

From similar questions (such as "Why is JaCoCo not covering my String switch statements?" and "How does assert groupType != null contain 4 branches")
JaCoCo performs analysis of bytecode
thus similarly - take a look at bytecode.
For Example.groovy
class Example {
boolean condition2(boolean b1, boolean b2) {
b1 || b2
}
}
Groovy compiler (groovyc --version)
Groovy compiler version 3.0.0-rc-1
Copyright 2003-2019 The Apache Software Foundation. http://groovy-lang.org/
generates (groovyc Example.groovy)
following bytecode (javap -v -p Example.class)
public boolean condition2(boolean, boolean);
descriptor: (ZZ)Z
flags: (0x0001) ACC_PUBLIC
Code:
stack=1, locals=4, args_size=3
0: invokestatic #20 // Method $getCallSiteArray:()[Lorg/codehaus/groovy/runtime/callsite/CallSite;
3: astore_3
4: invokestatic #38 // Method org/codehaus/groovy/runtime/BytecodeInterface8.isOrigZ:()Z
7: ifeq 25
10: getstatic #40 // Field __$stMC:Z
13: ifne 25
16: invokestatic #43 // Method org/codehaus/groovy/runtime/BytecodeInterface8.disabledStandardMetaClass:()Z
19: ifne 25
22: goto 42
25: iload_1
26: ifne 33
29: iload_2
30: ifeq 37
33: iconst_1
34: goto 38
37: iconst_0
38: ireturn
39: nop
40: nop
41: athrow
42: iload_1
43: ifne 50
46: iload_2
47: ifeq 54
50: iconst_1
51: goto 55
54: iconst_0
55: ireturn
56: nop
57: nop
58: nop
59: nop
60: nop
61: nop
62: nop
63: nop
64: athrow
LineNumberTable:
line 2: 4
line 3: 25
line 4: 39
line 3: 42
line 4: 56
LocalVariableTable:
Start Length Slot Name Signature
0 56 0 this LExample;
0 56 1 b1 Z
0 56 2 b2 Z
which contains 14 branches (2 branches per each conditional jump instruction ifeq and ifne), and your tests cover only half of them (execution path where first ifeq at offset 7 jumps to offset 25), which is absolutely consistent with what is reported by JaCoCo and hence shown in SonarQube.
And following discussion seems to be relevant here - http://groovy.329449.n5.nabble.com/Branch-coverage-issues-td5686725.html , because compilation of the same using --indy option (groovyc --indy Example.groovy) produces following bytecode
public boolean condition2(boolean, boolean);
descriptor: (ZZ)Z
flags: (0x0001) ACC_PUBLIC
Code:
stack=1, locals=3, args_size=3
0: iload_1
1: ifne 8
4: iload_2
5: ifeq 12
8: iconst_1
9: goto 13
12: iconst_0
13: ireturn
14: nop
15: nop
16: nop
17: nop
18: nop
19: nop
20: nop
21: nop
22: athrow
LineNumberTable:
line 3: 0
line 4: 14
LocalVariableTable:
Start Length Slot Name Signature
0 14 0 this LExample;
0 14 1 b1 Z
0 14 2 b2 Z
which contains only 4 branches.

Related

Find the optimal combination of setting values for `number of processes` and `OMP_NUM_THREADS` in a particular computing task

The testing environment is Ubuntu 20.04.3 LTS installed on a machine with dual Intel Xeon E5-2699 v4 and Supermicro X10DAi motherboard. I try to compile and test VASP.6.3.0 with recent/latest Intel oneAPI base and hpc toolkits.
The test commands are as follows:
VASP_TESTSUITE_EXE_STD="mpirun -np $nranks -genv OMP_NUM_THREADS=$nthrds -genv I_MPI_PIN_DOMAIN=omp -genv KMP_AFFINITY=verbose,granularity=fine,compact,1,0 -genv KMP_STACKSIZE=512m /home/werner/Public/hpc/vasp/vasp.6.3.0/testsuite/../bin/vasp_std"
VASP_TESTSUITE_EXE_NCL="mpirun -np $nranks -genv OMP_NUM_THREADS=$nthrds -genv I_MPI_PIN_DOMAIN=omp -genv KMP_AFFINITY=verbose,granularity=fine,compact,1,0 -genv KMP_STACKSIZE=512m /home/werner/Public/hpc/vasp/vasp.6.3.0/testsuite/../bin/vasp_ncl"
VASP_TESTSUITE_EXE_GAM="mpirun -np $nranks -genv OMP_NUM_THREADS=$nthrds -genv I_MPI_PIN_DOMAIN=omp -genv KMP_AFFINITY=verbose,granularity=fine,compact,1,0 -genv KMP_STACKSIZE=512m /home/werner/Public/hpc/vasp/vasp.6.3.0/testsuite/../bin/vasp_gam"
I found that the time performance may be very different for a specific job with different combination of np (i.e., number of processes) and OMP_NUM_THREADS. In my test, I found that the combination of -np 16 and OMP_NUM_THREADS=16 is very time-consuming, and I terminated this testing step before it was over. For a summary of the time benchmarks corresponding to the tests here, see this file and the discussion here and for more detailed information.
So a natural question is: How to find the optimal combination of setting values for number of processes and OMP_NUM_THREADS in a particular computing task? Is there a rule of thumb?
The following is supplementary information as a reply to the comments given by Victor Eijkhout, Homer512 and Jérôme Richard:
See the related info give by inxi:
werner#X10DAi-00:~$ inxi -Cxxx
CPU: Topology: 2x 22-Core model: Intel Xeon E5-2699 v4 bits: 64 type: MT MCP SMP arch: Broadwell rev: 1
L2 cache: 110.0 MiB
flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 387287
Speed: 1200 MHz min/max: 1200/3600 MHz Core speeds (MHz): 1: 1200 2: 1202 3: 1202 4: 1202 5: 1200
6: 1202 7: 1203 8: 1201 9: 1204 10: 1201 11: 1654 12: 2007 13: 2204 14: 2200 15: 1245 16: 1202
17: 1202 18: 1202 19: 1203 20: 1202 21: 1203 22: 1202 23: 1202 24: 1201 25: 1202 26: 1202 27: 1201
28: 1202 29: 1202 30: 1202 31: 2066 32: 1202 33: 1202 34: 1202 35: 1203 36: 1202 37: 1202 38: 1202
39: 1202 40: 1202 41: 1200 42: 1516 43: 1200 44: 1200 45: 1200 46: 1202 47: 1200 48: 1200 49: 1200
50: 1200 51: 1201 52: 1201 53: 1201 54: 1201 55: 1200 56: 1201 57: 1204 58: 1200 59: 1200 60: 1609
61: 1871 62: 2200 63: 1251 64: 1201 65: 1201 66: 1201 67: 1200 68: 1203 69: 1200 70: 1201 71: 1201
72: 1201 73: 1201 74: 1201 75: 1200 76: 1200 77: 1200 78: 1201 79: 1203 80: 1523 81: 1201 82: 1200
83: 1200 84: 1201 85: 1201 86: 1200 87: 1200 88: 1204
werner#X10DAi-00:~$ inxi -Mxxx
Machine: Type: Desktop System: Supermicro product: X10DAi v: 123456789 serial: <superuser/root required>
Mobo: Supermicro model: X10DAI v: 1.02 serial: <superuser/root required> UEFI: American Megatrends
v: 3.2 date: 12/16/2019
werner#X10DAi-00:~$ inxi -Sxxx
System: Host: X10DAi-00 Kernel: 5.8.0-43-generic x86_64 bits: 64 compiler: N/A Desktop: GNOME 3.36.9
tk: GTK 3.24.20 wm: gnome-shell dm: GDM3 3.36.3 Distro: Ubuntu 20.04.3 LTS (Focal Fossa)
I retest the test discussed here. See the following for the time baseline and the corresponding combination of options:
nranks=4 nthrds=2
real 0m13.666s
user 1m20.643s
sys 0m4.314s
nranks=8 nthrds=2
real 0m11.908s
user 2m9.973s
sys 0m7.549s
nranks=12 nthrds=2
real 0m11.043s
user 2m55.062s
sys 0m11.161s
nranks=16 nthrds=2
real 0m11.087s
user 3m45.074s
sys 0m15.343s
nranks=4 nthrds=2
real 0m13.511s
user 1m19.949s
sys 0m4.185s
nranks=6 nthrds=4
real 0m13.736s
user 3m38.704s
sys 0m12.471s
nranks=8 nthrds=5
real 0m12.378s
user 5m13.113s
sys 0m18.022s
It seems that the above results are consistent with the comments given by Homer512:
Typical setups to test are one process per core (1-2 threads) or one
per LLC with as many threads as appropriate.
Regards,
HZ

transpose lines to columns [duplicate]

i am trying to transpose a table (10k rows X 10K cols) using the following script.
A simple data example
$ cat rm1
t1 t2 t3
n1 1 2 3
n2 2 3 44
n3 1 1 1
$ sh transpose.sh rm1
n1 n2 n3
t1 1 2 1
t2 2 3 1
t3 3 44 1
However, I am getting memory error. Any help would be appreciated.
awk -F "\t" '{
for (f = 1; f <= NF; f++)
a[NR, f] = $f
}
NF > nf { nf = NF }
END {
for (f = 1; f <= nf; f++)
for (r = 1; r <= NR; r++)
printf a[r, f] (r==NR ? RS : FS)
}'
Error
awk: cmd. line:2: (FILENAME=input FNR=12658) fatal: dupnode: r->stptr: can't allocate 10 bytes of memory (Cannot allocate memory)
Here's one way to do it, as I mentioned in my comments, in chunks. Here I show the mechanics on a tiny 12r x 10c file, but I also ran a chunk of 1000 rows on a 10K x 10K file in not much more than a minute (Mac Powerbook).6
EDIT The following was updated to consider an M x N matrix with unequal number of rows and columns. The previous version only worked for an 'N x N' matrix.
$ cat et.awk
BEGIN {
start = chunk_start
limit = chunk_start + chunk_size - 1
}
{
n = (limit > NF) ? NF : limit
for (f = start; f <= n; f++) {
a[NR, f] = $f
}
}
END {
n = (limit > NF) ? NF : limit
for (f = start; f <= n; f++)
for (r = 1; r <= NR; r++)
printf a[r, f] (r==NR ? RS : FS)
}
$ cat t.txt
10 11 12 13 14 15 16 17 18 19
20 21 22 23 24 25 26 27 28 29
30 31 32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47 48 49
50 51 52 53 54 55 56 57 58 59
60 61 62 63 64 65 66 67 68 69
70 71 72 73 74 75 76 77 78 79
80 81 82 83 84 85 86 87 88 89
90 91 92 93 94 95 96 97 98 99
A0 A1 A2 A3 A4 A5 A6 A7 A8 A9
B0 B1 B2 B3 B4 B5 B6 B7 B8 B9
C0 C1 C2 C3 C4 C5 C6 C7 C8 C9
$ cat et.sh
inf=$1
outf=$2
rm -f $outf
for i in $(seq 1 2 12); do
echo chunk for rows $i $(expr $i + 1)
awk -v chunk_start=$i -v chunk_size=2 -f et.awk $inf >> $outf
done
$ sh et.sh t.txt t-transpose.txt
chunk for rows 1 2
chunk for rows 3 4
chunk for rows 5 6
chunk for rows 7 8
chunk for rows 9 10
chunk for rows 11 12
$ cat t-transpose.txt
10 20 30 40 50 60 70 80 90 A0 B0 C0
11 21 31 41 51 61 71 81 91 A1 B1 C1
12 22 32 42 52 62 72 82 92 A2 B2 C2
13 23 33 43 53 63 73 83 93 A3 B3 C3
14 24 34 44 54 64 74 84 94 A4 B4 C4
15 25 35 45 55 65 75 85 95 A5 B5 C5
16 26 36 46 56 66 76 86 96 A6 B6 C6
17 27 37 47 57 67 77 87 97 A7 B7 C7
18 28 38 48 58 68 78 88 98 A8 B8 C8
19 29 39 49 59 69 79 89 99 A9 B9 C9
And then running the first chunk on the huge file looks like:
$ time awk -v chunk_start=1 -v chunk_size=1000 -f et.awk tenk.txt > tenk-transpose.txt
real 1m7.899s
user 1m5.173s
sys 0m2.552s
Doing that ten times with the next chunk_start set to 1001, etc. (and appending with >> to the output, of course) should finally give you the full transposed result.
There is a simple and quick algorithm based on sorting:
1) Make a pass through the input, prepending the row number and column number to each field. Output is a three-tuple of row, column, value for each cell in the matrix. Write the output to a temporary file.
2) Sort the temporary file by column, then row.
3) Make a pass through the sorted temporary file, reconstructing the transposed matrix.
The two outer passes are done by awk. The sort is done by the system sort. Here's the code:
$ echo '1 2 3
2 3 44
1 1 1' |
awk '{ for (i=1; i<=NF; i++) print i, NR, $i}' |
sort -n |
awk ' NR>1 && $2==1 { print "" }; { printf "%s ", $3 }; END { print "" }'
1 2 1
2 3 1
3 44 1

Is there a way to output the assembly of a single function in isolation?

I am learning how a C file is compiled to machine code. I know I can generate assembly from gcc with the -S flag, however it also produces a lot of code to do with main() and printf() that I am not interested in at the moment.
Is there a way to get gcc or clang to "compile" a function in isolation and output the assembly?
I.e. get the assembly for the following c in isolation:
int add( int a, int b ) {
return a + b;
}
There are two ways to do this for a specific object file:
The -ffunction-sections option to gcc instructs it to create a separate ELF section for each function in the sourcefile being compiled.
The symbol table contains section name, start address and size of a given function; that can be fed into objdump via the --start-address/--stop-address arguments.
The first example:
$ readelf -S t.o | grep ' .text.'
[ 1] .text PROGBITS 0000000000000000 00000040
[ 4] .text.foo PROGBITS 0000000000000000 00000040
[ 6] .text.bar PROGBITS 0000000000000000 00000060
[ 9] .text.foo2 PROGBITS 0000000000000000 000000c0
[11] .text.munch PROGBITS 0000000000000000 00000110
[14] .text.startup.mai PROGBITS 0000000000000000 00000180
This has been compiled with -ffunction-sections and there are four functions, foo(), bar(), foo2() and munch() in my object file. I can disassemble them separately like so:
$ objdump -w -d --section=.text.foo t.o
t.o: file format elf64-x86-64
Disassembly of section .text.foo:
0000000000000000 <foo>:
0: 48 83 ec 08 sub $0x8,%rsp
4: 8b 3d 00 00 00 00 mov 0(%rip),%edi # a <foo+0xa>
a: 31 f6 xor %esi,%esi
c: 31 c0 xor %eax,%eax
e: e8 00 00 00 00 callq 13 <foo+0x13>
13: 85 c0 test %eax,%eax
15: 75 01 jne 18 <foo+0x18>
17: 90 nop
18: 48 83 c4 08 add $0x8,%rsp
1c: c3 retq
The other option can be used like this (nm dumps symbol table entries):
$ nm -f sysv t.o | grep bar
bar |0000000000000020| T | FUNC|0000000000000026| |.text
$ objdump -w -d --start-address=0x20 --stop-address=0x46 t.o --section=.text
t.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000020 <bar>:
20: 48 83 ec 08 sub $0x8,%rsp
24: 8b 3d 00 00 00 00 mov 0(%rip),%edi # 2a <bar+0xa>
2a: 31 f6 xor %esi,%esi
2c: 31 c0 xor %eax,%eax
2e: e8 00 00 00 00 callq 33 <bar+0x13>
33: 85 c0 test %eax,%eax
35: 75 01 jne 38 <bar+0x18>
37: 90 nop
38: bf 3f 00 00 00 mov $0x3f,%edi
3d: 48 83 c4 08 add $0x8,%rsp
41: e9 00 00 00 00 jmpq 46 <bar+0x26>
In this case, the -ffunction-sections option hasn't been used, hence the start offset of the function isn't zero and it's not in its separate section (but in .text).
Beware though when disassembling object files ...
This isn't exactly what you want, because, for object files, the call targets (as well as addresses of global variables) aren't resolved - you can't see here that foo calls printf, because the resolution of that on binary level happens only at link time. The assembly source would have the call printf in there though. The information that this callq is actually to printf is in the object file, but separate from the code (it's in the so-called relocation section that lists locations in the object file to be 'patched' by the linker); the disassembler can't resolve this.
The best way to go would be to copy your function in a single temp.c C file and to compile it with the -c flag like this: gcc -c -S temp.c -o temp.s
It should produce a more tighten assembly code with no other distraction (except for the header and footer).

asm error message: `(%rax,%edx,4)' is not a valid base/index expression

:96: Error: `(%rax,%edx,4)' is not a valid base/index expression
line97: Error: `-4(%rax,%edx,4)' is not a valid base/index expression
line101: Error: `(%rax,%edx,4)' is not a valid base/index expression
line102: Error: `-4(%rax,%edx,4)' is not a valid base/index expression
I get these error messages and am not sure how to fix it.
This is my code:
   
__asm__ (
"loop: \n\t"
"movl $1,%3\n\t"
"movl $0, %6\n"
"start: \n\t"
"movl (%1,%3,4),%4\n\t"
"movl -4(%1, %3, 4), %5\n\t"
"cmpl %4, %5\n\t"
"jle next\n\t"
"xchgl %4, %5\n\t"
"movl %4, (%1, %3, 4)\n\t"
"movl %5, -4(%1, %3, 4)\n\t"
"movl $1, %6\n\t"
"next: \n\t"
"incl %3 \n\t"
"cmpl %3, %2\n\t"
"jge start\n\t"
"cmpl $0, %6\n\t"
"je end\n\t"
"jmp loop\n\t"
"end: \n\t"
Some help explaining how to fix these error message, please.
I am trying to make a bubble sort in ASM.
You didn't say what processor you are targeting, but it appears to be x64. On x64, (%rax, %edx, 4) is not a legal combination. Consult the processor manual for a list of valid addressing modes. My guess is that you meant (%rax, %rdx, 4).
Most likely cause of your problem is the use of an explicit 32bit integer type in the %3 operand. You haven't shown the constraints list for your inline assembly. But the above occurs if you do:
int main(int argc, char **argv)
{
int result, foobaridx;
foobaridx = foobar[4];
__asm__ (
" dec %2\n\t"
" movl (%1, %2, 4), %0\n\t"
: "=r"(result) : "r"(foobar), "r"(foobaridx) : "memory", "cc");
return result;
}
Compiling this in 32bit mode works alright:
$ gcc -O8 -m32 -c tt.c
$ objdump -d tt.o
tt.o: file format elf32-i386
00000000 :
0: 55 push %ebp
1: b8 00 00 00 00 mov $0x0,%eax
6: 89 e5 mov %esp,%ebp
8: 83 ec 08 sub $0x8,%esp
b: 8b 15 10 00 00 00 mov 0x10,%edx
11: 83 e4 f0 and $0xfffffff0,%esp
14: 4a dec %edx
15: 8b 04 90 mov (%eax,%edx,4),%eax
18: 83 ec 10 sub $0x10,%esp
1b: c9 leave
1c: c3 ret
But in 64bit mode, the compiler/assembler doesn't like it:
$ gcc -O8 -c tt.c
/tmp/cckylXxC.s: Assembler messages:
/tmp/cckylXxC.s:12: Error: `(%rax,%edx,4)' is not a valid base/index expression
The way to fix this is to use #include <stdint.h> and cast register operands that'll end up being used for in addressing (as base or index registers) to uintptr_t (which is an integer data type guaranteed to be 'size-compatible' to pointers, no matter whether you're on 32bit or 64bit). With that change, the 64bit compile succeeds and creates the following output:
$ gcc -O8 -c tt.c
$ objdump -d tt.o
tt.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 :
0: 48 63 15 00 00 00 00 movslq 0(%rip),%rdx # 7
7: b8 00 00 00 00 mov $0x0,%eax
c: 48 ff ca dec %rdx
f: 8b 04 90 mov (%rax,%rdx,4),%eax
12: c3 retq
Good luck making your inline assembly "32/64bit agnostic" !
I also had the same issue in a simple array sum, and after using #FrankH. 's advice, (cheers #FrankH.)
it works.
here is my inline asm code for gcc
inline int array_sum(const int *value, const int &size) {
int sum = 0;
// for (int i = 0; i < size; i++) {
// sum = sum + value[i];
// }
for (int i = 0; i < size; i++)
asm("addl (%1,%2,4),%0"
: "=r"(sum)
: "r"((uintptr_t)value), "r"((uintptr_t)i), "0"(sum)
: "cc");
return (sum);
}

Longitudinal Redundancy Check fails

I have an application that decodes data from a magnetic stripe reader. But, I'm having difficulty getting my calculated LRC check byte to match the one on the cards. If I were to grab 3 cards each with 3 tracks, I would guess the algorithm below would work on 4 of the 9 tracks in those cards.
The algorithm I'm using looks like this (C#):
private static char GetLRC(string s, int start, int end)
{
int result = 0;
for (int i = start; i <= end; i++)
{
result ^= Convert.ToByte(s[i]);
}
return Convert.ToChar(result);
}
This is an example of track 3 data that fails the check. On this card, track 2 matched, but track 1 also failed.
0 1 2 3 4 5 6 7 8 9 A B C D E F
00 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5
10 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 7
20 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8
30 8 8 8 9 9 9 9 9 9 9 9 9 9 0 0 0
40 0 0 0 0 0 0 0 1 2 3 4 1 1 1 1 1
50 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3
60 3 3 3 3 3 3 3 3
The sector delimiter is ';' and it ends with a '?'.
The LRC byte from this track is 0x30. Unfortunately, the algorithm above computes an LRC of 0x00 per the following calculation (apologies for its length. I want to be thorough):
00 ^ 3b = 3b ';'
3b ^ 33 = 08
08 ^ 34 = 3c
3c ^ 34 = 08
08 ^ 34 = 3c
3c ^ 34 = 08
08 ^ 34 = 3c
3c ^ 34 = 08
08 ^ 34 = 3c
3c ^ 34 = 08
08 ^ 34 = 3c
3c ^ 34 = 08
08 ^ 35 = 3d
3d ^ 35 = 08
08 ^ 35 = 3d
3d ^ 35 = 08
08 ^ 35 = 3d
3d ^ 35 = 08
08 ^ 35 = 3d
3d ^ 35 = 08
08 ^ 35 = 3d
3d ^ 35 = 08
08 ^ 36 = 3e
3e ^ 36 = 08
08 ^ 36 = 3e
3e ^ 36 = 08
08 ^ 36 = 3e
3e ^ 36 = 08
08 ^ 36 = 3e
3e ^ 36 = 08
08 ^ 36 = 3e
3e ^ 36 = 08
08 ^ 37 = 3f
3f ^ 37 = 08
08 ^ 37 = 3f
3f ^ 37 = 08
08 ^ 37 = 3f
3f ^ 37 = 08
08 ^ 37 = 3f
3f ^ 37 = 08
08 ^ 37 = 3f
3f ^ 37 = 08
08 ^ 38 = 30
30 ^ 38 = 08
08 ^ 38 = 30
30 ^ 38 = 08
08 ^ 38 = 30
30 ^ 38 = 08
08 ^ 38 = 30
30 ^ 38 = 08
08 ^ 38 = 30
30 ^ 38 = 08
08 ^ 39 = 31
31 ^ 39 = 08
08 ^ 39 = 31
31 ^ 39 = 08
08 ^ 39 = 31
31 ^ 39 = 08
08 ^ 39 = 31
31 ^ 39 = 08
08 ^ 39 = 31
31 ^ 39 = 08
08 ^ 30 = 38
38 ^ 30 = 08
08 ^ 30 = 38
38 ^ 30 = 08
08 ^ 30 = 38
38 ^ 30 = 08
08 ^ 30 = 38
38 ^ 30 = 08
08 ^ 30 = 38
38 ^ 30 = 08
08 ^ 31 = 39
39 ^ 32 = 0b
0b ^ 33 = 38
38 ^ 34 = 0c
0c ^ 31 = 3d
3d ^ 31 = 0c
0c ^ 31 = 3d
3d ^ 31 = 0c
0c ^ 31 = 3d
3d ^ 31 = 0c
0c ^ 31 = 3d
3d ^ 31 = 0c
0c ^ 31 = 3d
3d ^ 31 = 0c
0c ^ 32 = 3e
3e ^ 32 = 0c
0c ^ 32 = 3e
3e ^ 32 = 0c
0c ^ 32 = 3e
3e ^ 32 = 0c
0c ^ 32 = 3e
3e ^ 32 = 0c
0c ^ 32 = 3e
3e ^ 32 = 0c
0c ^ 33 = 3f
3f ^ 33 = 0c
0c ^ 33 = 3f
3f ^ 33 = 0c
0c ^ 33 = 3f
3f ^ 33 = 0c
0c ^ 33 = 3f
3f ^ 33 = 0c
0c ^ 33 = 3f
3f ^ 3f = 00 '?'
If anybody can point out how to fix my algorithm, I would appreciate it.
Thanks,
PaulH
Edit:
So that you can see if I'm accidentally missing any bytes in my LRC calculation or including the wrong ones (the final '.' is actually a '\r'). The complete data from all three tracks:
0 1 2 3 4 5 6 7 8 9 A B C D E F
00 % U V W X Y Z 0 1 2 3 4 5 6 7 8
10 9 9 A B C D E F G H I J K L M N
20 O P Q R S T U V W X Y Z 1 2 3 0
30 1 2 3 4 5 6 7 8 9 A B C D E F G
40 H I J K L M N O P Q R S T ? 3 ;
50 1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 9
60 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
70 6 7 8 9 0 ? 5 ; 3 4 4 4 4 4 4 4
80 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6
90 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7
A0 7 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9
B0 9 9 9 9 9 0 0 0 0 0 0 0 0 0 0 1
C0 2 3 4 1 1 1 1 1 1 1 1 1 1 2 2 2
D0 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3
E0 ? 0 .
The GetLRC() algorithm re-instrumented as suggested to only XOR bytes that appear an odd number of times:
private static char GetLRC(string s, int start, int end)
{
int result = 0;
byte cur_byte = Convert.ToByte(s[start]);
int count = 0;
for (int i = start; i <= end; i++)
{
byte b = Convert.ToByte(s[i]);
if (cur_byte != b)
{
if (count % 2 != 0)
{
result ^= cur_byte;
}
cur_byte = b;
count = 0;
}
++count;
}
if (count % 2 != 0)
{
result ^= cur_byte;
}
return Convert.ToChar(result);
}
The calculation steps taken by the new GetLRC() function:
00 ^ 3b = 3b ';'
3b ^ 33 = 08
08 ^ 31 = 39
39 ^ 32 = 0b
0b ^ 33 = 38
38 ^ 34 = 0c
0c ^ 33 = 3f
3f ^ 3f = 00 '?'
Question: Does the LRC byte come from the card itself or is it being added by the reader firmware? (i.e. perhaps this is a firmware bug)
Can I make a suggestion? Store your data as run lengths and only do the xor if the run length is odd - and then only do it once (runLength & 0x01) times. That will get rid of a ton of the worthless bit work and make it clearer on what is occuring. Doing that yields:
Run Lengths:
(01,3b)(01,33)(10,34)(10,35)(10,36)(10,37)(10,38)(10,39)(10,30)
(01,31)(01,32)(01,33)(01,34)(10,31)(10,32)(09,33)(1,3f)
Doing the even/odd thing gives:
3b ^ 33 ^ 31 ^ 32 ^ 33 ^ 34 ^ 33 ^ 3f
08-->39-->0B-->38-->0C-->3F-->00
Much simpler and cleaner to look at. My guess is that looking at your data, that there is an extra 30 somewhere in your data stream or 1 short. Adding that extra 30 gets you your answer:
3b ^ 33 ^ 31 ^ 32 ^ 33 ^ 34 ^ 33 ^ 30 ^ 3F
08-->39-->0B-->38-->0C-->3F-->0F-->30
Beyond that, I'll keep digging...
Can you add some asserts or other validation to your input parameters? I'd hate to see out of bounds start/end causing excitement and/or a null string. Also, is there a possibility of an off by one with start end? Inclusive/exclusive data range? That could account for an extra 0x030 at the end of your data from a 0 stored at the end of your track 3 being converted to a 0x30. Also, is there any possibility of having either corrupt data or a corrupt LRU? Obviously, this is the kind of thing your check is trying to catch. Perhaps it caught something?
Algorithm about LRC is corrected, but the format of data to calculate LRC maybe wrong.
(it depends on your MSR reader)
There are two format of track define by ANSI/ISO (Alpha and BCD).
The coding of binary is different to ASCII.
In this case, start sentinel is ';' ,so the format should be BCD.
(Alpha start sentinel is '%')
LRC is use "Real track data" to calculate (not include parity bit),
Convert rule
ASCII to BCD ->(ASCII - 0x30)
--Data Bits-- Parity
b1 b2 b3 b4 b5 Character Function
0 0 0 0 1 0 (0H) Data
1 0 0 0 0 1 (1H) "
0 1 0 0 0 2 (2H) "
1 1 0 0 1 3 (3H) "
0 0 1 0 0 4 (4H) "
1 0 1 0 1 5 (5H) "
0 1 1 0 1 6 (6H) "
1 1 1 0 0 7 (7H) "
0 0 0 1 0 8 (8H) "
1 0 0 1 1 9 (9H) "
0 1 0 1 1 : (AH) Control
1 1 0 1 0 ; (BH) Start Sentinel
0 0 1 1 1 < (CH) Control
1 0 1 1 0 = (DH) Field Separator
0 1 1 1 0 > (EH) Control
1 1 1 1 1 ? (FH) End Sentinel
In your sample,
Convert ASCII track data to BCD format.
Use BCD data to calculate LRC, the result is 0x00.
Then convert LRC(BCD to ASCII), finally got LRC = 0x30.
P.S. ASCII convert to Alpha
if(bASCII >= 0x20 && bASCII <= 0x5B)
{
return(bASCII - 0x20);
}
else if(bASCII >= 0x5C && bASCII <= 0x5F)
{
return(bASCII - 0x1F);
}
Your algorithm doesn't match the LRC algorithm in Wikipedia's article. Are you sure you're using the correct algorithm?

Resources