Understanding blocks in gcov files - gcc

I'm trying to understand the output of the gcov tool. Running it with -a options makes sense, and want to understand the block coverage options. Unfortunately it's hard to make sense of what the blocks do and why they aren't taken. Below is the output.
I have run add function in my calculator program once. I have no clue why it shows block0.
-: 0:Source:calculator.c
-: 0:Graph:calculator.gcno
-: 0:Data:calculator.gcda
-: 0:Runs:1
-: 0:Programs:1
-: 1:#include "calculator.h"
-: 2:#include <stdio.h>
-: 3:#include <stdlib.h>
-: 4:
1: 5:int main(int argc, char *argv[])
1: 5-block 0
-: 6:{
-: 7: int a,b, result;
-: 8: char opr;
-: 9:
1: 10: if(argc!=4)
1: 10-block 0
-: 11: {
#####: 12: printf("Invalid arguments...\n");
$$$$$: 12-block 0
#####: 13: return -1;
-: 14: }
-: 15:
-: 16: //get values
1: 17: a = atoi(argv[1]);
1: 18: b = atoi(argv[3]);
-: 19:
-: 20: //get operator
1: 21: opr=argv[2][0];
-: 22:
-: 23: //calculate according to operator
1: 24: switch(opr)
1: 24-block 0
-: 25: {
1: 26: case '+':
1: 27: result = add_(a, b);
1: 27-block 0
-: 28:
1: 29: break;
#####: 30: case '-':
#####: 31: result=sub_(a,b);
$$$$$: 31-block 0
#####: 32: break;
#####: 33: case '_':
#####: 34: result=multiply_(a,b);
$$$$$: 34-block 0
#####: 35: break;
#####: 36: case '/':
#####: 37: result = div_(a,b);
$$$$$: 37-block 0
#####: 38: default:
#####: 39: result=0;
#####: 40: break;
$$$$$: 40-block 0
-: 41: }
-: 42:
1: 43: if(opr=='+' || opr=='-' || opr=='_'|| opr== '/')
1: 43-block 0
$$$$$: 43-block 1
$$$$$: 43-block 2
$$$$$: 43-block 3
1: 44: printf("Result: %d %c %d = %d\n",a,opr,b,result);
1: 44-block 0
-: 45: else
#####: 46: printf("Undefined Operator...\n");
$$$$$: 46-block 0
-: 47:
1: 48: return 0;
1: 48-block 0
-: 49:}
-: 50:
-: 51:/**
-: 52: * Function to add two numbers
-: 53: */
1: 54:float add_(float num1, float num2)
1: 54-block 0
-: 55:{
1: 56: return num1 + num2;
1: 56-block 0
-: 57:}
-: 58:
-: 59:/**
-: 60: * Function to subtract two numbers
-: 61: */
#####: 62:float sub_(float num1, float num2)
$$$$$: 62-block 0
-: 63:{
#####: 64: return num1 - num2;
$$$$$: 64-block 0
-: 65:}
-: 66:
-: 67:/**
-: 68: * Function to multiply two numbers
-: 69: */
#####: 70:float multiply_(float num1, float num2)
$$$$$: 70-block 0
-: 71:{
#####: 72: return num1 * num2;
$$$$$: 72-block 0
-: 73:}
-: 74:
-: 75:/**
-: 76: * Function to divide two numbers
-: 77: */
#####: 78:float div_(float num1, float num2)
$$$$$: 78-block 0
-: 79:{
#####: 80: return num1 / num2;
$$$$$: 80-block 0
-: 81:}
If anyone knows how to decipher the block info, specially lines 5,12,13,43 ,64 or knows of any detailed documentation on what it all means, I'd appreciate the help.

Each block is marked by a line with the same line number as the last line of the block and the number of branch and calls in the block. A block is created using a pair of curly braces({}). Line 5 marks the beginning of main block...then as I mentioned for every branch or function call block number is mentioned...Now your if statement has four conditions that means there will be 4 additional blocks which are labeled as 0,1,2,3..All the blocks which are not executed are marked $$$$$, which is true here as you must have passed '+' as the argument so the program never takes the path of other operators and hence block 1,2,3 are marked as $$$$$.
Hope this helps.

Related

Optimal way of assigning items throughout "bins"

I was asked to research possible solutions to a problem that is as follows: We have a certain number of "bins" (~50 of them). Every bin has its own capacity. Every bin can only take items from some specific types. Items have the type and their weight (which lowers the bin's capacity after being stored). What I'm aiming to achieve is a situation in which all the items are assigned "evenly" (I will discuss it in a second) over the bins. It might happen though, that bins won't be able of storing all the items in them because sum of the item's weights will be greater than sum of the bins' capacities. In such a situation I want the script to be able of going over the bins' limit.
What I meant by "evenly" is to, when comparing between bins, take percentage of bin's filled space rather than the amount of items stored in each bin. So, even though that bin 1 has only one item and bin 2 has 10 items then we call it evenly, because in both situations bins were filled to 60% of their capacity. This also applies when bins are overfilled.
At first this seemed as a multiple knapsack problem to me, but those custom constraints (and the overfilling aspect) make me think that this might not be the best approach. Is it even possible to find an exact solution to this (or an algorithm) or does it need some approach like using ML and hoping to find somehow-optimal solutions?
#Edit:
I was asked to, if possible, add a sample, anonymized data, so I'll do my best to somehow show the problem.
Let's say, that we have a set of bins as follows:
A: capacity: 1, types: a, b,
B: capacity: 0.5, types: b,
C: capacity: 0.5, types: a,
D: capacity: 0.75, types: c
E: capacity: 0.4, types: a, b, c
The thing is - capacity is always a number between 0 and 1 and there are multiple types of things that can get to a bin: a bin can either take one type of items to it or multiple.
Now, let's say we have some sample set of items:
a: weight: 0.1, type: a,
b: weight: 0.75, type: b,
c: weight: 0.5, type: a,
d: weight: 0.1, type: c,
e: weight: 0.25, type: a,
f: weight: 0.1, type: b
You can easily see that every item has its weight and type (only one per item).
Now, what we want to achieve is to create a set of bin:set_of_items (like, A -> {a, b}) in which the the distribution of items over bins create a situation in which every bin is, possibly, full to the same percentage of it's capacity (so like, every bin would be full to ~60%) but most importantly all items are distributed - even if this means that we will overfill bins.
Few things to notice from that - there are situations in which bins' capacity is really small, however it is almost certain that the granularity of data is such that every bin will hold at least one item. There can be a situation in which sum_of_bins_capacities<sum_of_items_weights and we will have to overfill the bins - that's fine.
It's really hard for me to provide some reasonable dummy data set but I hope this will make a problem a little bit clearer.
This is a variant on the Fair subset sum problem. It is NP-complete, and with a large number of bins (let alone type restrictions) you can't even use dynamic programming to solve it.
I would therefore suggest using an approximation algorithm to solve it.
First assign some reasonable cost function per bin. For example:
cost_of_bin = weight * percent_to_capacity * (1 if percent_to_capacity <= 1 else 2)
What this means is that moving an item from a fuller bin to an emptier one will reduce cost. It also means that adding to a bin past capacity is possible, but severely penalized.
You can then try a simple greedy approach. Sort the items from largest to smallest. Add each item to the available bin with lowest cost. Then cycle through the list and attempt to see if you can gain anything by moving it. When nothing wants to move, you've got a solution.
If you want to get better results, you can experiment with something like Simulated Annealing.
You can also play with different cost functions. For example squaring percent_to_capacity will more severely penalize the most full bin, while making it OK for light ones to be lighter. Just make sure that moving an item from a more full bin to a less full one always improves maximum cost.
I have implemented an algorithm based on the Monte Carlo strategy. A cost function seeks to fill the bins as evenly as possible by minimizing a variance defined as the sum of the squared bin costs. The cost of a bin is the difference between its fill-adjusted capacity and the combined weight of its items.
In each iteration, the algorithm selects an item and a bin at random. It then moves the item into the bin if this decreases the cost function, but with a small probability also if it increases. That helps avoid getting stuck in local minima. In this way, the algorithm roams the solution space in the general direction of a minimum. It is up to the user to decide when enough is enough and time to stop.
The algorithm is in C++ but is fairly C-ish to make it easier to port. The core algorithm is not extensive. A large part of the code is print statements for tracing. I post the code and its output as it stands. The best solution came after 315 iterations, and another almost 100.000 iterations did not change that.
Code:
#include <iostream>
#include <string>
#include <vector>
#include <unordered_set>
#include <random>
void binitem() {
const int N = 100000; // number of iterations in simulation
const int TL = 100; // trace limit. Tracing stops at TL but "winners" are always printed.
// TL=0 (no trace), TL=N (trace all).
const int SEED = 31; // Seed to the random number generator (change for another random sequence)
const double fairness_capacity = 0.6; // (60%) The evenness of the bins.
const double another_chance = 0.1; // (10%) The acceptance chance of a rejected move
using BinID = int;
using ItemID = int;
enum class TypeID {a,b,c};
struct Item {
std::string symbol;
double weight;
TypeID type;
};
std::vector<Item> items = { // item definitions
{"a", 0.1, TypeID::a},
{"b", 0.75, TypeID::b},
{"c", 0.5, TypeID::a},
{"d", 0.1, TypeID::c},
{"e", 0.25, TypeID::a},
{"f", 0.1, TypeID::b}
};
const int ITEMS = static_cast<ItemID>(items.size());
struct Bin {
std::string symbol;
double capacity;
HashSet<TypeID> allowed_types;
};
std::vector<Bin> bins = { // bin definitions
{"A", 1.0, {TypeID::a, TypeID::b}},
{"B", 0.5, {TypeID::b}},
{"C", 0.5, {TypeID::a}},
{"D", 0.75, {TypeID::c}},
{"E", 0.4, {TypeID::a, TypeID::b, TypeID::c}}
};
const int BINS = static_cast<BinID>(bins.size());
struct ItemState {
BinID is_in_bin = -1; // this item is in this bin (-1 means no bin)
};
std::vector<ItemState> itemStates(ITEMS);
struct BinState {
std::unordered_set<ItemID> items_in_bin; // items in this bin
double cost = 0.0; // bin cost
};
std::vector<BinState> binStates(BINS);
// cost of a bin (squared difference between fairness adjusted capacity and total weight of items)
auto square_cost_of_bin = [&](BinID binID) {
double cost = fairness_capacity * bins[binID].capacity; // adjusted capacity
for (int itemID : binStates[binID].items_in_bin) {
cost -= items[itemID].weight; // minus weight of all items
}
return cost * cost; // squared
};
auto square_cost_of_all_bins = [&]() { // sum up costs of all bins
double sum = 0.0;
for (BinID binID = 0; binID<BINS; ++binID) sum += binStates[binID].cost;
return sum;
};
for (BinID binID = 0; binID<BINS; ++binID) { // initialize bin costs
binStates[binID].cost = square_cost_of_bin(binID);
}
auto print_bins = [&] (int i) {
for (BinID binID = 0; binID<BINS; ++binID) {
std::string s = "";
for (ItemID itemID : binStates[binID].items_in_bin) {
s += items[itemID].symbol + ",";
}
if (s.empty()) s = "-"; else s.pop_back();
if (i>=0) std::cout << i;
std::cout << ": *** Bin=" << bins[binID].symbol <<
" holds items=[" << s << "] (bin cost=" << binStates[binID].cost << ")" << std::endl;
}
};
std::mt19937 rng(SEED); // Mersenne Twister random number generator
std::uniform_int_distribution<int> rnd_item(0, ITEMS-1);
std::uniform_int_distribution<int> rnd_bin(0, BINS-1);
std::uniform_real_distribution<double> rnd_real(0.0, 1.0);
std::cout << ": *** Binitem - a Monte Carlo solution to an assignment probleM" << std::endl;
double total_cost = square_cost_of_all_bins();
std::cout << ": *** Ititial total cost with all bins empty=" << total_cost << std::endl;
for (ItemID itemID=0; itemID<ITEMS; ++itemID) { // insert all items into bins
bool ok=false;
BinID rnd_binID = rnd_bin(rng);
for (BinID binID=0; binID<BINS; ++binID) {
if (bins[rnd_binID].allowed_types.contains(items[itemID].type)) {
binStates[rnd_binID].items_in_bin.insert(itemID); // add item to bin
itemStates[itemID].is_in_bin = rnd_binID; // mark item as present in bin
binStates[rnd_binID].cost = square_cost_of_bin(rnd_binID); // update bin cost
ok=true;
break;
}
rnd_binID = (rnd_binID + 1) % BINS;
}
if (!ok) {
std::cout << ": *** Fatal error: No bin accepts item=" << items[itemID].symbol << std::endl;
return;
}
}
total_cost = square_cost_of_all_bins();
std::cout << ": *** Ititial total cost with all items inserted=" << total_cost << std::endl;
print_bins(-1);
for (int i=0; i<N; ++i) {
if (i==TL) std::cout << ": *** Tracing ends." << std::endl;
const ItemID rnd_itemID = rnd_item(rng); // get a random item
const BinID rnd_binID = rnd_bin(rng); // and try it out with a random bin
double new_total_cost = total_cost;
const bool allowed = bins[rnd_binID].allowed_types.contains(items[rnd_itemID].type);
if (allowed) { // random item is allowed in random bin
const BinID old_binID = itemStates[rnd_itemID].is_in_bin;
if (old_binID != rnd_binID) { // the random item is not in the random bin - move it there
// current costs
const double cur_rnd_bin_cost = binStates[rnd_binID].cost;
const double cur_old_bin_cost = binStates[old_binID].cost;
// make move
binStates[rnd_binID].items_in_bin.insert(rnd_itemID); // add item to random bin
binStates[old_binID].items_in_bin.erase(rnd_itemID); // remove item from old bin
itemStates[rnd_itemID].is_in_bin = rnd_binID; // mark item as present in bin
// new costs
const double new_rnd_bin_cost = square_cost_of_bin(rnd_binID);
binStates[rnd_binID].cost = new_rnd_bin_cost;
const double new_old_bin_cost = square_cost_of_bin(old_binID);
binStates[old_binID].cost = new_old_bin_cost;
bool accept = new_rnd_bin_cost+new_old_bin_cost <= cur_rnd_bin_cost+cur_old_bin_cost;
bool second_chance = false;
if (!accept) {
second_chance = (rnd_real(rng) < another_chance); // a second chance
accept = second_chance;
}
if (accept) {
new_total_cost = square_cost_of_all_bins();
} else {
// restore move and costs
binStates[rnd_binID].items_in_bin.erase(rnd_itemID); // add item to random bin
binStates[old_binID].items_in_bin.insert(rnd_itemID); // remove it from old bin
itemStates[rnd_itemID].is_in_bin = old_binID; // mark item as present in bin
binStates[rnd_binID].cost = cur_rnd_bin_cost;
binStates[old_binID].cost = cur_old_bin_cost;
}
if (i<TL) {
std::cout << i << ": Try move item=" << items[rnd_itemID].symbol <<
" to bin:" << bins[rnd_binID].symbol <<
" : " << ((accept) ? "accepted" : "rejected");
if (second_chance) std::cout << " (on second chance)";
std::cout << std::endl;
}
} else { // random item is already in random bin - better luck next time
if (i<TL) std::cout << i << ": Item=" << items[rnd_itemID].symbol <<
" already present in bin=" << bins[rnd_binID].symbol << std::endl;
}
} else { // random item is not allowed in random bin - better luck next time
if (i<TL) std::cout << i << ": Item=" << items[rnd_itemID].symbol <<
" not allowed in bin=" << bins[rnd_binID].symbol << std::endl;
}
if (new_total_cost < total_cost) { // a new winner
std::cout << i << ": *** New winner:" << std::endl;
std::cout << i << ": *** New total cost=" << new_total_cost <<
", old cost=" << total_cost << std::endl;
total_cost = new_total_cost;
print_bins(i);
}
}
std::cout << ": *** Exit after " << N << " iterations." << std::endl;
} // binitem
Output:
: *** Binitem - a Monte Carlo solution to an assignment probleM
: *** Ititial total cost with all bins empty=0.8001
: *** Ititial total cost with all items inserted=0.6251
: *** Bin=A holds items=[b,c] (bin cost=0.4225)
: *** Bin=B holds items=[f] (bin cost=0.04)
: *** Bin=C holds items=[a] (bin cost=0.04)
: *** Bin=D holds items=[d] (bin cost=0.1225)
: *** Bin=E holds items=[e] (bin cost=0.0001)
0: Try move item=e to bin:C : rejected
1: Item=a not allowed in bin=D
2: Try move item=a to bin:A : rejected
3: Try move item=e to bin:C : rejected
4: Try move item=b to bin:E : rejected
5: Item=d not allowed in bin=A
6: Try move item=f to bin:E : rejected
7: Item=f not allowed in bin=D
8: Try move item=e to bin:C : rejected
9: Item=e not allowed in bin=B
10: Item=b not allowed in bin=D
11: Item=d not allowed in bin=B
12: Item=a already present in bin=C
13: Item=a not allowed in bin=D
14: Item=d already present in bin=D
15: Item=a not allowed in bin=D
16: Try move item=b to bin:E : accepted (on second chance)
17: Item=f not allowed in bin=C
18: Item=c not allowed in bin=D
19: Item=c not allowed in bin=B
20: Try move item=e to bin:A : accepted
20: *** New winner:
20: *** New total cost=0.4851, old cost=0.6251
20: *** Bin=A holds items=[c,e] (bin cost=0.0225)
20: *** Bin=B holds items=[f] (bin cost=0.04)
20: *** Bin=C holds items=[a] (bin cost=0.04)
20: *** Bin=D holds items=[d] (bin cost=0.1225)
20: *** Bin=E holds items=[b] (bin cost=0.2601)
21: Item=c already present in bin=A
22: Item=e not allowed in bin=B
23: Item=e not allowed in bin=B
24: Item=b not allowed in bin=D
25: Item=d not allowed in bin=B
26: Try move item=a to bin:E : rejected
27: Try move item=a to bin:E : rejected
28: Item=c not allowed in bin=D
29: Try move item=a to bin:A : rejected
30: Item=c already present in bin=A
31: Item=f not allowed in bin=C
32: Item=b not allowed in bin=D
33: Item=a already present in bin=C
34: Item=a not allowed in bin=B
35: Try move item=a to bin:A : rejected
36: Item=e not allowed in bin=B
37: Item=a not allowed in bin=D
38: Item=d not allowed in bin=A
39: Try move item=b to bin:A : rejected
40: Try move item=c to bin:C : rejected
41: Item=c already present in bin=A
42: Try move item=a to bin:A : rejected
43: Try move item=a to bin:A : accepted (on second chance)
44: Try move item=c to bin:C : accepted
45: Item=e already present in bin=A
46: Item=e not allowed in bin=D
47: Item=b not allowed in bin=D
48: Try move item=a to bin:E : rejected
49: Try move item=f to bin:E : rejected
50: Item=b not allowed in bin=D
51: Try move item=a to bin:E : accepted (on second chance)
52: Try move item=c to bin:A : accepted
53: Item=b not allowed in bin=D
54: Item=b already present in bin=E
55: Item=c already present in bin=A
56: Item=a not allowed in bin=D
57: Try move item=c to bin:C : rejected
58: Item=a already present in bin=E
59: Try move item=f to bin:A : rejected
60: Item=d not allowed in bin=B
61: Try move item=f to bin:E : rejected
62: Item=e not allowed in bin=B
63: Item=d already present in bin=D
64: Try move item=b to bin:A : rejected
65: Item=c not allowed in bin=B
66: Item=e already present in bin=A
67: Try move item=e to bin:E : rejected
68: Item=a not allowed in bin=B
69: Item=c already present in bin=A
70: Item=f not allowed in bin=D
71: Item=c already present in bin=A
72: Try move item=c to bin:C : rejected
73: Item=d already present in bin=D
74: Item=a not allowed in bin=B
75: Item=f not allowed in bin=D
76: Item=d not allowed in bin=B
77: Item=f not allowed in bin=C
78: Item=d not allowed in bin=B
79: Try move item=b to bin:B : accepted
80: Try move item=a to bin:C : accepted
81: Item=e not allowed in bin=B
82: Try move item=b to bin:E : accepted
83: Item=b not allowed in bin=C
84: Item=f not allowed in bin=D
85: Item=f not allowed in bin=C
86: Item=a already present in bin=C
87: Item=d not allowed in bin=C
88: Try move item=c to bin:C : rejected
89: Item=b not allowed in bin=D
90: Item=f not allowed in bin=D
91: Try move item=a to bin:A : accepted (on second chance)
92: Item=b already present in bin=E
93: Item=f already present in bin=B
94: Try move item=a to bin:C : accepted
95: Item=e already present in bin=A
96: Item=a not allowed in bin=D
97: Try move item=d to bin:E : rejected
98: Try move item=e to bin:C : accepted
98: *** New winner:
98: *** New total cost=0.4351, old cost=0.4851
98: *** Bin=A holds items=[c] (bin cost=0.01)
98: *** Bin=B holds items=[f] (bin cost=0.04)
98: *** Bin=C holds items=[a,e] (bin cost=0.0025)
98: *** Bin=D holds items=[d] (bin cost=0.1225)
98: *** Bin=E holds items=[b] (bin cost=0.2601)
99: Item=f not allowed in bin=D
: *** Tracing ends.
153: *** New winner:
153: *** New total cost=0.3171, old cost=0.4351
153: *** Bin=A holds items=[b] (bin cost=0.0225)
153: *** Bin=B holds items=[f] (bin cost=0.04)
153: *** Bin=C holds items=[e] (bin cost=0.0025)
153: *** Bin=D holds items=[d] (bin cost=0.1225)
153: *** Bin=E holds items=[a,c] (bin cost=0.1296)
173: *** New winner:
173: *** New total cost=0.2551, old cost=0.3171
173: *** Bin=A holds items=[b] (bin cost=0.0225)
173: *** Bin=B holds items=[f] (bin cost=0.04)
173: *** Bin=C holds items=[e,a] (bin cost=0.0025)
173: *** Bin=D holds items=[d] (bin cost=0.1225)
173: *** Bin=E holds items=[c] (bin cost=0.0676)
315: *** New winner:
315: *** New total cost=0.2371, old cost=0.2551
315: *** Bin=A holds items=[b] (bin cost=0.0225)
315: *** Bin=B holds items=[f] (bin cost=0.04)
315: *** Bin=C holds items=[c] (bin cost=0.04)
315: *** Bin=D holds items=[d] (bin cost=0.1225)
315: *** Bin=E holds items=[a,e] (bin cost=0.0121)
: *** Exit after 100000 iterations.

Conversion of integer -> char

I'm learning scheme and I stumbled upon this in a textbook:
(integer->char 50) ⇒ #\2
Why does integer->char 50 evaluate to 2? Is it because "50" is too big to be a character, so it just takes the length/number of digits?
It doesn't evaluate to 2: it evaluates to the character #\2, which is a completely different thing:
> (for ([i (in-range 32 128)])
(let ([c (integer->char i)])
(printf "~S: ~S / '~A'~%" i c c)))
32: #\space / ' '
33: #\! / '!'
34: #\" / '"'
35: #\# / '#'
36: #\$ / '$'
37: #\% / '%'
38: #\& / '&'
39: #\' / '''
40: #\( / '('
41: #\) / ')'
42: #\* / '*'
43: #\+ / '+'
44: #\, / ','
45: #\- / '-'
46: #\. / '.'
47: #\/ / '/'
48: #\0 / '0'
49: #\1 / '1'
50: #\2 / '2'
51: #\3 / '3'
52: #\4 / '4'
53: #\5 / '5'
54: #\6 / '6'
55: #\7 / '7'
56: #\8 / '8'
57: #\9 / '9'
58: #\: / ':'
59: #\; / ';'
60: #\< / '<'
61: #\= / '='
62: #\> / '>'
63: #\? / '?'
64: #\# / '#'
65: #\A / 'A'
66: #\B / 'B'
67: #\C / 'C'
68: #\D / 'D'
69: #\E / 'E'
70: #\F / 'F'
71: #\G / 'G'
72: #\H / 'H'
73: #\I / 'I'
74: #\J / 'J'
75: #\K / 'K'
76: #\L / 'L'
77: #\M / 'M'
78: #\N / 'N'
79: #\O / 'O'
80: #\P / 'P'
81: #\Q / 'Q'
82: #\R / 'R'
83: #\S / 'S'
84: #\T / 'T'
85: #\U / 'U'
86: #\V / 'V'
87: #\W / 'W'
88: #\X / 'X'
89: #\Y / 'Y'
90: #\Z / 'Z'
91: #\[ / '['
92: #\\ / '\'
93: #\] / ']'
94: #\^ / '^'
95: #\_ / '_'
96: #\` / '`'
97: #\a / 'a'
98: #\b / 'b'
99: #\c / 'c'
100: #\d / 'd'
101: #\e / 'e'
102: #\f / 'f'
103: #\g / 'g'
104: #\h / 'h'
105: #\i / 'i'
106: #\j / 'j'
107: #\k / 'k'
108: #\l / 'l'
109: #\m / 'm'
110: #\n / 'n'
111: #\o / 'o'
112: #\p / 'p'
113: #\q / 'q'
114: #\r / 'r'
115: #\s / 's'
116: #\t / 't'
117: #\u / 'u'
118: #\v / 'v'
119: #\w / 'w'
120: #\x / 'x'
121: #\y / 'y'
122: #\z / 'z'
123: #\{ / '{'
124: #\| / '|'
125: #\} / '}'
126: #\~ / '~'
127: #\rubout / ''
It's important to know that it does not convert a numeric value to it's digit, rather it converts the ascii value to its corresponding character. Eg. the ascii value 50 represents the character #\2 (digit 2) and 65 represents #\A (capital letter A).
You find the documentation in the report:
procedure: (char->integer char)
procedure: (integer->char n)
Given a character, char->integer returns an exact integer
representation of the character. Given an exact integer that is the
image of a character under char->integer, integer->char returns that
character.
Scheme has a procedure called number->string which converts a number to a string representation:
(number->string 50 10) ; ==> "50" (base 10 representation)
(number->string #x32 10) ; ==> "50" (base 10 representation)
(number->string 50 16) ; ==> "32" (hex, base 16)

Which condition is technically more efficient, i >= 0 or i > -1?

This kind of usage is common while writing loops.
I was wondering if i >=0 will need more CPU cycles as it has two conditions greater than OR equal to when compared to i > -1. Is one known to be better than the other, and if so, why?
This is not correct. The JIT will implement both tests as a single machine language instruction.
And the number of CPU clock cycles is not determined by the number of comparisons to zero or -1, because the CPU should do one comparison and set flags to indicate whether the result of the comparison is <, > or =.
It's possible that one of those instructions will be more efficient on certain processors, but this kind of micro-optimization is almost always not worth doing. (It's also possible that the JIT - or javac - will actually generate the same instructions for both tests.)
On the contrary, comparsions (including non-strict) with zero takes one CPU instruction less. x86 architecture supports conditional jumps after any arithmetic or loading operation. It is reflected in Java bytecode instruction set, there is a group of instructions to compare the value on the top of the stack and jump: ifeq/ifgt/ifge/iflt/ifle/ifne. (See the full list). Comparsion with -1 requires additional iconst_m1 operation (loading -1 constant onto the stack).
The are two loops with different comparsions:
#GenerateMicroBenchmark
public int loopZeroCond() {
int s = 0;
for (int i = 1000; i >= 0; i--) {
s += i;
}
return s;
}
#GenerateMicroBenchmark
public int loopM1Cond() {
int s = 0;
for (int i = 1000; i > -1; i--) {
s += i;
}
return s;
}
The second version is one byte longer:
public int loopZeroCond();
Code:
0: iconst_0
1: istore_1
2: sipush 1000
5: istore_2
6: iload_2
7: iflt 20 //
10: iload_1
11: iload_2
12: iadd
13: istore_1
14: iinc 2, -1
17: goto 6
20: iload_1
21: ireturn
public int loopM1Cond();
Code:
0: iconst_0
1: istore_1
2: sipush 1000
5: istore_2
6: iload_2
7: iconst_m1 //
8: if_icmple 21 //
11: iload_1
12: iload_2
13: iadd
14: istore_1
15: iinc 2, -1
18: goto 6
21: iload_1
22: ireturn
It is slightly more performant on my machine (to my surprise. I expected JIT to compile these loops into identical assembly.)
Benchmark Mode Thr Mean Mean error Units
t.LoopCond.loopM1Cond avgt 1 0,319 0,004 usec/op
t.LoopCond.loopZeroCond avgt 1 0,302 0,004 usec/op
Сonclusion
Compare with zero whenever sensible.

scala implicit performance

This comes up regularly. Functions coded up using generics are signifficnatly slower in scala. See example below. Type specific version performs about a 1/3 faster than the generic version. This is doubly surprising given that the generic component is outside of the expensive loop. Is there a known explanation for this?
def xxxx_flttn[T](v: Array[Array[T]])(implicit m: Manifest[T]): Array[T] = {
val I = v.length
if (I <= 0) Array.ofDim[T](0)
else {
val J = v(0).length
for (i <- 1 until I) if (v(i).length != J) throw new utl_err("2D matrix not symetric. cannot be flattened. first row has " + J + " elements. row " + i + " has " + v(i).length)
val flt = Array.ofDim[T](I * J)
for (i <- 0 until I; j <- 0 until J) flt(i * J + j) = v(i)(j)
flt
}
}
def flttn(v: Array[Array[Double]]): Array[Double] = {
val I = v.length
if (I <= 0) Array.ofDim[Double](0)
else {
val J = v(0).length
for (i <- 1 until I) if (v(i).length != J) throw new utl_err("2D matrix not symetric. cannot be flattened. first row has " + J + " elements. row " + i + " has " + v(i).length)
val flt = Array.ofDim[Double](I * J)
for (i <- 0 until I; j <- 0 until J) flt(i * J + j) = v(i)(j)
flt
}
}
You can't really tell what you're measuring here--not very well, anyway--because the for loop isn't as fast as a pure while loop, and the inner operation is quite inexpensive. If we rewrite the code with while loops--the key double-iteration being
var i = 0
while (i<I) {
var j = 0
while (j<J) {
flt(i * J + j) = v(i)(j)
j += 1
}
i += 1
}
flt
then we see that the bytecode for the generic case is actually dramatically different. Non-generic:
133: checkcast #174; //class "[D"
136: astore 6
138: iconst_0
139: istore 5
141: iload 5
143: iload_2
144: if_icmpge 191
147: iconst_0
148: istore 4
150: iload 4
152: iload_3
153: if_icmpge 182
// The stuff above implements the loop; now we do the real work
156: aload 6
158: iload 5
160: iload_3
161: imul
162: iload 4
164: iadd
165: aload_1
166: iload 5
168: aaload // v(i)
169: iload 4
171: daload // v(i)(j)
172: dastore // flt(.) = _
173: iload 4
175: iconst_1
176: iadd
177: istore 4
// Okay, done with the inner work, time to jump around
179: goto 150
182: iload 5
184: iconst_1
185: iadd
186: istore 5
188: goto 141
It's just a bunch of jumps and low-level operations (daload and dastore being the key ones that load and store a double from an array). If we look at the key inner part of the generic bytecode, it instead looks like
160: getstatic #30; //Field scala/runtime/ScalaRunTime$.MODULE$:Lscala/runtime/ScalaRunTime$;
163: aload 7
165: iload 6
167: iload 4
169: imul
170: iload 5
172: iadd
173: getstatic #30; //Field scala/runtime/ScalaRunTime$.MODULE$:Lscala/runtime/ScalaRunTime$;
176: aload_1
177: iload 6
179: aaload
180: iload 5
182: invokevirtual #107; //Method scala/runtime/ScalaRunTime$.array_apply:(Ljava/lang/Object;I)Ljava/lang/Object;
185: invokevirtual #111; //Method scala/runtime/ScalaRunTime$.array_update:(Ljava/lang/Object;ILjava/lang/Object;)V
188: iload 5
190: iconst_1
191: iadd
192: istore 5
which, as you can see, has to call methods to do the array apply and update. The bytecode for that is a huge mess of stuff like
2: aload_3
3: instanceof #98; //class "[Ljava/lang/Object;"
6: ifeq 18
9: aload_3
10: checkcast #98; //class "[Ljava/lang/Object;"
13: iload_2
14: aaload
15: goto 183
18: aload_3
19: instanceof #100; //class "[I"
22: ifeq 37
25: aload_3
26: checkcast #100; //class "[I"
29: iload_2
30: iaload
31: invokestatic #106; //Method scala/runtime/BoxesRunTime.boxToInteger:
34: goto 183
37: aload_3
38: instanceof #108; //class "[D"
41: ifeq 56
44: aload_3
45: checkcast #108; //class "[D"
48: iload_2
49: daload
50: invokestatic #112; //Method scala/runtime/BoxesRunTime.boxToDouble:(
53: goto 183
which basically has to test each type of array and box it if it's the type you're looking for. Double is pretty near the front (3rd of 10), but it's still a pretty major overhead, even if the JVM can recognize that the code ends up being box/unbox and therefore doesn't actually need to allocate memory. (I'm not sure it can do that, but even if it could it wouldn't solve the problem.)
So, what to do? You can try [#specialized T], which will expand your code tenfold for you, as if you wrote each primitive array operation by yourself. Specialization is buggy in 2.9 (should be less so in 2.10), though, so it may not work the way you hope. If speed is of the essence--well, first, write while loops instead of for loops (or at least compile with -optimise which helps for loops out by a factor of two or so!), and then consider either specialization or writing the code by hand for the types you require.
This is due to boxing, when you apply the generic to a primitive type and use containing arrays (or the type appearing plain in method signatures or as member).
Example
In the following trait, after compilation, the process method will take an erased Array[Any].
trait Foo[A]{
def process(as: Array[A]): Int
}
If you choose A to be a value/primitive type, like Double it has to be boxed. When writing the trait in a non-generic way (e.g. with A=Double), process is compiled to take an Array[Double], which is a distinct array type on the JVM. This is more efficient, since in order to store a Double inside the Array[Any], the Double has to be wrapped (boxed) into an object, a reference to which gets stored inside the array. The special Array[Double] can store the Double directly in memory as a 64-Bit value.
The #specialized-Annotation
If you feel adventerous, you can try the #specialized keyword (it's pretty buggy and crashes the compiler often). This makes scalac compile special versions of a class for all or selected primitive types. This only makes sense, if the type parameter appears plain in type signatures (get(a: A), but not get(as: Seq[A])) or as a type paramter to Array. I think you'll receive a warning if speicialization is pointless.

bitwise not in byte compare

Say I got a byte like this: 00010001 (with 2 bits ON)
And I wish to compare it to these bytes: {0000 0110, 0000 0011, 0011 0000, 0000 1100 }
The idea is to get the bytes that that don't match; where (byteA & byteX) == 0
For the example I should get/find: {0000 0110, 0000 1100 }
This maybe easy if we write a code where we loop the array of bytes.
Here an example:
byte seek = 17;
byte[] pool = {6, 3, 48, 12 };
for(int p=0; p<pool.Length; p++)
{
if((pool[p] & seek)==0)
{
//Usefull
}
}
Now I wish to do the same without looping the array.
Say the array is huge; and I wish to compare each byte with the rest.
for(int p1=0; p1<pool.Length; p1++)
{
for(int p2=0; p2<pool.Length; p1++)
{
if((pool[p1] & pool[p2])==0)
{
//byte at p1 works with byte at p2.
}
}//for p2
}//for p1
So what are my options?
A dictionary won't help me (i think) because if I have my seek byte 0001 0001
I will wan't to find a byte like this: XXX0 XXX0
Any ideas?
Thanks a lot for your help;
I welcome C#, C++ or any pseudocode.
I am looking for an algorithm; not so much the code
Mike
Here's an entirely different idea, that may or may not work well, depending on what's in your pool.
Put the entire pool into a zero suppressed binary decision diagram. The items from pool would be sets, where the indices for which the bit is 1 are elements of that set. The ZDD is the family of all those sets.
To do a query, form an other ZDD - the family of all sets that do not include the bits which are 1 in seek (that will be a small ZDD, in terms of nodes), then enumerate all sets in the intersection of those ZDD's.
Enumerating all those sets from the intersection is an output sensitive algorithm, but calculating the intersection takes time depending on how big the ZDD's are, so whether it works well depends on whether pool is a nice ZDD (the query zdd is definitely nice). And of course you have to prepare that ZDD, so in any case it'll only help if you plan to query the same pool often.
The great thing about bytes is there are only 256 possibilities.
You could initially create a 2d array 256x256 then just do a look-up into the array with your two values.
You could create the array before hand and then store the result in your main program as a static instance.
static bool[256,256] LookUp = {
{true, true, true ... },
...
{false, false, false ...}
};
static bool IsUsefule(byte a, byte b) {
return LookUp[a][b];
}
edit *
Or use Arrays of answer Arrays
The inner array would ONLY contain the bytes that are 'Useful'.
static List<<byte[]> LookUp = new List<byte[]>(256);
static byte[] IsUseful(byte a) {
return LookUp[a];
}
If 'a' = 0 then IsUseful would return the 255 bytes that have a bit set. This would avoid your inner loop from your example.
One fairly general solution is to "bit-transpose" your data so that you have e.g. a chunk of words containing all the high-order bits of your data, a chunk of words containing all the bits one position down from there, and so on. Then for your two-bit query, you or together two such chunks of words and look for 0 bits - so if a result word is -1 you can skip over it completely. To find where all the 0 bits are in word x, look at popcnt(x ^ (x + 1)): If x = ...10111, then x + 1 = ...11000 so x ^ (x + 1) = 000..01111 - and popcnt will then tell you where the lowest order 0 is. In practice, the big win may be when most of your data does not satisfy the query and you can skip over whole words: when you have a lot of matches the cost of query under any scheme may be small compared to the cost of whatever you plan to do with the matches. In a database, this is http://en.wikipedia.org/wiki/Bitmap_index - lots of info there and pointers to source code.
There are a number of ideas for querying 0/1 data in Knuth Vol 2 section 6.5 - "Binary attributes". Most of these require you to have some idea of the distribution of your data to recognise where they are applicable. One idea from there is generally applicable - if you have any sort of tree structure or index structure, you can keep in the nodes of the tree information on the or/and of everything under it. You can then check your query against that information and you may sometimes find that nothing below that node can possibly match your query, in which case you can skip it all. This is probably most useful if there are connections between the bits, so that e.g. if you divide the pool up just by sorting it and cutting it into chunks, even bits which do not affect the division into chunks are allways set in some chunks and never set in other chunks.
The only thing I can think of is to reduce number of tests:
for(int p1=1; p1<pool.Length; p1++)
{
for(int p2=0; p2<p1; p1++)
{
if((pool[p1] & pool[p2])==0)
{
//byte at p1 works with byte at p2.
//byte at p2 works with byte at p1.
}
}
}
First of all, my english is poor but I hope you understand. Also, I know that my answer is a bit late but I think is still useful.
As someone has pointed, best solution is generate a look up table.
With this purpose, you have to hardcode every loop iteration case in
an array.
Fortunately, we are working with bytes, so it is only possible 256
cases. For instance, if we take your pattern list {3, 6, 12, 48} we
get this table:
0: { 3, 6, 12, 48 }
1: { 6, 12, 48 }
2: { 12, 48 }
3: { 12, 48 }
3: { 12, 48 }
...
252: { 3 }
253: -
254: -
255: -
We use the input byte as an index in the look up table to get a list pattern values that doesn't match with the input byte.
Implementation:
I've used Python to generate two headers files. One with a look up table
definition, an other one with desired pattern list values. Then, I include this file in a new C project an that's all!
Python Code
#! /usr/bin/env python
from copy import copy
from time import *
import getopt
import sys
class LUPattern:
__LU_SZ = 256
BASIC_TYPE_CODE = "const uint8_t"
BASIC_ID_CODE = "p"
LU_ID_CODE = "lu"
LU_HEADER_CODE = "__LUPATTERN_H__"
LU_SZ_PATTLIST_ID_CODE = "SZ"
PAT_HEADER_CODE = "__PATTERNLIST_H__"
PAT_ID_CODE = "patList"
def __init__(self, patList):
self.patList = copy(patList)
def genLU(self):
lu = []
pl = list( set(self.patList) )
pl.sort()
for i in xrange(LUPattern.__LU_SZ):
e = []
for p in pl:
if (i & p) == 0:
e.append(p)
lu.append(e)
return lu
def begCode(self):
code = "// " + asctime() + "\n\n" \
+ "#ifndef " + LUPattern.LU_HEADER_CODE + "\n" \
+ "#define " + LUPattern.LU_HEADER_CODE + "\n" \
+ "\n#include <stdint.h>\n\n" \
return code
def luCode(self):
lu = self.genLU()
pDict = {}
luSrc = LUPattern.BASIC_TYPE_CODE \
+ " * const " \
+ LUPattern.LU_ID_CODE \
+ "[%i] = { \n\t" % LUPattern.__LU_SZ
for i in xrange(LUPattern.__LU_SZ):
if lu[i]:
pId = "_%i" * len(lu[i])
pId = pId % tuple(lu[i])
pId = LUPattern.BASIC_ID_CODE + pId
pBody = "{" + "%3i, " * len(lu[i]) + " 0 }"
pBody = pBody % tuple(lu[i])
pDict[pId] = pBody
luSrc += pId
else:
luSrc += "0"
luSrc += (i & 3) == 3 and (",\n\t") or ", "
luSrc += "\n};"
pCode = ""
for pId in pDict.keys():
pCode += "static " + \
LUPattern.BASIC_TYPE_CODE + \
" " + pId + "[] = " + \
pDict[pId] + ";\n"
return (pCode, luSrc)
def genCode(self):
(pCode, luSrc) = self.luCode()
code = self.begCode() \
+ pCode + "\n\n" \
+ luSrc + "\n\n#endif\n\n"
return code
def patCode(self):
code = "// " + asctime() + "\n\n" \
+ "#ifndef " + LUPattern.PAT_HEADER_CODE + "\n" \
+ "#define " + LUPattern.PAT_HEADER_CODE + "\n" \
+ "\n#include <stdint.h>\n\n"
code += "enum { " \
+ LUPattern.LU_SZ_PATTLIST_ID_CODE \
+ " = %i, " % len(self.patList) \
+ "};\n\n"
code += "%s %s[] = { " % ( LUPattern.BASIC_TYPE_CODE,
LUPattern.PAT_ID_CODE )
for p in self.patList:
code += "%i, " % p
code += "};\n\n#endif\n\n"
return code
#########################################################
def msg():
hmsg = "Usage: "
hmsg += "%s %s %s" % (
sys.argv[0],
"-p",
"\"{pattern0, pattern1, ... , patternN}\"\n\n")
hmsg += "Options:"
fmt = "\n%5s, %" + "%is" % ( len("input pattern list") + 3 )
hmsg += fmt % ("-p", "input pattern list")
fmt = "\n%5s, %" + "%is" % ( len("output look up header file") + 3 )
hmsg += fmt % ("-l", "output look up header file")
fmt = "\n%5s, %" + "%is" % ( len("output pattern list header file") + 3 )
hmsg += fmt % ("-f", "output pattern list header file")
fmt = "\n%5s, %" + "%is" % ( len("print this message") + 3 )
hmsg += fmt % ("-h", "print this message")
print hmsg
exit(0)
def getPatternList(patStr):
pl = (patStr.strip("{}")).split(',')
return [ int(i) & 255 for i in pl ]
def parseOpt():
patList = [ 255 ] # Default pattern
luFile = sys.stdout
patFile = sys.stdout
try:
opts, args = getopt.getopt(sys.argv[1:], "hp:l:f:", ["help", "patterns="])
except getopt.GetoptError:
msg()
for op in opts:
if op[0] == '-p':
patList = getPatternList(op[1])
elif op[0] == '-l':
luFile = open(op[1], 'w')
elif op[0] == '-f':
patFile = open(op[1], 'w')
elif op[0] == '-h':
msg()
return (patList, luFile, patFile)
def main():
(patList, luFile, patFile) = parseOpt()
lug = LUPattern(patList)
print >> luFile , lug.genCode()
print >> patFile, lug.patCode()
patFile.close()
luFile.close()
if __name__ == "__main__":
main()
C Code
Now, after call above script, it'll generate two files: lu.h and pl.h. We must to include
that files on our new C project.
Here a simple C code example:
#include "pl.h"
#include "lu.h"
#include <stdio.h>
int main(void)
{
uint32_t stats[SZ + 1] = { 0 };
uint8_t b;
while( fscanf(stdin, "%c", &b) != EOF )
{
(void)lu[b];
// lu[b] has bytes that don't match with b
}
return 0;
}
Test and benchmark:
I've done some extra stuff to check and get results. There are
more code which I've used as a test case unit but I don't paste here
(If you wish I'll paste later).
I make two similar versions of a same utility. One use look up table (noloop version)
an other one use typical loop (loop version).
loop code are slightly different than noloop code but I try to minimize these differences.
noloop version:
#include "pl.h"
#include "lu.h"
#include <stdio.h>
void doStats(const uint8_t * const, uint32_t * const);
void printStats(const uint32_t * const);
int main(void)
{
uint32_t stats[SZ + 1] = { 0 };
uint8_t b;
while( fscanf(stdin, "%c", &b) != EOF )
{
/* lu[b] has pattern values that not match with input b */
doStats(lu[b], stats);
}
printStats(stats);
return 0;
}
void doStats(const uint8_t * const noBitMatch, uint32_t * const stats)
{
uint8_t i, j = 0;
if(noBitMatch)
{
for(i = 0; noBitMatch[i] != 0; i++)
for(; j < SZ; j++)
if( noBitMatch[i] == patList[j] )
{
stats[j]++;
break;
}
}
else
stats[SZ]++;
}
void printStats(const uint32_t * const stats)
{
const uint8_t * const patList = lu[0];
uint8_t i;
printf("Stats: \n");
for(i = 0; i < SZ; i++)
printf(" %3i%-3c%9i\n", patList[i], ':', stats[i]);
printf(" ---%-3c%9i\n", ':', stats[SZ]);
}
loop version:
#include "pl.h"
#include <stdio.h>
#include <stdint.h>
#include <string.h>
void getNoBitMatch(const uint8_t, uint8_t * const);
void doStats(const uint8_t * const, uint32_t * const);
void printStats(const uint32_t * const);
int main(void)
{
uint8_t b;
uint8_t noBitMatch[SZ];
uint32_t stats[SZ + 1] = { 0 };
while( fscanf(stdin, "%c", &b ) != EOF )
{
getNoBitMatch(b, noBitMatch);
doStats(noBitMatch, stats);
}
printStats(stats);
return 0;
}
void doStats(const uint8_t * const noBitMatch, uint32_t * const stats)
{
uint32_t i;
uint8_t f;
for(i = 0, f = 0; i < SZ; i++)
{
f = ( (noBitMatch[i]) ? 1 : f );
stats[i] += noBitMatch[i];
}
stats[SZ] += (f) ? 0 : 1;
}
void getNoBitMatch(const uint8_t b, uint8_t * const noBitMatch)
{
uint8_t i;
for(i = 0; i < SZ; i++)
noBitMatch[i] = ( (b & patList[i]) == 0 ) ? 1 : 0;
}
void printStats(const uint32_t * const stats)
{
uint8_t i;
printf("Stats: \n");
for(i = 0; i < SZ; i++)
printf(" %3i%-3c%9i\n", patList[i], ':', stats[i]);
printf(" ---%-3c%9i\n", ':', stats[SZ]);
}
Both code perform same action: count bytes that not match with a concrete byte of pattern list (pl.h).
Makefile for compile them:
###
CC = gcc
CFLAGS = -c -Wall
SPDUP = -O3
DEBUG = -ggdb3 -O0
EXECUTABLE = noloop
AUXEXEC = loop
LU_SCRIPT = ./lup.py
LU_HEADER = lu.h
LU_PATLIST_HEADER = pl.h
#LU_PATLIST = -p "{ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }"
#LU_PATLIST = -p "{ 3, 6, 12, 15, 32, 48, 69, 254 }"
LU_PATLIST = -p "{ 3, 6, 12, 48 }"
#LU_PATLIST = -p "{ 1, 2 }"
#LU_PATLIST = -p "{ 1 }"
LU_FILE = -l $(LU_HEADER)
LU_PAT_FILE = -f $(LU_PATLIST_HEADER)
SRC= noloop.c loop.c
SOURCE = $(EXECUTABLE).c
OBJECTS = $(SOURCE:.c=.o)
AUXSRC = $(AUXEXEC).c
AUXOBJ = $(AUXSRC:.c=.o)
all: $(EXECUTABLE) $(AUXEXEC)
lookup:
$(LU_SCRIPT) $(LU_PATLIST) $(LU_FILE) $(LU_PAT_FILE)
touch $(SRC)
$(EXECUTABLE): lookup $(OBJECTS)
$(CC) $(OBJECTS) -o $#
$(AUXEXEC): $(AUXOBJ)
$(CC) $(AUXOBJ) -o $#
.c.o:
$(CC) $(CFLAGS) $(SPDUP) -c $<
debug: lookup dbg
$(CC) $(OBJECTS) -o $(EXECUTABLE)
$(CC) $(AUXOBJ) -o $(AUXEXEC)
dbg: *.c
$(CC) $(CFLAGS) $(DEBUG) -c $<
clean:
rm -f $(EXECUTABLE) $(AUXEXEC) *.o &> /dev/null
.PHONY: clean
I've use three plain texts as input stream: gpl v3 plain text, Holy Bible plain text and linux kernel sources using recursive cat tool.
Executing this code with different pattern list give me this results:
Sat Sep 24 15:03:18 CEST 2011
Test1: test/gpl.txt (size: 35147)
---------------------------------------------------
Look up table version:
------------------------
Stats:
1: 18917
2: 22014
3: 12423
4: 19015
5: 11111
6: 12647
7: 7791
8: 23498
9: 13637
10: 16032
11: 9997
12: 14059
13: 9225
14: 8609
15: 6629
16: 25610
---: 0
real 0m0.016s
user 0m0.008s
sys 0m0.016s
Loop version:
------------------------
Stats:
1: 18917
2: 22014
3: 12423
4: 19015
5: 11111
6: 12647
7: 7791
8: 23498
9: 13637
10: 16032
11: 9997
12: 14059
13: 9225
14: 8609
15: 6629
16: 25610
---: 0
real 0m0.020s
user 0m0.020s
sys 0m0.008s
Test2: test/HolyBible.txt (size: 5918239)
---------------------------------------------------
Look up table version:
------------------------
Stats:
1: 3392095
2: 3970343
3: 2325421
4: 3102869
5: 1973137
6: 2177366
7: 1434363
8: 3749010
9: 2179167
10: 2751134
11: 1709076
12: 2137823
13: 1386038
14: 1466132
15: 1072405
16: 4445367
---: 3310
real 0m1.048s
user 0m1.044s
sys 0m0.012s
Loop version:
------------------------
Stats:
1: 3392095
2: 3970343
3: 2325421
4: 3102869
5: 1973137
6: 2177366
7: 1434363
8: 3749010
9: 2179167
10: 2751134
11: 1709076
12: 2137823
13: 1386038
14: 1466132
15: 1072405
16: 4445367
---: 3310
real 0m0.926s
user 0m0.924s
sys 0m0.016s
Test3: test/linux-kernel-3.0.4 (size: 434042620)
---------------------------------------------------
Look up table version:
------------------------
Stats:
1: 222678565
2: 254789058
3: 137364784
4: 239010012
5: 133131414
6: 146334792
7: 83232971
8: 246531446
9: 145867949
10: 161728907
11: 103142808
12: 147836792
13: 93927370
14: 87122985
15: 66624721
16: 275921653
---: 16845505
real 2m22.900s
user 3m43.686s
sys 1m14.613s
Loop version:
------------------------
Stats:
1: 222678565
2: 254789058
3: 137364784
4: 239010012
5: 133131414
6: 146334792
7: 83232971
8: 246531446
9: 145867949
10: 161728907
11: 103142808
12: 147836792
13: 93927370
14: 87122985
15: 66624721
16: 275921653
---: 16845505
real 2m42.560s
user 3m56.011s
sys 1m26.037s
Test4: test/gpl.txt (size: 35147)
---------------------------------------------------
Look up table version:
------------------------
Stats:
3: 12423
6: 12647
12: 14059
15: 6629
32: 2338
48: 1730
69: 6676
254: 0
---: 11170
real 0m0.011s
user 0m0.004s
sys 0m0.016s
Loop version:
------------------------
Stats:
3: 12423
6: 12647
12: 14059
15: 6629
32: 2338
48: 1730
69: 6676
254: 0
---: 11170
real 0m0.021s
user 0m0.020s
sys 0m0.008s
Test5: test/HolyBible.txt (size: 5918239)
---------------------------------------------------
Look up table version:
------------------------
Stats:
3: 2325421
6: 2177366
12: 2137823
15: 1072405
32: 425404
48: 397564
69: 1251668
254: 0
---: 1781959
real 0m0.969s
user 0m0.936s
sys 0m0.048s
Loop version:
------------------------
Stats:
3: 2325421
6: 2177366
12: 2137823
15: 1072405
32: 425404
48: 397564
69: 1251668
254: 0
---: 1781959
real 0m1.447s
user 0m1.424s
sys 0m0.032s
Test6: test/linux-kernel-3.0.4 (size: 434042620)
---------------------------------------------------
Look up table version:
------------------------
Stats:
3: 137364784
6: 146334792
12: 147836792
15: 66624721
32: 99994388
48: 64451562
69: 89249942
254: 5712
---: 105210728
real 2m38.851s
user 3m37.510s
sys 1m26.653s
Loop version:
------------------------
Stats:
3: 137364784
6: 146334792
12: 147836792
15: 66624721
32: 99994388
48: 64451562
69: 89249942
254: 5712
---: 105210728
real 2m32.041s
user 3m36.022s
sys 1m27.393s
Test7: test/gpl.txt (size: 35147)
---------------------------------------------------
Look up table version:
------------------------
Stats:
3: 12423
6: 12647
12: 14059
48: 1730
---: 11277
real 0m0.013s
user 0m0.016s
sys 0m0.004s
Loop version:
------------------------
Stats:
3: 12423
6: 12647
12: 14059
48: 1730
---: 11277
real 0m0.014s
user 0m0.020s
sys 0m0.000s
Test8: test/HolyBible.txt (size: 5918239)
---------------------------------------------------
Look up table version:
------------------------
Stats:
3: 2325421
6: 2177366
12: 2137823
48: 397564
---: 1850018
real 0m0.933s
user 0m0.916s
sys 0m0.036s
Loop version:
------------------------
Stats:
3: 2325421
6: 2177366
12: 2137823
48: 397564
---: 1850018
real 0m0.892s
user 0m0.860s
sys 0m0.052s
Test9: test/linux-kernel-3.0.4 (size: 434042620)
---------------------------------------------------
Look up table version:
------------------------
Stats:
3: 137364784
6: 146334792
12: 147836792
48: 64451562
---: 132949214
real 2m31.187s
user 3m31.289s
sys 1m25.909s
Loop version:
------------------------
Stats:
3: 137364784
6: 146334792
12: 147836792
48: 64451562
---: 132949214
real 2m34.942s
user 3m33.081s
sys 1m24.381s
Test10: test/gpl.txt (size: 35147)
---------------------------------------------------
Look up table version:
------------------------
Stats:
1: 18917
2: 22014
---: 6639
real 0m0.014s
user 0m0.016s
sys 0m0.008s
Loop version:
------------------------
Stats:
1: 18917
2: 22014
---: 6639
real 0m0.017s
user 0m0.016s
sys 0m0.008s
Test11: test/HolyBible.txt (size: 5918239)
---------------------------------------------------
Look up table version:
------------------------
Stats:
1: 3392095
2: 3970343
---: 881222
real 0m0.861s
user 0m0.848s
sys 0m0.032s
Loop version:
------------------------
Stats:
1: 3392095
2: 3970343
---: 881222
real 0m0.781s
user 0m0.760s
sys 0m0.044s
Test12: test/linux-kernel-3.0.4 (size: 434042620)
---------------------------------------------------
Look up table version:
------------------------
Stats:
1: 222678565
2: 254789058
---: 84476465
real 2m29.894s
user 3m30.449s
sys 1m23.177s
Loop version:
------------------------
Stats:
1: 222678565
2: 254789058
---: 84476465
real 2m21.103s
user 3m22.321s
sys 1m24.001s
Test13: test/gpl.txt (size: 35147)
---------------------------------------------------
Look up table version:
------------------------
Stats:
1: 18917
---: 16230
real 0m0.015s
user 0m0.020s
sys 0m0.008s
Loop version:
------------------------
Stats:
1: 18917
---: 16230
real 0m0.016s
user 0m0.016s
sys 0m0.008s
Test14: test/HolyBible.txt (size: 5918239)
---------------------------------------------------
Look up table version:
------------------------
Stats:
1: 3392095
---: 2526144
real 0m0.811s
user 0m0.808s
sys 0m0.024s
Loop version:
------------------------
Stats:
1: 3392095
---: 2526144
real 0m0.709s
user 0m0.688s
sys 0m0.040s
Test15: test/linux-kernel-3.0.4 (size: 434042620)
---------------------------------------------------
Look up table version:
------------------------
Stats:
1: 222678565
---: 201900739
real 2m21.510s
user 3m23.009s
sys 1m23.861s
Loop version:
------------------------
Stats:
1: 222678565
---: 201900739
real 2m22.677s
user 3m26.477s
sys 1m23.385s
Sat Sep 24 15:28:28 CEST 2011
Conclusions:
In my opinion, use of a look up table improve code execution by
increasing code size, but that improvment are not too much
significant. To start to notice differences, the amount input bytes should
be huge.

Resources