Remove initial directives from preprocessor output - gcc

I have the following in test.c:
#if 1
foo boo bar
#endif
Then I run gcc like this:
gcc -E test.c -o test.pp
This is the test.pp output:
# 1 "test.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 1 "<command-line>" 2
# 1 "test.c"
foo boo bar
Is there a way to remove these # [something] directives and let only (in this case) foo boo bar only by using gcc flags? I mean, I would like that the preprocessor output would be only foo boo bar in this case.

Related

the format of the output of gcc preprocesing

I see the following output of gcc preprocessing. I don't find the documentation of the output format. Could anybody let me know what it is? Thanks.
$ cat a.h
#include "b.h"
$ cat b.h
#define X Y
$ gcc -E -dD - <<< '#include "a.h"'
...
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "<stdin>" 2
# 1 "./a.h" 1
# 1 "./b.h" 1
#define X Y
# 2 "./a.h" 2
# 2 "<stdin>" 2
When I try the following, I see more numbers, which are different from the above. I am not sure about what they mean either.
$ gcc -E -dD - <<< '#include <sys/socket.h>'
...
# 19 "/usr/include/x86_64-linux-gnu/asm/posix_types_64.h" 2 3 4
# 8 "/usr/include/x86_64-linux-gnu/asm/posix_types.h" 2 3 4
# 37 "/usr/include/linux/posix_types.h" 2 3 4
# 6 "/usr/include/asm-generic/socket.h" 2 3 4
# 1 "/usr/include/x86_64-linux-gnu/asm/sockios.h" 1 3 4
# 1 "/usr/include/asm-generic/sockios.h" 1 3 4
...
Unusual lines are linemarkers, which specify linenumber and filename. Numbers after the filename are special flags.
These are explained in GCC Preprocessor online documentation:
Source file name and line number information is conveyed by lines of
the form
# linenum filename flags
These are called linemarkers. They are inserted as needed into the
output (but never within a string or character constant). They mean
that the following line originated in file filename at line linenum.
filename will never contain any non-printing characters; they are
replaced with octal escape sequences.
After the file name comes zero or more flags, which are ‘1’, ‘2’, ‘3’,
or ‘4’. If there are multiple flags, spaces separate them. Here is
what the flags mean:
‘1’ This indicates the start of a new file.
‘2’ This indicates returning to a file (after having included another file).
‘3’ This indicates that the following text comes from a system header file, so certain warnings should be suppressed.
‘4’ This indicates that the following text should be treated as being wrapped in an implicit extern "C" block.

How would you structure Alpha Nodes in a Rete Network that has a rule with two conditions found in other rules?

Let's say I have three rules:
When Object's foo property is 1, output "foo"
When Object's bar property is 1, output "bar"
When Object's foo property is 1 and bar property is 1, output "both foo and bar"
What would the structure of alpha nodes look like for this scenario? I've seen examples where, given rules 1 and 2, it might look like:
foo == 1 - "foo"
root<
bar == 1 - "bar"
And, given 3:
root - foo == 1 - bar == 1 - "both foo and bar"
And, given 3 and 1:
"foo"
root - foo == 1 <
bar == 1 - "both foo and bar"
Given 3, 2 and 1, would it look something like:
foo == 1 - "foo"
root <
"bar"
bar == 1 <
foo == 1 - "both foo and bar"
or
foo == 1 - "foo"
/
root-- bar == 1 - "bar"
\
foo == 1 - bar == 1 - "both foo and bar"
Or some other way?
If you are sharing nodes and preserving the order in which properties are tested it would look like this:
bar == 1 - "bar"
root <
"foo"
foo == 1 <
bar == 1 - "both foo and bar"

Split text from bash variable

I have a variable which has groups of numbers. It looks like this:
foo 3
foo 5
foo 2
bar 8
bar 8
baz 2
qux 3
qux 5
...
I would like to split this data so I can work on one 'group' at a time. I feel this would be achievable with a loop somehow. The end goal is to take the mean of each group, such that I could have:
foo 3.33
bar 8.50
baz 5.00
qux 4.00
...
This mean taking has been implemented already, but I've brought it up so the context is known.
It's important to note that each group (eg. foo, bar, baz) is of arbitrary length.
How would I go about splitting up these groups?
I would use awk (tested using the GNU version gawk here, but I think it's portable) for both the collecting and the averaging. As a coreutil, it should be in just about anything bash is installed on.
# print_avg.awk
{
sums[$1] += $2
counts[$1] += 1
}
END {
for (key in sums)
print key , sums[key] / counts[key]
}
data.txt:
foo 3
foo 5
bar 8
bar 8
baz 2
qux 3
qux 5
Run it like:
$ awk -f print_avg.awk data.txt
foo 4
baz 2
qux 4
bar 8

Error when concatenate in macro gcc preprocessor

I'm getting an error when I try to use ## in macro this is what I try to make:
With this defines:
#define PORT 2
#define PIN 3
I want that preprocessor generates:
PM2.3=1
when I call a macro like this:
SetPort(PORT,PIN)
Then, I see that I can make the substitution PORT and PIN at the same time that concatenation, then I think I must to use 2 defines:
#define SetP2(PORT,PIN) PM##PORT.PIN = 1
#define SetPort(PORT,PIN) SetP2(PORT,PIN)
but I get an error on:
#define PIN 3 --> expected identifier before numeric constant
and a warning on:
SetPort(PORT,PIN) --> Syntax error
Any idea?
This works for me:
$ cat portpin.c
#define PORT 2
#define PIN 3
#define SetP2(prefix,prt) prefix ## prt
#define SetPort(prt,pn) SetP2(PM,prt).pn = 1
SetPort(PORT,PIN)
$ gcc -E portpin.c
# 1 "portpin.c"
# 1 "<built-in>"
# 1 "<command line>"
# 1 "portpin.c"
PM2. 3 = 1
$
I don't know how important it is for there to be no space between the . and the 3, but the preprocessor seems to want to insert it.
UPDATE:
Actually I tried your original code, and it seems to produce the same result, so my answer above is probably not much use to you.
UPDATE 2:
It turns out the OP is expecting the pre-processor to generate PM2.no3=1 and not PM2.3=1. This can easily be done as follows:
$ cat portpin.c
#define PORT 2
#define PIN 3
#define SetP2(PORT,PIN) PM##PORT.no##PIN=1
#define SetPort(PORT,PIN) SetP2(PORT,PIN)
SetPort(PORT,PIN)
$ gcc -E portpin.c
# 1 "portpin.c"
# 1 "<built-in>"
# 1 "<command line>"
# 1 "portpin.c"
PM2.no3=1
$

Perform highly customized sort based on multiple columns of a CSV file?

I have a four-column CSV file, using # as the separator, e.g.:
0001 # fish # animal # eats worms
The first column is the only column guaranteed to be unique.
I need to perform four sort operations on columns 2, 3, and 4.
First, column 2 is sorted alphanumerically. The important feature of this sort is it must guarantee that any duplicate entries within column 2 are next to each other, e.g.:
# a # #
# a # #
# a # #
# a # #
# a # #
# b # #
# b # #
# c # #
# c # #
# c # #
# c # #
# c # #
Next, within the first sort, sort the lines into two categories. The first lines are those which do not contain the words “arch.”, “var.”, “ver.”, “anci.” or “fam.” anywhere within column 4. The second lines (which are sorted after), are those containing those words, e.g.:
# a # # Does not have one of those words.
# a # # Does not have one of those words.
# a # # Does not have one of those words.
# a # # Does not have one of those words.
# a # # This sentence contains arch.
# b # # Does not have one of those words.
# b # # Has the word ver.
# c # # Does not have one of those words.
# c # # Does not have one of those words.
# c # # Does not have one of those words.
# c # # This sentence contains var.
# c # # This sentence contains fam.
# c # # This sentence contains fam.
Finally, sorting only within the separate categories of the second sort, sort the lines from “contains the most duplicate entries within column 3” to “contains the least number of duplicate entries within column 3”, e.g.:
# a # fish # Does not have one of those words.
# a # fish # Does not have one of those words.
# a # fish # Does not have one of those words.
# a # tiger # Does not have one of those words.
# a # bear # This sentence contains arch.
# b # fish # Does not have one of those words.
# b # fish # Has the word ver.
# c # bear # Does not have one of those words.
# c # bear # Does not have one of those words.
# c # fish # Does not have one of those words.
# c # tiger # This sentence contains var.
# c # tiger # This sentence contains fam.
# c # bear # This sentence contains fam.
How can I sort the file alphanumerically by column 2, by the appearance of some key words in column 4, and by most common duplicate to least common duplicate in column 3?
TXR: ( http://www.nongnu.org/txr )
#(bind special-words ("arch." "var." "ver." "anci." "fam."))
#(bind ahash #(hash :equal-based))
#(repeat)
#id ## #alpha ## #animal ## #words
# (rebind words #(split-str words " "))
# (bind record (id alpha animal words))
# (do (push record [ahash alpha]))
#(end)
#(bind sorted-rec-groups nil)
#(do
(defun popularity-sort (recs)
(let ((histogram [group-reduce (hash)
third (do inc #1)
recs 0]))
[sort recs > [chain third histogram]]))
(dohash (key records ahash)
(let (contains does-not combined)
(each* ((r records)
(w [mapcar fourth r]))
(if (isec w special-words)
(push r contains)
(push r does-not)))
(push (append (popularity-sort does-not)
(popularity-sort contains))
sorted-rec-groups)))
(set sorted-rec-groups [sort sorted-rec-groups :
[chain first second]]))
#(output)
# (repeat)
# (repeat)
#(rep)#{sorted-rec-groups} ## #(last)#{sorted-rec-groups " "}#(end)
# (end)
# (end)
#(end)
Data:
0001 # b # fish # Does not have one of those words.
0002 # a # bear # Does not have one of those words.
0003 # b # bear # Has the word ver.
0004 # a # fish # Does not have one of those words.
0005 # c # bear # Does not have one of those words.
0006 # c # bear # Does not have one of those words.
0007 # a # fish # Does not have one of those words.
0008 # c # fish # Does not have one of those words.
0009 # a # fish # Does not have one of those words.
0010 # c # tiger # This sentence contains var.
0011 # c # bear # This sentence contains fam.
0012 # a # fish # Does not have one of those words.
0013 # c # tiger # This sentence contains fam.
Run:
$ txr sort.txr data.txt
0004 # a # fish # Does not have one of those words.
0007 # a # fish # Does not have one of those words.
0009 # a # fish # Does not have one of those words.
0012 # a # fish # Does not have one of those words.
0002 # a # bear # Does not have one of those words.
0001 # b # fish # Does not have one of those words.
0003 # b # bear # Has the word ver.
0005 # c # bear # Does not have one of those words.
0006 # c # bear # Does not have one of those words.
0008 # c # fish # Does not have one of those words.
0010 # c # tiger # This sentence contains var.
0013 # c # tiger # This sentence contains fam.
0011 # c # bear # This sentence contains fam.
Here's an answer to your first question to help you get started:
sort data -t "#" -k 2,2 -k 3,4
How it works:
-t specifies the field separator which for you is the "#" sign.
-k 2,2 means sort on field two
-k 3,4 means resolve ties by sorting on field 3, then field 4
Here's a solution in Ruby.
#!/usr/bin/env ruby
class Row
SEPARATOR = " # "
attr_accessor :cols
def initialize(text)
#cols = text.chomp.split(SEPARATOR)
#cols.size == 4 or raise "Expected text to have four columns: #{text}"
duplicate_increment
end
def has_words?
cols[3]=~/arch\.|var\.|ver\.|anci\.|fam\./ ? true : false
end
def to_s
SEPARATOR +
#cols[1,3].join(SEPARATOR) +
" -- id:#{cols[0]} duplicates:#{duplicate_count}"
end
### Comparison
def <=>(other)
other or raise "Expected other to exist"
cmp = self.cols[1] <=> other.cols[1]
return cmp if cmp !=0
cmp = (self.has_words? ? 1 : -1) <=> (other.has_words? ? 1 : -1)
return cmp if cmp !=0
other.duplicate_count <=> self.duplicate_count
end
### Track duplicate entries
##duplicate_count = Hash.new{|h,k| h[k]=0}
def duplicate_key
[cols[1],has_words?]
end
def duplicate_count
##duplicate_count[duplicate_key]
end
def duplicate_increment
##duplicate_count[duplicate_key] += 1
end
end
### Main
lines = ARGF
rows = lines.map{|line| Row.new(line) }
sorted_rows = rows.sort
sorted_rows.each{|row| puts row }
Input:
0001 # b # fish # text
0002 # a # bear # text
0003 # b # bear # ver.
0004 # a # fish # text
0005 # c # bear # text
0006 # c # bear # text
0007 # a # fish # text
0008 # c # fish # text
0009 # a # fish # text
0010 # c # lion # var.
0011 # c # bear # fam.
0012 # a # fish # text
0013 # c # lion # fam.
Output:
$ cat data.txt | ./sorter.rb
# a # fish # text -- id:0007 duplicates:5
# a # bear # text -- id:0002 duplicates:5
# a # fish # text -- id:0012 duplicates:5
# a # fish # text -- id:0004 duplicates:5
# a # fish # text -- id:0009 duplicates:5
# b # fish # text -- id:0001 duplicates:1
# b # bear # ver. -- id:0003 duplicates:1
# c # bear # text -- id:0005 duplicates:3
# c # fish # text -- id:0008 duplicates:3
# c # bear # text -- id:0006 duplicates:3
# c # lion # var. -- id:0010 duplicates:3
# c # bear # fam. -- id:0011 duplicates:3
# c # lion # fam. -- id:0013 duplicates:3
 q
First, I load the "csv" and get it into the right shape. The test data is called "worms" on my computer but because q doesn't use strings as the file name "type" (to protect against e.g. injection attacks), I need to use hsym to make a "file name":
t:flip `id`a`b`c!("SSSS";"#")0:hsym`worms;
Then I worked on which "fourth field" entries contained one of your words. I built a bitmap using like and applying it to each row(left) then each pattern(right) to get 0 where the word is not present, or 1 where one of them is:
t:update p:any each c like/:\:("*arch.*";"*var.*";"*ver.*";"*anci.*";"*fam.*") from t;
Then I want to find the number of duplicates. This is simply the count of rows by column 2 (a), column 3 (b) and within the present-category:
t:update d:neg count i by a,b,p from t;
Finally, I because I negated the count, all of my values "go the same way", so I can simply sort by those three columns:
`a`p`d xasc t
This might work for you (very inelegant!):
sed 's/[^#]*#\([^#\]*\)#\([^#]*\)/\1\t\2\t&/;h;s/#/&\n/3;s/.*\n//;/\(arch\|var\|ver\|anci\|fam\)\./!ba;s/.*/1/;bb;:a;s/.*/0/;:b;G;s/\(.\)\n\([^\t]*\)/\2\t\1/' file |
sort |
tee file1 |
sed 's/\(.*\)\t.*/\1/' |
uniq -c |
sed 's|^\s*\(\S*\) \(.*\t.*\t\(.*\)\)|/^\2/s/\3/\1/|' >file.sed
sed -f file.sed file1 |
sort -k1,2 -k3,3nr |
sed 's/\t/\n/3;s/.*\n//'
1 # a # fish # Does not have one of those words.
2 # a # fish # Does not have one of those words.
3 # a # fish # Does not have one of those words.
4 # a # tiger # Does not have one of those words.
5 # a # bear # This sentence contains arch.
6 # b # fish # Does not have one of those words.
7 # b # fish # Has the word ver.
8 # c # bear # Does not have one of those words.
9 # c # bear # Does not have one of those words.
10 # c # fish # Does not have one of those words.
11 # c # tiger # This sentence contains var.
12 # c # tiger # This sentence contains fam.
13 # c # bear # This sentence contains fam.
Explanation:
Make sort keys consisting of:
The 2nd field
0/1: 0 represents 4th field without arch./var./etc. 1 represents those with.
The count of 3rd field duplicates after sorting the above 2.
The file is eventually sorted using the above keys and then the keys deleted.

Resources