Swap the 2 colums in output ----Sed or Awk? - shell

Input file:
GET /static_register_ad_request_1_2037_0_0_0_1_1_4_8335086462.gif?pa=99439_50491&country=US&state_fips_code=US_CA&city_name=Los%2BAngeles&dpId=2&dmkNm=apple&dmlNm=iPod%2Btouch&osNm=iPhone%2BOS&osvNm=5.1.1&bNm=Safari&bvNm=null&spNm=SBC%2BInternet%2BServices&kv=0_0&sessionId=0A80187E0138A0AE42E4DE3F783E7A08&sdk_version=4.0.5.6%20&domain=805AOEtUaMu&ad_catalog=99439_50491&make=APPLE&width=320&height=460&slot_type=PREROLL&model=iPod%20touch%205.1.1&iabcat=artsandentertainment&iabsubcat=music&age=113&gender=2&zip=92869 HTTP/1.1
Output file:
domain sdk_version
805AOEtUaMu 4.0.5.6%20
I could use sed -n 's/.*sdk_version=\([^&]*\).*domain=\([^&]*\).*/\1 \2/p' to get the result, but sdk_version in first column, what I need is swap the sdk_version and domain columns in outputfile.
Could anyone help me with this? Thank you so much in advance:)

Just swap your backreferences:
sed -n 's/.*sdk_version=\([^&]*\).*domain=\([^&]*\).*/\2 \1/p'

One way using awk:
awk '
BEGIN {
FS = "&";
}
{
for ( i = 15; i <= 16; i++ ) {
split( $i, f, /=/ );
printf( "%s ", f[2] );
}
}
END {
printf "\n";
}
' infile
Output:
4.0.5.6%20 805AOEtUaMu

If you want to handle arbitrary order, I would suggest switching to awk or Perl.
perl -ne 'm/[?&]domain=([^&]+)/ && $d = $1;
m/[?&]sdk_version=([^&]+) && $s = $1;
print "$d\t$s\n"' logfile

Related

awk to process the first two lines then the next two and so on

Suppose i have a very file which i created from two files one is old & another is the updated file by using cat & sort on the primary key.
File1
102310863||7097881||6845193||271640||06007709532577||||
102310863||7097881||6845123||271640||06007709532577||||
102310875||7092992||6840808||023740||10034500635650||||
102310875||7092992||6840818||023740||10034500635650||||
So pattern of this file is line 1 = old value & line 2 = updated value & so on..
now I want to process the file in such a way that awk first process the first two lines of the file & find out the difference & then move on two the next two lines.
now the process is
if($[old record]!=$[new record])
i= [new record]#[old record];
Desired output
102310863||7097881||6845123#6845193||271640||06007709532577||||
102310875||7092992||6840818#6840808||023740||10034500635650||||
$ cat tst.awk
BEGIN { FS="[|][|]"; OFS="||" }
NR%2 { split($0,old); next }
{
for (i=1;i<=NF;i++) {
if (old[i] != $i) {
$i = $i "#" old[i]
}
}
print
}
$
$ awk -f tst.awk file
102310863||7097881||6845123#6845193||271640||06007709532577||||
102310875||7092992||6840818#6840808||023740||10034500635650||||
This awk could help:
$ awk -F '\\|\\|' '{
getline new;
split(new, new_array, "\\|\\|");
for(i=1;i<=NF;i++) {
if($i != new_array[i]) {
$i = new_array[i]"#"$i;
}
}
} 1' OFS="||" < input_file
102310863||7097881||6845123#6845193||271640||06007709532577||||
102310875||7092992||6840818#6840808||023740||10034500635650||||
I think, you are good enough in awk to understand above code. Skipping the explanation.
Updated version, and thanks #martin for the double | trick:
$ cat join.awk
BEGIN {new=0; FS="[|]{2}"; OFS="||"}
new==0 {
split($0, old_data, "[|]{2}")
new=1
next
}
new==1 {
split($0, new_data, "[|]{2}")
for (i = 1; i <= 7; i++) {
if (new_data[i] != old_data[i]) new_data[i] = new_data[i] "#" old_data[i]
}
print new_data[1], new_data[2], new_data[3], new_data[4], new_data[5], new_data[6], new_data[7]
new = 0
}
$ awk -f join.awk data.txt
102310863||7097881||6845123#6845193||271640||06007709532577||||
102310875||7092992||6840818#6840808||023740||10034500635650||||

How to rearrange text file output result

I would like to write a unix script that do the following to have the ff result:
textfile1 contains the following text:
keyval1,1
keyval1,2
keyval1,3
keyval1,4
keyval2,1
keyval2,2
keyval3,1
keyval4,1
keyval4,3
keyval4,4
Expected result:
keyval1 (1,2,3,4)
keyval2 (1,2)
keyval2 (1)
keyval4 (1,3,4)
Thank you.
I'm new to unix and this is what I have done so far. It's not working yet though :(
#!/bin/ksh
f1 = 'cut -d "," -f 1 keyval.txt'
f2 = 'cut -d "," -f 2 keyval.txt'
while f1 <> f2
do
echo f1 "("f2")"
done > output.txt
You can do this in a breeze using AWK:
#!/usr/bin/awk -f
BEGIN {
FS = ","
closeBracket = ""
}
{
if (key != $1)
{
key = $1
printf "%s%s (%s", closeBracket, key, $2
}
else
{
printf ",%s", $2
}
closeBracket = ")\n"
}
END {
printf "%s", closeBracket
}
A bit late to the party, but I had this one laying around, almost:
#!/usr/bin/perl
while (<>)
{
/(.+),(.?+)\s*/;
push #{$h{$1}}, $2;
}
print map {"$_ (" . join(',', #{$h{$_}}) . ")\n"} sort keys %h;
Not particular beautiful but it get the job done.

how to sum each column in a file using bash

I have a file on the following format
id_1,1,0,2,3,lable1
id_2,3,2,2,1,lable1
id_3,5,1,7,6,lable1
and I want the summation of each column ( I have over 300 columns)
9,3,11,10,lable1
how can I do that using bash.
I tried using what described here but didn't work.
Using awk:
$ awk -F, '{for (i=2;i<NF;i++)a[i]+=$i}END{for (i=2;i<NF;i++) printf a[i]",";print $NF}' file
9,3,11,10,lable1
This will print the sum of each column (from i=2 .. i=n-1) in a comma separated file followed the value of the last column from the last row (i.e. lable1).
If the totals would need to be grouped by the label in the last column, you could try this:
awk -F, '
{
L[$NF]
for(i=2; i<NF; i++) T[$NF,i]+=$i
}
END{
for(i in L){
s=i
for(j=NF-1; j>1; j--) s=T[i,j] FS s
print s
}
}
' file
If the labels in the last column are sorted then you could try without arrays and save memory:
awk -F, '
function labelsum(){
s=p
for(i=NF-1; i>1; i--) s=T[i] FS s
print s
split(x,T)
}
p!=$NF{
if(p) labelsum()
p=$NF
}
{
for(i=2; i<NF; i++) T[i]+=$i
}
END {
labelsum()
}
' file
Here's a Perl one-liner:
<file perl -lanF, -E 'for ( 0 .. $#F ) { $sums{ $_ } += $F[ $_ ]; } END { say join ",", map { $sums{ $_ } } sort keys %sums; }'
It will only do sums, so the first and last column in your example will be 0.
This version will follow your example output:
<file perl -lanF, -E 'for ( 1 .. $#F - 1 ) { $sums{ $_ } += $F[ $_ ]; } END { $sums{ $#F } = $F[ -1 ]; say join ",", map { $sums{ $_ } } sort keys %sums; }'
A modified version based on the solution you linked:
#!/bin/bash
colnum=6
filename="temp"
for ((i=2;i<$colnum;++i))
do
sum=$(cut -d ',' -f $i $filename | paste -sd+ | bc)
echo -n $sum','
done
head -1 $filename | cut -d ',' -f $colnum
Pure bash solution:
#!/usr/bin/bash
while IFS=, read -a arr
do
for((i=1;i<${#arr[*]}-1;i++))
do
((farr[$i]=${farr[$i]}+${arr[$i]}))
done
farr[$i]=${arr[$i]}
done < file
(IFS=,;echo "${farr[*]}")

Shell programming --Extract values of 2 key words

Input file (HTTP request log file):
GET /dynamic_branding_playlist.fmil?domain=915oGLbNZhb&pluginVersion=3.2.7_2.6&pubchannel=usa&sdk_ver=2.4.6.3&width=680&height=290&embeddedIn=http%3A%2F%2Fviewster.com%2Fsplash%2FOscar-Videos-1.aspx%3Futm_source%3Dadon_272024_113535_24905_24905%26utm_medium%3Dcpc%26utm_campaign%3DUSYME%26adv %3D573900%26req%3D5006e9ce1ca8b26347b88a7.1.825&sdk_url=http%3A%2F%2Fdivaag.vo.llnwd.net%2Fo42%2Fhtt p_only%2Fviewster_com%2Fv25%2Fyume%2F&viewport=42
Out put file:
domain sdk_version
915oGLbNZhb 2.4.6.3
Thousands of logs similar to the example above, so I need to find a way to extract the value of domain&sdk_version. And the positions of domain and sdk_version are not fixed. sometimes appear in the 2 field, sometimes apprear in the last field (if split by &).
Could anyone help me in this problem (using sed command)? Thanks so much in advance
Using sed:
sed -n 's/.*domain=\([^&]*\).*sdk_ver=\([^&]*\).*/\1 \2/p' input_file
This might work for you (GNU sed):
sed 's/.*\<\(domain\)=\([^&]*\).*\<\(sdk_ver\)=\([^&]*\).*/\1 \3sion\n\2 \4/p;d' file
Using awk:
BEGIN {
FS = "[&?]"
printf "domain\tsdk_version\n"
}
{
for (i = 1; i <= NF; i++) {
split ($i, array, "=")
if (array[1] == "domain") {
printf array[2]
}
if (array[1] == "sdk_ver") {
printf "\t%s", array[2]
}
}
printf "\n"
}
Or as a one-liner:
awk -F "[&?]" 'BEGIN { printf "domain\tsdk_version\n" } { for (i = 1; i <= NF; i++) { split ($i, array, "="); if (array[1] == "domain") printf array[2]; if (array[1] == "sdk_ver") printf "\t%s", array[2]; } printf "\n"; }' file.txt
Results:
domain sdk_version
915oGLbNZhb 2.4.6.3

change random line with shellscript

how can i easily (quick and dirty) change, say 10, random lines of a file with a simple shellscript?
i though about abusing ed and generating random commands and line ranges, but i'd like to know if there was a better way
awk 'BEGIN{srand()}
{ lines[++c]=$0 }
END{
while(d<10){
RANDOM = int(1 + rand() * c)
if( !( RANDOM in r) ) {
r[RANDOM]
print "do something with " lines[RANDOM]
++d
}
}
}' file
or if you have the shuf command
shuf -n 10 $file | while read -r line
do
sed -i "s/$line/replacement/" $file
done
Playing off #Dennis' version, this will always output 10.
Doing random numbers in a separate array could create
duplicates and, consequently, fewer than 10 modifications.
file=~/testfile
c=$(wc -l < "$file")
awk -v c=$c '
BEGIN {
srand();
count = 10;
}
{
if (c*rand() < count) {
--count;
print "do something with " $0;
} else
print;
--c;
}
' "$file"
This seems to be quite a bit faster:
file=/your/input/file
c=$(wc -l < "$file")
awk -v c=$c 'BEGIN {
srand();
for (i=0;i<10;i++) lines[i] = int(1 + rand() * c);
asort(lines);
p = 1
}
{
if (NR == lines[p]) {
++p
print "do something with " $0
}
else print
}' "$file"
I

Resources