Text processing using bash - bash

I have a vmstat dump file that has the header and values in this format
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
12 0 5924396 20810624 548548 935160 0 0 0 5 0 0 60 5 34 0 0
12 0 5924396 20768588 548548 935160 0 0 0 0 1045 1296 99 0 0 0 0
12 0 5924396 20768968 548548 935452 0 0 0 32 1025 1288 100 0 0 0 0
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
4 0 5924396 20768724 548552 935408 0 0 0 92 1093 1377 33 0 67 0 0
I'm writing a script in bash that extracts just the lines containing lines with numbers i.e. values and remove all lines containing the alphabet. How do I do that

If you need a lines with numbers and tabs\spaces only, grep -P "^[0-9\ \t]*$" should helps you.
$> cat ./text | grep -P "^[0-9\ \t]*$"
12 0 5924396 20810624 548548 935160 0 0 0 5 0 0 60 5 34 0 0
12 0 5924396 20768588 548548 935160 0 0 0 0 1045 1296 99 0 0 0 0
12 0 5924396 20768968 548548 935452 0 0 0 32 1025 1288 100 0 0 0 0
4 0 5924396 20768724 548552 935408 0 0 0 92 1093 1377 33 0 67 0 0

cat filename | grep "[0-9]"

Related

Generating and plotting an event window relative to a shock (Repost)

Repost from: https://www.statalist.org/forums/forum/general-stata-discussion/general/1648042-generating-and-plotting-of-an-event-window-relative-to-a-shock
Dear all,
I am (still) struggling with the generation of event_window variable (relative to the time of the event). The esplot package (#Dylan Balla-Elliott) defines event_windowas follows.
event_indicator = <current_time> == <time of event>
event_time = <current_time> - <time of event>
Here is a data example, with a time variable, a continuous variable, and a set of event indicator dummies (which are basically random shocks).
* Example generated by -dataex-. For more info, type help dataex
clear
input str7 modate float epeu_lvl byte(cop_shock unpri_reg_shock eu_reg_shock) float tid
"2004/1" 75.34063 0 0 0 1
"2004/2" 76.99823 0 0 0 2
"2004/3" 125.02164 0 0 0 3
"2004/4" 109.83804 0 0 0 4
"2004/5" 114.84982 0 0 0 5
"2004/6" 99.84531 0 0 0 6
"2004/7" 115.9254 0 0 0 7
"2004/8" 77.3424 0 0 0 8
"2004/9" 89.59677 0 0 0 9
"2004/10" 120.00146 0 0 0 10
"2004/11" 127.93832 0 0 0 11
"2004/12" 83.33497 1 0 1 12
"2005/1" 58.94662 0 0 0 13
"2005/2" 74.97708 0 0 0 14
"2005/3" 81.45479 0 0 0 15
"2005/4" 89.07868 0 0 0 16
"2005/5" 99.44091 0 0 0 17
"2005/6" 99.41497 0 0 0 18
"2005/7" 85.08384 0 0 0 19
"2005/8" 82.83349 0 0 0 20
"2005/9" 160.47383 0 0 0 21
"2005/10" 71.51886 0 0 0 22
"2005/11" 95.44765 0 0 0 23
"2005/12" 61.47662 1 0 1 24
"2006/1" 83.96114 0 0 0 25
"2006/2" 60.63415 0 0 0 26
"2006/3" 79.82993 0 0 0 27
"2006/4" 89.04356 0 0 0 28
"2006/5" 82.44514 0 0 0 29
"2006/6" 89.85152 0 0 0 30
"2006/7" 82.00437 0 0 0 31
"2006/8" 58.86663 0 0 0 32
"2006/9" 76.82971 0 0 0 33
"2006/10" 71.2218 0 0 0 34
"2006/11" 73.84509 1 0 0 35
"2006/12" 74.91799 0 0 0 36
"2007/1" 62.33881 0 0 0 37
"2007/2" 58.51786 0 0 0 38
"2007/3" 71.11645 0 0 0 39
"2007/4" 65.16531 0 0 0 40
"2007/5" 54.99327 0 0 0 41
"2007/6" 60.84606 0 0 0 42
"2007/7" 47.69234 0 0 0 43
"2007/8" 94.66286 0 0 0 44
"2007/9" 166.7332 0 0 0 45
"2007/10" 96.88046 0 0 0 46
"2007/11" 97.73734 0 0 0 47
"2007/12" 98.01473 1 0 1 48
"2008/1" 160.25905 0 0 1 49
"2008/2" 128.78455 0 0 0 50
"2008/3" 139.87073 0 0 0 51
"2008/4" 96.74758 0 0 0 52
"2008/5" 76.82344 0 0 0 53
"2008/6" 106.42784 0 0 0 54
"2008/7" 87.93302 0 0 0 55
"2008/8" 92.29639 0 0 0 56
"2008/9" 156.0435 0 0 0 57
"2008/10" 216.5918 0 0 0 58
"2008/11" 156.77446 1 0 0 59
"2008/12" 136.78456 0 0 0 60
"2009/1" 159.99384 0 0 0 61
"2009/2" 139.69698 0 0 0 62
"2009/3" 133.46071 0 0 0 63
"2009/4" 119.9992 0 0 1 64
"2009/5" 122.9601 0 0 0 65
"2009/6" 113.23891 0 0 0 66
"2009/7" 95.94823 0 0 0 67
"2009/8" 91.37744 0 0 0 68
"2009/9" 104.3236 0 0 0 69
"2009/10" 105.04014 0 0 0 70
"2009/11" 133.00749 1 0 1 71
"2009/12" 115.2626 0 0 1 72
"2010/1" 142.00356 0 0 0 73
"2010/2" 136.73906 0 0 0 74
"2010/3" 137.8383 0 0 0 75
"2010/4" 152.78447 0 0 0 76
"2010/5" 203.30525 0 0 0 77
"2010/6" 171.40266 0 0 1 78
"2010/7" 186.55524 0 0 0 79
"2010/8" 172.81606 0 0 0 80
"2010/9" 161.69014 0 0 0 81
"2010/10" 186.1411 0 1 0 82
"2010/11" 172.68817 1 0 0 83
"2010/12" 183.076 0 0 0 84
"2011/1" 143.03174 0 0 0 85
"2011/2" 122.44579 0 0 0 86
"2011/3" 154.4015 0 0 0 87
"2011/4" 145.5086 0 0 0 88
"2011/5" 134.21507 0 0 1 89
"2011/6" 168.2959 0 0 0 90
"2011/7" 183.40234 0 0 0 91
"2011/8" 230.29893 0 0 0 92
"2011/9" 280.05814 0 0 0 93
"2011/10" 241.75185 0 0 0 94
"2011/11" 304.60022 1 0 0 95
"2011/12" 228.8716 0 0 0 96
"2012/1" 216.73445 0 0 0 97
"2012/2" 193.44435 0 0 0 98
"2012/3" 177.4927 0 0 0 99
"2012/4" 216.99586 0 0 0 100
end
At glance I thought to create a loop that generates event_window. But some questions arise about how to handle the variable with two sequential shocks (i.e in 2009/11 and 2009/12 for eu_reg_shock). Or where two or more shocks are included in the time window. If the window is too large, it will be problematic, I assume.
My main goal is to analyze if these shocks affect the continuous variable before and after. Ideally, I need to normalize the continuous variable (with mean of one) before the shock. Here is the study and the plot that I wish to replicate from Scott R. Baker Nicholas Bloom Stephen J. Terry (2022).
I thought about the following plot. But I have no idea about the normalization part.
graph bar (mean) epeu_lvl, over(event_time)
References:
Scott R. Baker Nicholas Bloom Stephen J. Terry (2022). https://www.nber.org/papers/w27167
Dylan Balla-Elliott. https://dballaelliott.github.io/esplot/index.html

AWK Formatting Using First Row as a Header and Iterating by column

I'm struggling trying to format a collectd ploted file si I can later import it to an influx db instance.
This is how the file looks like:
#Date Time [CPU]User% [CPU]Nice% [CPU]Sys% [CPU]Wait% [CPU]Irq% [CPU]Soft% [CPU]Steal% [CPU]Idle% [CPU]Totl% [CPU]Intrpt/sec [CPU]Ctx/sec [CPU]Proc/sec [CPU]ProcQue [CPU]ProcRun [CPU]L-Avg1 [CPU]L-Avg5 [CPU]L-Avg15 [CPU]RunTot [CPU]BlkTot [MEM]Tot [MEM]Used [MEM]Free [MEM]Shared [MEM]Buf [MEM]Cached [MEM]Slab [MEM]Map [MEM]Anon [MEM]Commit [MEM]Locked [MEM]SwapTot [MEM]SwapUsed [MEM]SwapFree [MEM]SwapIn [MEM]SwapOut [MEM]Dirty [MEM]Clean [MEM]Laundry [MEM]Inactive [MEM]PageIn [MEM]PageOut [MEM]PageFaults [MEM]PageMajFaults [MEM]HugeTotal [MEM]HugeFree [MEM]HugeRsvd [MEM]SUnreclaim [SOCK]Used [SOCK]Tcp [SOCK]Orph [SOCK]Tw [SOCK]Alloc [SOCK]Mem [SOCK]Udp [SOCK]Raw [SOCK]Frag [SOCK]FragMem [NET]RxPktTot [NET]TxPktTot [NET]RxKBTot [NET]TxKBTot [NET]RxCmpTot [NET]RxMltTot [NET]TxCmpTot [NET]RxErrsTot [NET]TxErrsTot [DSK]ReadTot [DSK]WriteTot [DSK]OpsTot [DSK]ReadKBTot [DSK]WriteKBTot [DSK]KbTot [DSK]ReadMrgTot [DSK]WriteMrgTot [DSK]MrgTot [INODE]NumDentry [INODE]openFiles [INODE]MaxFile% [INODE]used [NFS]ReadsS [NFS]WritesS [NFS]MetaS [NFS]CommitS [NFS]Udp [NFS]Tcp [NFS]TcpConn [NFS]BadAuth [NFS]BadClient [NFS]ReadsC [NFS]WritesC [NFS]MetaC [NFS]CommitC [NFS]Retrans [NFS]AuthRef [TCP]IpErr [TCP]TcpErr [TCP]UdpErr [TCP]IcmpErr [TCP]Loss [TCP]FTrans [BUD]1Page [BUD]2Pages [BUD]4Pages [BUD]8Pages [BUD]16Pages [BUD]32Pages [BUD]64Pages [BUD]128Pages [BUD]256Pages [BUD]512Pages [BUD]1024Pages
20190228 00:01:00 12 0 3 0 0 1 0 84 16 26957 20219 14 2991 3 0.05 0.18 0.13 1 0 198339428 197144012 1195416 0 817844 34053472 1960600 76668 158641184 201414800 0 17825788 0 17825788 0 0 224 0 0 19111168 3 110 4088 0 0 0 0 94716 2885 44 0 5 1982 1808 0 0 0 0 9739 9767 30385 17320 0 0 0 0 0 0 12 13 3 110 113 0 16 16 635592 7488 0 476716 0 0 0 0 0 0 0 0 0 0 0 8 0 0 22 0 1 0 0 0 0 48963 10707 10980 1226 496 282 142 43 19 6 132
20190228 00:02:00 11 0 3 0 0 1 0 85 15 26062 18226 5 2988 3 0.02 0.14 0.12 2 0 198339428 197138128 1201300 0 817856 34054692 1960244 75468 158636064 201398036 0 17825788 0 17825788 0 0 220 0 0 19111524 0 81 960 0 0 0 0 94420 2867 42 0 7 1973 1842 0 0 0 0 9391 9405 28934 16605 0 0 0 0 0 0 9 9 0 81 81 0 11 11 635446 7232 0 476576 0 0 0 0 0 0 0 0 0 0 0 3 0 0 8 0 1 0 0 0 0 49798 10849 10995 1241 499 282 142 43 19 6 132
20190228 00:03:00 11 0 3 0 0 1 0 85 15 25750 17963 4 2980 0 0.00 0.11 0.10 2 0 198339428 197137468 1201960 0 817856 34056400 1960312 75468 158633880 201397832 0 17825788 0 17825788 0 0 320 0 0 19111712 0 75 668 0 0 0 0 94488 2869 42 0 5 1975 1916 0 0 0 0 9230 9242 28411 16243 0 0 0 0 0 0 9 9 0 75 75 0 10 10 635434 7232 0 476564 0 0 0 0 0 0 0 0 0 0 0 2 0 0 6 0 1 0 0 0 0 50029 10817 10998 1243 501 282 142 43 19 6 132
20190228 00:04:00 11 0 3 0 0 1 0 84 16 25755 17871 10 2981 5 0.08 0.11 0.10 3 0 198339428 197140864 1198564 0 817856 34058072 1960320 75468 158634508 201398088 0 17825788 0 17825788 0 0 232 0 0 19111980 0 79 2740 0 0 0 0 94488 2867 4 0 2 1973 1899 0 0 0 0 9191 9197 28247 16183 0 0 0 0 0 0 9 9 0 79 79 0 10 10 635433 7264 0 476563 0 0 0 0 0 0 0 0 0 0 0 5 0 0 12 0 1 0 0 0 0 49243 10842 10985 1245 501 282 142 43 19 6 132
20190228 00:05:00 12 0 4 0 0 1 0 83 17 26243 18319 76 2985 3 0.06 0.10 0.09 2 0 198339428 197148040 1191388 0 817856 34059808 1961420 75492 158637636 201405208 0 17825788 0 17825788 0 0 252 0 0 19112012 0 85 18686 0 0 0 0 95556 2884 43 0 6 1984 1945 0 0 0 0 9176 9173 28153 16029 0 0 0 0 0 0 10 10 0 85 85 0 12 12 635473 7328 0 476603 0 0 0 0 0 0 0 0 0 0 0 3 0 0 7 0 1 0 0 0 0 47625 10801 10979 1253 505 282 142 43 19 6 132
What I'm trying to do, is to get it in a format that looks like this:
cpu_value,host=mxspacr1,instance=5,type=cpu,type_instance=softirq value=180599 1551128614916131663
cpu_value,host=mxspacr1,instance=2,type=cpu,type_instance=interrupt value=752 1551128614916112943
cpu_value,host=mxspacr1,instance=4,type=cpu,type_instance=softirq value=205697 1551128614916128446
cpu_value,host=mxspacr1,instance=7,type=cpu,type_instance=nice value=19250943 1551128614916111618
cpu_value,host=mxspacr1,instance=2,type=cpu,type_instance=softirq value=160513 1551128614916127690
cpu_value,host=mxspacr1,instance=1,type=cpu,type_instance=softirq value=178677 1551128614916127265
cpu_value,host=mxspacr1,instance=0,type=cpu,type_instance=softirq value=212274 1551128614916126586
cpu_value,host=mxspacr1,instance=6,type=cpu,type_instance=interrupt value=673 1551128614916116661
cpu_value,host=mxspacr1,instance=4,type=cpu,type_instance=interrupt value=701 1551128614916115893
cpu_value,host=mxspacr1,instance=3,type=cpu,type_instance=interrupt value=723 1551128614916115492
cpu_value,host=mxspacr1,instance=1,type=cpu,type_instance=interrupt value=756 1551128614916112550
cpu_value,host=mxspacr1,instance=6,type=cpu,type_instance=nice value=21661921 1551128614916111032
cpu_value,host=mxspacr1,instance=3,type=cpu,type_instance=nice value=18494760 1551128614916098304
cpu_value,host=mxspacr1,instance=0,type=cpu,type_instance=interrupt value=552 1551
What I have managed to do so far is just to convert the date string into EPOCH format.
I was thinking somehow to use the first value "[CPU]" as the measurement, and the "User%" as the type, the host I can take it from the system where the script will run.
I would really appreciate your help, because I really basic knowledge of text editing.
Thanks.
EDIT: this is what would expect to get with the information of the second line using as a header the first row:
cpu_value,host=mxspacr1,type=cpu,type_instance=user% value=0 1551128614916131663
EDIT: This is what I have so far, and I'm stuck here.
awk -v HOSTNAME="$HOSTNAME" 'BEGIN { FS="[][]"; getline; NR==1; f1=$2; f2=$3 } { RS=" "; printf f1"_measurement,host="HOSTNAME",type="f2"value="$3" ", system("date +%s -d \""$1" "$2"\"") }' mxmcaim01-20190228.tab
And this is what I get, but this is only for 1 column, now I don't know how to process the remaining columns such as Nice, Sys, Wait and so on.
CPU_measurement,host=mxmcamon05,type=User% value= 1552014000
CPU_measurement,host=mxmcamon05,type=User% value= 1551960000
CPU_measurement,host=mxmcamon05,type=User% value= 1551343500
CPU_measurement,host=mxmcamon05,type=User% value= 1551997620
CPU_measurement,host=mxmcamon05,type=User% value= 1551985200
CPU_measurement,host=mxmcamon05,type=User% value= 1551938400
CPU_measurement,host=mxmcamon05,type=User% value= 1551949200
CPU_measurement,host=mxmcamon05,type=User% value= 1551938400
CPU_measurement,host=mxmcamon05,type=User% value= 1551938400
CPU_measurement,host=mxmcamon05,type=User% value= 1551945600
CPU_measurement,host=mxmcamon05,type=User% value= 1551938400
Please help.
EDIT. First of all, Thanks for your help.
Taking Advantage from you knowledge in text editing, I was expecting to use this for 3 separate files, but unfortunately and I don't know why the format is different, like this:
#Date Time SlabName ObjInUse ObjInUseB ObjAll ObjAllB SlabInUse SlabInUseB SlabAll SlabAllB SlabChg SlabPct
20190228 00:01:00 nfsd_drc 0 0 0 0 0 0 0 0 0 0
20190228 00:01:00 nfsd4_delegations 0 0 0 0 0 0 0 0 0 0
20190228 00:01:00 nfsd4_stateids 0 0 0 0 0 0 0 0 0 0
20190228 00:01:00 nfsd4_files 0 0 0 0 0 0 0 0 0 0
20190228 00:01:00 nfsd4_stateowners 0 0 0 0 0 0 0 0 0 0
20190228 00:01:00 nfs_direct_cache 0 0 0 0 0 0 0 0 0 0
So I don't how to handle the arrays in a way that I can use nfsd_drc as the type and then Iterate through ObjInUse ObjInUseB ObjAll ObjAllB SlabInUse SlabInUseB SlabAll SlabAllB SlabChg SlabPct and use them like the type_instance and finally the value in this case for ObjInUse will be 0, ObjInUseB = 0, ObjAll = 0, an so one, making something like this:
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=ObjectInUse value=0 1551128614916131663
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=ObjInuseB value=0 1551128614916131663
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=ObjAll value=0 1551128614916112943
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=ObjAllB value=0 1551128614916128446
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=SlabInUse value=0 1551128614916111618
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=SlabInUseB value=0 1551128614916127690
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=SlabAll value=0 1551128614916127265
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=SlabAllB value=0 1551128614916126586
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=SlabChg value=0 1551128614916116661
slab_value,host=mxspacr1,type=nfsd_drc,type_instance=SlabPct value=0 1551128614916115893
slab_value is a hard-coded value.
Thanks.
It is not clear where do instance and type_instance=interrupt come from in your final desired format. Otherwise awk code below should work.
Note: it doesn't strip % from tag values and prints timestamp at end of line in seconds (append extra zeros if you want nanoseconds).
gawk -v HOSTNAME="$HOSTNAME" 'NR==1 {split($0,h,/[ \t\[\]]+/,s); for(i=0;i<length(h);i++){ h[i]=tolower(h[i]); };}; NR>1 { for(j=2;j<NF;j++) {k=2*j; printf("%s_value,host=%s,type=%s,type_instance=%s value=%s %s\n", h[k], HOSTNAME, h[k], h[k+1],$(j+1), mktime(substr($1,1,4)" "substr($1,5,2)" "substr($1,7,2)" "substr($2,1,2)" "substr($2,4,2)" "substr($2,7,2)));}}' mxmcaim01-20190228.tab

Extract column from file with shell [duplicate]

This question already has answers here:
bash: shortest way to get n-th column of output
(8 answers)
Closed 4 years ago.
I would like to extract column number 8 from the following table using shell (ash):
0xd024 2 0 32 20 3 0 1 0 2 1384 1692 -61 27694088
0xd028 0 1 5 11 1 0 46 0 0 301 187 -74 27689154
0xd02c 0 0 35 14 1 0 21 0 0 257 250 -80 27689410
0xd030 1 1 15 13 1 0 38 0 0 176 106 -91 27689666
0xd034 1 1 50 20 1 0 8 0 0 790 283 -71 27689980
0xd038 0 0 0 3 4 0 89 0 0 1633 390 -90 27690291
0xd03c 0 0 8 3 3 0 82 0 0 1837 184 -95 27690603
0xd040 0 0 4 5 1 0 90 0 0 0 148 -97 27690915
0xd064 0 0 36 9 1 0 29 0 0 321 111 -74 27691227
0xd068 0 0 5 14 14 0 40 0 0 8066 2270 -85 27691539
0xd06c 1 1 39 19 1 0 15 0 0 1342 261 -74 27691850
0xd070 0 0 12 11 1 0 53 0 0 203 174 -73 27692162
0xd074 0 0 18 2 1 0 75 0 0 301 277 -94 27692474
How can I do that?
the following command "awk '{print $8}' file" works fine

Determine whether matrix is sparse?

I have a matrix. I want to know it whether sparse or not. Is there any function in matlab to evaluate that property? I tried to used issparse function, but it always returns 0(not sparse). For example, my matrix (27 by 27) is
A=
[ 1 0 0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
1 1 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0
1 1 1 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0
0 1 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0
0 0 1 1 1 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0
0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0
0 0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0
250 243 247 245 244 244 244 122 61 144 72 36 18 9 4 2 1 1 0 0 0 0 0 0 0 0 0
151 197 236 118 181 212 106 53 26 13 136 68 34 17 8 4 2 0 1 0 0 0 0 0 0 0 0
24 12 6 3 143 201 234 117 180 90 45 152 76 38 19 9 4 0 0 1 0 0 0 0 0 0 0
18 9 138 69 172 86 165 220 224 112 56 28 128 64 32 16 8 0 0 0 1 0 0 0 0 0 0
27 131 207 103 189 94 47 153 194 239 119 59 29 128 64 32 16 0 0 0 0 1 0 0 0 0 0
44 22 133 204 232 116 58 147 199 237 248 124 62 31 129 64 32 0 0 0 0 0 1 0 0 0 0
238 119 181 90 45 152 76 38 19 135 205 232 116 58 29 128 64 0 0 0 0 0 0 1 0 0 0
48 24 12 6 3 143 201 100 50 25 130 207 233 116 58 29 128 0 0 0 0 0 0 0 1 0 0
168 84 42 21 132 66 33 158 79 39 19 135 205 232 116 58 29 0 0 0 0 0 0 0 0 1 0
235 117 58 29 128 64 32 16 8 4 2 1 142 201 234 117 58 0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0
0 1 1 0 0 0 0 1 1 0 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 0
1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 0 0 0 0 0 0 0 0
0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0
0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 1
0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0
0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 1 0
0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0]
This is figure for above matrix
This seemingly easy question is quite difficult to answer. There is actually no known standard that determines whether a matrix is sparse or full.
However, the most common measure I know is to measure a matrix's sparsity. This is simply the fraction of the total number of zeroes over the total number of elements. If this exceeds some sensible threshold, then you could say that the matrix is sparse.
If you're given the matrix A, perhaps something like this:
sparsity = (numel(A) - nnz(A)) / numel(A);
numel determines the total number of elements in the matrix A and nnz determines the total number of non-zero elements. Therefore, numel(A) - nnz(A) should give you the total number of zero elements.
So, going with the threshold idea, this is what I was talking about:
is_sparse = sparsity > tol;
tol would be a fraction from [0,1], so something like 0.75 could work. This would mean that if 75% of your matrix consisted of zeroes, this could be a sparse matrix. It's all heuristic though. Choose a threshold that you think makes the most sense.

How to join general string to every first column of every sub row

I want to join every first general string in the case below "ADMIN" and "DB" to the data which they represent and the place which will they take to be every time on the first column.
Example:
ADMIN
ADMIN_DB Running 1 0 1 0 0 0 80
ADMIN_CATALOG Running 0 0 1 0 0 0 452
ADMIN_CAT Running 0 0 1 0 0 0 58
DB
SLAVE_DB Running 2 0 3 0 0 0 94
DB_BAK Running 1 0 1 0 0 0 54
HISTORY_DB Running 0 0 1 0 0 0 40
HISTORY_DB_BAK Running 0 0 1 0 0 0 59
Expectation:
ADMIN ADMIN_DB Running 1 0 1 0 0 0 80
ADMIN ADMIN_CATALOG Running 0 0 1 0 0 0 452
ADMIN ADMIN_CAT Running 0 0 1 0 0 0 58
DB SLAVE_DB Running 2 0 3 0 0 0 94
DB DB_BAK Running 1 0 1 0 0 0 54
DB HISTORY_DB Running 0 0 1 0 0 0 40
DB HISTORY_DB_BAK Running 0 0 1 0 0 0 59
In the past I have one example this is the start point which can do the thing but I'm not aware so much in that kind of scripting: perl -ne 'chomp; if($. % 2){print "$_,";next;}
How about
awk 'NF==1{ val=$0; next} {print val" "$0}' input
You can format the output using the column utilty as
$ awk 'NF==1{ val=$0; next} { print val" "$0}' input | column -t
ADMIN ADMIN_DB Running 1 0 1 0 0 0 80
ADMIN ADMIN_CATALOG Running 0 0 1 0 0 0 452
ADMIN ADMIN_CAT Running 0 0 1 0 0 0 58
DB SLAVE_DB Running 2 0 3 0 0 0 94
DB DB_BAK Running 1 0 1 0 0 0 54
DB HISTORY_DB Running 0 0 1 0 0 0 40
DB HISTORY_DB_BAK Running 0 0 1 0 0 0 59

Resources