Parsing Data from SQLPlus - bash

Someone kindly dumped the data out of a number of tables in SQL*Plus.
Is there a nice awk or similar script to turn it into CSV or something more easily parsed to load into another system. Sadly getting it re-run is not an option.
They used
SQL> set pages 10000 lines 10000
followed by SELECT * from the table
There are the column_names, ---- 's and then data lines. It looks like the structure is spaces or tabs between column names and --- ---- with the number of --- probably being field length. The following is the columns,---'s and first 2 lines from one of the tables.
CM D ORDR_REF LINE_NUM SUPP BYR LINE_REVN TXT_NUM L L T G ACCPT_US A PERF ITEM MANUF PART_NO EC_ CMDTY CLSFCTN RCPT_CNT DESCR ST IN STORE EAN QUOM QTY_ON_ORDR QTY_OUTSTG QTY_ADVD QTY_ADVD_OUTSTG QTY_RECV QTY_REJECT QTY_CR QTY_INVCE_OUTSTG QTY_INVCD QTY_INVCE_HELD QTY_CR_OUTSTG QTY_CRDTD QTY_CR_HELD DLVRY_SI DATE_DUE DATE_ACK DATE_XPCT DATE_XPED XPED_USR XP LEASE CMMT_DATE A A MIN_AUTH ACT_AUTH CURR_AUTH_SEQ_NUM TAX TAX_DATE HA PUOM DSCNT_1 DSCNT_2 DSCNT_3 ENTRD_PRC PRC MIN_PRC P ENTRD_VAL MIN_ENTRD_VAL UNIT_COST VAL_ON_ORDR VAL_RECV VAL_OUTSTG VAL_ACCRU VAL_INVCE_OUTSTG VAL_INVCD VAL_INVCE_HELD VAL_CR_OUTSTG VAL_CRDTD VAL_CR_HELD VAL_REJECT VAL_CR VAL_TAX MIN_ORDR_VAL MIN_VAL_TAX L S CNTRCT_REF CNTRCT_LINE_NUM C GL_TRA AIRCRFT_RE AIRL FLGHT_ LEG_NUM SRVC_QTY RATE_PRC CHRG_VAL UPDT_DATE UPDT_TIME USR_DATA L VAT_NON_REC_VALUE VAT_REC_VALUE PEV_LINE_COST A
-- - -------------------- ---------- ------------ -------- ---------- ---------- - - - - -------- - ---- -------------------- ------------ -------------------- --- ---------------------- ---------- -------- ---------------------------------------- -- -- -------- ------------- ---- ----------- ---------- ---------- --------------- ---------- ---------- ---------- ---------------- ---------- -------------- ------------- ---------- ----------- -------- --------- --------- --------- --------- -------- -- -------------------- --------- - - ---------- ---------- ----------------- --- --------- -- ---- ---------- ---------- ---------- ---------- --- ---------- - ---------- ------------- ---------- ----------- ---------- ---------- ---------- ---------------- ---------- -------------- ------------- ---------- ----------- ---------- ---------- ---------- ------------ ----------- - - -------------------- --------------- - ------ ---------- ---- ------ ---------- ---------- ---------- ---------- --------- --------- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - ----------------- ------------- ------------- -
AR O PO415966 1 040960 LFOSTER 0 0 2 2 Y Stirrers and cleaning tabs - ivan 0 0 0 0 0 0 0 0 0 0 0 0 0 CIVIC 01-APR-20 01-JAN-00 01-APR-20 01-JAN-00 31-MAR-20 0 0 0 0 0 01 01-JAN-00 ER 0 0 0 0 0 1 75.51 0 75.51 75.51 75.51 0 0 0 75.51 0 0 0 0 0 0 15.1 0 0 0 0 022704 0 0 0 0 03-APR-20 01-JAN-00 2 0 15.1 75.51
AR O PO415967 1 015552 LFOSTER 0 0 2 2 Y extras to PO414840 - Sam 0 0 0 0 0 0 0 0 0 0 0 0 0 CIVIC 01-APR-20 01-JAN-00 01-APR-20 01-JAN-00 31-MAR-20 0 0 0 0 0 01 01-JAN-00 ER 0 0 0 0 0 1 60 0 60 60 60 0 0 0 60 0 0 0 0 0 0 12 0 0 0 0 022705 0 0 0 0 01-APR-20 01-JAN-00 2 0 12 60

You may not need to go externally , but use the csv options natively in sql itself.
Something like this?
MySQL Query to CSV

Figuring out the format, was a challenge since it is a mix of tabs and spaces, however it aligned with standard UNIX tabs so expand converted the tabs into spaces correctly aligned, and then the awk script did the rest.
The column names were white space delimited and then using the --- lines gave the column width to use FIELDWIDTHS for the data.
A couple of bash 1 lines to expand and then process.
for f in *.txt;do expand "$f" > "${f%.txt}.fix"; done
Then call awk to convert into delimited format
for f in *.fix;do awk -f parse.awk "$f" > "${f%.fix}.del"; done
The awk script (parse.awk) uses a couple of tricks $1=$1 read everything in using default input FS for line 1 and FIELDWIDTHS after line 2. The next print outputs using the output file separator (OFS) which is set to ¬ in BEGIN, you can use what you like. It was not in the data so no need to escape , etc.
NR==1 prints out the columns
NR==2 gets the field lengths by measuring the --- --- -----
NR>2 Processes the rest of the file to the end using
fixed width set in 2
BEGIN {
OFS = "¬"
}
NR==1 {
$1=$1
print
}
NR==2 {
fw=""
for(i=1;i<=NF;i++) {
fw = fw length($i)+1 " "
}
FIELDWIDTHS= fw
}
NR>2 {
$1=$1
print
}

Related

Column value based on previous column value

I had a result set on Oracle like this table:
Is there a way to add a new column with the values based on the previous TEMREGIONAL column to be like that:
311,1,1,0
430,2,0,1
329,3,0,1
What I want is based on the TEMREGIONAL value, if it is 1, so all rows after that will be 1 to.
So if I have something like that:
311,1,0
430,2,0
329,3,1
334,4,0
323,5,0
324,6,0
326,7,0
The result should be:
311,1,0,0
430,2,0,0
329,3,1,0
334,4,0,1
323,5,0,1
324,6,0,1
326,7,0,1
What I want is to add a new column and after the row with the value 1 at the third column, all rows should have the value 1 in this new column.
Anybody can help me?
You may use ignore nulls addition of lag to find previous 1, turning zeroes to null. This can be done in one pass.
with a(
ID_ORGAO_INTELIGENCIA
, ORD
, TEMREGIONAL
) as (
select 311,1,0 from dual union all
select 430,2,0 from dual union all
select 329,3,1 from dual union all
select 334,4,0 from dual union all
select 323,5,0 from dual union all
select 324,6,0 from dual union all
select 326,7,0 from dual
)
select
a.*
, coalesce(
lag(nullif(TEMREGIONAL, 0))
ignore nulls
over(order by ord asc)
, 0) as prev
from a
ID_ORGAO_INTELIGENCIA | ORD | TEMREGIONAL | PREV
--------------------: | --: | ----------: | ---:
311 | 1 | 0 | 0
430 | 2 | 0 | 0
329 | 3 | 1 | 0
334 | 4 | 0 | 1
323 | 5 | 0 | 1
324 | 6 | 0 | 1
326 | 7 | 0 | 1
db<>fiddle here
For sample data
SQL> select * from test order by ord;
ID_ORGAO_INTELIGENCIA ORD TERMREGIONAL
--------------------- ---------- ------------
311 1 0
430 2 0
329 3 1
334 4 0
323 5 0
324 6 0
326 7 0
7 rows selected.
this might be one option:
SQL> with
2 temp as
3 -- find minimal ORD for which TERMREGIONAL = 1
4 (select min(a.ord) min_ord
5 from test a
6 where a.termregional = 1
7 )
8 select t.id_orgao_inteligencia,
9 t.ord,
10 t.termregional,
11 case when t.ord > m.min_ord then 1 else 0 end new_column
12 from temp m cross join test t
13 order by t.ord;
ID_ORGAO_INTELIGENCIA ORD TERMREGIONAL NEW_COLUMN
--------------------- ---------- ------------ ----------
311 1 0 0
430 2 0 0
329 3 1 0
334 4 0 1
323 5 0 1
324 6 0 1
326 7 0 1
7 rows selected.
SQL>

changing a value of a field for all duplicates with higher date using Oracle

I would like to identify the duplicates in a field and change the value of another field to all late on id dups. for example:
---------------------------------------------------
id | color | ref | date
---------------------------------------------------
1 | orange | 0 | 20200101
2 | orange | 0 | 20200102
3 | black | 0 | 20200117
4 | red | 0 | 20200202
5 | black | 0 | 20200104
6 | black | 0 | 20200115
7 | red | 0 | 20200101
8 | orange | 0 | 20200210
the above table is just an example: I would like to identify the duplicate depending on the color field and update all the duplicates with later dates to ref = 1
SELECT *
from colorful
where (color) in
(SELECT color
from colorful
group by color
HAVING COUNT(*) > 1
)
ORDER BY color;
How to write an update statement to do the above as I tried a few times and not able to do it successfully
MERGE can be one option.
Sample data:
SQL> alter session set nls_Date_Format = 'yyyy-mm-dd';
Session altered.
SQL> select * from test order by color, datum;
ID COLOR REF DATUM
---------- ------ ---------- ----------
5 black 0 2020-01-04
6 black 0 2020-01-15
3 black 0 2020-01-17
1 orange 0 2020-01-01
2 orange 0 2020-01-02
8 orange 0 2020-02-10
7 red 0 2020-01-01
4 red 0 2020-02-02
9 white 0 2020-03-15
9 rows selected.
Let's update all REFs to 1 if there are duplicates whose date column's value isn't minimal for that color.
SQL> merge into test t
2 using (select color, min(datum) min_datum
3 from test
4 group by color
5 ) x
6 on (x.color = t.color)
7 when matched then update set
8 t.ref = 1
9 where t.datum > x.min_datum;
5 rows merged.
SQL> select * From test order by color, datum;
ID COLOR REF DATUM
---------- ------ ---------- ----------
5 black 0 2020-01-04
6 black 1 2020-01-15
3 black 1 2020-01-17
1 orange 0 2020-01-01
2 orange 1 2020-01-02
8 orange 1 2020-02-10
7 red 0 2020-01-01
4 red 1 2020-02-02
9 white 0 2020-03-15
9 rows selected.
SQL>

What is the Starts data in TKPROF output?

When I get a tkprof output, I see "starts=1" data in the execution plan area, between the time and cost data. So what does that mean?
Basically, this area :
(cr=15 pr=0 pw=0 time=514 us starts=1 cost=3 size=7383 card=107)
********************************************************************************
SQL ID: 7jk33n4f4mpy9 Plan Hash: 1445457117
select *
from
hr.employees
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.04 0.03 0 351 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 9 0.00 0.00 0 15 0 107
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 11 0.04 0.03 0 366 0 107
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS
Number of plan statistics captured: 1
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
107 107 107 TABLE ACCESS FULL EMPLOYEES (cr=15 pr=0 pw=0 time=514 us starts=1 cost=3 size=7383 card=107)
********************************************************************************
STARTS is the number of times that line in the plan started. Easier to see when using (say) a join. Here's an example
SQL> select /*+ leading(d) use_nl(e) gather_plan_statistics */
2 e.ename, d.dname
3 from scott.dept d,
4 scott.emp e
5 where e.deptno = d.deptno
6 and e.sal > 1000;
ENAME DNAME
---------- --------------
CLARK ACCOUNTING
KING ACCOUNTING
MILLER ACCOUNTING
JONES RESEARCH
SCOTT RESEARCH
ADAMS RESEARCH
FORD RESEARCH
ALLEN SALES
WARD SALES
MARTIN SALES
BLAKE SALES
TURNER SALES
12 rows selected.
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
--------------------------------
SQL_ID 37nwzk5qypud3, child number 0
-------------------------------------
select /*+ leading(d) use_nl(e) gather_plan_statistics */
e.ename, d.dname from scott.dept d, scott.emp e where
e.deptno = d.deptno and e.sal > 1000
Plan hash value: 4192419542
-------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 12 |00:00:00.01 | 32 |
| 1 | NESTED LOOPS | | 1 | 13 | 12 |00:00:00.01 | 32 |
| 2 | TABLE ACCESS FULL| DEPT | 1 | 4 | 4 |00:00:00.01 | 7 |
|* 3 | TABLE ACCESS FULL| EMP | 4 | 3 | 12 |00:00:00.01 | 25 |
-------------------------------------------------------------------------------------
We scanned DEPT and got 4 rows. For each of those 4 rows, we then did a full scan of EMP, hence line 3 started 4 times.

SQL execution plan:there's a index on column but it use 'table access full' in hash join

The information of "table access full" appears in the SQL execution plan of a SQL statement.
I want to know why it does not use index.
Or is there any other optimizations?
After checking ,I get the info:
“table access full” appeared in hash join.
Columns of two tables in hash join have the same type.
DB version 10.2.0.3.
The column in "table access full" has index.
The last collection of tables and index statistics was dated June
12th.
When I run a single table queries on the column, the index is used.
Details:
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------
SQL_ID 5wz1z56666666, child number 13
--------------------------------------
SELECT count(Id) num FROM ccshare.blue WHERE id in (select id from(select
a.*,b.documentid,b.variablename,b.variablecname,b.variablevalue from ccshare.blue
a,ccshare.blueEx b where a.id= b.documentid and a.formid= :1 and
a.MdlId= :2 and (a.tflag= :3 or a.tflag is null) and (a.overflag =
:4 or a.overflag = 0) and (a.pages <> :5 and a.pages <> 9) and a.subject like :6)
cc group by id having count(id)>= 0) AND ((hisuserids like :7 or curuserids like :8 or
curuserids1 like :9 or curuserids2 like :10) and (hisdeluserids not like :11 or hisdeluserids
is null)) AND (tflag= :12 or tflag is null) AND formid> :13
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------
Plan hash value: 1599999999
----------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | | 45356 (100)| |
| 1 | SORT AGGREGATE | | 1 | 344 | | | |
| 2 | NESTED LOOPS | | 1410 | 473K| | 45356 (2)| 00:09:05 |
| 3 | VIEW | VW_NSO_1 | 1410 | 18330 | | 43937 (2)| 00:08:48 |
|* 4 | FILTER | | | | | | |
| 5 | HASH GROUP BY | | 1410 | 103K| | 43937 (2)| 00:08:48 |
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------
|* 6 | HASH JOIN | | 405K| 29M| 2232K| 43903 (2)| 00:08:47 |
|* 7 | TABLE ACCESS FULL | blue | 28186 | 1899K| | 11964 (2)| 00:02:24 |
| 8 | INDEX FAST FULL SCAN | blueEX_DOCUID | 15M| 87M| | 18550 (2)| 00:03:43 |
|* 9 | TABLE ACCESS BY INDEX ROWID| blue | 1 | 331 | | 2 (0)| 00:00:01 |
|* 10 | INDEX UNIQUE SCAN | blue_KEY | 1 | | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$09D7319C
PLAN_TABLE_OUTPUT
---------------------------------------------------------------
3 - SEL$833EDA65 / VW_NSO_1#SEL$09D7319C
4 - SEL$833EDA65
7 - SEL$833EDA65 / a#SEL$3
8 - SEL$833EDA65 / b#SEL$3
9 - SEL$09D7319C / blue#SEL$1
10 - SEL$09D7319C / blue#SEL$1
Outline Data
-------------
/*+
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('10.2.0.3')
ALL_ROWS
OUTLINE_LEAF(#"SEL$833EDA65")
OUTLINE_LEAF(#"SEL$09D7319C")
UNNEST(#"SEL$335DD26A")
OUTLINE(#"SEL$335DD26A")
MERGE(#"SEL$3")
OUTLINE(#"SEL$833EDA65")
OUTLINE(#"SEL$1")
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------
OUTLINE(#"SEL$2")
OUTLINE(#"SEL$3")
NO_ACCESS(#"SEL$09D7319C" "VW_NSO_1"#"SEL$09D7319C")
INDEX_RS_ASC(#"SEL$09D7319C" "blue"#"SEL$1" ("blue"."ID"))
LEADING(#"SEL$09D7319C" "VW_NSO_1"#"SEL$09D7319C" "blue"#"SEL$1")
USE_NL(#"SEL$09D7319C" "blue"#"SEL$1")
FULL(#"SEL$833EDA65" "a"#"SEL$3")
INDEX_FFS(#"SEL$833EDA65" "b"#"SEL$3" ("blueEX"."DOCUMENTID"))
LEADING(#"SEL$833EDA65" "a"#"SEL$3" "b"#"SEL$3")
USE_HASH(#"SEL$833EDA65" "b"#"SEL$3")
END_OUTLINE_DATA
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------
*/
Peeked Binds (identified by position):
--------------------------------------
1 - :1 (NUMBER): 107889
2 - :2 (NUMBER): 188
3 - :3 (VARCHAR2(30), CSID=852): (null)
4 - :4 (NUMBER): -1
5 - :5 (NUMBER): 7
6 - :6 (VARCHAR2(30), CSID=852): '%he has a pretty dog%'
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------
7 - :7 (VARCHAR2(30), CSID=852): '%,4026722,%'
8 - :8 (VARCHAR2(30), CSID=852): '%,4026722*,%'
9 - :9 (VARCHAR2(30), CSID=852): '%,4026722*,%'
10 - :10 (VARCHAR2(30), CSID=852): '%,4026722*,%'
11 - :11 (VARCHAR2(30), CSID=852): '%,4026722,%'
12 - :12 (VARCHAR2(30), CSID=852): (null)
13 - :13 (NUMBER): 0
Predicate Information (identified by operation id):
---------------------------------------------------
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------
4 - filter(COUNT(*)>=0)
6 - access("a"."ID"="b"."DOCUMENTID")
7 - filter(("a"."SUBJECT" LIKE :6 AND "a"."FORMID"=:1 AND "a"."MDLID"=:2 AND
"a"."PAGES"<>:5 AND "a"."PAGES"<>9 AND INTERNAL_FUNCTION("a"."OVERFLAG") AND
("a"."tflag" IS NULL OR "a"."tflag"=:3)))
9 - filter((("HISUSERIDS" LIKE :7 OR "CURUSERIDS" LIKE :8 OR "CURUSERIDS1" LIKE :9 OR
"CURUSERIDS2" LIKE :10) AND "FORMID">:13 AND ("HISDELUSERIDS" IS NULL OR "HISDELUSERIDS" NOT
LIKE :11) AND ("tflag" IS NULL OR "tflag"=:12)))
10 - access("ID"="$nso_col_1")
Column Projection Information (identified by operation id):
PLAN_TABLE_OUTPUT
---------------------------------------------------------------
1 - (#keys=0) COUNT(*)[22]
3 - "$nso_col_1"[NUMBER,22]
4 - "a"."ID"[NUMBER,22]
5 - "a"."ID"[NUMBER,22], COUNT(*)[22]
6 - (#keys=1) "a"."ID"[NUMBER,22]
7 - "a"."ID"[NUMBER,22]
8 - "b"."DOCUMENTID"[NUMBER,22]
10 - "blue".ROWID[ROWID,10]
SQL> explain plan for select count(id) from ccshare.blue;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 302********
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 890 (2)| 00:00:11 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| blue_OVERFLAG | 1155K| 890 (2)| 00:00:11 |
10046 event info (tkprof) of the SQL:
TKPROF: Release 10.2.0.3.0 - Production on Thu Jun 21 22:48:56 2018
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Trace file: /opt/app/oracle/admin/testdb/udump/testdb1_ora_7668044.trc
Sort options: default
********************************************************************************
count = number of times OCI procedure was executed
cpu = cpu time in seconds executing
elapsed = elapsed time in seconds executing
disk = number of physical reads of buffers from disk
query = number of buffers gotten for consistent read
current = number of buffers gotten in current mode (usually for update)
rows = number of rows processed by the fetch or execute call
********************************************************************************
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 6.67 16.00 163 139223 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 6.68 16.02 163 139223 0 1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=139223 pr=163 pw=0 time=16007522 us)
0 NESTED LOOPS (cr=139223 pr=163 pw=0 time=16007499 us)
2 VIEW VW_NSO_1 (cr=139215 pr=163 pw=0 time=16005404 us)
2 FILTER (cr=139215 pr=163 pw=0 time=16005400 us)
2 HASH GROUP BY (cr=139215 pr=163 pw=0 time=16005391 us)
32 HASH JOIN (cr=139215 pr=163 pw=0 time=15614187 us)
2 TABLE ACCESS FULL blue (cr=53972 pr=0 pw=0 time=1725549 us)
15511417 INDEX FAST FULL SCAN blueEX_DOCUID (cr=85243 pr=163 pw=0 time=15516352 us)(object id 60473)
0 TABLE ACCESS BY INDEX ROWID GREEN (cr=8 pr=0 pw=0 time=1964 us)
2 INDEX UNIQUE SCAN blue_KEY (cr=6 pr=0 pw=0 time=961 us)(object id 60454)
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
library cache lock 1 0.00 0.00
SQL*Net message to client 2 0.00 0.00
gc current block 2-way 1645 0.00 0.62
db file sequential read 23 0.00 0.01
db file parallel read 8 0.00 0.00
gc cr multi block request 50254 0.00 7.45
db file scattered read 29 0.00 0.02
SQL*Net message from client 2 13.69 13.69
********************************************************************************
OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 6.67 16.00 163 139223 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 6.68 16.02 163 139223 0 1
Misses in library cache during parse: 1
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 4 0.00 0.00
SQL*Net message from client 4 19.75 39.08
library cache lock 1 0.00 0.00
gc current block 2-way 1645 0.00 0.62
db file sequential read 23 0.00 0.01
db file parallel read 8 0.00 0.00
gc cr multi block request 50254 0.00 7.45
db file scattered read 29 0.00 0.02
OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 0 0.00 0.00 0 0 0 0
Execute 0 0.00 0.00 0 0 0 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 0 0.00 0.00 0 0 0 0
Misses in library cache during parse: 0
1 user SQL statements in session.
0 internal SQL statements in session.
1 SQL statements in session.
********************************************************************************
Trace file: /opt/app/oracle/admin/testdb/udump/testdb1_ora_7668044.trc
Trace file compatibility: 10.01.00
Sort options: default
1 session in tracefile.
1 user SQL statements in trace file.
0 internal SQL statements in trace file.
1 SQL statements in trace file.
1 unique SQL statements in trace file.
52007 lines in trace file.
16 elapsed seconds in trace file.

Bash: How to extract table-like structures from text file

I have a log file which contains some data and important table-like parts as following:
//Some data
--------------------------------------------------------------------------------
----- Output Table -----
--------------------------------------------------------------------------------
NAME Attr1 Attr2 Attr3 Attr4 Attr5
--------------------------------------------------------------------------------
fooooooooo 0 0 3 0 0
boooooooooooooooooooooo 0 0 30 0 0
abv 0 0 16 0 0
bhbhbhbh 0 0 3 0 0
foooo 0 0 198 0 0
WARNING: Some message...
WARNING: Some message...
aaaaaaaaa 0 0 60 0 7
bbbbbbbb 0 0 48 0 7
ccccccc 0 0 45 0 7
rrrrrrr 0 0 50 0 7
abcabca 0 0 42 0 6
// Some data...
--------------------------------------------------------------------------------
----- Another Output Table -----
--------------------------------------------------------------------------------
NAME Attr1 Attr2 Attr3 Attr4 Attr5
--------------------------------------------------------------------------------
$$foo12 0 0 3 0 0
$$foo12_720_720_14_2 0 0 30 0 0
I want to extract all that kind of tables from given file and save in separate files.
Notes:
A start of the table indicates a line which contains {NAME, Attr1, ..., Attr5} words.
WARNING messages may exist in the scope of a table and should be ignored.
Table ends when an empty line occurs and the next of that blank line is not a "WARNING" line.
So I expect the following 2 files as output:
NAME Attr1 Attr2 Attr3 Attr4 Attr5
--------------------------------------------------------------------------------
fooooooooo 0 0 3 0 0
boooooooooooooooooooooo 0 0 30 0 0
abv 0 0 16 0 0
bhbhbhbh 0 0 3 0 0
foooo 0 0 198 0 0
aaaaaaaaa 0 0 60 0 7
bbbbbbbb 0 0 48 0 7
ccccccc 0 0 45 0 7
rrrrrrr 0 0 50 0 7
abcabca 0 0 42 0 6
NAME Attr1 Attr2 Attr3 Attr4 Attr5
--------------------------------------------------------------------------------
$$foo12 0 0 3 0 0
$$foo12_720_720_14_2 0 0 30 0 0
I would write the following awk script by following your directions.
#! /usr/bin/awk -f
# start a table with a NAME line
/^ +NAME/ {
titles = $0
print
next
}
# don't print if not in table
! titles {
next
}
# blank line may mean end-of-table
/^$/ {
EOT = 1
next
}
# warning is not EOT
/^WARNING/ {
EOT = 0
next
}
# end of table means we're not in a table anymore, Toto
EOT {
titles = 0
EOT = 0
next
}
# print what's in the table
{ print }
Try this -
awk -F'[[:space:]]+' 'NF>6 || ($0 ~ /-/ && $0 !~ "Output") {print $0}' f
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
NAME Attr1 Attr2 Attr3 Attr4 Attr5
--------------------------------------------------------------------------------
fooooooooo 0 0 3 0 0
boooooooooooooooooooooo 0 0 30 0 0
abv 0 0 16 0 0
bhbhbhbh 0 0 3 0 0
foooo 0 0 198 0 0
aaaaaaaaa 0 0 60 0 7
bbbbbbbb 0 0 48 0 7
ccccccc 0 0 45 0 7
rrrrrrr 0 0 50 0 7
abcabca 0 0 42 0 6
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
NAME Attr1 Attr2 Attr3 Attr4 Attr5
--------------------------------------------------------------------------------
$$foo12 0 0 3 0 0
$$foo12_720_720_14_2 0 0 30 0 0

Resources