Merge all the data within the (.....) in one line in shell script - shell

I am new to shell script i need some help i have one SQL file like
SELECT DISTINCT F1.COL1,
F1.COL5 ADDRESS ,
COALESCE(COL1,
COL2,
COL3,
COL4),
F1.COL7
FROM TABLE1 F1
I need to print this in one line like
SELECT DISTINCT F1.COL1,
F1.COL5 ADDRESS ,
COALESCE(COL1,COL2,COL3,COL4),
F1.COL7
FROM TABLE1 F1
Thanks

With sed :
sed '/(/{:a;N;s/^ *//;s/\n *//;/)/!{ba}}' file
To edit file in place, add the -i option :
sed -i '/(/{:a;N;s/^ *//;s/\n *//;/)/!{ba}}' file
All lines starting with ( are joined until next line containing ).

Related

Export hql output to csv in beeline

I am trying to export my hql output to csv in beeline using below command :
beeline -u "jdbc:hive2://****/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"?tez.queue.name=devices-jobs --outputformat=csv2 -e "use schema_name; select * from table_name where open_time_new>= '2020-07-13' and open_time_new < '2020-07-22'" > filename.csv
The problem is that some column values in the table contains commas which pushes the data of same column to the next column value.
For eg:
| abcd | as per data,outage fault,xxxx.
| xyz |as per the source,ghfg,hjhjg.
The above data will get saved as 4 column instead of 2.
Need help!
Try the approach with local directory:
insert overwrite local directory '/tmp/local_csv_report'
row format delimited fields terminated by "," escaped by '\\'
select *
from table_name
where open_time_new >= '2020-07-13'
and open_time_new < '2020-07-22'
This will create several csv files under your local /tmp/local_csv_report directory, so using simple cat after that will merge the results into a single file.

Replace array of string ( passed as argument to script ) replace those values in a HQL file using Bash shell script?

I have a script which accepts 3 arguments $1 $2 $3
but $3 is an array like ("2018" "01")
so I am executing my script as :
sh script.sh Employee IT "2018 01"
and there an HQL file ( emp.hql) in which I want to replace my partition columns with the array passed like below :
***"select deptid , employee_name from {TBL_NM} where year={par_col[i]} and month={par_col[i]}"***
so below is the code I have tried :
**Table=$1
dept=$2
Par_cols=($3)
for i in "${par_cols[#]}" ;do
sed -i "/${par_col[i]}/${par_col[i]}/g" /home/hk/emp.hql**
done
Error :
*sed: -e experssion #1 , char 0: no previous regular expression*
*sed: -e experssion #2 , char 0: no previous regular expression*
But I think logic to replace partition columns is wrong , could you please help me in this?
Desired Output in HQL file :
select deptid ,employee_name from employee where year=2018 and month=01
Little bit related to below like :
Shell script to find, search and replace array of strings in a file

Insert multiple lines Containing ' and $variable using sed not working

I am new to scripting and stuck at one place that may be really simple.. still would be grateful if anyone can help.
Below is my issue in simplest term:
Input file new.txt
Hello team
Output file expected: new_2.txt
Select '/backup/path1_' from dual;
Select '/backup/path2_' from dual;
Hello team
Note : $var1=path1 and $var2=path2
Sed command used :
Sed '1i\
Select '/backup/"$var1"_' from dual;\
Select '/backup/"$var2"_' from dual;\
' new.txt > new_2.txt
Output received:
new_2.txt
Select /backup/path1_ from dual;
Select /backup/path2_from dual;
Hello team
After various quotes combination also, either single quote ' won't be displayed in output or var value won't be inserted.
Would you please try the following:
var1=path1
var2=path2
sed "1i\\
Select '/backup/${var1}_' from dual;\\
Select '/backup/${var2}_' from dual;
" new.txt > new_2.txt
Result:
Select '/backup/path1_' from dual;
Select '/backup/path2_' from dual;
Hello team
You can also escape the quotes mark with a backslash:
sed '1i\
Select '\'/backup/"$var1"_\'' from dual;\
Select '\'/backup/"$var2"_\'' from dual;
' new.txt > new_2.txt

bash: separate blocks of lines between pattern x and y

I have a similar question to this one Sed/Awk - pull lines between pattern x and y, however, in my case I want to output each block-of-lines to individual files (named after the first pattern).
Input example:
-- filename: query1.sql
-- sql comments goes here or else where
select * from table1
where id=123;
-- eof
-- filename: query2.sql
insert into table1
(id, date) values (1, sysdate);
-- eof
I want the bash script to generate 2 files: query1.sql and query2.sql with the following content:
query1.sql:
-- sql comments goes here or else where
select * from table1
where id=123;
query2.sql:
insert into table1
(id, date) values (1, sysdate);
Thank you
awk '/-- filename/{if(f)close(f); f=$3;next} !/eof/&&/./{print $0 >> f}' input
Brief explanation,
-- filename{if(f)close(f); f=$3;next}: locate the record contains filename, and assign it to f
!/eof/&&/./{print $0 >> f}: if following lines don't contain 'eof' neither empty, save it to the corresponding file.
This might work for you (GNU sed):
sed -r '/-- filename: (\S+)/!d;s##/&/,/-- eof/{//d;w \1#p;s/.*/}/p;d' file |
sed -nf - file
Create a sed script from the input file and run it against the input file
N.B. Two lines are needed for each query as the program for the query must be surrounded by braces and the w command must end in a newline.
Using GNU awk to handle multiple open files for you:
awk '/^-- eof/{f=0} f{print > out} /^-- filename/{out=$3; f=1}' file
or with any awk:
awk '/^-- eof/{f=0} f{print > out} /^-- filename/{close(out); out=$3; f=1}' file

bash / sed / awk Remove or gsub timestamp pattern from text file

I have a text file like this:
1/7/2017 12:53 DROP TABLE table1
1/7/2017 12:53 SELECT
1/7/2017 12:55 --UPDATE #dat_recency SET
Select * from table 2
into table 3;
I'd like to remove all of the timestamp patterns (M/D/YYYY HH:MM, M/DD/YYYY HH:MM, MM/D/YYYY HH:MM, MM/DD/YYYY HH:MM). I can find the patterns using grep but can't figure out how to use gsub. Any suggestions?
DESIRED OUTPUT:
DROP TABLE table1
SELECT
--UPDATE #dat_recency SET
Select * from table 2
into table 3;
You can use this sed command to remove data/time stamps from line start:
sed -i.bak -E 's~([0-9]{1,2}/){2}[0-9]{4} [0-9]{2}:[0-9]{2} *~~' file
cat file
DROP TABLE table1
SELECT
--UPDATE #dat_recency SET
Select * from table 2
into table 3;
Use the default space separator, make first and second columns to empty string and then print the whole line.
awk '/^[0-9]/{$1=$2="";gsub(/^[ \t]+|[ \t]+$/, "")} !/^[0-9]/{print}' sample.csv
the command checks each line whether starts with numeric or not, if it is replace the first 2 columns with empty strings and remove leading spaces; otherwise print the original line.
output:
DROP TABLE table1
SELECT
--UPDATE #dat_recency SET
Select * from table 2
into table 3;

Resources