How could I put these lines in range format? - bash

I have a text file with 826,838 lines. Text file looks like this (sorry, couldn't get the image uploader to work).
I'm using sed (sed -n '2p;$p') to print the second and last line but can't figure out how to put the lines in range format.
Current output:
1 3008.00 7380.00 497724.00 3158482.00 497724.00 3158482.00
826838 4744.00 7409.00 480729.00 3207718.00 480729.00 3207718.00
Desired output:
1-826838 3008.00-4744.00 7380.00-7409.00 497724.00-480729.00 3158482.00-3207718.00 497724.00-480729.00 3158482.00-3207718.00
Thank you for your help!

This might work for you (GNU sed):
sed -r '2H;$!d;H;x;:a;s/\n\s*(\S+)\s*(.*\n)\s*(\S+\s*)/\1-\3\n\2/;ta;P;d' file
Store line 2 and the last line in the hold space (HS). Following the last line, swap to the HS and then repeatedly move the first fields of the second and third lines to the first line. Finally print the first line only.

With single awk expression (will get the needed lines and make the needed ranges):
awk 'NR==2{ split($0,a) }END{ for(i=1;i<=NF;i++) printf("%s\t",a[i]"-"$i); print "" }' file
The output:
1-826838 3008.00-4744.00 7380.00-7409.00 497724.00-480729.00 3158482.00-3207718.00 497724.00-480729.00 3158482.00-3207718.00

Related

copy and paste between two txt file

Hi I am new to bash and using sed need a little help
I have two txt files i need to copy and paste between them the first file I know what the text is and placed of the text but the second txt file I don't know the text but I do know the placed of the text is.
In file1 put the two text words or numbers from file2 and place them like I show below.
When I create file2 all I am going to know about it will have two words or numbers on the same line4
I have been trying with this
sed $'10{e sed "4!d" /home/Desktop/file1.txt\n;d}' /home/Desktop/file2.txt
and
awk 'NR==4{a=$0}NR==FNR{next}FNR==10{print a}4' /home/Desktop/file2.txt /home/Desktop/file1.txt
This is what my files would look like
file1.txt
cat
hat
sat
fat
mat
rat
file2.txt
line1
line2
line3
text1 text2
line5
I need it to look like this
file1.txt
cat
hat
sat text1
fat text2
mat
rat
thanks for any help
This might work for you (GNU sed):
sed -E '1{x;s#^#sed -n 4p file2#e;x};3{G;s/\n(\S+).*/ \1/};4{G;s/\n\S+//}' file1
Stuff the line from file2 into the hold space when processing file1 and append and manipulate that line when needed.
A more explicit explanation:
By default, sed reads each line of a file. For each cycle, it removes the newline, places the result in the pattern space, goes through a sequence of commands, re-appends the newline and prints the result e.g. sed '' file replicates the cat command. The sed commands are usually placed between '...' and represent a cycle, thus:
1{x;s#^#sed -n 4p file2#e;x}
1{..} executes the commands between the ellipses on the first line of file1. Commands are separated by ;'s
x sed provides two buffers. After removing the newline that delimits each line of a file, the result is placed in the pattern space. Another buffer is provided empty, at the start of each invocation, called the hold space. The x swaps the pattern space for the hold space.
s#^#sed -n 4p file2#e this inserts another sed invocation into the empty hold space and evaluates it by the use of the e flag. The second invocation turns off implicit printing (-n option) and then prints line 4 of file2 only.
x the hold space is now swapped with the pattern space.Thus, line 4 of file2 is placed in the hold space.
3{G;s/\n(\S+).*/ \1/}
3{..} executes the commands between the ellipses on the third line of file1.
G append the contents of hold space to the pattern space using a newline as a separator.
s/\n(\S+).*/ \1/ match on the appended hold space and replace it by a space and the first column.
4{G;s/\n\S+//}
4{..} executes the commands between the ellipses on the fourth line of file1.
G append the contents of hold space to the pattern space using a newline as a separator.
s/\n\S+// match on the appended hold space and remove the newline and the first column, thus leaving a space and the second column.
m
Assuming you want to append the fields of the 4th line of file2.txt
to the 3rd and the following lines of file1.txt, how about:
awk 'FNR==NR {if (FNR==4) split($0, ary, " "); next} {print $0 " " ary[FNR - 3 + 1]}' /home/Desktop/file2.txt /home/Desktop/file1.txt
Result:
cat
hat
sat text1
fat text2
mat
rat

paste a line with a pattern to the previous line

I have a file with structure
"name1";"surname1";23;44
"name2";"surname2
www.so.org/443";56;33
"name3";"surname3";223;4554
"name4";"surname5
surname#so.net";77;889
I need an output:
"name1";"surname1";23;44
"name2";"surname2 www.so.org/443";56;33
"name3";"surname3";223;4554
"name4";"surname5 surname#so.net";77;889
The pattern here is alphanum at the start of the line and not \". I would like to paste the line with this pattern to the line above.
Edit:
I am using Debian stable.
I have used sed but I realized that it is a stream editor and I thought that it cannot paste a line to a previous one (which is false).
sed -e 's/^[a-z:A-Z]/ /g' which only help me to find the correct line.
My second trial was with a text editor. I opened the file with emacs and used M-x replace-regexp and find the corresponding lines with ^J[a-zA-Z] and replace with nothing. It did the job but it also deletes the first character and I need it after a single empty space.
This awk one-liner should give you a hand:
awk '{printf "%s%s",(/^"/&&NR>1?RS:""),$0}END{print ""}' file
The key to the problem is to decide when should we output/print the line break.
This one-liner works for even this format:
cat f
"name1";"surname1";23;44
"name2";"surname2
w
ww.
so.
org/
44
3";5
6;33
"name3";"surname3";223;4554
"name4";"surname5
surname#so.net";77;889
awk '{printf "%s%s",(/^"/&&NR>1?RS:""),$0}END{print ""}' f
"name1";"surname1";23;44
"name2";"surname2 www.so.org/443";56;33
"name3";"surname3";223;4554
"name4";"surname5 surname#so.net";77;889
This might work for you (GNU sed):
sed ':a;s/;/&/3;t;N;s/\n//;ta' file
If the current line does not contain 3 or more ;'s, append the next line, remove the introduced newline and repeat.

Sed range and removing last matching line

I have this data:
One
two
three
Four
five
six
Seven
eight
And this command:
sed -n '/^Four$/,/^[^[:blank:]]/p'
I get the following output:
Four
five
six
Seven
How can I change this sed expression to not match the final line of the output? So the ideal output should be:
Four
five
six
I've tried many things involving exclamation points but haven't managed to get close to getting this working.
Use a "do..while()" loop:
sed -n '/^Four$/{:a;p;n;/^[[:blank:]]/ba}'
details:
/^Four$/ {
:a # define the label "a"
p # print the pattern-space
n # load the next line in the pattern space
/^[[:blank:]]/ba # if the pattern succeeds, go to label "a"
}
You may pipe to another sed and skip last line:
sed -n '/^Four$/,/^[^[:blank:]]/p' file | sed '$d'
Four
five
six
Alternatively you may use:
sed -n '/^Four$/,/^[^[:blank:]]/{/^Four$/p; /^[^[:blank:]]/!p;}' file
You're using the wrong tool. sed is for doing s/old/new, that is all. Just use awk:
$ awk '/^[^[:blank:]]/{f=/^Four$/} f' file
Four
five
six
How it works: Every time it finds a line that doesn't start with spaces (/^[^[:blank:]]/) it sets a flag f (for "found") to 1 if that line starts with Four and 0 otherwise (f=/^Four$/). Whenever f is non-zero that is interpreted as a true condition and so invokes awks default behavior which is to print the current line. So when it hits a block starting with Four it prints every line in that block because f is 1/true and for every other block it doesn't print since f is 0/false.
Following awk may help you here.
awk '!/^ /{flag=""} /Four/{flag=1} flag' Input_file
Output will be as follows.
Four
five
six
Also in case of you need to save the output into Input_file itself append > temp_file && mv temp_file Input_file to above code.
grep -Pzo '\n\KFour\n(\s.+\n)+' input.txt
Output
Four
five
six
This might work for you (GNU sed):
sed '/^Four/{:a;n;/^\s/ba};d' file
If the line begins with Four print it and any following lines beginning with a space.
Another way:
sed '/^\S/h;G;/^Four/MP;d' file
If a line begins with a non-space, copy it to the hold space (HS). Append the HS to each line and if either line begins with Four print the first line and delete the rest. This will delete all lines other than the section beginning with Four.

How to strip date in csv output using shell script?

I have a few csv extracts that I am trying to fix up the date on, they are as follows:
"Time Stamp","DBUID"
2016-11-25T08:28:33.000-8:00,"5tSSMImFjIkT0FpiO16LuA"
The first column is always the "Time Stamp", I would like to convert this so it only keeps the date "2016-11-25" and drops the "T08:28:33.000-8:00".
The end result would be..
"Time Stamp","DBUID"
2016-11-25,"5tSSMImFjIkT0FpiO16LuA"
There are plenty of files with different dates.
Is there a way to do this in ksh? Some kind of for each loop to loop through all the files and replace the long time-stamp and leave just the date?
Use sed:
$ sed '2,$s/T[^,]*//' file
"Time Stamp","DBUID"
2016-11-25,"5tSSMImFjIkT0FpiO16LuA"
How it works:
2,$ # Skip header (first line) removing this will make a
# replacement on the first line as well.
s/T[^,]*// # Replace everything between T (inclusive) and , (exclusive)
# `[^,]*' Matches everything but `,' zero or more times
Here's one solution using a standard aix utility,
awk -F, -v OFS=, 'NR>1{sub(/T.*$/,"",$1)}1' file > file.cln && mv file.cln file
output
"Time Stamp","DBUID"
2016-11-25,"5tSSMImFjIkT0FpiO16LuA"
(but I no longer have access to an aix environment, so only tested with my local awk).
NR>1 skips the header line, and the sub() is limited to only the first field (up to the first comma). The trailing 1 char is awk shorthand for {print $0}.
If your data layout changes and you get extra commas in your data, this may required fixing.
IHTH
Using sed:
sed -i "s/\([0-9]\{4\}\)-\([0-9]\{2\}\)-\([0-9]\{2\}\).*,/\1-\2-\3,/" file.csv
Output:
"Time Stamp","DBUID"
2016-11-25,"5tSSMImFjIkT0FpiO16LuA"
-i edit files inplace
s substitute
This is a perfect job for awk, but unlike the previous answer, I recommend using the substring function.
awk -F, 'NR > 1{$1 = substr($1,1,10)} {print $0}' file.txt
Explanation
-F,: The -F flag sets a field separator, in this case a comma
NR > 1: Ignore the first row
$1: Refers to the first field
$1 = substr($1,1,10): Sets the first field to the first 10 characters of the field. In the example, this is the date portion
print $0: This will print the entire row

Merge two blank lines into one

I am looking for a solution of turning file A to file B, which requires merging two blank lines into one.
File-A:
// Comment 1
// Comment 2
// Comment 3
// Comment 4
// Comment 5
File-B:
// Comment 1
// Comment 2
// Comment 3
// Comment 4
// Comment 5
From this post, I know how to delete empty lines, I am wondering how to merge two consecutive blank lines into one.
PS: blank means that it could be empty OR there might be a tab or a space in the line.
sed -r 's/^\s+$//' infile | cat -s > outfile
sed removes any whitespace on a blank line. The -s option to cat squeezes consecutive blank lines into one.
This might work for you (GNU sed):
sed '$!N;s/^\s*\n\s*$//;P;D' file
This will convert 2 blank lines into one.
If you want to replace multiple blank lines into one:
sed ':a;$!N;s/^\s*\n\s*$//;ta;P;D' file
On reflection a far simpler solution is:
sed ':a;N;s/\n\s*$//;ta' file
Which squeezes one or more blank lines to a single blank line.
An even easier solution uses the range condition:
sed '/\S/,/^\s*$/!d' file
This deletes any blank lines other than those following a non-blank line.
Here is a simple solution with awk:
awk '!NF && !a++; NF {print;a=0}' file
// Comment 1
// Comment 2
// Comment 3
// Comment 4
// Comment 5
NF counts the number of fields; note that a line composed entirely of spaces and tabs counts as a blank line, too.
a is used to count blank lines, and if it's more than 1, skip it.
This page might come handy. TL;DR as follows:
# delete all CONSECUTIVE blank lines from file except the first; also
# deletes all blank lines from top and end of file (emulates "cat -s")
sed '/./,/^$/!d' # method 1, allows 0 blanks at top, 1 at EOF
sed '/^$/N;/\n$/D' # method 2, allows 1 blank at top, 0 at EOF
This should work:
sed 'N;s/^\([[:space:]]*\)\n\([[:space:]]*\)$/\1\2/;P;D' file
awk -v RS='([[:blank:]]*\n){2,}' -v ORS="\n\n" 1 file
I had hoped to produce a shorter Perl version, but Perl does not use regular expressions for its record separator.
awk does not edit in-place. You would have to do this:
awk -v RS='([[:blank:]]*\n){2,}' -v ORS="\n\n" 1 file > tmp && mv tmp file

Resources