Modify a line below a specific line - shell

I have a big file like this small example:
>ENSG00000002587|ENST00000002596
ATGGCCGCGCTGCTCCTGGGCGCGGTGCTGCTGGTGGCCCAGCCCCAGCTAGTGCCTTCC
>ENSG00000004059|ENST00000000233
ATGGGCCTCACCGTGTCCGCGCTCTTTTCGCGGATCTTCGGGAAGAAGCAGATGCGGATT
>ENSG00000003249|ENST00000002501
ATGGAGCCCCCGGAGGGCGCCGGCACCGGAGAGATCGTTAAGGAGGCTGAGGTGCCGCAG
GCTGCGCTGGGCGTCCCAGCCCAGGGGACAGGGGACAATGGCCACACGCCTGTGGAGGAG
>ENSG00000048028|ENST00000003302
ATGACTGCGGAGCTGCAGCAGGACGACGCGGCCGGCGCGGCAGACGGCCACGGCTCGAGC
TGCCAAATGCTGTTAAATCAACTGAGAGAAATCACAGGCATTCAGGACCCTTCCTTTCTC
CATGAAGCTCTGAAGGCCAGTAATGGTGACATTACTCAGGCAGTCAGCCTTCTCACTGAT
I want to remove the first 5 character of every line which is below the line that starts with >.
I do not know how to do that in command line. Do you know?
Here is the expected output:
>ENSG00000002587|ENST00000002596
CGCGCTGCTCCTGGGCGCGGTGCTGCTGGTGGCCCAGCCCCAGCTAGTGCCTTCC
>ENSG00000004059|ENST00000000233
CCTCACCGTGTCCGCGCTCTTTTCGCGGATCTTCGGGAAGAAGCAGATGCGGATT
>ENSG00000003249|ENST00000002501
GCCCCCGGAGGGCGCCGGCACCGGAGAGATCGTTAAGGAGGCTGAGGTGCCGCAG
GCTGCGCTGGGCGTCCCAGCCCAGGGGACAGGGGACAATGGCCACACGCCTGTGGAGGAG
>ENSG00000048028|ENST00000003302
TGCGGAGCTGCAGCAGGACGACGCGGCCGGCGCGGCAGACGGCCACGGCTCGAGC
TGCCAAATGCTGTTAAATCAACTGAGAGAAATCACAGGCATTCAGGACCCTTCCTTTCTC
CATGAAGCTCTGAAGGCCAGTAATGGTGACATTACTCAGGCAGTCAGCCTTCTCACTGAT

sed -E '/^>/{N;s/\n.{5}/\n/}' file
find line starting with >
join that line with next
replace newline and five chars with just newline

Related

How to detect a blank line between filled lines in .txt file and convert it to a single tab

I an running a bash/.dat script (Mac terminal) and part of it is converting each line return into a TAB (to get it ready for nicely importing into Excel). The problem is that I also want to remove all extra blank lines except a single blank line when comes between two filled lines. So...
Line pre-A is blank
Line A has text
Line B has text
Line C is blank
Line D has text
Line E is blank
Line F is blank
Line C above would become a TAB and Line E and F (and pre-A) would be deleted. Also, sometimes there is a blank line before Line A (labelled Line pre-A above), so I'd want it removed but not replaced with a TAB.
So the result would be:
Line A text [TAB] Line B text [TAB] [TAB] Line D text
...and it'd be OK if Line D text was followed by a [TAB]. Make sense? Is this doable and, if so, how?
Thanks!
If perl is your option, would you please try:
perl -0777pe 's/^\n+//; s/\n{3,}/\t/g; s/\n/\t/g' file.txt
The -0777 option tells perl to slurp all lines at once to process
newline characters between lines.
The -pe option enables the one-liner programming.
The first substitution s/^\n+// removes the pre blank line(s).
The next s/\n{3,}/\t/g converts three or more consecutive newline
characters (meaning two or more blank lines) into a tab character.
The last s/\n/\t/g converts the newline characters into the same number
of tab characters.

How to append a string to a file from the nth line onwards in shell script?

I want to add a string to the file at the end of each line starting from the second line .
so this is my file budget.txt
id,budget
d4385ff7-247f-407a-97c6-366d8128c6c7,
50548d0a-257c-44f5-b175-2e7efa53dc35,
e15965cf-ffc1-40ae-94c4-b450ab190233,
b9286b97-2575-4c98-bd24-1393d5309e76,
the output i am expecting is below. I want to add the string 'True' starting from the second line onwards in the end.
id,budget
d4385ff7-247f-407a-97c6-366d8128c6c7,True
50548d0a-257c-44f5-b175-2e7efa53dc35,True
e15965cf-ffc1-40ae-94c4-b450ab190233,True
b9286b97-2575-4c98-bd24-1393d5309e76,True
what could be the shortest bash command .
thank you so much
appreciate any help
Make sure to run dos2unix budget.txt on your file before running the commands below, in general .txt files are originated on windows so have different line ending.
awk 'NR>1{$0=$0"True"}1' file
id,budget
d4385ff7-247f-407a-97c6-366d8128c6c7,True
50548d0a-257c-44f5-b175-2e7efa53dc35,True
e15965cf-ffc1-40ae-94c4-b450ab190233,True
b9286b97-2575-4c98-bd24-1393d5309e76,True
Here, NR is the number of record and by the default nature of awk record is same as line. So if you do NR>1 it will tell awk to perform action inside {..} on the lines number greater than 1.
Or use sed, here replace end of line $ with True:
sed '2,$s/$/True/' file
id,budget
d4385ff7-247f-407a-97c6-366d8128c6c7,True
50548d0a-257c-44f5-b175-2e7efa53dc35,True
e15965cf-ffc1-40ae-94c4-b450ab190233,True
b9286b97-2575-4c98-bd24-1393d5309e76,True

How do I delete all lines from a file after (and including) a line that contains a defined string in a Bash script?

I'm hacking about a text file in the middle of a Bash script (on an RPI3B+ with OSMC installed) and trying to crop a file at the first line that contains the text "BLAH DE BLAH" (deleting everything in the same file after and including the first line it finds that text on).
For example (in the file filename.text):
This is the first line
This is the second line
This is the third line containing "BLAH DE BLAH"
This is the fourth line
This is the fifth line
Required output (in the file filename.text):
This is the first line
This is the second line
I've tried to investigate awk and sed related posts, but I'm finding it all so confusing as I can't find anything that does exactly what I need (some split at certain line numbers, some from the command line not a bash script, some before and after certain strings)... and I'm stuck. As you can see, I can't even work out how to format this post properly (my head hurts so much)!
Any help appreciated - thanks!
Looks like
sed '/BLAH DE BLAH/Q'
would do the job in GNU sed.

Output of for loop in bash

I am running a for loop inside a while loop.
File passed as parameter has the contents:
peter
roger
casie
I am trying to create a path to test existence of files a,b,c,d,e
I am expecting the output to be
/peter/a
/peter/b
and so on.
Instead I am getting
/aeter
/beter
etc.
What do I need to understand here? Please find the code below -
CODE:
while read fileLine; do
x=$fileLine
for i in a b c d e
do
echo /$x/$i
done
done < $1
Apparently your input file uses the windows end-of-line format of \r\n. The read removes the \n but leaves the \r. When the string /$x/$i is printed, the "carriage" is returned to the beginning of the line at the end of the x string, printing the slash over top of the slash from x and printing the letter from i over the first letter of x.
You may be able to fix it by replacing your x=$fileLine line with
x=${fileLine%?}
which should remove the last character.

Going to a specific line number using Less in Unix

I have a file that has around million lines. I need to go to line number 320123 to check the data. How do I do that?
With n being the line number:
ng: Jump to line number n. Default is the start of the file.
nG: Jump to line number n. Default is the end of the file.
So to go to line number 320123, you would type 320123g.
Copy-pasted straight from Wikipedia.
To open at a specific line straight from the command line, use:
less +320123 filename
If you want to see the line numbers too:
less +320123 -N filename
You can also choose to display a specific line of the file at a specific line of the terminal, for when you need a few lines of context. For example, this will open the file with line 320123 on the 10th line of the terminal:
less +320123 -j 10 filename
You can use sed for this too -
sed -n '320123'p filename
This will print line number 320123.
If you want a range then you can do -
sed -n '320123,320150'p filename
If you want from a particular line to the very end then -
sed -n '320123,$'p filename
From within less (in Linux):
g and the line number to go forward
G and the line number to go backwards
Used alone, g and G will take you to the first and last line in a file respectively; used with a number they are both equivalent.
An example; you want to go to line 320123 of a file,
press 'g' and after the colon type in the number 320123
Additionally you can type '-N' inside less to activate / deactivate the line numbers. You can as a matter of fact pass any command line switches from inside the program, such as -j or -N.
NOTE: You can provide the line number in the command line to start less (less +number -N) which will be much faster than doing it from inside the program:
less +12345 -N /var/log/hugelogfile
This will open a file displaying the line numbers and starting at line 12345
Source: man 1 less and built-in help in less (less 418)
For editing this is possible in nano via +n from command line, e.g.,
nano +16 file.txt
To open file.txt to line 16.

Resources