Remove first character of a text file from shell - bash

I have a text file and I would like to only delete the first character of the text file, is there a way to do this in shell script?
I'm new to writing scripts so I really don't know where to start. I understand that the main command most people use is "sed" but I can only find how to use that as a find and replace tool.
All help is appreciated.

You can use the tail command, telling it to start from character 2:
tail -c +2 infile > outfile

You can use sed
sed '1s/^.//' startfile > endfile
1s means match line 1, in substitution mode (s)
^. means at the beginning of the line (^), match any character (.)
There's nothing between the last slashes, which means substitute with nothing (remove)

I used to use cut command to do this.
For example:
cat file|cut -c2-80
Will show characters from column 2 to 80 only.
In your case you can use:
cat file|cut -c2-10000 > newfile
I hope this help you.
[]s

You can also use the 0,addr2 address-range to limit replacements to the first substitution, e.g.
sed '0,/./s/^.//' file
That will remove the 1st character of the file and the sed expression will be at the end of its range -- effectively replacing only the 1st occurrence.
To edit the file in place, use the -i option, e.g.
sed -i '0,/./s/^.//' file
or simply redirect the output to a new file:
sed '0,/./s/^.//' file > newfile

A few other ideas:
awk '{print (NR == 1 ? substr($0,2) : $0)}' file
perl -0777 -pe 's/.//' file
perl -pe 's/.// unless $done; $done = 1' file
ed file <<END
1s/.//
w
q
END

dd allows you to specify an offset at which to start reading:
dd ibs=1 seek=1 if="$input" of="$output"
(where the variables are set to point to your input and output files, respectively)

Related

Update a csv file using bash

I have a csv file, with student name and marks. I want to update "marks" of a student with name "jack"(the only person in the csv). the data in csv file looks as below.
student,marks
jack,10
peter,20
rick,10
I found this awk '$1 == "Audrey" {print $2}' numbers.txt, but iam not sure on how to modify the file.
awk 'BEGIN{FS=OFS=","} $1=="jack"{$2=27} 1' foo.csv > tmp && mv tmp foo.csv
It worked for me with
sed -ir "s/^\(jack\),.*/\1,$new_grade/"
input.csv. with argument "r" or else i get the "error sed: 1: "input.csv": command i expects \ followed by text".
ed is usually better for in-place editing of files than sed:
printf "%s\n" "/^jack,/c" "jack,${new_grade}" "." w | ed -s input.csv
or using a heredoc to make it easier to read:
ed -s input.csv <<EOF
/^jack,/c
jack,${new_grade}
.
w
EOF
At the first line starting with jack,, change it to jack,XX where XX is the value of the new_grade variable, and write the new contents of the file.
You could use sed:
new_grade=9
sed -i'' "s/^\(jack\),.*/\1,$new_grade/"
The pattern ^\(jack\),.* matches the beginning of the line ^ followed by jack by a comma and the rest of the line .*. The replacement string \1,$new_mark contains the first captured group \1 (in this case jack) followed by a comma and the new mark.
Alternatively you could loop over the file and use a pattern substitution:
new_grade=9
while read -s line; do
echo ${line/jack,*/jack,$new_grade}
done < grades.txt > grades2.txt
Another approach with sed is to anchor the replacement to the digits at the end of the line with:
sed '/^jack,/s/[0-9][0-9]*$/12/' file
This uses the form sed '/find/s/match/replace' where find locates at the beginning of the line '^' the word "jack," eliminating all ambiguity with, e.g. jackson,33. Then the normal substitution form of 's/match/replace/' where match locates at least one digit at the end of the line (anchored by '$') and replaces it with the 12 (or whatever you choose).
Example Use/Output
With your example file in file, you would have:
$ sed '/^jack,/s/[0-9][0-9]*$/12/' file
student,marks
jack,12
peter,20
rick,10
(note: the POSIX character class of [[:digit:]] is equivalent to [0-9] which is another alternative)
The equivalent expression using the POSIX character class would be:
sed '/^jack,/s/[[:digit:]][[:digit:]]*$/12/' file
You can also use Extended Regular Expression which provides the '+' repetition operator to indicate one-or-more compared to the basic repetition designator of '*' to indicate zero-or-more. The equivalent ERE would be sed -E '/^jack,/s/[0-9]+$/12/' file
You can add the -i option to edit in-place and/or using it as -i.bak to create a backup of the original with the .bak extension before modifying the original.

Bash script delete a line in the file

I have a file, which has multiple lines.
For example:
a
ab#
ad.
a12fs
b
c
...
I want to use sed or awk delete the line, if the line include symbols or numbers. (For example, I want to delete: ab#, ad., a12fs.... lines)
or in another words, I just want to keep the line which include [a-z][A-Z] .
I know how to delete number line,
sed '/[0-9]/d' file.txt
but I do not know how to delete symbols lines.
Or there has any easy way to do that?
To keep blank lines:
grep '^[[:alpha:]]*$' file
sed '/[^[:alpha:]]/d' file
awk '/^[[:alpha:]]*$/' file
To remove blank lines:
grep '^[[:alpha:]]+$' file
sed -E -n '/^[[:alpha:]]+$/p' file
awk '/^[[:alpha:]]+$/' file
grep works well too and is even simpler: just do the reverse: keep the lines that interest you, which are way easier to define
grep -i '^[a-z]*$' file.txt
(match lines containing only letters and empty lines, and -i option makes grep case-insensitive)
to remove empty lines as well:
grep -i '^[a-z]+$' file.txt
caution when using Windows text files, as there's a carriage return at the end of the line, so nothing would match depending on grep versions (tested on windows here and it works)
but just in case:
grep -iP '^[a-z]*\r?$'
(note the P option to enable perl expressions or \r is not recognized)
You can use this sed:
sed '/^[A-Za-z0-9]\+$/!d' file
(OR)
sed '/[^A-Za-z0-9]/d' file
$ awk '!/[^[:alpha:]]/' file.txt
a
b
c

How to replace an entire sentence with a space in shell script

I am new to this platform. Just had a requirement which I have been working over sometime but not able to find it.
If this pattern was to occur in the middle of a line. How to handle it. Suppose the line is like. aaaa ---- bbbb. If i want to erase the ----bbbb part how to do it. But I want to keep the aaaa part as it is in the file.
Thanks
You can do it easily with sed:
sed -r 's/^--.*//' inputfile > outputfile
Or in place:
sed -r -i.bak 's/^--.*//' inputfile
This will create an inputfile.bak as a backup before modifying the file
Here is a good old bash solution:
while read -r line; do
echo "${line/#--*/}"
done < inputFile > outputFile
One way using awk:
awk '/^--/{$0=" ";}1' file
This will repalce the line with a space when it begins with --
Its not clear from your problem statement what the criteria (limitations) of the solution is.
What you are looking for is something that will support regular expressions. There are a lot of UNIX/Linux tools that can be used to solve this problem.
One simple solution is:
# cat file.txt | sed -e "{s/^--.*/ /}"
The regular expression "^--." will match any line beginning "^" with "--" followed
by any number of characters ".". "s" is the sed substitution command.
so "s/^--.*/ /" means, substitute all lines that start with -- and are followed by any
number of characters with a single space.

Delete all lines beginning with a # from a file

All of the lines with comments in a file begin with #. How can I delete all of the lines (and only those lines) which begin with #? Other lines containing #, but not at the beginning of the line should be ignored.
This can be done with a sed one-liner:
sed '/^#/d'
This says, "find all lines that start with # and delete them, leaving everything else."
I'm a little surprised nobody has suggested the most obvious solution:
grep -v '^#' filename
This solves the problem as stated.
But note that a common convention is for everything from a # to the end of a line to be treated as a comment:
sed 's/#.*$//' filename
though that treats, for example, a # character within a string literal as the beginning of a comment (which may or may not be relevant for your case) (and it leaves empty lines).
A line starting with arbitrary whitespace followed by # might also be treated as a comment:
grep -v '^ *#' filename
if whitespace is only spaces, or
grep -v '^[ ]#' filename
where the two spaces are actually a space followed by a literal tab character (type "control-v tab").
For all these commands, omit the filename argument to read from standard input (e.g., as part of a pipe).
The opposite of Raymond's solution:
sed -n '/^#/!p'
"don't print anything, except for lines that DON'T start with #"
you can directly edit your file with
sed -i '/^#/ d'
If you want also delete comment lines that start with some whitespace use
sed -i '/^\s*#/ d'
Usually, you want to keep the first line of your script, if it is a sha-bang, so sed should not delete lines starting with #!. also it should delete lines, that just contain only a hash but no text. put it all together:
sed -i '/^\s*\(#[^!].*\|#$\)/d'
To be conform with all sed variants you need to add a backup extension to the -i option:
sed -i.bak '/^\s*#/ d' $file
rm -Rf $file.bak
You can use the following for an awk solution -
awk '/^#/ {sub(/#.*/,"");getline;}1' inputfile
This answer builds upon the earlier answer by Keith.
egrep -v "^[[:blank:]]*#" should filter out comment lines.
egrep -v "^[[:blank:]]*(#|$)" should filter out both comments and empty lines, as is frequently useful.
For information about [:blank:] and other character classes, refer to https://en.wikipedia.org/wiki/Regular_expression#Character_classes.
If you want to delete from the file starting with a specific word, then do this:
grep -v '^pattern' currentFileName > newFileName && mv newFileName currentFileName
So we have removed all the lines starting with a pattern, writing the content into a new file, and then copy the content back into the source/current file.
You also might want to remove empty lines as well
sed -E '/(^$|^#)/d' inputfile
Delete all empty lines and also all lines starting with a # after any spaces:
sed -E '/^$|^\s*#/d' inputfile
For example, see the following 3 deleted lines (including just line numbers!):
1. # first comment
2.
3. # second comment
After testing the command above, you can use option -i to edit the input file in place.
Just this!
Here is it with a loop for all files with some extension:
ll -ltr *.filename_extension > list.lst
for i in $(cat list.lst | awk '{ print $8 }') # validate if it is the 8 column on ls
do
echo $i
sed -i '/^#/d' $i
done

How to get the part of a file after the first line that matches a regular expression

I have a file with about 1000 lines. I want the part of my file after the line which matches my grep statement.
That is:
cat file | grep 'TERMINATE' # It is found on line 534
So, I want the file from line 535 to line 1000 for further processing.
How can I do that?
The following will print the line matching TERMINATE till the end of the file:
sed -n -e '/TERMINATE/,$p'
Explained: -n disables default behavior of sed of printing each line after executing its script on it, -e indicated a script to sed, /TERMINATE/,$ is an address (line) range selection meaning the first line matching the TERMINATE regular expression (like grep) to the end of the file ($), and p is the print command which prints the current line.
This will print from the line that follows the line matching TERMINATE till the end of the file:
(from AFTER the matching line to EOF, NOT including the matching line)
sed -e '1,/TERMINATE/d'
Explained: 1,/TERMINATE/ is an address (line) range selection meaning the first line for the input to the 1st line matching the TERMINATE regular expression, and d is the delete command which delete the current line and skip to the next line. As sed default behavior is to print the lines, it will print the lines after TERMINATE to the end of input.
If you want the lines before TERMINATE:
sed -e '/TERMINATE/,$d'
And if you want both lines before and after TERMINATE in two different files in a single pass:
sed -e '1,/TERMINATE/w before
/TERMINATE/,$w after' file
The before and after files will contain the line with terminate, so to process each you need to use:
head -n -1 before
tail -n +2 after
IF you do not want to hard code the filenames in the sed script, you can:
before=before.txt
after=after.txt
sed -e "1,/TERMINATE/w $before
/TERMINATE/,\$w $after" file
But then you have to escape the $ meaning the last line so the shell will not try to expand the $w variable (note that we now use double quotes around the script instead of single quotes).
I forgot to tell that the new line is important after the filenames in the script so that sed knows that the filenames end.
How would you replace the hardcoded TERMINATE by a variable?
You would make a variable for the matching text and then do it the same way as the previous example:
matchtext=TERMINATE
before=before.txt
after=after.txt
sed -e "1,/$matchtext/w $before
/$matchtext/,\$w $after" file
to use a variable for the matching text with the previous examples:
## Print the line containing the matching text, till the end of the file:
## (from the matching line to EOF, including the matching line)
matchtext=TERMINATE
sed -n -e "/$matchtext/,\$p"
## Print from the line that follows the line containing the
## matching text, till the end of the file:
## (from AFTER the matching line to EOF, NOT including the matching line)
matchtext=TERMINATE
sed -e "1,/$matchtext/d"
## Print all the lines before the line containing the matching text:
## (from line-1 to BEFORE the matching line, NOT including the matching line)
matchtext=TERMINATE
sed -e "/$matchtext/,\$d"
The important points about replacing text with variables in these cases are:
Variables ($variablename) enclosed in single quotes ['] won't "expand" but variables inside double quotes ["] will. So, you have to change all the single quotes to double quotes if they contain text you want to replace with a variable.
The sed ranges also contain a $ and are immediately followed by a letter like: $p, $d, $w. They will also look like variables to be expanded, so you have to escape those $ characters with a backslash [\] like: \$p, \$d, \$w.
As a simple approximation you could use
grep -A100000 TERMINATE file
which greps for TERMINATE and outputs up to 100,000 lines following that line.
From the man page:
-A NUM, --after-context=NUM
Print NUM lines of trailing context after matching lines.
Places a line containing a group separator (--) between
contiguous groups of matches. With the -o or --only-matching
option, this has no effect and a warning is given.
A tool to use here is AWK:
cat file | awk 'BEGIN{ found=0} /TERMINATE/{found=1} {if (found) print }'
How does this work:
We set the variable 'found' to zero, evaluating false
if a match for 'TERMINATE' is found with the regular expression, we set it to one.
If our 'found' variable evaluates to True, print :)
The other solutions might consume a lot of memory if you use them on very large files.
If I understand your question correctly you do want the lines after TERMINATE, not including the TERMINATE-line. AWK can do this in a simple way:
awk '{if(found) print} /TERMINATE/{found=1}' your_file
Explanation:
Although not best practice, you could rely on the fact that all variables defaults to 0 or the empty string if not defined. So the first expression (if(found) print) will not print anything to start off with.
After the printing is done, we check if this is the starter-line (that should not be included).
This will print all lines after the TERMINATE-line.
Generalization:
You have a file with start- and end-lines and you want the lines between those lines excluding the start- and end-lines.
start- and end-lines could be defined by a regular expression matching the line.
Example:
$ cat ex_file.txt
not this line
second line
START
A good line to include
And this line
Yep
END
Nope more
...
never ever
$ awk '/END/{found=0} {if(found) print} /START/{found=1}' ex_file.txt
A good line to include
And this line
Yep
$
Explanation:
If the end-line is found no printing should be done. Note that this check is done before the actual printing to exclude the end-line from the result.
Print the current line if found is set.
If the start-line is found then set found=1 so that the following lines are printed. Note that this check is done after the actual printing to exclude the start-line from the result.
Notes:
The code rely on the fact that all AWK variables defaults to 0 or the empty string if not defined. This is valid, but it may not be best practice so you could add a BEGIN{found=0} to the start of the AWK expression.
If multiple start-end-blocks are found, they are all printed.
grep -A 10000000 'TERMINATE' file
is much, much faster than sed, especially working on really a big file. It works up to 10M lines (or whatever you put in), so there isn't any harm in making this big enough to handle about anything you hit.
Use Bash parameter expansion like the following:
content=$(cat file)
echo "${content#*TERMINATE}"
There are many ways to do it with sed or awk:
sed -n '/TERMINATE/,$p' file
This looks for TERMINATE in your file and prints from that line up to the end of the file.
awk '/TERMINATE/,0' file
This is exactly the same behaviour as sed.
In case you know the number of the line from which you want to start printing, you can specify it together with NR (number of record, which eventually indicates the number of the line):
awk 'NR>=535' file
Example
$ seq 10 > a #generate a file with one number per line, from 1 to 10
$ sed -n '/7/,$p' a
7
8
9
10
$ awk '/7/,0' a
7
8
9
10
$ awk 'NR>=7' a
7
8
9
10
If for any reason, you want to avoid using sed, the following will print the line matching TERMINATE till the end of the file:
tail -n "+$(grep -n 'TERMINATE' file | head -n 1 | cut -d ":" -f 1)" file
And the following will print from the following line matching TERMINATE till the end of the file:
tail -n "+$(($(grep -n 'TERMINATE' file | head -n 1 | cut -d ":" -f 1)+1))" file
It takes two processes to do what sed can do in one process, and if the file changes between the execution of grep and tail, the result can be incoherent, so I recommend using sed. Moreover, if the file doesn’t not contain TERMINATE, the first command fails.
Alternatives to the excellent sed answer by jfg956, and which don't include the matching line:
awk '/TERMINATE/ {y=1;next} y' (Hai Vu's answer to 'grep +A': print everything after a match)
awk '/TERMINATE/ ? c++ : c' (Steven Penny's answer to 'grep +A': print everything after a match)
perl -ne 'print unless 1 .. /TERMINATE/' (tchrist's answer to 'grep +A': print everything after a match)
This could be one way of doing it. If you know in what line of the file you have your grep word and how many lines you have in your file:
grep -A466 'TERMINATE' file
sed is a much better tool for the job:
sed -n '/re/,$p' file
where re is a regular expression.
Another option is grep's --after-context flag. You need to pass in a number to end at, using wc on the file should give the right value to stop at. Combine this with -n and your match expression.
This will print all lines from the last found line "TERMINATE" till the end of the file:
LINE_NUMBER=`grep -o -n TERMINATE $OSCAM_LOG | tail -n 1 | sed "s/:/ \\'/g" | awk -F" " '{print $1}'`
tail -n +$LINE_NUMBER $YOUR_FILE_NAME

Resources