sed on a ASCII-File with no line feed - bash

I have an ASCII-Textfile with only one (long) line in it. I want to do a normal sed on it but nothing works. vi shows me with :set list that the line ends with an $
If i add a new line manually with vi the sed command works. If i do this:
echo $(cat file) |sed 's/.*test//g'
it does work too. Any idea why sed cant work with that?

The commandf echo $(cat file) is not going to preserve white space very well.
You should be able to add a newline, while leaving the rest of the file untouched, quite easily with:
( cat inputFile ; echo ) | sed 'something or other'
Just keep in mind that some sed implementations don't actually like very long lines. POSIX, I believe, mandated only a 8K minimum requirement. You should be okay if you're using the GNU variant, though.

Related

Uncomment config line with sed [duplicate]

how to remove comment lines (as # bal bla ) and empty lines (lines without charecters) from file with one sed command?
THX
lidia
If you're worried about starting two sed processes in a pipeline for performance reasons, you probably shouldn't be, it's still very efficient. But based on your comment that you want to do in-place editing, you can still do that with distinct commands (sed commands rather than invocations of sed itself).
You can either use multiple -e arguments or separate commands with a semicolon, something like (just one of these, not both):
sed -i 's/#.*$//' -e '/^$/d' fileName
sed -i 's/#.*$//;/^$/d' fileName
The following transcript shows this in action:
pax> printf 'Line # with a comment\n\n# Line with only a comment\n' >file
pax> cat file
Line # with a comment
# Line with only a comment
pax> cp file filex ; sed -i 's/#.*$//;/^$/d' filex ; cat filex
Line
pax> cp file filex ; sed -i -e 's/#.*$//' -e '/^$/d' filex ; cat filex
Line
Note how the file is modified in-place even with two -e options. You can see that both commands are executed on each line. The line with a comment first has the comment removed then all is removed because it's empty.
In addition, the original empty line is also removed.
#paxdiablo has a good answer but it can be improved.
(1) The '/^$/d' clause only matches 100% blank lines.
If you want to also match lines that are entirely whitespace (spaces, tabs etc.) use this instead:
'/^\s*$/d'
(2) The 's/#.*$//' clause only matches lines that start with the # character in column 0.
If you want to also match lines that have only whitespace before the first # use this instead:
'/^\s*#.*$/d'
The above criteria may not be universal (e.g. within a HEREDOC block, or in a Python multi-line string the different approaches could be significant), but in many cases the conventional definition of "blank" lines include whitespace-only, and "comment" lines include whitespace-then-#.
(3) Lastly, on OSX at least, the #paxdiablo solution in which the first clause turns comment lines into blank lines, and the second clause strips blank lines (including what were originally comments) doesn't work. It seems to be more portable to make both clauses /d delete actions as I've done.
The revised command incorporating the above is:
sed -e '/^\s*#.*$/d' -e '/^\s*$/d' inputFile
This tiny jewel removes all # comments, no matter where they begin in a line (see caution below):
sed -e 's/\s*#.*$//'
Example:
text="
this is a # test
#this is a test
#this is a #test
this is # another #test
"
$echo "$text" | sed -e 's/\s*#.*$//'
this is a
this is
Next this removes any resulting blank lines:
$echo "$text" | sed -e 's/\s*#.*$//' | sed -e '/^\s*$/d'
Caution: Depending on the syntax and/or interpretation of the lines your processing, this might not be an appropriate solution, as it just stupidly removes end of lines, even if the '#' is part of your data or code. However, for use cases where you'll never use a hash except for as an end of line comment then it works fine. So just as with all coding, context must be taken into consideration.
Alternative variant, using grep:
cat file.txt | grep -Ev '(#.*$)|(^$)'
you can use awk
awk 'NF{gsub(/^[ \t]*#/,"");print}' file
First example(paxdiablo) is very good except its not change file, just output result. If you want to change it inline:
sudo sed -i 's/#.*$//;/^$/d' inputFile
On (one of) my linux boxes, sed understands extended regular expressions with the -r option, so:
sed -r '/(^\s*#)|(^\s*$)/d' squid.conf.installed
is very useful for showing all non-blank, non comment lines.
The regex matches either start of line followed by zero or more spaces or tabs followed by either a hash or end of line, and deletes those matching lines from the input.

bash scripting: Can I get sed to output the original line, then a space, then the modified line?

I'm new to Unix in all its forms, so please go easy on me!
I have a bash script that will pipe an ls command with arbitrary filenames into sed, which will use an arbitrary replacement pattern on the files, and then this will be piped into awk for some processing. The catch is, awk needs to know both the original file name and the new one.
I've managed everything except getting the original file names into awk. For instance, let's say my files are test.* and my replacement pattern is 's:es:ar;', which would change every occurrence of "test" to "tart". For testing purposes I'm just using awk to print what it's receiving:
ls "$#" | sed "$pattern" | awk '{printf "0: %s\n1: %s\n2: %s\n", $0,$1,$2}'
where test.* is in $# and the pattern is stored in $pattern.
Clearly, this doesn't get me to where I want to be. The output is obviously
0: tart.c
1: tart.c
2:
If I could get sed to output "test.c tart.c", then I'd have two parameters for awk. I've played around with the pattern to no avail, even hardcoding "test.c" into the replacement. But of course that just gave me amateur results like "ttest.c art.c". Is it possible for sed to remember the input, then work it into the beginning of the output? Do I even have the right ideas? Thanks in advance!
Two ways to change the first t in a b in the duplicated field.
Duplicate (& replays the matched part), change first word and swap (remember 2 strings with a space in between):
echo test.c | sed -r 's/.*/& &/;s/t/b/;s/([^ ]*) (.*)/\2 \1/'
or with more magic (copy original value to buffer, make the change, insert value from buffer as the first line and replace eond of line with a space)
echo test.c | sed 'h;s/t/b/;x;G;s/\n/ /'
Use Perl instead of sed:
echo test.c | perl -lne 'print "$_ ", s/es/ar/r'
-l removes the newline from input and adds it after each print. The /r modifier to the substitution returns the modified string instead of changing the variable (Perl 5.14+ needed).
Old answer, not working for s/t/b/2 or s/.*/replaced/2:
You can duplicate the contents of the line with s/.*/& &/, then just tell sed that it should only apply the second substitution (this works at least in GNU sed):
echo test.c | sed 's/.*/& &/; s/es/ar/2'
$ echo 'foo' | awk '{old=$0; gsub(/o/,"e"); print old, $0}'
foo fee

sed add line at the end of file

I trying to add a line at the end of file (/root/test.conf) with sed. I use FreeBSD and when I try to add a simple line, I always get several errors like:
extra characters at the of command
undefined label 'est.conf'
The file is like this:
#Test
firstLine
secondLine
!p.p
*.*
And I want to add something like this:
(return \n)
!word
other (5 tab between "other" and "/usr/local") /usr/local
If it's not possible with sed, there are another options?
Thank you!
It doesn't sound like you need to use sed at all, maybe just cat with a heredoc:
cat >>test.conf <<EOF
whatever you want here
more stuff
EOF
>> opens test.conf in "append" mode, so lines are added to the bottom of the file, and the <<EOF is a heredoc that allows you to pass lines to cat via standard input.
To add literal tabs in the interactive terminal, you can use Ctrl-v followed by Tab.
You don't need any special tools like sed to add some lines to the end of files.
$ echo "This is last line" >>file
#or
$ printf "This is last line\n" >>file
works just fine in almost any platform. You might need to escape special characters though, or enclose them in single/double quotes.
In the event you have other things to do in sed, and appending at the end is just one of them:
sed -e "\$a A-line-at-the-end.' <infile >outfile
sed -e '$a A-Line-at-the-end.' <infile >outfile
sed -e '$a\A-line-at-the-end.' <infile >outfile
Works on linux (ubuntu), not on freebsd.

sed delete not working with cat variable

I have a file named test-domain, the contents of which contain the line 100.am.
When I do this, the line with 100.am is deleted from the test-domain file, as expected:
for x in $(echo 100.am); do sed -i "/$x/d" test-domain; done
However, if instead of echo 100.am, I read each line from a file named unwanted-lines, it does NOT work.
for x in $(cat unwanted-lines); do sed -i "/$x/d" test-domain; done
This is even if the only contents of unwanted-lines is one line, with the exact contents 100.am.
Does anyone know why sed delete line works if you use echo in your variable, but not if you use cat?
fgrep -v -f unwanted-lines test-domain > /tmp/Buffer
mv /tmp/Buffer test-domain
sed is not interesting in this case due to multiple call in shell (poor efficiency and lot of ressources used). The way to still use sed is to preload line to delete, and make a search base on this preloaded info but very heavy compare to fgrep in this case
Does anyone know why sed delete line works if you use echo in your
variable, but not if you use cat?
I believe that your file containing unwanted lines contains CR+LF line endings due to which it doesn't work when you use the file. You could strip the CR in your loop:
for x in $(cat unwanted-lines); do x="${x//$'\r'}"; sed -i "/$x/d" test-domain; done
One better strategy than yours would be to use a genuine editor, e.g., ed, as so:
ed -s test-domain < <(
shopt -s extglob
while IFS= read -r l; do
[[ $l = *([[:space:]]) ]] && continue
l=${l//./\\.}
echo "g/$l/d"
done < unwanted-lines
echo "wq"
)
Caveat. You must make sure that the file unwanted-lines doesn't contain any character that could clash with ed's regexps and commands. I have already included a match for a period (i.e., replace . with \.).
This method is quite efficient, as you're not forking so many times on sed, writing temp files, renaming them, etc.
Another possibility would be to use grep, but then you won't have the editing option ed offers.
Remark. ed is the standard editor.
why not just applying the sed command on your file?
sed -i '/.*100\.am/d' your_file

How can I remove the first line of a text file using bash/sed script?

I need to repeatedly remove the first line from a huge text file using a bash script.
Right now I am using sed -i -e "1d" $FILE - but it takes around a minute to do the deletion.
Is there a more efficient way to accomplish this?
Try tail:
tail -n +2 "$FILE"
-n x: Just print the last x lines. tail -n 5 would give you the last 5 lines of the input. The + sign kind of inverts the argument and make tail print anything but the first x-1 lines. tail -n +1 would print the whole file, tail -n +2 everything but the first line, etc.
GNU tail is much faster than sed. tail is also available on BSD and the -n +2 flag is consistent across both tools. Check the FreeBSD or OS X man pages for more.
The BSD version can be much slower than sed, though. I wonder how they managed that; tail should just read a file line by line while sed does pretty complex operations involving interpreting a script, applying regular expressions and the like.
Note: You may be tempted to use
# THIS WILL GIVE YOU AN EMPTY FILE!
tail -n +2 "$FILE" > "$FILE"
but this will give you an empty file. The reason is that the redirection (>) happens before tail is invoked by the shell:
Shell truncates file $FILE
Shell creates a new process for tail
Shell redirects stdout of the tail process to $FILE
tail reads from the now empty $FILE
If you want to remove the first line inside the file, you should use:
tail -n +2 "$FILE" > "$FILE.tmp" && mv "$FILE.tmp" "$FILE"
The && will make sure that the file doesn't get overwritten when there is a problem.
You can use -i to update the file without using '>' operator. The following command will delete the first line from the file and save it to the file (uses a temp file behind the scenes).
sed -i '1d' filename
For those who are on SunOS which is non-GNU, the following code will help:
sed '1d' test.dat > tmp.dat
You can easily do this with:
cat filename | sed 1d > filename_without_first_line
on the command line; or to remove the first line of a file permanently, use the in-place mode of sed with the -i flag:
sed -i 1d <filename>
No, that's about as efficient as you're going to get. You could write a C program which could do the job a little faster (less startup time and processing arguments) but it will probably tend towards the same speed as sed as files get large (and I assume they're large if it's taking a minute).
But your question suffers from the same problem as so many others in that it pre-supposes the solution. If you were to tell us in detail what you're trying to do rather then how, we may be able to suggest a better option.
For example, if this is a file A that some other program B processes, one solution would be to not strip off the first line, but modify program B to process it differently.
Let's say all your programs append to this file A and program B currently reads and processes the first line before deleting it.
You could re-engineer program B so that it didn't try to delete the first line but maintains a persistent (probably file-based) offset into the file A so that, next time it runs, it could seek to that offset, process the line there, and update the offset.
Then, at a quiet time (midnight?), it could do special processing of file A to delete all lines currently processed and set the offset back to 0.
It will certainly be faster for a program to open and seek a file rather than open and rewrite. This discussion assumes you have control over program B, of course. I don't know if that's the case but there may be other possible solutions if you provide further information.
The sponge util avoids the need for juggling a temp file:
tail -n +2 "$FILE" | sponge "$FILE"
If you want to modify the file in place, you could always use the original ed instead of its streaming successor sed:
ed "$FILE" <<<$'1d\nwq\n'
The ed command was the original UNIX text editor, before there were even full-screen terminals, much less graphical workstations. The ex editor, best known as what you're using when typing at the colon prompt in vi, is an extended version of ed, so many of the same commands work. While ed is meant to be used interactively, it can also be used in batch mode by sending a string of commands to it, which is what this solution does.
The sequence <<<$'1d\nwq\n' takes advantage of modern shells' support for here-strings (<<<) and ANSI quotes ($'...') to feed input to the ed command consisting of two lines: 1d, which deletes line 1, and then wq, which writes the file back out to disk and then quits the editing session.
As Pax said, you probably aren't going to get any faster than this. The reason is that there are almost no filesystems that support truncating from the beginning of the file so this is going to be an O(n) operation where n is the size of the file. What you can do much faster though is overwrite the first line with the same number of bytes (maybe with spaces or a comment) which might work for you depending on exactly what you are trying to do (what is that by the way?).
You can edit the files in place: Just use perl's -i flag, like this:
perl -ni -e 'print unless $. == 1' filename.txt
This makes the first line disappear, as you ask. Perl will need to read and copy the entire file, but it arranges for the output to be saved under the name of the original file.
should show the lines except the first line :
cat textfile.txt | tail -n +2
Could use vim to do this:
vim -u NONE +'1d' +'wq!' /tmp/test.txt
This should be faster, since vim won't read whole file when process.
How about using csplit?
man csplit
csplit -k file 1 '{1}'
This one liner will do:
echo "$(tail -n +2 "$FILE")" > "$FILE"
It works, since tail is executed prior to echo and then the file is unlocked, hence no need for a temp file.
Since it sounds like I can't speed up the deletion, I think a good approach might be to process the file in batches like this:
While file1 not empty
file2 = head -n1000 file1
process file2
sed -i -e "1000d" file1
end
The drawback of this is that if the program gets killed in the middle (or if there's some bad sql in there - causing the "process" part to die or lock-up), there will be lines that are either skipped, or processed twice.
(file1 contains lines of sql code)
tail +2 path/to/your/file
works for me, no need to specify the -n flag. For reasons, see Aaron's answer.
You can use the sed command to delete arbitrary lines by line number
# create multi line txt file
echo """1. first
2. second
3. third""" > file.txt
deleting lines and printing to stdout
$ sed '1d' file.txt
2. second
3. third
$ sed '2d' file.txt
1. first
3. third
$ sed '3d' file.txt
1. first
2. second
# delete multi lines
$ sed '1,2d' file.txt
3. third
# delete the last line
sed '$d' file.txt
1. first
2. second
use the -i option to edit the file in-place
$ cat file.txt
1. first
2. second
3. third
$ sed -i '1d' file.txt
$cat file.txt
2. second
3. third
If what you are looking to do is recover after failure, you could just build up a file that has what you've done so far.
if [[ -f $tmpf ]] ; then
rm -f $tmpf
fi
cat $srcf |
while read line ; do
# process line
echo "$line" >> $tmpf
done
Based on 3 other answers, I came up with this syntax that works perfectly in my Mac OSx bash shell:
line=$(head -n1 list.txt && echo "$(tail -n +2 list.txt)" > list.txt)
Test case:
~> printf "Line #%2d\n" {1..3} > list.txt
~> cat list.txt
Line # 1
Line # 2
Line # 3
~> line=$(head -n1 list.txt && echo "$(tail -n +2 list.txt)" > list.txt)
~> echo $line
Line # 1
~> cat list.txt
Line # 2
Line # 3
Would using tail on N-1 lines and directing that into a file, followed by removing the old file, and renaming the new file to the old name do the job?
If i were doing this programatically, i would read through the file, and remember the file offset, after reading each line, so i could seek back to that position to read the file with one less line in it.

Resources