How to replace all spaces in .txt file using SED in Cygwin - bash

I have a huge .txt file that I want all spaces, line-breaks, indentations etc removed. It should literally be one long string.
I tried
sed -i 's/\ //g' test.txt
but nothing happens

sed -n "s/[[:blank:]]//g;H
$ {x;s/\n//g;p;}"
The H than $ are needed if you want to include New line due to fact that sed treat by default line by line (so no new line inside a line). The -n and p are needed to avoid double display with use of H

Seems to work ok for me:
[~/Desktop]
==> cat test.txt
the quick brown fox
[~/Desktop]
==> sed -i "s/\ //g" test.txt
[~/Desktop]
==> cat test.txt
thequickbrownfox

Sometimes using " " directly is hard and especially when you use double quotes (which involves that bash will interpret the string before passing it to sed).
sed -i -e 's/\s//g' file.txt
... should work (it works for me). "\s" means all whitespace characters, and with single quotes '', for bash not to interpret it before you passing it to sed.

While you use cygwin I think your OS is windows, then you don't need to use bash to implement your goal. Just open your txt file with the text editor, and replace the while space with nothing, then all of the whit space in you txt file will be removed.
This method can meet almost all kinds of removal. And also can apply in excel or word and so on.
Good luck!

Related

Changing a line of text with sed with special characters

The name in the title says it all. However, I'm absolutely the worst with the sed command. So I'm trying to edit the following file:
/var/www/html/phpMyAdmin/config.inc.php
I want to edit the line that says
$cfg['Servers'][$i]['AllowRoot'] = false;
into the following
$cfg['Servers'][$i]['AllowRoot'] = true;
It has so many special characters and whatnot and I have no prior knowledge of how sed works. So here's some commands I've tried to specifically edit that one line.
sed -i "/*.AllowRoot.*/\$cfg['Servers'][\$i]['AllowRoot'] = true;/" /var/www/html/phpMyAdmin/config.inc.php
sed -i "/*.AllowRoot.*/$cfg['Servers'][$i]['AllowRoot'] = true;/" /var/www/html/phpMyAdmin/config.inc.php
# this one finds the line successfully and prints it so I know it's got the right string:
sed -n '/AllowRoot/p' /var/www/html/phpMyAdmin/config.inc.php
sed -i "s/'AllowRoot|false'/'AllowRoot|true'/" /var/www/html/phpMyAdmin/config.inc.php
I have absolutely no idea what I'm doing and I'm not learning a whole lot besides the feeling that the last command splits up 'AllowRoot|false' makes sure that both must be present in the sentence to come back as a result. So to my logic, I thought changing the word false into true would make that happen, but nothing. The other commands return... bizarre results at best, one even emptying the file. Or that's one of the commands I had not written down here, I've lost track after 50 attempts. What is the solution here?
The [ and ] need to be escaped to match literal brackets, instead of inadvertently starting a bracket expression. This should work:
$ sed -i "/\$cfg\['Servers'\]\[\$i\]\['AllowRoot'\]/s/false/true/" /var/www/html/phpMyAdmin/config.inc.php
There is not many things to escape in sed. Main problem in your line is / which you have chosen as delimiter (most common, but not required). I suggest you use # and the following will work:
sed -i "s#$cfg['Servers'][$i]['AllowRoot'] = false;<br />#$cfg['Servers'][$i]['AllowRoot'] = true;<br />#g" input.txt
however you need to think about bash interpreter as well. $i and $cfg will be interpreted as variables. My suggestion is that when you want to match a string like this to put the sed expression in a text file like this:
cat allow_root_true.sed
s#['Servers'][]['AllowRoot'] = false;<br />#['Servers'][]['AllowRoot'] = true;<br />#g
and run the command using sed -f like this:
sed -i -f allow_root_true.sed input.txt
Warning -i will change the input file
sed can't do literal string matching which is why you need to escape so many characters (see Is it possible to escape regex metacharacters reliably with sed), but awk can:
$ awk -v str="\$cfg['Servers'][\$i]['AllowRoot']" 'index($0,str){sub(/false/,"true")} 1' file
//some text here
$cfg['Servers'][$i]['AllowRoot'] = true;<br />
//some more text here
Run code snippetHide resultsExpand snippet
In the above we only have to escape the $s to protect them from the shell since the string is enclosed in "s to allow it to include 's.

Use a shell script to replace text with pwd

file.txt
...
<LOCAL_PATH_TO_REPO>/src/java/example.java
...
^A longer file but this pretty much explains what I am trying to do.
script.sh
dir=$(pwd)
# replace <LOCAL_PATH_TO_REPO> with dir
I tried using the sed command but it did not work for some reason. Any ideas on how to do this?
Your error means you have backslashes in the variable text.
The simplest solution is to change the delimiter to the one that does not occur in the variable text.
If there are no commas use a comma:
sed -i "s,LOCAL_PATH_TO_REPO,$PWD," file.yml
The -i flag introduces changes into the input file (works for GNU sed).

Insert line after match using sed

For some reason I can't seem to find a straightforward answer to this and I'm on a bit of a time crunch at the moment. How would I go about inserting a choice line of text after the first line matching a specific string using the sed command. I have ...
CLIENTSCRIPT="foo"
CLIENTFILE="bar"
And I want insert a line after the CLIENTSCRIPT= line resulting in ...
CLIENTSCRIPT="foo"
CLIENTSCRIPT2="hello"
CLIENTFILE="bar"
Try doing this using GNU sed:
sed '/CLIENTSCRIPT="foo"/a CLIENTSCRIPT2="hello"' file
if you want to substitute in-place, use
sed -i '/CLIENTSCRIPT="foo"/a CLIENTSCRIPT2="hello"' file
Output
CLIENTSCRIPT="foo"
CLIENTSCRIPT2="hello"
CLIENTFILE="bar"
Doc
see sed doc and search \a (append)
Note the standard sed syntax (as in POSIX, so supported by all conforming sed implementations around (GNU, OS/X, BSD, Solaris...)):
sed '/CLIENTSCRIPT=/a\
CLIENTSCRIPT2="hello"' file
Or on one line:
sed -e '/CLIENTSCRIPT=/a\' -e 'CLIENTSCRIPT2="hello"' file
(-expressions (and the contents of -files) are joined with newlines to make up the sed script sed interprets).
The -i option for in-place editing is also a GNU extension, some other implementations (like FreeBSD's) support -i '' for that.
Alternatively, for portability, you can use perl instead:
perl -pi -e '$_ .= qq(CLIENTSCRIPT2="hello"\n) if /CLIENTSCRIPT=/' file
Or you could use ed or ex:
printf '%s\n' /CLIENTSCRIPT=/a 'CLIENTSCRIPT2="hello"' . w q | ex -s file
Sed command that works on MacOS (at least, OS 10) and Unix alike (ie. doesn't require gnu sed like Gilles' (currently accepted) one does):
sed -e '/CLIENTSCRIPT="foo"/a\'$'\n''CLIENTSCRIPT2="hello"' file
This works in bash and maybe other shells too that know the $'\n' evaluation quote style. Everything can be on one line and work in
older/POSIX sed commands. If there might be multiple lines matching the CLIENTSCRIPT="foo" (or your equivalent) and you wish to only add the extra line the first time, you can rework it as follows:
sed -e '/^ *CLIENTSCRIPT="foo"/b ins' -e b -e ':ins' -e 'a\'$'\n''CLIENTSCRIPT2="hello"' -e ': done' -e 'n;b done' file
(this creates a loop after the line insertion code that just cycles through the rest of the file, never getting back to the first sed command again).
You might notice I added a '^ *' to the matching pattern in case that line shows up in a comment, say, or is indented. Its not 100% perfect but covers some other situations likely to be common. Adjust as required...
These two solutions also get round the problem (for the generic solution to adding a line) that if your new inserted line contains unescaped backslashes or ampersands they will be interpreted by sed and likely not come out the same, just like the \n is - eg. \0 would be the first line matched. Especially handy if you're adding a line that comes from a variable where you'd otherwise have to escape everything first using ${var//} before, or another sed statement etc.
This solution is a little less messy in scripts (that quoting and \n is not easy to read though), when you don't want to put the replacement text for the a command at the start of a line if say, in a function with indented lines. I've taken advantage that $'\n' is evaluated to a newline by the shell, its not in regular '\n' single-quoted values.
Its getting long enough though that I think perl/even awk might win due to being more readable.
A POSIX compliant one using the s command:
sed '/CLIENTSCRIPT="foo"/s/.*/&\
CLIENTSCRIPT2="hello"/' file
Maybe a bit late to post an answer for this, but I found some of the above solutions a bit cumbersome.
I tried simple string replacement in sed and it worked:
sed 's/CLIENTSCRIPT="foo"/&\nCLIENTSCRIPT2="hello"/' file
& sign reflects the matched string, and then you add \n and the new line.
As mentioned, if you want to do it in-place:
sed -i 's/CLIENTSCRIPT="foo"/&\nCLIENTSCRIPT2="hello"/' file
Another thing. You can match using an expression:
sed -i 's/CLIENTSCRIPT=.*/&\nCLIENTSCRIPT2="hello"/' file
Hope this helps someone
The awk variant :
awk '1;/CLIENTSCRIPT=/{print "CLIENTSCRIPT2=\"hello\""}' file
I had a similar task, and was not able to get the above perl solution to work.
Here is my solution:
perl -i -pe "BEGIN{undef $/;} s/^\[mysqld\]$/[mysqld]\n\ncollation-server = utf8_unicode_ci\n/sgm" /etc/mysql/my.cnf
Explanation:
Uses a regular expression to search for a line in my /etc/mysql/my.cnf file that contained only [mysqld] and replaced it with
[mysqld]
collation-server = utf8_unicode_ci
effectively adding the collation-server = utf8_unicode_ci line after the line containing [mysqld].
I had to do this recently as well for both Mac and Linux OS's and after browsing through many posts and trying many things out, in my particular opinion I never got to where I wanted to which is: a simple enough to understand solution using well known and standard commands with simple patterns, one liner, portable, expandable to add in more constraints. Then I tried to looked at it with a different perspective, that's when I realized i could do without the "one liner" option if a "2-liner" met the rest of my criteria. At the end I came up with this solution I like that works in both Ubuntu and Mac which i wanted to share with everyone:
insertLine=$(( $(grep -n "foo" sample.txt | cut -f1 -d: | head -1) + 1 ))
sed -i -e "$insertLine"' i\'$'\n''bar'$'\n' sample.txt
In first command, grep looks for line numbers containing "foo", cut/head selects 1st occurrence, and the arithmetic op increments that first occurrence line number by 1 since I want to insert after the occurrence.
In second command, it's an in-place file edit, "i" for inserting: an ansi-c quoting new line, "bar", then another new line. The result is adding a new line containing "bar" after the "foo" line. Each of these 2 commands can be expanded to more complex operations and matching.

Why does sed add a new line in OSX?

echo -n 'I hate cats' > cats.txt
sed -i '' 's/hate/love/' cats.txt
This changes the word in the file correctly, but also adds a newline to the end of the file. Why? This only happens in OSX, not Ubuntu etc. How can I stop it?
echo -n 'I hate cats' > cats.txt
This command will populate the contents of 'cats.txt' with the 11 characters between the single quotes. If you check the size of cats.txt at this stage it should be 11 bytes.
sed -i '' 's/hate/love/' cats.txt
This command will read the cats.txt file line by line, and replace it with a file where each line has had the first instance of 'hate' replaced by 'love' (if such an instance exists). The important part is understanding what a line is. From the sed man page:
Normally, sed cyclically copies a line of input, not including its
terminating newline character, into a pattern space, (unless there is
something left after a ``D'' function), applies all of the commands
with addresses that select that pattern space, copies the pattern
space to the standard output, appending a newline, and deletes the
pattern space.
Note the appending a newline part. On your system, sed is still interpreting your file as containing a single line, even though there is no terminating newline. So the output will be the 11 characters, plus the appended newline. On other platforms this would not necessarily be the case. Sometimes sed will completely skip the last line in a file (effectively deleting it) because it is not really a line! But in your case, sed is basically fixing the file for you (as a file with no lines in it, it is broken input to sed).
See more details here: Why should text files end with a newline?
See this question for an alternate approach to your problem: SED adds new line at the end
If you need a solution which will not add the newline, you can use gsed (brew install gnu-sed)
A good way to avoid this problem is to use perl instead of sed. Perl will respect the EOF newline, or lack thereof, that is in the original file.
echo -n 'I hate cats' > cats.txt
perl -pi -e 's/hate/love/' cats.txt
Note that GNU sed does not add the newline on Mac OS.
Another thing you can do is this:
echo -n 'I hate cats' > cats.txt
SAFE=$(cat cats.txt; echo x)
SAFE=$(printf "$SAFE" | sed -e 's/hate/love/')
SAFE=${SAFE%x}
That way if cats.txt ends in a newline it gets preserved. If it doesn't, it doesn't get one added on.
This worked for me. I didn't have to use an intermediate file.
OUTPUT=$( echo 'I hate cats' | sed 's/hate/love/' )
echo -n "$OUTPUT"

How to append to specific lines in a flat file using shell script

I have a flat file that contains something like this:
11|30646|654387|020751520
11|23861|876521|018277154
11|30645|765418|016658304
Using shell script, I would like to append a string to certain lines in this file, if those lines contain a specific string.
For example, in the above file, for lines containing 23861, I would like to append a string "Processed" at the end, so that the file becomes:
11|30646|654387|020751520
11|23861|876521|018277154|Processed
11|30645|765418|016658304
I could use sed to append the string to all lines in the file, but how do I do it for specific lines ?
I'd do it this way
sed '/\|23861\|/{s/$/|Something/;}' file
This is similar to Marcelo's answer but doesn't require extended expressions and is, I think, a little cleaner.
First, match lines having 23861 between pipes
/\|23861\|/
Then, on those lines, replace the end-of-line with the string |Something
{s/$/|Something/;}
If you want to do more than one of these you could simply list them
sed '/\|23861\|/{s/$/|Something/;};/\|30645\|/{s/$/|SomethingElse/;}' file
Use the following awk-script:
$ awk '/23861/ { $0=$0 "|Processed" } {print}' input
11|30646|654387|020751520
11|23861|876521|018277154|Processed
11|30645|765418|016658304
or, using sed:
$ sed 's/\(.*23861.*$\)/\1|Processed/' input
11|30646|654387|020751520
11|23861|876521|018277154|Processed
11|30645|765418|016658304
Use the substitution command:
sed -i~ -E 's/(\|23861\|.*)/\1|Processed/' flat.file
(Note: the -i~ performs the substitution in-place. Just leave it out if you don't want to modify the original file.)
You can use the shell
while read -r line
do
case "$line" in
*23681*) line="$line|Processed";;
esac
echo "$line"
done < file > tempo && mv tempo file
sed is just a stream version of ed, which has a similar command set but was designed to edit files in place (allegedly interactively, but you wouldn't want to use it that way unless all you had was one of these). Something like
field_2_value=23861
appended_text='|processed'
line_match_regex="^[^|]*|$field_2_value|"
ed "$file" <<EOF
g/$line_match_regex/s/$/$appended_text/
wq
EOF
should get you there.
Note that the $ in .../s/$/... is not expanded by the shell, as are $line_match_regex and $appended_text, because there's no such thing as $/ - instead it's passed through as-is to ed, which interprets it as text to substitute ($ being regex-speak for "end of line").
The syntax to do the same job in sed, should you ever want to do this to a stream rather than a file in place, is very similar except that you don't need the leading g before the regex address:
sed -e "/$line_match_regex/s/$/$appended_text/" "$input_file" >"$output_file"
You need to be sure that the values you put in field_2_value and appended_text never contain slashes, because ed's g and s commands use those for delimiters.
If they might do, and you're using bash or some other shell that allows ${name//search/replace} parameter expansion syntax, you could fix them up on the fly by substituting \/ for every / during expansion of those variables. Because bash also uses / as a substitution delimiter and also uses \ as a character escape, this ends up looking horrible:
appended_text='|n/a'
ed "$file" <<EOF
g/${line_match_regex//\//\\/}/s/$/${appended_text//\//\\/}/
wq
EOF
but it does work. Nnote that both ed and sed require a trailing / after the replacement text in s/search/replace/ while bash's ${name//search/replace} syntax doesn't.

Resources