How to make whitespaces visible in the command line - bash

I have a very large file, of which I want to inspect the first 100 lines using head:
head -n100 large.file
I'd really like to make whitespaces lik \t \r,... visible. How can I do this. I did not find an option in man head.

You can do it with Perl
echo 'fooo bar' | perl -pe 's/( +)/\033[41m$1\033[00m/g'
\033[41m enables red color and \033[00m disables it. Perl with -pe works like sed and is needed only to put those special sequences around spaces.
To highlight line breaks change the first part of the regular expression to
s/([ \n]+)/...rest of the expression

Related

Replacing text with shell script ending with an extension

I need some inputs on how to achieve this:
I need to replace a text in a file, using shell script, the text which i need to replace ends with .ear, for example below:
/home/export/files/list/aa_bb_cc.ear
The shell script should replace the aa_bb_cc.ear with say, replaced.ear.
That is the line after substitution should be:
/home/export/files/list/replaced.ear
I did read online about this, and came to know about sed command. The problem which i have, is that i don't know before hand what the text to be replace would be, that is, i know the text to be replace would be something *.ear (in an attempt to match aa_bb_cc.ear)
Now, how can I do this? I tried to use "*" in sed however it didn't work
With GNU sed:
sed 's|[^/]*\.ear|replaced.ear|' file
or
sed 's|[^/]*\(\.ear\)|replaced\1|' file
If you want to edit your file "in place" use sed's option -i.
See: The Stack Overflow Regular Expressions FAQ
$ echo /home/export/files/list/aa_bb_cc.ear | sed 's/[^/]*\.ear/xxx.ear/'
/home/export/files/list/xxx.ear
since regex match is greedy, you need to specify to match non-slash char. Dot matches any char, so to specify literal dot you have to escape with back-slash.

Dynamic delimiter in Unix

Input:-
echo "1234ABC89,234" # A
echo "0520001DEF78,66" # B
echo "46545455KRJ21,00"
From the above strings, I need to split the characters to get the alphabetic field and the number after that.
From "1234ABC89,234", the output should be:
ABC
89,234
From "0520001DEF78,66", the output should be:
DEF
78,66
I have many strings that I need to split like this.
Here is my script so far:
echo "1234ABC89,234" | cut -d',' -f1
but it gives me 1234ABC89 which isn't what I want.
Assuming that you want to discard leading digits only, and that the letters will be all upper case, the following should work:
echo "1234ABC89,234" | sed 's/^[0-9]*\([A-Z]*\)\([0-9].*\)/\1\n\2/'
This works fine with GNU sed (I have 4.2.2), but other sed implementations might not like the \n, in which case you'll need to substitute something else.
Depending on the version of sed you can try:
echo "0520001DEF78,66" | sed -E -e 's/[0-9]*([A-Z]*)([,0-9]*)/\1\n\2/'
or:
echo "0520001DEF78,66" | sed -E -e 's/[0-9]*([A-Z]*)([,0-9]*)/\1$\2/' | tr '$' '\n'
DEF
78,66
Explanation: the regular expression replaces the input with the expected output, except instead of the new-line it puts a "$" sign, that we replace to a new-line with the tr command
Where do the strings come from? Are they read from a file (or other source external to the script), or are they stored in the script? If they're in the script, you should simply reformat the data so it is easier to manage. Therefore, it is sensible to assume they come from an external data source such as a file or being piped to the script.
You could simply feed the data through sed:
sed 's/^[0-9]*\([A-Z]*\)/\1 /' |
while read alpha number
do
…process the two fields…
done
The only trick to watch there is that if you set variables in the loop, they won't necessarily be visible to the script after the done. There are ways around that problem — some of which depend on which shell you use. This much is the same in any derivative of the Bourne shell.
You said you have many strings like this, so I recommend if possible save them to a file such as input.txt:
1234ABC89,234
0520001DEF78,66
46545455KRJ21,00
On your command line, try this sed command reading input.txt as file argument:
$ sed -E 's/([0-9]+)([[:alpha:]]{3})(.+)/\2\t\3/g' input.txt
ABC 89,234
DEF 78,66
KRJ 21,00
How it works
uses -E for extended regular expressions to save on typing, otherwise for example for grouping we would have to escape \(
uses grouping ( and ), searches three groups:
firstly digits, + specifies one-or-more of digits. Oddly using [0-9] results in an extra blank space above results, so use POSIX class [[:digit:]]
the next is to search for POSIX alphabetical characters, regardless if lowercase or uppercase, and {3} specifies to search for 3 of them
the last group searches for . meaning any character, + for one or more times
\2\t\3 then returns group 2 and group 3, with a tab separator
Thus you are able to extract two separate fields per line, just separated by tab, for easier manipulation later.

Bash terminal output - highlight lines containing some text

When I get output in bash I get my standard 2 colour screen. Is there any way I can, by default, highlight a line if it contains some key text output?
E.g. if it contains the word "FAIL" then the line is coloured red.
I’ve read this https://unix.stackexchange.com/questions/46562/how-do-you-colorize-only-some-keywords-for-a-bash-script
but am looking for something simpler than having to write a wrapper script which I’d inevitably have to debug at some time in the future.
For a simple workaround, pipe it through grep --color to turn some words red.
Add a fallback like ^ to print lines which do not contain any matches otherwise.
grep --color -e 'FAIL' -e '^' <<<$'Foo\nBar FAIL Baz\nIck'
Grep output with multiple Colors? describes a hack for getting multiple colors if you need that.
If you're happy to install a BASH script and ack, the hhlighter package has useful default colours and an easy interface https://github.com/paoloantinori/hhighlighter:
You can use it like so to highlight rows that start with FAIL:
h -i 'FAIL.*'
or that contain FAIL:
h -i '.*FAIL.*'
or for various common log entries:
h -i '.*FAIL.*' '.*PASS.*' '.*WARN.*'
Building on tripleee's answer, following command will highlight the matching line red and preserve the other lines:
your_command | grep --color -e ".*FAIL.*" -e "^"
If you prefer a completely inverted line, with gnu grep:
your_command | GREP_COLORS='mt=7' grep --color -e ".*FAIL.*" -e "^"
(updated with mklement0 feedback)
This will highlight not only a word, but the whole line:
echo "foo bar error baz" | egrep --color '*.FAIL.*|$'
A search phrase should be enclosed by .* on either side. It will cause highlighting before and after the search word or phrase.
References
https://unix.stackexchange.com/a/330613/341457
How .* (dot star) works?

How to replace all spaces in .txt file using SED in Cygwin

I have a huge .txt file that I want all spaces, line-breaks, indentations etc removed. It should literally be one long string.
I tried
sed -i 's/\ //g' test.txt
but nothing happens
sed -n "s/[[:blank:]]//g;H
$ {x;s/\n//g;p;}"
The H than $ are needed if you want to include New line due to fact that sed treat by default line by line (so no new line inside a line). The -n and p are needed to avoid double display with use of H
Seems to work ok for me:
[~/Desktop]
==> cat test.txt
the quick brown fox
[~/Desktop]
==> sed -i "s/\ //g" test.txt
[~/Desktop]
==> cat test.txt
thequickbrownfox
Sometimes using " " directly is hard and especially when you use double quotes (which involves that bash will interpret the string before passing it to sed).
sed -i -e 's/\s//g' file.txt
... should work (it works for me). "\s" means all whitespace characters, and with single quotes '', for bash not to interpret it before you passing it to sed.
While you use cygwin I think your OS is windows, then you don't need to use bash to implement your goal. Just open your txt file with the text editor, and replace the while space with nothing, then all of the whit space in you txt file will be removed.
This method can meet almost all kinds of removal. And also can apply in excel or word and so on.
Good luck!

Insert line after match using sed

For some reason I can't seem to find a straightforward answer to this and I'm on a bit of a time crunch at the moment. How would I go about inserting a choice line of text after the first line matching a specific string using the sed command. I have ...
CLIENTSCRIPT="foo"
CLIENTFILE="bar"
And I want insert a line after the CLIENTSCRIPT= line resulting in ...
CLIENTSCRIPT="foo"
CLIENTSCRIPT2="hello"
CLIENTFILE="bar"
Try doing this using GNU sed:
sed '/CLIENTSCRIPT="foo"/a CLIENTSCRIPT2="hello"' file
if you want to substitute in-place, use
sed -i '/CLIENTSCRIPT="foo"/a CLIENTSCRIPT2="hello"' file
Output
CLIENTSCRIPT="foo"
CLIENTSCRIPT2="hello"
CLIENTFILE="bar"
Doc
see sed doc and search \a (append)
Note the standard sed syntax (as in POSIX, so supported by all conforming sed implementations around (GNU, OS/X, BSD, Solaris...)):
sed '/CLIENTSCRIPT=/a\
CLIENTSCRIPT2="hello"' file
Or on one line:
sed -e '/CLIENTSCRIPT=/a\' -e 'CLIENTSCRIPT2="hello"' file
(-expressions (and the contents of -files) are joined with newlines to make up the sed script sed interprets).
The -i option for in-place editing is also a GNU extension, some other implementations (like FreeBSD's) support -i '' for that.
Alternatively, for portability, you can use perl instead:
perl -pi -e '$_ .= qq(CLIENTSCRIPT2="hello"\n) if /CLIENTSCRIPT=/' file
Or you could use ed or ex:
printf '%s\n' /CLIENTSCRIPT=/a 'CLIENTSCRIPT2="hello"' . w q | ex -s file
Sed command that works on MacOS (at least, OS 10) and Unix alike (ie. doesn't require gnu sed like Gilles' (currently accepted) one does):
sed -e '/CLIENTSCRIPT="foo"/a\'$'\n''CLIENTSCRIPT2="hello"' file
This works in bash and maybe other shells too that know the $'\n' evaluation quote style. Everything can be on one line and work in
older/POSIX sed commands. If there might be multiple lines matching the CLIENTSCRIPT="foo" (or your equivalent) and you wish to only add the extra line the first time, you can rework it as follows:
sed -e '/^ *CLIENTSCRIPT="foo"/b ins' -e b -e ':ins' -e 'a\'$'\n''CLIENTSCRIPT2="hello"' -e ': done' -e 'n;b done' file
(this creates a loop after the line insertion code that just cycles through the rest of the file, never getting back to the first sed command again).
You might notice I added a '^ *' to the matching pattern in case that line shows up in a comment, say, or is indented. Its not 100% perfect but covers some other situations likely to be common. Adjust as required...
These two solutions also get round the problem (for the generic solution to adding a line) that if your new inserted line contains unescaped backslashes or ampersands they will be interpreted by sed and likely not come out the same, just like the \n is - eg. \0 would be the first line matched. Especially handy if you're adding a line that comes from a variable where you'd otherwise have to escape everything first using ${var//} before, or another sed statement etc.
This solution is a little less messy in scripts (that quoting and \n is not easy to read though), when you don't want to put the replacement text for the a command at the start of a line if say, in a function with indented lines. I've taken advantage that $'\n' is evaluated to a newline by the shell, its not in regular '\n' single-quoted values.
Its getting long enough though that I think perl/even awk might win due to being more readable.
A POSIX compliant one using the s command:
sed '/CLIENTSCRIPT="foo"/s/.*/&\
CLIENTSCRIPT2="hello"/' file
Maybe a bit late to post an answer for this, but I found some of the above solutions a bit cumbersome.
I tried simple string replacement in sed and it worked:
sed 's/CLIENTSCRIPT="foo"/&\nCLIENTSCRIPT2="hello"/' file
& sign reflects the matched string, and then you add \n and the new line.
As mentioned, if you want to do it in-place:
sed -i 's/CLIENTSCRIPT="foo"/&\nCLIENTSCRIPT2="hello"/' file
Another thing. You can match using an expression:
sed -i 's/CLIENTSCRIPT=.*/&\nCLIENTSCRIPT2="hello"/' file
Hope this helps someone
The awk variant :
awk '1;/CLIENTSCRIPT=/{print "CLIENTSCRIPT2=\"hello\""}' file
I had a similar task, and was not able to get the above perl solution to work.
Here is my solution:
perl -i -pe "BEGIN{undef $/;} s/^\[mysqld\]$/[mysqld]\n\ncollation-server = utf8_unicode_ci\n/sgm" /etc/mysql/my.cnf
Explanation:
Uses a regular expression to search for a line in my /etc/mysql/my.cnf file that contained only [mysqld] and replaced it with
[mysqld]
collation-server = utf8_unicode_ci
effectively adding the collation-server = utf8_unicode_ci line after the line containing [mysqld].
I had to do this recently as well for both Mac and Linux OS's and after browsing through many posts and trying many things out, in my particular opinion I never got to where I wanted to which is: a simple enough to understand solution using well known and standard commands with simple patterns, one liner, portable, expandable to add in more constraints. Then I tried to looked at it with a different perspective, that's when I realized i could do without the "one liner" option if a "2-liner" met the rest of my criteria. At the end I came up with this solution I like that works in both Ubuntu and Mac which i wanted to share with everyone:
insertLine=$(( $(grep -n "foo" sample.txt | cut -f1 -d: | head -1) + 1 ))
sed -i -e "$insertLine"' i\'$'\n''bar'$'\n' sample.txt
In first command, grep looks for line numbers containing "foo", cut/head selects 1st occurrence, and the arithmetic op increments that first occurrence line number by 1 since I want to insert after the occurrence.
In second command, it's an in-place file edit, "i" for inserting: an ansi-c quoting new line, "bar", then another new line. The result is adding a new line containing "bar" after the "foo" line. Each of these 2 commands can be expanded to more complex operations and matching.

Resources