A console program (translate-shell) has an output with colors and uses special decorate characters for this: ^[[22m, ^[[24m, ^[[1m... and so on.
I'd like to remove them to get a plain text.
I tried with tr -d "^[[22m" and with sed 's/[\^[[22m]//g', but only is removed the number, not the special character ^[
Thanks.
You have multiple options:
https://unix.stackexchange.com/questions/14684/removing-control-chars-including-console-codes-colours-from-script-output
http://www.commandlinefu.com/commands/view/3584/remove-color-codes-special-characters-with-sed
and as -no-ansi as pointed out by Jens in other answer
EDIT
The solution from commandlinefu does the job pretty well:
sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[m|K]//g"
The solution from unix.stackexchange might be better but is much longer and so you would want to create a separate script file because it is so long instead of just doing a shell one-liner.
I found this in the manual about the use of ANSI escape codes:
-no-ansi
Do not use ANSI escape codes.
So you should add this option when starting the program.
Related
A console program (translate-shell) has an output with colors and uses special decorate characters for this: ^[[22m, ^[[24m, ^[[1m... and so on.
I'd like to remove them to get a plain text.
I tried with tr -d "^[[22m" and with sed 's/[\^[[22m]//g', but only is removed the number, not the special character ^[
Thanks.
You have multiple options:
https://unix.stackexchange.com/questions/14684/removing-control-chars-including-console-codes-colours-from-script-output
http://www.commandlinefu.com/commands/view/3584/remove-color-codes-special-characters-with-sed
and as -no-ansi as pointed out by Jens in other answer
EDIT
The solution from commandlinefu does the job pretty well:
sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[m|K]//g"
The solution from unix.stackexchange might be better but is much longer and so you would want to create a separate script file because it is so long instead of just doing a shell one-liner.
I found this in the manual about the use of ANSI escape codes:
-no-ansi
Do not use ANSI escape codes.
So you should add this option when starting the program.
I have a large text file containing sequences such as
\u02BBUtthay\u0101n h\u01E3ng Ch\u0101t Khao Yai
However, they render exactly as above. How do I convert this so people just see UTF-8? I would prefer to process the files at the command line if possible.
use the printf command.
http://manpages.ubuntu.com/manpages/intrepid/man3/printf.3.html
you can wrap it in $() to use as a variable if needed, too.
For example,
echo $(printf '\u02BBUtthay\u0101n h\u01E3ng Ch\u0101t Khao Yai')
this outputs: ʻUtthayān hǣng Chāt Khao Yai
Hope that helps.
I have a rather tricky request...
We use a special application which is connected to a oracle database. For control reasons the application uses special characters which are defined by the application and saved in a long field of the database.
My task is to query the long field periodically and check for changes. To do that, I write the content by using a bash script in a file and compare the old and the new file with md5sum.
When there's a difference, I want to send the old file via mail. The problem is, that the old file contains these special characters and I don't know how to replace them with for example a string which describes them.
I tried to replace them on the basis of their ASCII code, but this didn't work. I've also tried to replace them by their appearance in the file. (They look like this: ^P ) This didn't work neither.
When viewing the file by text editor like nano the characters are visible like described above. But when using cat on the file, the content is only displayed until the first appearance of such a control character.
As far as I know there is know possibility to replace them while querying from the database because of the fact that the content is in a LONG field.
I hope you can help me.
Thank you in advance.
Marco
^P is the Control-P character, which is decimal 16 or hexadecimal 0x10, also known as the Data Link Escape (DLE) character in ASCII.
To replace all occurrences of 0x10 in a file with another string we can use our friend gsed:
gsed "s/\x10/Data Link Escape/g" yourfile.txt
This should replace all occurrences of characters containing the hex value 0x10 with the text string "Data Link Escape". You'll probably want to use a different string - this is just an example.
Depending on the system you're using you may be able to use the standard sed command if your version of sed recognizes the \xNN single-character escape codes. If there are multiple hex characters you need to replace you may want to create a file containing your sed commands, one for each hexadecmial character you need to replace, and tell sed or gsed to use the commands in the file - consult the sed or gsed man pages for how to do this.
Share and enjoy.
You can use xxd to change the string to its hex representation, then use xxd -r to convert back.
Or, you can use uuencode and uudecode.
One option is to run the file through cat -v. This replaces nonprinting characters with visible representations (using the ^ notation for control characters):
$ echo $'\x10\x12\x13\x14\x16' | cat -v
^P^R^S^T^V
i have a template, with a var LINK
and a data file, links.txt, with one url per line
how in bash i can substitute LINK with the content of links.txt?
if i do
#!/bin/bash
LINKS=$(cat links.txt)
sed "s/LINKS/$LINK/g" template.xml
two problem:
$LINKS has the content of links.txt without newline
sed: 1: "s/LINKS/http://test ...": bad flag in substitute command: '/'
sed is not escaping the // in the links.txt file
thanks
Use some better language instead. I'd write a solution for bash + awk... but that's simply too much effort to go into. (See http://www.gnu.org/manual/gawk/gawk.html#Getline_002fVariable_002fFile if you really want to do that)
Just use any language where you don't have to mix control and content text. For example in python:
#!/usr/bin/env python
links = open('links.txt').read()
template = open('template.xml').read()
print template.replace('LINKS', links)
Watch out if you're trying to force sed solution with some other separator - you'll get into the same problems unless you find something disallowed in urls (but are you verifying that?) If you don't, you already have another problem - links can contain < and > and break your xml.
You can do this using ed:
ed template.xml <<EOF
/LINKS/d
.r links.txt
w output.txt
EOF
The first command will go to the line
containing LINKS and delete it.
The second line will insert the
contents of links.txt on the current
line.
The third command will write the file
to output.txt (if you omit output.txt
the edits will be saved to
template.xml).
Try running sed twice. On the first run, replace / with \/. The second run will be the same as what you currently have.
The character following the 's' in the sed command ends up the separator, so you'll want to use a character that is not present in the value of $LINK. For example, you could try a comma:
sed "s,LINKS,${LINK}\n,g" template.xml
Note that I also added a \n to add an additional newline.
Another option is to escape the forward slashes in $LINK, possibly using sed. If you don't have guarantees about the characters in $LINK, this may be safer.
I am writing a shell (bash) script and I'm trying to figure out an easy way to accomplish a simple task.
I have some string in a variable.
I don't know if this is relevant, but it can contain spaces, newlines, because actually this string is the content of a whole text file.
I want to replace the last occurence of a certain substring with something else.
Perhaps I could use a regexp for that, but there are two moments that confuse me:
I need to match from the end, not from the start
the substring that I want to scan for is fixed, not variable.
for truncating at the start: ${var#pattern}
truncating at the end ${var%pattern}
${var/pattern/repl} for general replacement
the patterns are 'filename' style expansion, and the last one can be prefixed with # or % to match only at the start or end (respectively)
it's all in the (long) bash manpage. check the "Parameter Expansion" chapter.
amn expression like this
s/match string here$/new string/
should do the trick - s is for sustitute, / break up the command, and the $ is the end of line marker. You can try this in vi to see if it does what you need.
I would look up the man pages for awk or sed.
Javier's answer is shell specific and won't work in all shells.
The sed answers that MrTelly and epochwolf alluded to are incomplete and should look something like this:
MyString="stuff ttto be edittted"
NewString=`echo $MyString | sed -e 's/\(.*\)ttt\(.*\)/\1xxx\2/'`
The reason this works without having to use the $ to mark the end is that the first '.*' is greedy and will attempt to gather up as much as possible while allowing the rest of the regular expression to be true.
This sed command should work fine in any shell context used.
Usually when I get stuck with Sed I use this page,
http://sed.sourceforge.net/sed1line.txt