Cannot Get rid of all of ^C , ^A, in a file - controls

Hi: I have a file with
^A^M (they seem to go together)
^C
^M
In vi, I used
:%s/\r//g
to get rid of most of the annoying characters but
^C and ^A remains.
It's a wierd.
dos2unix did not work.
I used
perl -pi -e 's/[\cA-\cZ]*//g'
but all the spaced between the lines have been removed
If anyone knows of a good trick, please let me know. Thanks!

Related

sed behaving strangely when replacing line ends in WSL

I am trying to use sed to add some text, at every line end, for all .txt files in a directory. This is the exact command I use: find . -name "*.txt" -exec sed -i 's/$/:orig/' {} +
Expected:
https://pbs.twimg.com/media/EUr539_UMAAFqZM.jpg:orig
https://pbs.twimg.com/media/ENTrymcUwAAnd6_.jpg:orig
https://pbs.twimg.com/media/EIzzcrFUYAAgfUo.jpg:orig
That is also what I actually get when I run it in my laptop with Linux Mint 19.2. But when I try it in my Windows PC, running sed through Ubuntu in WSL, what I get is this:
https://pbs.twimg.com/media/EUr539_UMAAFqZM.jpg
:orig
https://pbs.twimg.com/media/ENTrymcUwAAnd6_.jpg
:orig
https://pbs.twimg.com/media/EIzzcrFUYAAgfUo.jpg:orig
If I cat the files in question while still in the Ubuntu terminal, what's displayed is more like this (there's some weird whitespace that makes it look like columns in SO, but generally they all look pretty chaotic):
:orig://pbs.twimg.com/media/EUr539_UMAAFqZM.jpg :orig://pbs.twimg.com/media/ENTrymcUwAAnd6_.jpg https://pbs.twimg.com/media/EIzzcrFUYAAgfUo.jpg:orig
I understand Windows and Linux text is formatted differently and that line ends in particular are problematic, though I am uncertain if that is of any importance here.
Can anyone shed light on this behavior? How can I get the command to behave consistently?
The problem is that your files end in CRLF but the WSL sed uses just LF and the line end. You can get around this with a three step process, if you know it's a CRLF-style file:
get rid of the CR;
do your change;
put the CR back.
That would go something like: sed -i -e 's/\r$//' -e 's/$/:orig/' -e 's/$/\r/'.
However, that this won't work on UNIX-style files since the first substitution will do nothing but the third will put a CR character at the end of each line, even though it wasn't there originally. If you want something that will work on both types of files, this should do it:
sed -E 's/(\r)?$/:orig\1/'
This captures the optional CR at the end of the line and puts it back in the substitution (if it's not in the original line, it won't put it back).

In shell script, colon(:) is being treated as a operator for variable creation

I have following snippet:
host="https://example.com"
port="80"
url="${host}:${port}"
echo $url
the output is:
:80ps://example.com
How can I escape the colon here. I also tried:
url="${host}\:${port}"
but it did not work.
Expected output is:
https://example.com:80
You've most likely run into what I call the Linefeed-Limbo.
If I copy the code you provided from StackOverflow and run it on my machine (bash version 4.4.19(1)), then it outputs correctly
user#host:~$ cat script.sh
host="https://example.com"
port="80"
url="${host}:${port}"
echo $url
user#host:~$ bash script.sh
https://example.com:80
What is Linefeed-Limbo?
Different operating systems use different ASCII symbols to represent when a new line occurs in a text, such as a script. This Wikipedia article gives a good introduction.
As you can see, Unix and Unix-like systems use the single character \n, also called a "Line Feed". Windows, as well as other systems, use \r\n, so a "carriage return" followed by a "line feed".
What happens now is when you write a script on Windows on an editor such as notepad, what you write is host="example.com"\r\n. When you copy this file into Linux, Linux interprets the \r as if it were part of the script, since only \n is considered a new line. And indeed, when I change my newline style to DOS-style, I get the exact output you get.
How can I fix this?
You have several options to fix this issue.
Converting the script (with dos2unix)
Since all you need to do is replacing every instance of \r\n with \n, you could use any text-editing software you want. However, if you like simple solutions, then dos2unix (and its sister unix2dos) might be what you looking for:
user#host:~$ dos2unix script.sh
dos2unix: converting file script.sh to Unix format...
That's it. Run your file now and you will see it behaves well.
Encoding the source-file correctly
By using a more advanced text editor such as Notepad++, you can define which style of newline you would like to use.
By changing the newline-type to whichever system you intend to run your script on, you will not run into any problems like this anymore.
Bonus round: Why does it output :80ps://example.com?
To understand why your output is like this, you have to look at what your script is doing, and what \r means.
Try thinking of your terminal as an old-fashioned typewriter. Returning the carriage means you start writing on the left again. Making a "new line" means sliding the paper. These two things are seperate, and I think that's why some systems decided to use these two characters as a logical "new line".
But I digress. Let's look at the first line, host="https://example.com"\r.
What this means when printed is "Print https://example.com, then put the carriage back at the start". When you then print :80\r, it doesn't start after ".com", it starts at the beginning of the line, because that's where you (unknowingly) told the cursor to go. it then overwites the first few characters, resulting in ":80ps://example.com" to be written. Keep in mind that after 80, you again placed a carriage return symbol, so any new text you would have written ends up overwriting the beginning again.
It works for me, try to remove carriage returns in variables and then try.
new_host=$(echo "$host" | tr -d '\r')
new_port=$(echo "$port" | tr -d '\r')
new_url="${new_host}:${new_port}"

Git bash keeps auto scrolling to the bottom of the window

I have installed multiple versions of git for windows, but every version I have tried so far acts the same. If I have a bunch of output lines in the terminal, scroll up to see some of the earlier outputs, the window will automatically take me back to the bottom at the prompt. It seems to be happening on an interval about 5 seconds apart. I tried replicating the issue with CMD and powershell, but it only happens in git bash. Even just running bash.exe inside the bin folder doesn't produce the auto scrolling, just git-bash.exe. Any ideas why this is happening or how to stop it?
Edit1: It seems as though it is automatically executing a page down command. If I use the less command, it automatically goes page by page. I thought maybe it was a keyboard issue but this is the only application that seems to be doing this.
Edit2: I wrote a quick bash script that logs input to a file.
while true; do
read -s -n 1 input
echo $input >> file.txt
done
I printed the contents of the file using od -c file.txt. The output after a few seconds is below.
0000000 \n \n 177 \n \n 177 \n \n 177 \n \n 177 \n \n \n 177
0000020 \n \n 177 \n \n 177 \n \n
0000030
Does anyone know how to stop it? Does this look like a keyboard issue?
In Git Bash:
Options ► Mouse ► uncheck "Copy on select"
In order to make up for not having the copy on select functionality:
Options ► Keys ► Ctl+Shift+letter shortcut
Should allow you to use Ctl+Shift+C for copying.
Obviously this is just a workaround. I was in the process of searching to see if a bug has been created for this issue when I found this post.
Edit: For completeness, I'm using git version 2.10.1.windows.1
I have noted this behavior from Git Bash when the window size is smaller than some multiple of the font height. Making the window taller or the font smaller seems to fix it.

vim script leaves characters in stdin

I'm trying to use vim with -s option to run a script that replaces some lines in a file like this (text.txt):
test1
ab
ac
ae
test2
sd
Script file is like this (script):
:silent %s/test1\zs\_.\+\zetest2/\=substitute(submatch(0), '\n\(\w\)', '\n#\1', 'g')/g
:wq
It comments out lines between test1 and test2. Which is what I want. What I don't want though is output before and after prompt. I run it and get:
user#hostname: ~/vimtest$ vim -s script text.txt
^[[?1;2cuser#hostname: ~/vimtest$ 1;2c
So this ^[[?1;2c is bad news already but 1;2c is in the input as if I already typed it. If I hit enter it gives me a bash error. So I have to remove these symbols each time the script is used. Any ideas?
It seems like vim (or some vim startup script) is trying to figure out what type of terminal you are using. The ^[[?1;2c, with the last few characters left in the input buffer, is almost certainly part of your terminal emulator's response to a DA (Device Attributes) query. You can see this yourself by typing in bash:
printf '\033[c'
or, to see the complete return, pause a bit:
printf '\033[c'; sleep 0.1; echo
The response \033[?1;2c means "I'm a VT100 with Advanced Video Option.", which is what xterm and many other console programs respond. (The Linux console itself responds \033[?6c, which means "I'm a VT102.")
The reason that only 1;2c is left in the console input buffer, by the way, is that the initial escape code \033[? was ignored when it was read. The readline library will ignore it without echoing it, whereas normal console input will echo it and then ignore it; that's why the two shell commands above differ.
I can't reproduce this problem with my vim installation, so I don't really even know where to start looking. But you might try to see if disabling all startup files helps:
vim -u NONE -s script text.txt
If that helps, start disabling installed extensions one by one until you find the one which is causing the problem.
:%s/test1\zs\_.\+\ze\ntest2/\=substitute(submatch(0), '\n', '\n#', 'g')/g
:wq
this is tested here, it changed the input file in required way.
Some changes done based on your command:
add \n after \ze
in substitute() function we can just handle the \n, we don't need to capture the word after the \n
I noticed that you tagged the question with bash, so I thought a shell-solution should be accepted too.
awk '/test1/{p=1;print;next}/test2/{p=0;print;next}{$0=(p?"#":"")$0}7' file
this awk oneliner should do that for you. vim is very powerful editor, I love vim. But if you want to do some automatic transformation, I prefer a script or a proper text processing tool. On a linux box you can always find one. It is easier to test and debug.
Test with your input:
kent$ cat f
test1
ab
ac
ae
test2
sd
kent$ awk '/test1/{p=1;print;next}/test2/{p=0;print;next}{$0=(p?"#":"")$0}7' f
test1
#ab
#ac
#ae
test2
sd
If you want to save the text back to your file, you can :
awk '...' file > tmp.file && mv tmp.file file

sed can not work in script file in Windows

I once write a simple sed command like this
s/==/EQU/
while I run it in command line:
sed 's/==/EQU' filename
it works well, replace the '==' with 'EQU', but while I write the command to a script file named replace.sed, run it in this way:
sed -f replace.sed filename
there is a error, says that
sed: file replace.sed line 1: unknwon option to 's'
What I want to ask is that is there any problem with my script file replace.sed while it run in windows?
The unknown option is almost invariably a rogue character after the trailing / (which is missing from your command line version, by the way so it should complain about an unterminated command).
Have a look at you replace.sed again. You may have a funny character at the end, which could include the ' if you forgot to delete it, or even a CTRL-M DOS-style line ending, though CygWin seems to handle this okay - you haven't specified which sed you're using (that may help).
Okay, based on your edit, it looks like one of my scattergun of suggestions was right :-) You had CTRL-M at the end of the line because of the CR/LF line endings:
At the end of each line in the *.sed file, there was a 'CR\LF' pair, and that the problem, but you cannot see it by default, I use notepad to delete them manually and fix the problem. But I have not find a way to delete it automatically or do not contain the 'new-line' style while edit a new text file in windows.
You may want to get your hands on a more powerful editor like Notepad++ or gVim (my favourite) but, in fact, you do have a tool that can get rid of those characters :-) It's called sed.
sed 's/\015//g' replace.sed >replace2.sed
should get rid of all the CR characters from your file and give you a replace2.sed that you can use for your real job.

Resources