I am trying to run the following command
echo `grep -o "<\/div><div class\=\".*" $1` |
grep -o "title=\\"\(.*\?\)\\" aria-describedby" -> title.txt
from script test.sh.
However, every time I check my file title.txt, it is empty.
I tested the first part of the command,
echo `grep -o "<\/div><div class\=\".*" $1`
and it works fine.
The second part is the one with the problem"
grep -o "title=\\\"\(.*\?\)\\\" aria-describedby" -> title.txt
Just to keep in mind, this is not being run from the terminal itself, but from a bash script file being called from the terminal.
I believe my problem lies in how I am quoting or escaping the quotes.
I do not know if your expressions do what you want them to do, but there is an issue with this one :
"title=\\"\(.*\?\)\\"
When the shell sees to consecutive backslashes (basically an escaped backslash), it will read them as one literal backslash. The first twin backslashes in your expression are read like this, and the double quote that follows ends the string. In other words, the following is a string :
"title=\\"
And the rest of the line :
\(.*\?\)\\"
ends with a double quote (not escaped once again due to the twin backslashes that become one literal backslash), but has no initial double quote.
Related
In Windows 10, I used following command in Git bash 1.9.5,
$ vim -c "%s/^/\=line(".").'. '/g" -c "%s/\//\\/g" -c "%y+" -c "wq" ~/Desktop/sample.txt
in order to,
-c "%s/^/\=line(".").'. '/g": add a sequence of numbers before every line.
-c "%s/\//\\/g": replace all slashes with backslashes.
-c "%y+" -c "wq": copy all text to clipboard, save & exit Vim.
Everything works well except 2, it seems that -c "%s/\//\\/g" argument cannot be dealt correctly by Vim. None of slashes was replaced, and a g was inserted after every first / of each line. For example,
Sample.txt before
A/P/A/T/H
B/P/A/T/H
Sample.txt after
A/gP/A/T/H
B/gP/A/T/H
However, if I execute :%s/\//\\/g in vim, it would works as I expected.
Besides, I've tried these,
-c "%s/\//#/g" can replace all / to # as expected.
-c "%s,/,\\,g" will replace the first / in each line to ,g.
So, I wonder if it is a limit or a known issue of -c argument, or I made mistakes somewhere?
Edit: I just found by chance that -c "%s/\//\\\/g" works as expected.
So could anyone please explain why another \ is needed to escape \?
In Vim, when using regular expressions, the backslash is used to escape characters so it can't be used on its own to represent a literal backslash. For that, you must escape the literal backslash with an escape backslash:
:%s/\//\\/g
So you start with two backslashes no matter what: the escape one and the literal one. That's what Vim expects.
In your shell, backslashes also have a special meaning. When inside double quotes, two consecutive backslashes "collapse" into a single one so, when you think you are telling Vim to do:
:%s/\//\\/g
with:
-c "%s/\//\\/g"
what it actually receives is:
%s/\//\/g
which means: "substitute each slash with an escaped slash (so a simple slash) followed by letter g". Not exactly what you had in mind.
To make sure Vim actually receives the proper command you need to add a third backslash:
-c "%s/\//\\\/g"
Two backslashes are collapsed into a single one and the other one is left intact so you end up with two backslashes, which is what Vim expects:
%s/\//\\/g
Another, better, approach would be to use single quotes, where backslashes are always literal, instead of double quotes:
-c '%s/\//\\/g'
From this page about shell quoting, emphasis mine:
The backslash retains its meaning only when followed by dollar, backtick, double quote, backslash or newline. Within double quotes, the backslashes are removed from the input stream when followed by one of these characters.
I have just recently got back into learning bash. Currently working on a project of mine and when using sed I've run into an issue, I've tried looking around the web for help but haven't had any joy. I suspect as I may not be using the correct terminology so I can't find what I'm looking for. ANYHOW.
So in my script I'm trying to assign the output of date to a variable. Here's the line from my script.
origdate=$(date)
When I call it the output looks like this:
Wed Oct 5 19:40:45 BST 2016
Part of my script then generates a file and writes information to it, part of which I am trying to use sed to find lines and replace parts of it. This is the first I've been playing around with sed, I've used it successfully so far for my needs. However I'm getting stuck when I try this:
sed -i '/origdate=empty/c\'$origdate'' $sd/pingcheck-email-$job.txt
When I run the script and it gets to this line, this is the error I'm getting:
sed: can't read Oct: No such file or directory
sed: can't read 5: No such file or directory
sed: can't read 19:52:56: No such file or directory
sed: can't read BST: No such file or directory
sed: can't read 2016: No such file or directory
I suspect it's something to do with the spaces in the date (variable), my question is: how can I work around this? Can I get sed to 'ignore' the spaces? or should I just use cut to cut the field for the date, and set that to a variable and the same thing again to set the time to another variable?
Even if someone could kindly point me in the right direction that'd be great!
Thanks in advance!
double quote the variable
sed -i '/origdate=empty/c\'"$origdate"'' $sd/pingcheck-email-$job.txt
or alternatively, the whole script
sed -i "/origdate=empty/c\$origdate" $sd/pingcheck-email-$job.txt
The problem is not with sed but rather with how bash word splits on your date given your command.
Bash
In bash, word splitting is performed on the command line so that text is broken up into a list of arguments. To illustrate, I'm going to run a simple script that outputs the first argument only.
bash -c 'echo $1' ignored_0 foo bar
Think of bash -c 'echo $1' ignored_0 as the command (sed in your case) and foo bar as the arguments. In this case, foo bar is split into two arguments, foo and bar.
To pass foo bar in as the first parameter, you need to have the text in either single or double quotes. See the GNU manual on quoting.
bash -c 'echo $1' ignored_0 'foo bar'
bash -c 'echo $1' ignored_0 "foo bar"
Parameter expansion does not occur when the variable is inside a single quote.
var="foo bar"
bash -c 'echo $1' ignored_0 '$var'
bash -c 'echo $1' ignored_0 "$var"
NOTE: In the command `bash -c 'echo $1', I do not want $1 to expand before being passed as an argument to bash because that's part of the code I want to execute.
Parameter expansion occurs when variables are outside of quotes, but word splitting will apply after the parameter is expanded. From the bash man page in the Word Splitting section:
The shell scans the results of parameter expansion, command
substitution, and arithmetic expansion that did not occur within
double quotes for word splitting.
From the GNU bash manual on Word Splitting:
The shell scans the results of parameter expansion, command
substitution, and arithmetic expansion that did not occur within
double quotes for word splitting.
var="foo bar"
bash -c 'echo $1' ignored_0 $var
The last step in Shell Expansions in Quote Removal where unquoted quote characters are removed before being passed to commands. The following command shows that ''"" has no effect on the arguments passed.
bash -c 'echo $1' ignored_0 foo''""
Application
In your example, the trailing '' after $origdate is extraneous. The important part is that $origdate is not quoted so word splitting applies to the expanded variable.
When -e is not passed to the sed command, sed expects the expression to be in one argument, or word from bash. When you run your command, your expression is /origdate=empty/c\Wed and the rest of the date is considered to be files for the expression to be applied to.
The simple fix is to put double quotes around the string for which you want to prevent word splitting. I've modified the command so that anyone can run this example without having the files on their system.
In this example, the \ must be escaped so that it is not considered an escape character for $.
echo "origdate=empty" | sed "/origdate=empty/c\\$origdate"
You can also change the type of quotes you are using without affecting word splitting like so.
echo "origdate=empty" | sed '/origdate=empty/c\'"$origdate"
You need escape by double slash
\ / \%
The POSIX shell standard at
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_07_04
says in Section 2.6:
command substitution (...) shall be performed
(...)
Quote removal (...) shall always be performed last.
It appears to me that quote removal is not performed after command substitution:
$ echo "#"
#
$ echo '"'
"
as expected, but
$ echo $(echo '"')#"
>
What am I not understanding?
Added after reading answer/comments:
From what everybody is saying, the consideration of quotes happens at the very beginning of parsing, for example, to decide if a command is even "acceptable". Then why does the standard bother to emphasise, that the quote removal is performed late in the process??
"then the outer command becomes echo "#" and is balanced"
That is not 'balanced' because the first double-quote does not count. Quotes are only meaningful as quotes if they appear unencumbered on the command line.
To verify, let's look at this:
$ echo $(echo '"')#
"#
That is balanced because the shell does considers that " to be just another character.
By contrast, this is unbalanced because it has one and only one shell-active ":
$ echo $(echo '"')#"
>
Similar example 1
Here we show the same thing but using parameter expansion instead of command substitution:
$ q='"'; echo $q
"
Once the shell has substituted " for $q, one might think that there was an unbalanced double-quote. But, that double-quote was the results of parameter expansion and is therefore not a shell-active quote.
Similar example 2
Let's consider a directory containing file:
$ ls
file
$ ls "file"
file
As you can see above, quote removal is perfomed before ls is run.
But, consider this command:
$ echo ls $(echo '"file"')
ls "file"
As you can see ls $(echo '"file"') expands to ls "file" which is the command which ran successfully above. Now, let's try running that:
$ ls $(echo '"file"')
ls: cannot access '"file"': No such file or directory
As you can see, the shell does not treat the double-quotes that remain after command substitution. This is because those quotes are not considered to be shell-active. As a consequence, they are treated as normal characters and passed on to ls which complains that the file whose name begins and ends with " does not exist in the directory.
The same is happening here:
$ cmd='ls "file"'
$ $cmd
ls: cannot access '"file"': No such file or directory
POSIX standard
From the POSIX standard:
Enclosing characters in single-quotes ( ' ' ) shall preserve the
literal value of each character within the single-quotes
In other words, once the double-quote appears inside single quotes, it has no special powers: it is just another character.
The standard also mentions escaping and double-quotes as methods of preserving "the literal value" of a character.
Practical consequences
People new to shell often want to store a command in a variable as in the cmd='ls "file"' example above. But, because quotes and other shell-active characters cease to be shell active once they are stored in a variable, the complex cases always fail. This leads to a classic essay:
"I'm trying to put a command in a variable, but the complex cases always fail!"
I want to issue this command from the bash script
sed -e $beginning,$s/pattern/$variable/ file
but any possible combination of quotes gives me an error, only one that works:
sed -e "$beginning,$"'s/pattern/$variable/' file
also not good, because it do not dereferences the variable.
Does my approach can be implemented with sed?
Feel free to switch the quotes up. The shell can keep things straight.
sed -e "$beginning"',$s/pattern/'"$variable"'/' file
You can try this:
$ sed -e "$beginning,$ s/pattern/$variable/" file
Example
file.txt:
one
two
three
Try:
$ beginning=1
$ variable=ONE
$ sed -e "$beginning,$ s/one/$variable/" file.txt
Output:
ONE
two
three
There are two types of quotes:
Single quotes preserve their contents (> is the prompt):
> var=blah
> echo '$var'
$var
Double quotes allow for parameter expansion:
> var=blah
> echo "$var"
blah
And two types of $ sign:
One to tell the shell that what follows is the name of a parameter to be expanded
One that stands for "last line" in sed.
You have to combine these so
The shell doesn't think sed's $ has anything to do with a parameter
The shell parameters still get expanded (can't be within single quotes)
The whole sed command is quoted.
One possibility would be
sed "$beginning,\$s/pattern/$variable/" file
The whole command is in double quotes, i.e., parameters get expanded ($beginning and $variable). To make sure the shell doesn't try to expand $s, which doesn't exist, the "end of line" $ is escaped so the shell doesn't try anything funny.
Other options are
Double quoting everything but adding a space between $ and s (see Ren's answer)
Mixing quoting types as needed (see Ignacio's answer)
Methods that don't work
sed '$beginning,$s/pattern/$variable/' file
Everything in single quotes: the shell parameters are not expanded (doesn't follow rule 2 above). $beginning is not a valid address, and pattern would be literally replaced by $variable.
sed "$beginning,$s/pattern/$variable/" file
Everything in double qoutes: the parameters are expanded, including $s, which isn't supposed to (doesn't follow rule 1 above).
the following form worked for me from within script
sed $beg,$ -e s/pattern/$variable/ file
the same form will also work if executed from the shell
I'm trying to pull a list of files over ssh with rsync, but I can't get it to work with filenames that have spaces on it! One example file is this:
/home/pi/Transmission_Downloads/FUNDAMENTOS_JAVA_E_ORIENTAÇÃO_A_OBJETOS/2. Fundamentos da linguagem/estruturas-de-controle-if-else-if-e-else-v1.mp4
and I'm trying to transfer it using this shell code.
cat $file_name | while read LINE
do
echo $LINE
rsync -aP "$user#$server:$LINE" $local_folder
done
and the error I'm getting is this:
receiving incremental file list
rsync: link_stat "/home/pi/Transmission_Downloads/FUNDAMENTOS_JAVA_E_ORIENTAÇÃO_A_OBJETOS/2." failed: No such file or directory (2)
rsync: link_stat "/home/pi/Fundamentos" failed: No such file or directory (2)
rsync: link_stat "/home/pi/da" failed: No such file or directory (2)
rsync: change_dir "/home/pi//linguagem" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1655) [Receiver=3.1.0]
I don't get it why does it print OK on the screen, but parses the file name/path incorrectly! I know spaces are actually backslash with spaces, but don't know how to solve this. Sed (find/replace) didn't help either, and I also tried this code without success
while IFS='' read -r line || [[ -n "$line" ]]; do
echo "Text read from file: $line"
rsync -aP "$user#$server:$line" $local_folder
done < $file_name
What should I do to fix this, and why this is happening?
I read the list of files from a .txt file (each file and path on one line), and I'm using ubuntu 14.04. Thanks!
rsync does space splitting by default.
You can disable this using the -s (or --protect-args) flag, or you can escape the spaces within the filename
The shell is correctly passing the filename to rsync, but rsync interprets spaces as separating multiple paths on the same server. So in addition to double-quoting the variable expansion to make sure rsync sees the string as a single argument, you also need to quote the spaces within the filename.
If your filenames don't have apostrophes in them, you can do that with single quotes inside the double quotes:
rsync -aP "$user#$server:'$LINE'" "$local_folder"
If your filenames might have apostrophes in them, then you need to quote those (whether or not the filenames also have spaces). You can use bash's built-in parameter substitution to do that (as long as you're on bash 4; older versions, such as the /bin/bash that ships on OS X, have issues with backslashes and apostrophes in such expressions). Here's what it looks like:
rsync -aP "$user#$server:'${LINE//\'/\'\\\'\'}'" "$local_folder"
Ugly, I know, but effective. Explanation follows after the other options.
If you're using an older bash or a different shell, you can use sed instead:
rsync -aP "$user#$server:'$(sed "s/'/'\\\\''/g" <<<"$LINE")'" "$local_folder"
... or if your shell also doesn't support <<< here-strings:
rsync -aP "$user#$server:'$(echo "$LINE" | sed "s/'/'\\\\''/g")'" "$local_folder"
Explanation: we want to replace all apostrophes with.. something that becomes a literal apostrophe in the middle of a single-quoted string. Since there's no way to escape anything inside single quotes, we have to first close the quotes, then add a literal apostrophe, and then re-open the quotes for the rest of the string. Effectively, that means we want to replace all occurrences of an apostrophe (') with the sequence (apostrophe, backslash, apostrophe, apostrophe): '\''. We can do that with either bash parameter expansion or sed.
In bash, ${varname/old/new} expands to the value of the variable $varname with the first occurrence of the string old replaced by the string new. Doubling the first slash ( ${varname//old/new} ) replaces all occurrences instead of just the first one. That's what we want here. But since both apostrophe and backslash are special to the shell, we have to put a(nother) backslash in front of every one of those characters in both expressions. That turns our old value into \', and our new one into \'\\\'\'.
The sed version is a little simpler, since apostrophes aren't special. Backslashes still are, so we have to put a \\ in the string to get a \ back. Since we want apostrophes in the string, it's easier to use a double-quoted string instead of a single-quoted one, but that means we need to double all the backslashes again to make sure the shell passes them on to sed unmolested. That's why the shell command has \\\\: that gets handed to sed as \\, which it outputs as \.