curl: (3) Illegal characters found in URL : ${...%?} doesn't work [duplicate] - bash

This question already has an answer here:
Why is a shell script giving syntax errors when the same code works elsewhere? [duplicate]
(1 answer)
Closed 5 years ago.
I've been looking for a solution to my problem all the morning, especially in the 4 posts in https://stackoverflow.com having the same error name in their title but the solutions don't work for me.
I want to do several simple cURL requests put together in a Bash script. The request at the end of the file always works, whatever request it is. However the requests before return an error:
curl: (3) Illegal characters found in URL
I am pretty sure that it has something to do with the carriage return in my file. But I don't know how to deal with it. As I show in the picture below I tried to use ${url1%?}. I also tried ${url1%$'\r'}, but it doesn't change anything.
Screenshot of file + results in terminal:
Any ideas?

If your lines end with \r, stripping away the \r from the $url won't work, because the line
curl -o NUL "{url1%?}
also ends with a \r, which is appended to the url argument again.
Comment out the \r, that is
url1="www.domain.tld/file"
curl -o NUL "${url1%?}" #
or
url1="www.domain.tld/file" #
curl -o NUL "$url1" #
or convert the file before executing it
tr -d '\r' < test.sh > testWithoutR.sh

Related

Decode URL Unix/Bash Command Line (without sed) [duplicate]

This question already has answers here:
Bash script to convert from HTML entities to characters
(12 answers)
Closed 4 years ago.
I am scraping a website with curl and parsing out what I need.
The URLs are returned with Ascii encoded characters like
GET v2.12/...?fields={fieldname_of_type_Tab} HTTP/1.1
How can I convert this to UTF-8 (char) directly from the command line (ideally something I can pipe | to) so that the result is...
GET v2.12/...?fields={fieldname_of_type_Tab} HTTP/1.1
EDIT: There are a number of solutions with sed but the regex that goes along with it is quite ugly. Since the provided answer leveraging perl is very clean I hope we can leave this question open
It's html-entities.
Decode like this using perl :
$ echo 'http://domain.tld/?fields={fieldname_of_type_Tab&#125' |
perl -MHTML::Entities -pe 'decode_entities($_)'
Output :
http://domain.tld/?fields={fieldname_of_type_Tab}

Strange characters appearing in bash variable expansion

Trying to do the following on contos7 works as I expect:
pod_in_question=$(curl -u uname:password -k very.cluster.com/api/v1/namespaces/default/pods/ | grep -i '"name": "myapp-' | cut -d '"' -f 4)
echo "$pod_in_question"
curl -u uname:password -k -X DELETE "very.cluster.com/api/v1/namespaces/default/pods/${pod_in_question}"
However, trying the same thing on MacOS (10.12.1) yields:
curl: (3) [globbing] bad range in column 92
When I try to curl the last line with a -g option it substitutes with a malformed name such as: myapp-\\x1b[m\\x1b[Kl1eti\
The echo statement would always execute just fine and show something like myapp-v7454 which I later want to put into the last curl statement. So where are these other characters coming from?
A robust solution - Basic cURL CLI debugging.
This answer is revised after it's been identified that the issue for the OP relates to curl applying color output.
There's a proposed answer which explains clearly what the embedded special characters meant, and instructions to override the grep behaviour to not output color. Certainly this is a good practise for grep use in piping. There are however a number of best practises that can help diagnose this or a similar issue with cURL and ultimately lead to the most robust solution.
Re-creating the problem
Assuming it's a JSON Content-Type, we use echo {'"name": "myapp-7414"'} to simulate the output from cURL
We filter the text and set a variable with it that we use in a cURL command
We force grep to output color, since it doesn't normally by default when outputting to a tty.
Recreation:
myvar=$(echo {'"name": "myapp-7414"'} | grep --color=always -i '"name": "myapp-' | cut -d '"' -f 4)
curl "https://www.google.com/${myvar}"
Output:
curl: (3) [globbing] bad range in column 32
First up:
'{}' are special characters to cURL, period.
The best practise for URL syntax in cURL:
If Variable Expansion is required:
Apply the -g switch to disable potential globbing done by cURL
Otherwise:
Use $variable as part of a "quoted" url string, instead of ${variable}
Second: In addition to -g, we add --libcurl /tmp/libcurl so we can get some insight into what cURL is seeing.
   Recreation with -g and --libcurl:
curl -g --libcurl /tmp/libcurl "https://www.google.com/${myvar}"
Output:
<p>Your client has issued a malformed or illegal request <ins>That’s all we know.
Perfect, at least now everything is getting to the server and back! Let's see what cURL sent out to the server:
cat /tmp/libcurl
Surely enough we find this line: (note the bold part).
curl_easy_setopt(hnd, CURLOPT_URL, "https://www.google.com/myapp-\033[m7414");
So we know that:
The shell is doing something strange with our variable.
cURL knows not to try glob once we send the -g switch. That way - If there is an error with the shell variable, we can actually see what it is. We shouldn't be debugging a globbing error if we're not trying to use URL Ranges.
The special characters are colors. They represent the --color=always that we added to simulate the OPs environment.
At this point. Since it looks like we're working with JSON data, why not just use a widely available, high performance JSON parsing tool. That has a number of benefits, including:
Not relying on any environment that could affect string filtering
Can request the data we want (aka. "name")
The app name "myapp" can change and we won't have to re-write the code to retrieve it.
It's cleaner and accounts for things I haven't considered yet.
If we used jq for example (while we're at it, we don't need the -g switch because we don't need '{}' for the variable because we're already double " the URL):
myvar=$(echo {'"name": "myapp-7414"'} | jq -r .name)
curl --libcurl /tmp/libcurl "https://www.google.com/$myvar"
Now we get:
<p>The requested URL /myapp-7414 was not found on this server. That’s all we know.
Great. It's all working now. It should be obvious that the test URL here being www.google.com is obviously not going to know was myapp-7414 was.
So we've gone from :
Globbing bad range, to:
Malformed URL, to:
URL not found on server.
We could also as suggested elsewhere change the grep output and override it to --color=never (As I have noted: If grep has to be used, the --color=never is a great way to use it as a best practise when piping strings, period.). However, given the portability issues already experienced because of string filtering, and the fact that we are already handed structured data on a plate that can be parsed reliably, the more robust solution would be to do just that, if possible.
The substitution you showed at the last part looks like one of your calls injected ANSI escape sequences. It's possible that grep isn't detecting non-TTY output and is colorizing.
On a terminal that supports ANSI escape sequences, your particular codes might not be visible. The codes ^E[m^E[K set the screen mode and clear the current line. That's why you thought the echo command proved your data was correct.
You can examine the raw data with:
echo "$pod_in_question" | hexdump -C
And you should see there are other characters in there which did not appear in your terminal before. When you put these "invisible" codes into the URL, curl tries to encode them and then fails when it encounters a control character (ESC).
The solution is to add the argument --color=never to your grep call, which will disable colorization.

Weird behavior in bashrc: concatenate, same code run/fail [duplicate]

This question already has answers here:
Bash syntax error: unexpected end of file
(21 answers)
Closed 6 years ago.
I have a VM with CentOS 6 and I am loading some scripts from bashrc.
Everything worked fine, but I wanted to copy-paste the same code and scripts in an older backup of same VM, but I got an error: "unexpected end of file". Also the same error had to deal another person when I wanted to share those scripts with him (he had the same VM).
So I started to debug a little and found that one row he didn't liked it was (it was parsing an array:
COUNTER=1
while [[ ! -z ${SCRIPT[$COUNTER]} ]]; do
Also he didn't liked this either (it's not exactly the same with "while" logic, but it does the job):
for i in ${Script[#]}; do
So, I replaced it with:
for ((i = 0; i < ${#SCRIPT[#]}; i++)); do
Now I tryed to get the error name with same piece of code and no more errors occurred.
Also I have this behavior which is the weirdest from all:
Code:
BASH_SCRIPTS_LOCATION='/mnt/hgfs/Shared-workspace/scripts/'
SCRIPT[0]='aliases.sh'
SCRIPT[1]='scripts_config.sh'
SCRIPT[2]='credentials.sh'
SCRIPT[3]='other_functions.sh'
SCRIPT[4]='ssh_functions.sh'
SCRIPT[5]='release_functions.sh'
SCRIPT[6]='test_functions.sh'
for ((i = 0; i < ${#SCRIPT[#]}; i++)); do
loadedScript=${BASH_SCRIPTS_LOCATION}${SCRIPT[$i]}
echo -e "$loadedScript"
done
Terminal output (seems the "concatenate" it is replacing the characters starting from the begging of first String/variable :
aliases.shShared-workspace/scripts/
scripts_config.shworkspace/scripts/
credentials.shed-workspace/scripts/
other_functions.shorkspace/scripts/
ssh_functions.sh-workspace/scripts/
release_functions.shkspace/scripts/
test_functions.shworkspace/scripts/
I think I am using something very inappropriate. But I am not sure what or what I should be looking for.
Any recommandation or advice is welcome.
Thanks!
It doesn't show here but your script has carriage return chars in the shell definition lines. Edit them out (using Notepad++ for instance or tr -d "\015" < yourscript.sh > newscript.sh)
You can redirect your script to a file you'll see all the text in the file.
Carriage return char (asc 13, \r) just resets the cursor without skipping to newline. Every text written after that overwrites the text in the current line. Windows uses that to complement the linefeed character. Windows text mode is like that

Wget: read URL from file, add sequence of number to the URL

I am reading a file (with URL's) line by line:
#!/bin/bash
while read line
do
url=$line
wget $url
wget $url_{001..005}.jpg
done < $1
For first, I want to download primary url as you see wget $url. After that I want to add to the url sequence of numbers (_001.jpg, _002.jpg, _003.jpg, _004.jpg, _005.jpg):
wget $url_{001..005}.jpg
...but for some reason it's not working.
Sorry, missed out one thing: the url's are like http://xy.com/052914.jpg. Is there any easy way to add _001 before the extension? http://xy.com/052914_001.jpg. Or I have to remove ".jpg" from the file containing URL's then simply add later to the variable?
Another way escaping the underscore char:
wget $url\_{001..005}.jpg
Try encapsulating your variable name:
wget ${url}_{001..005}.jpg
Bash is trying to expand the variable $url_ in your command.
As for your jpg within the URL followup, see substring expansion in the bash manual.
wget ${url:0: -4}_{001..005}.jpg
The :0: -4 means, expand to the variable from position zero (the first character), minus the last 4 characters.
Or from this answer:
wget ${url%.jpg}_{001..005}.jpg
%.jpg removes .jpg specifically and will work on older versions of bash.

Bash curl and variable in the middle of the url

I would need to read certain data using curl. I'm basically reading keywords from file
while read line
do
curl 'https://gdata.youtube.com/feeds/api/users/'"${line}"'/subscriptions?v=2&alt=json' \
> '/home/user/archive/'"$line"
done < textfile.txt
Anyway I haven't found a way to form the url to curl so it would work. I've tried like every possible single and double quoted versions. I've tried basically:
'...'"$line"'...'
"..."${line}"..."
'...'$line'...'
and so on.. Just name it and I'm pretty sure that I've tried it.
When I'm printing out the URL in the best case it will be formed as:
/subscriptions?v=2&alt=jsoneeds/api/users/KEYWORD FROM FILE
or something similar. If you know what could be the cause of this I would appreciate the information. Thanks!
It's not a quoting issue. The problem is that your keyword file is in DOS format -- that is, each line ends with carriage return & linefeed (\r\n) rather than just linefeed (\n). The carriage return is getting read into the line variable, and included in the URL. The giveaway is that when you echo it, it appears to print:
/subscriptions?v=2&alt=jsoneeds/api/users/KEYWORD FROM FILE"
but it's really printing:
https://gdata.youtube.com/feeds/api/users/KEYWORD FROM FILE
/subscriptions?v=2&alt=json
...with just a carriage return between them, so the second overwrites the first.
So what can you do about it? Here's a fairly easy way to trim the cr at the end of the line:
cr=$'\r'
while read line
do
line="${line%$cr}"
curl "https://gdata.youtube.com/feeds/api/users/${line}/subscriptions?v=2&alt=json" \
> "/home/user/archive/$line"
done < textfile.txt
Your current version should work, I think. More elegant is to use a single pair of double quotes around the whole URL with the variable in ${}:
"https://gdata.youtube.com/feeds/api/users/${line}/subscriptions?v=2&alt=json"
Just use it like this, should be sufficient enough:
curl "https://gdata.youtube.com/feeds/api/users/${line}/subscriptions?v=2&alt=json" > "/home/user/archive/${line}"
If your shell gives you issues with & just put \&, but it works fine for me without it.
If the data from the file can contain spaces and you have no objection to spaces in the file name in the /home/user/archive directory, then what you've got should be OK.
Given the contents of the rest of the URL, you could even just write:
while read line
do
curl "https://gdata.youtube.com/feeds/api/users/${line}/subscriptions?v=2&alt=json" \
> "/home/user/archive/${line}"
done < textfile.txt
where strictly the ${line} could be just $line in both places. This works because the strings are fixed and don't contain shell metacharacters.
Since you're code is close to this, but you claim that you're seeing the keywords from the file in the wrong place, maybe a little rewriting for ease of debugging is in order:
while read line
do
url="https://gdata.youtube.com/feeds/api/users/${line}/subscriptions?v=2&alt=json"
file="/home/user/archive/${line}"
curl "$url" > "$file"
done < textfile.txt
Since the strings may end up containing spaces, it seems (do you need to expand spaces to + in the URL?), the quotes around the variables are strongly recommended. You can now run the script with sh -x (or add a line set -x to the script) and see what the shell thinks it is doing as it is doing it.

Resources