How to urlencode data into a URL, with bash or curl

How to urlencode data into a URL, with bash or curl - bash

How can a string be urlencoded and embedded into the URL? Please note that I am not trying to GET or POST data, so the -G and --data and --data-urlencode options of curl don't seem to do the job.
For example, if you used
curl -G http://example.com/foo --data-urlencode "bar=spaced data"
that would be functionally equivalent to
curl http://example.com/foo?bar=spaced%20data"
which is not desired.
I have a string foo/bar which must be urlencoded foo%2fbar and embedded into the URL.
curl http://example.com/api/projects/foo%2fbar/events
One hypothetical solution (if I could find something like this) would be to preprocess the data in bash, if there exists some kind of urlencode function.
DATA=foo/bar
ENCODED=`urlencode $DATA`
curl http://example.com/api/projects/${ENCODED}/events
Another hypothetical solution (if I could find something like this) would be some switch in curl, similar to this:
curl http://example.com/api/projects/{0}/events --string-urlencode "0=foo/bar"
The specific reason I'm looking for an answer to this question is the Gitlab API. For example, gitlab get single project NAMESPACE/PROJECT_NAME is URL-encoded, eg. /api/v3/projects/diaspora%2Fdiaspora (where / is represented by %2F). Further to this, you can request individual properties in the project, so you end up with a URL such as http://example.com/projects/diaspora%2Fdiaspora/events
Although this question is gitlab-specific, I imagine it's generally applicable to REST API's in general, and I'm surprised I can't find a pre-existing answer on stackoverflow or internet search.

The urlencode function you propose is easy enough to implement:
urlencode() {
python -c 'import urllib, sys; print urllib.quote(sys.argv[1], sys.argv[2])' \
"$1" "$urlencode_safe"
}
...used as:
data=foo/bar
encoded=$(urlencode "$data")
curl "http://example.com/api/projects/${encoded}/events"
If you want to have some characters which are passed through literally -- in many use cases, this is desired for /s -- instead use:
encoded=$(urlencode_safe='/' urlencode "$data")

Related

Is it sensible to automatically URL encode query parameters in a string?

Say I want to make the following request using curl:
https://api.foobar.com/widgets?begin=2018-09-10T01:00:00+01:00&object={"name":"barry"}
The URL encoded version of that string looks like this:
https://api.foobar.com/widgets?begin=2018-09-10T01%3A00%3A00%2B01%3A00&object=%7B%22name%22%3A%22barry%22%7D
Of course, when I'm making requests at the command line I would much rather look at the nicer looking (but not URL-valid) first version. I'm considering using a bash script to split out the different parts of the nice version, encode the relevant ones, and then glue it back together so I don't have to worry about it.
For example, after a couple of rounds of simple splitting on ?, &, and = I can easily get to:
https://api.foobar.com/widgets
begin
2018-09-10T01:00:00+01:00
object
{"name":"barry"}
And after that, URL encode the query string's two values and glue it all back together. I accept that any occurences of & and = in the query string will break this approach.
Is there anything else I should worry about that might make this a particularly stupid idea?

Use --data-urlencode with --get
curl --data-urlencode 'begin=2018-09-10T01:00:00+01:00' --data-urlencode 'object={"name":"barry"}' --get 'http://api.foobar.com/widgets'
-G, --get When used, this option will make all data specified with -d, --data, --data-binary or --data-urlencode to be used in an HTTP GET request instead of the POST request that otherwise would be used. The data will be appended to the URL with a '?' separator.

This is the script I came up with in the end, curl-encoded.sh:
#!/usr/bin/env bash
#
# Make HTTP request for given URL with query string URL encoded.
#
set -e
# Name args.
URL=$1
if [[ $URL = *"?"* ]]; then
DOMAIN_PATH="${URL%%\?*}";
QUERY_STRING="${URL#*\?}"
else
DOMAIN_PATH=$URL
QUERY_STRING=''
fi
# Split query string into key/value pairs.
IFS='&' read -ra PARAMETERS <<< "$QUERY_STRING"
for PARAMETER in "${PARAMETERS[#]}"; do
URLENCODED_PARAMETERS=("${URLENCODED_PARAMETERS[#]}" "--data-urlencode" "$PARAMETER")
done
# Make request.
curl --silent "${URLENCODED_PARAMETERS[#]/#/}" "${#:2}" --get "$DOMAIN_PATH"
You call:
./curl-encoded.sh https://api.foobar.com/widgets?foo=bar&object={"name":"barry"}
And the URL that's fetched is:
https://api.foobar.com/widgets?foo=bar&object=%7B%22name%22%3A%22barry%22%7D

executing a curl request through bash script

I have to insert many data in my application and through the graphical interface it takes many time. For this reason I want to create a bash script and make the requests through curl using the REST API (I have to manually specify the id).
The problem is that i get the error: The server refused this request because the request entity is in a format not supported by the requested resource for the requested method.
Here is the code
#!/bin/bash
for i in {1..1}
do
CURL='/usr/bin/curl -X POST'
RVMHTTP="http://192.168.1.101:8080/sitewhere/api/devices
-H 'accept:application/json'
-H 'content-type:application/json'
-H 'x-sitewhere-tenant:sitewhere1234567890'
--user admin:password"
DATA=" -d '{\"hardwareId":\"$i",\"siteToken\":\"4e6913db-c8d3-4e45-9436-f0a99b502d3c\",\"specificationToken\":\"82043707-9e3d-441f-bdcc-33cf0f4f7260\"}'"
# or you can redirect it into a file:
$CURL $RVMHTTP $DATA >> /home/bluedragon/Desktop/tokens
done
The format of my request has to be json

#!/usr/bin/env bash
rvmcurl() {
local url
url="http://192.168.1.101:8080/sitewhere/${1#/}"
shift || return # function should fail if we weren't passed at least one argument
curl -XPOST "${rvm_curl_args[#]}" "$url" "$#"
}
i=1 # for testing purposes
rvm_curl_args=(
-H 'accept:application/json'
-H 'content-type:application/json'
-H 'x-sitewhere-tenant:sitewhere1234567890'
--user admin:password
)
data=$(jq -n --arg hardwareId "$i" '
{
"hardwareId": $hardwareId,
"siteToken": "4e6913db-c8d3-4e45-9436-f0a99b502d3c",
"specializationToken": "82043707-9e3d-441f-bdcc-33cf0f4f7260"
}')
rvmcurl /api/devices -d "$data"
Note:
Commands, or command fragments intended to be parsed into multiple words, should never be stored in strings. Use an array or a function instead. Quotes inside such strings are not parsed as syntax, and instead (when parsed without eval, which carries its own serious risks and caveats) become literal values. See BashFAQ #50 for a full explanation.
Use a JSON-aware tool, such as jq, to ensure that generated data is legit JSON.
Fully-qualifying paths to binaries is, in general, an antipattern. It doesn't result in a significant performance gain (the shell caches PATH lookups), but it does reduce your scripts' portability and flexibility (preventing you from installing a wrapper for curl in your PATH, in an exported shell function, or otherwise).
All-caps variable names are in a namespace used for variables with meaning to the shell and operating system. Use names with at least one lowercase character for your own variables to prevent any chance of conflict.

Curl Post with File

I have data phone in phone.txt
+6285712341234
+6285712341235
+6285712341236
+6285712341237
+6285712341238
but I don't know how to use this data to curl, here's what I tried:
curl -X POST "https://rest-api.moceansms.com/rest/1/sms" -d "mocean-api-key={api_key}&mocean-api-secret={api_secret}&mocean-from={name}&mocean-to={phone.txt}&mocean-text=Hello"
I should use phone data to send SMS to everyone; I googled for a solution, but with no luck (I don't even know whether the keywords I used to look for a solution where correct or not).

You should use --data flag:
Check:
https://curl.haxx.se/mail/archive-2007-03/0097.html
https://curl.haxx.se/docs/manpage.html#-d
Here the entire explanation from man:
-d, --data
(HTTP) Sends the specified data in a POST request to the HTTP server,
in the same way that a browser does when a user has filled in an HTML
form and presses the submit button. This will cause curl to pass the
data to the server using the content-type
application/x-www-form-urlencoded. Compare to -F, --form.
--data-raw is almost the same but does not have a special interpretation of the # character. To post data purely binary, you
should instead use the --data-binary option. To URL-encode the value
of a form field you may use --data-urlencode.
If any of these options is used more than once on the same command
line, the data pieces specified will be merged together with a
separating &-symbol. Thus, using '-d name=daniel -d skill=lousy' would
generate a post chunk that looks like 'name=daniel&skill=lousy'.
If you start the data with the letter #, the rest should be a file
name to read the data from, or - if you want curl to read the data
from stdin. Multiple files can also be specified. Posting data from a
file named 'foobar' would thus be done with -d, --data #foobar. When
--data is told to read from a file like that, carriage returns and newlines will be stripped out. If you don't want the # character to
have a special interpretation use --data-raw instead.
See also --data-binary and --data-urlencode and --data-raw. This
option overrides -F, --form and -I, --head and --upload.

Strange characters appearing in bash variable expansion

Trying to do the following on contos7 works as I expect:
pod_in_question=$(curl -u uname:password -k very.cluster.com/api/v1/namespaces/default/pods/ | grep -i '"name": "myapp-' | cut -d '"' -f 4)
echo "$pod_in_question"
curl -u uname:password -k -X DELETE "very.cluster.com/api/v1/namespaces/default/pods/${pod_in_question}"
However, trying the same thing on MacOS (10.12.1) yields:
curl: (3) [globbing] bad range in column 92
When I try to curl the last line with a -g option it substitutes with a malformed name such as: myapp-\\x1b[m\\x1b[Kl1eti\
The echo statement would always execute just fine and show something like myapp-v7454 which I later want to put into the last curl statement. So where are these other characters coming from?

A robust solution - Basic cURL CLI debugging.
This answer is revised after it's been identified that the issue for the OP relates to curl applying color output.
There's a proposed answer which explains clearly what the embedded special characters meant, and instructions to override the grep behaviour to not output color. Certainly this is a good practise for grep use in piping. There are however a number of best practises that can help diagnose this or a similar issue with cURL and ultimately lead to the most robust solution.
Re-creating the problem
Assuming it's a JSON Content-Type, we use echo {'"name": "myapp-7414"'} to simulate the output from cURL
We filter the text and set a variable with it that we use in a cURL command
We force grep to output color, since it doesn't normally by default when outputting to a tty.
Recreation:
myvar=$(echo {'"name": "myapp-7414"'} | grep --color=always -i '"name": "myapp-' | cut -d '"' -f 4)
curl "https://www.google.com/${myvar}"
Output:
curl: (3) [globbing] bad range in column 32
First up:
'{}' are special characters to cURL, period.
The best practise for URL syntax in cURL:
If Variable Expansion is required:
Apply the -g switch to disable potential globbing done by cURL
Otherwise:
Use $variable as part of a "quoted" url string, instead of ${variable}
Second: In addition to -g, we add --libcurl /tmp/libcurl so we can get some insight into what cURL is seeing.
Recreation with -g and --libcurl:
curl -g --libcurl /tmp/libcurl "https://www.google.com/${myvar}"
Output:
<p>Your client has issued a malformed or illegal request <ins>That’s all we know.
Perfect, at least now everything is getting to the server and back! Let's see what cURL sent out to the server:
cat /tmp/libcurl
Surely enough we find this line: (note the bold part).
curl_easy_setopt(hnd, CURLOPT_URL, "https://www.google.com/myapp-\033[m7414");
So we know that:
The shell is doing something strange with our variable.
cURL knows not to try glob once we send the -g switch. That way - If there is an error with the shell variable, we can actually see what it is. We shouldn't be debugging a globbing error if we're not trying to use URL Ranges.
The special characters are colors. They represent the --color=always that we added to simulate the OPs environment.
At this point. Since it looks like we're working with JSON data, why not just use a widely available, high performance JSON parsing tool. That has a number of benefits, including:
Not relying on any environment that could affect string filtering
Can request the data we want (aka. "name")
The app name "myapp" can change and we won't have to re-write the code to retrieve it.
It's cleaner and accounts for things I haven't considered yet.
If we used jq for example (while we're at it, we don't need the -g switch because we don't need '{}' for the variable because we're already double " the URL):
myvar=$(echo {'"name": "myapp-7414"'} | jq -r .name)
curl --libcurl /tmp/libcurl "https://www.google.com/$myvar"
Now we get:
<p>The requested URL /myapp-7414 was not found on this server. That’s all we know.
Great. It's all working now. It should be obvious that the test URL here being www.google.com is obviously not going to know was myapp-7414 was.
So we've gone from :
Globbing bad range, to:
Malformed URL, to:
URL not found on server.
We could also as suggested elsewhere change the grep output and override it to --color=never (As I have noted: If grep has to be used, the --color=never is a great way to use it as a best practise when piping strings, period.). However, given the portability issues already experienced because of string filtering, and the fact that we are already handed structured data on a plate that can be parsed reliably, the more robust solution would be to do just that, if possible.

The substitution you showed at the last part looks like one of your calls injected ANSI escape sequences. It's possible that grep isn't detecting non-TTY output and is colorizing.
On a terminal that supports ANSI escape sequences, your particular codes might not be visible. The codes ^E[m^E[K set the screen mode and clear the current line. That's why you thought the echo command proved your data was correct.
You can examine the raw data with:
echo "$pod_in_question" | hexdump -C
And you should see there are other characters in there which did not appear in your terminal before. When you put these "invisible" codes into the URL, curl tries to encode them and then fails when it encounters a control character (ESC).
The solution is to add the argument --color=never to your grep call, which will disable colorization.

Github API /issues - pagination trouble

I am using curl from a bash command line to GET Github issues like this:
curl -o myoutput --user "myuser:mypasswd" -G https://api.github.com/issues?filter=all
This is working fine and returns 52 open issues.
I know there are more issues, so I am also examining the headers (using -i) which provides links to the next & last pages, https://api.github.com/issues?filter=all&page=2 & https://api.github.com/issues?filter=all&page=14 respectively
However, using curl with these link URI's produces the same 52 results as before. In fact any page number I try returns the same most recent issues. I am deleting myoutput each time.
What am I missing?
Any words of wisdom on this would be much appreciated.
Thanks

What am I missing?
Use a single quoted string for the URL to make sure the ampersand (e.g &page=2) is not interpreted as a control operator:
curl -o myoutput2 --user "user:pwd" \
'https://api.github.com/issues?filter=all&page=2'
Without doing so you systematically perform a https://api.github.com/issues?filter=all request, which is why the output is always the same.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio