Curl and wget: why isn't the GET parameter used? - shell

I am trying to fetch data from this page using wget and curl in PHP. As you can see by using your browser, the default result is 20 items but by setting the GET parameter iip to number x, I can fetch x items, i.e. http:// www.example.com/ foo ?a=26033&b=100
The problem is that the iip parameter only works in browsers. If I try to fetch the last link using wget or curl, only 20 items are returned. Why? Try this at the command-line:
curl -O http://www.example.com/foo?a=26033&iip=b
wget http://www.example.com/foo?a=26033&iip=b
Why can't I use the GET parameter iip?

Try adding quotes:
curl -O 'http://www.objektvision.se/annonsorer?ai=26033&iip=100'
wget 'http://www.objektvision.se/annonsorer?ai=26033&iip=100'
The & has special functionality on the command line which is likely causing the issues.

Try quoting the argument. At least in cmd, & is used to delimit two commands that are run individually.

You'll have to enclose your URL in either " or ', since the & has a special meaning in shellscript... That'll give you:
curl -O "http://www.objektvision.se/annonsorer?ai=26033&iip=100"
wget "http://www.objektvision.se/annonsorer?ai=26033&iip=100"

& is a reserved word in shell. Just escape it like this \&

Related

How to execute cURL requesting which requires to upload file in Golang?

I have a cURL request as follows.
$(curl --request PUT --upload-file "<path to catalog file on your local machine>" "<presigned URL>")
Let's say that I have to upload a bin/test.txt file with the presigned URL being https://www.someurl.com
I execute the command in my terminal
curl --request PUT --upload-file "bin/test.txt" "https://www.someurl.com" and it works fine.
How do I write a piece of Golang which does the same? I have tried
cmd := exec.Command("curl", "--request", "PUT", "--upload-file", fmt.Sprintf("\"%s\"", catalogPath), fmt.Sprintf("\"%s\"", presignedURL))
err = cmd.Run()
but found no success.
I see one obvious problem preventing that curl call from working properly, one quite possible and another one also possible.
The obvious problem is that string quoting — such as executing curl … --upload-file "bin/test.txt" … — in interpreted by the shell which is to execute the command. Quoting — using either double or single quotes — is used to inhibit interpreting of otherwise special characters by the shell; chiefly it's used to prevent the shell from splitting a string into separate "words" on whitespace characters or their series.
The key takeaway is that the command run by the shell after it's fully parsed the command to be executed (and interpreted the quotes) does not "see" these quotes because they are removed by the shell.
os/exec.Cmd calls the specified program directly and does not "pass it through" the shell. Hence if you include double quotes into the command-line parameters of the program to execute, they are passed to that program, unchanged. This means, curl were to try to find the file named test.txt" located in the directory named "bin — which is most probably not what you expected.
The same applied to the URL.
The second — possible — problem is that your call relies on the current directory of your Go program because you pass a relative path to curl.
This might or might not be a problem but you might check this anyway.
The third problem is that you might want to pass your URL through the "percent escaping" algorithm before passing it to curl.
You might look at PathEscape and QueryEscape functions of the net/url package.
Two pieces of advice follow.
First, I would try very hard not to call out to curl to perform such a ridiculously simple task. Go has excellent support for making HTTP requests (and serving them, FWIW) in its standard library, and PUTting a file is really a no-brainer with a solutions googleable in, like, five minutes.
Second, if, for some reason, you intend to stick with calling curl, consider passing it some options to make it fail loudly on errors — otherwise you're doomed to be in that „but found no success” situation in your attempts. For instance, consider passing curl the -s and -S command line options (together).
that's not how you quote shell arguments, that would break if your argument starts with or ends with \ or ", the proper way to quote shell arguments on unix would be
func quoteshellarg(str string) string {
if strings.Contains(str, "\x00") {
panic("argument contains null bytes, it is impossible to escape null bytes in shell arguments!")
}
return "'" + strings.ReplaceAll(str, "'", "'\\''") + "'"
}
and with that, just
cmd := exec.Command("curl --request PUT --upload-file " + quoteshellarg(catalogPath)+ " " + quoteshellarg(presignedURL));
... at least that's how to do it on unix systems. as for how to do it on Windows, it seems nobody knows for sure, not even Microsoft

Unable to use Select query while using curl

For my interest. I am trying to use the below curl query:
Delphi$ curl -o /dev/null -s -w %{time_total}
http://localhost:8080/getquery?db=EmpDt&col=HRDt&query=select * from
emp where id=1111
but unable to execute it:
[1] 2784
[2] 2785
0.003799invalid option or syntax: 10
[1]- Done curl -o /dev/null -s -w %{time_total}
http://localhost:8080/getquery?db=EmpDt
[2]+ Done col=HRDt
Something is not correct here but not able to get what? Any help would be really helpful. Thanks
in shell unquoted & terminates command and runs the command to its left in the background; thus your post contains three separate commands run concurrently. Either quote & individually with backslash as \& or surround at least the &s and usually the whole string with either singlequotes'http://host/q?x&y&z' or doublequotes "http://host/q?x&y&z"
? and * are also special in shell, although not command terminators, and in general must also be quoted, although in your case after fixing the spaces (below) this becomes less critical
URL cannot contain space; it must be encoded as + (preferred) or %20. Other special characters (here * and =) may not work depending on how your server handles URL parsing, which in turn depends on what your server is and you didn't give any hint; in that case they too must be percent-encoded. (If you want actual +, which you don't, it is encoded as %2B.)

Strange characters appearing in bash variable expansion

Trying to do the following on contos7 works as I expect:
pod_in_question=$(curl -u uname:password -k very.cluster.com/api/v1/namespaces/default/pods/ | grep -i '"name": "myapp-' | cut -d '"' -f 4)
echo "$pod_in_question"
curl -u uname:password -k -X DELETE "very.cluster.com/api/v1/namespaces/default/pods/${pod_in_question}"
However, trying the same thing on MacOS (10.12.1) yields:
curl: (3) [globbing] bad range in column 92
When I try to curl the last line with a -g option it substitutes with a malformed name such as: myapp-\\x1b[m\\x1b[Kl1eti\
The echo statement would always execute just fine and show something like myapp-v7454 which I later want to put into the last curl statement. So where are these other characters coming from?
A robust solution - Basic cURL CLI debugging.
This answer is revised after it's been identified that the issue for the OP relates to curl applying color output.
There's a proposed answer which explains clearly what the embedded special characters meant, and instructions to override the grep behaviour to not output color. Certainly this is a good practise for grep use in piping. There are however a number of best practises that can help diagnose this or a similar issue with cURL and ultimately lead to the most robust solution.
Re-creating the problem
Assuming it's a JSON Content-Type, we use echo {'"name": "myapp-7414"'} to simulate the output from cURL
We filter the text and set a variable with it that we use in a cURL command
We force grep to output color, since it doesn't normally by default when outputting to a tty.
Recreation:
myvar=$(echo {'"name": "myapp-7414"'} | grep --color=always -i '"name": "myapp-' | cut -d '"' -f 4)
curl "https://www.google.com/${myvar}"
Output:
curl: (3) [globbing] bad range in column 32
First up:
'{}' are special characters to cURL, period.
The best practise for URL syntax in cURL:
If Variable Expansion is required:
Apply the -g switch to disable potential globbing done by cURL
Otherwise:
Use $variable as part of a "quoted" url string, instead of ${variable}
Second: In addition to -g, we add --libcurl /tmp/libcurl so we can get some insight into what cURL is seeing.
   Recreation with -g and --libcurl:
curl -g --libcurl /tmp/libcurl "https://www.google.com/${myvar}"
Output:
<p>Your client has issued a malformed or illegal request <ins>That’s all we know.
Perfect, at least now everything is getting to the server and back! Let's see what cURL sent out to the server:
cat /tmp/libcurl
Surely enough we find this line: (note the bold part).
curl_easy_setopt(hnd, CURLOPT_URL, "https://www.google.com/myapp-\033[m7414");
So we know that:
The shell is doing something strange with our variable.
cURL knows not to try glob once we send the -g switch. That way - If there is an error with the shell variable, we can actually see what it is. We shouldn't be debugging a globbing error if we're not trying to use URL Ranges.
The special characters are colors. They represent the --color=always that we added to simulate the OPs environment.
At this point. Since it looks like we're working with JSON data, why not just use a widely available, high performance JSON parsing tool. That has a number of benefits, including:
Not relying on any environment that could affect string filtering
Can request the data we want (aka. "name")
The app name "myapp" can change and we won't have to re-write the code to retrieve it.
It's cleaner and accounts for things I haven't considered yet.
If we used jq for example (while we're at it, we don't need the -g switch because we don't need '{}' for the variable because we're already double " the URL):
myvar=$(echo {'"name": "myapp-7414"'} | jq -r .name)
curl --libcurl /tmp/libcurl "https://www.google.com/$myvar"
Now we get:
<p>The requested URL /myapp-7414 was not found on this server. That’s all we know.
Great. It's all working now. It should be obvious that the test URL here being www.google.com is obviously not going to know was myapp-7414 was.
So we've gone from :
Globbing bad range, to:
Malformed URL, to:
URL not found on server.
We could also as suggested elsewhere change the grep output and override it to --color=never (As I have noted: If grep has to be used, the --color=never is a great way to use it as a best practise when piping strings, period.). However, given the portability issues already experienced because of string filtering, and the fact that we are already handed structured data on a plate that can be parsed reliably, the more robust solution would be to do just that, if possible.
The substitution you showed at the last part looks like one of your calls injected ANSI escape sequences. It's possible that grep isn't detecting non-TTY output and is colorizing.
On a terminal that supports ANSI escape sequences, your particular codes might not be visible. The codes ^E[m^E[K set the screen mode and clear the current line. That's why you thought the echo command proved your data was correct.
You can examine the raw data with:
echo "$pod_in_question" | hexdump -C
And you should see there are other characters in there which did not appear in your terminal before. When you put these "invisible" codes into the URL, curl tries to encode them and then fails when it encounters a control character (ESC).
The solution is to add the argument --color=never to your grep call, which will disable colorization.

Wget: read URL from file, add sequence of number to the URL

I am reading a file (with URL's) line by line:
#!/bin/bash
while read line
do
url=$line
wget $url
wget $url_{001..005}.jpg
done < $1
For first, I want to download primary url as you see wget $url. After that I want to add to the url sequence of numbers (_001.jpg, _002.jpg, _003.jpg, _004.jpg, _005.jpg):
wget $url_{001..005}.jpg
...but for some reason it's not working.
Sorry, missed out one thing: the url's are like http://xy.com/052914.jpg. Is there any easy way to add _001 before the extension? http://xy.com/052914_001.jpg. Or I have to remove ".jpg" from the file containing URL's then simply add later to the variable?
Another way escaping the underscore char:
wget $url\_{001..005}.jpg
Try encapsulating your variable name:
wget ${url}_{001..005}.jpg
Bash is trying to expand the variable $url_ in your command.
As for your jpg within the URL followup, see substring expansion in the bash manual.
wget ${url:0: -4}_{001..005}.jpg
The :0: -4 means, expand to the variable from position zero (the first character), minus the last 4 characters.
Or from this answer:
wget ${url%.jpg}_{001..005}.jpg
%.jpg removes .jpg specifically and will work on older versions of bash.

Github API /issues - pagination trouble

I am using curl from a bash command line to GET Github issues like this:
curl -o myoutput --user "myuser:mypasswd" -G https://api.github.com/issues?filter=all
This is working fine and returns 52 open issues.
I know there are more issues, so I am also examining the headers (using -i) which provides links to the next & last pages, https://api.github.com/issues?filter=all&page=2 & https://api.github.com/issues?filter=all&page=14 respectively
However, using curl with these link URI's produces the same 52 results as before. In fact any page number I try returns the same most recent issues. I am deleting myoutput each time.
What am I missing?
Any words of wisdom on this would be much appreciated.
Thanks
What am I missing?
Use a single quoted string for the URL to make sure the ampersand (e.g &page=2) is not interpreted as a control operator:
curl -o myoutput2 --user "user:pwd" \
'https://api.github.com/issues?filter=all&page=2'
Without doing so you systematically perform a https://api.github.com/issues?filter=all request, which is why the output is always the same.

Resources