Run multiple curl commands in parallel - bash

I have the following shell script. The issue is that I want to run the transactions parallel/concurrently without waiting for one request to finish to go to the next request. For example if I make 20 requests, I want them to be executed at the same time.
for ((request=1;request<=20;request++))
do
for ((x=1;x<=20;x++))
do
time curl -X POST --header "http://localhost:5000/example"
done
done
Any guide?

You can use xargs with -P option to run any command in parallel:
seq 1 200 | xargs -n1 -P10 curl "http://localhost:5000/example"
This will run curl command 200 times with max 10 jobs in parallel.

Using xargs -P option, you can run any command in parallel:
xargs -I % -P 8 curl -X POST --header "http://localhost:5000/example" \
< <(printf '%s\n' {1..400})
This will run give curl command 400 times with max 8 jobs in parallel.

Update 2020:
Curl can now fetch several websites in parallel:
curl --parallel --parallel-immediate --parallel-max 3 --config websites.txt
websites.txt file:
url = "website1.com"
url = "website2.com"
url = "website3.com"

This is an addition to #saeed's answer.
I faced an issue where it made unnecessary requests to the following hosts
0.0.0.1, 0.0.0.2 .... 0.0.0.N
The reason was the command xargs was passing arguments to the curl command. In order to prevent the passing of arguments, we can specify which character to replace the argument by using the -I flag.
So we will use it as,
... xargs -I '$' command ...
Now, xargs will replace the argument wherever the $ literal is found. And if it is not found the argument is not passed. So using this the final command will be.
seq 1 200 | xargs -I $ -n1 -P10 curl "http://localhost:5000/example"
Note: If you are using $ in your command try to replace it with some other character that is not being used.

Adding to #saeed's answer, I created a generic function that utilises function arguments to fire commands for a total of N times in M jobs at a parallel
function conc(){
cmd=("${#:3}")
seq 1 "$1" | xargs -n1 -P"$2" "${cmd[#]}"
}
$ conc N M cmd
$ conc 10 2 curl --location --request GET 'http://google.com/'
This will fire 10 curl commands at a max parallelism of two each.
Adding this function to the bash_profile.rc makes it easier. Gist

Add “wait” at the end, and background them.
for ((request=1;request<=20;request++))
do
for ((x=1;x<=20;x++))
do
time curl -X POST --header "http://localhost:5000/example" &
done
done
wait
They will all output to the same stdout, but you can redirect the result of the time (and stdout and stderr) to a named file:
time curl -X POST --header "http://localhost:5000/example" > output.${x}.${request}.out 2>1 &

Wanted to share my example how I utilised parallel xargs with curl.
The pros from using xargs that u can specify how many threads will be used to parallelise curl rather than using curl with "&" that will schedule all let's say 10000 curls simultaneously.
Hope it will be helpful to smdy:
#!/bin/sh
url=/any-url
currentDate=$(date +%Y-%m-%d)
payload='{"field1":"value1", "field2":{},"timestamp":"'$currentDate'"}'
threadCount=10
cat $1 | \
xargs -P $threadCount -I {} curl -sw 'url= %{url_effective}, http_status_code = %{http_code},time_total = %{time_total} seconds \n' -H "Content-Type: application/json" -H "Accept: application/json" -X POST $url --max-time 60 -d $payload
.csv file has 1 value per row that will be inserted in json payload

Based on the solution provided by #isopropylcyanide and the comment by #Dario Seidl, I find this to be the best response as it handles both curl and httpie.
# conc N M cmd - fire (N) commands at a max parallelism of (M) each
function conc(){
cmd=("${#:3}")
seq 1 "$1" | xargs -I'$XARGI' -P"$2" "${cmd[#]}"
}
For example:
conc 10 3 curl -L -X POST https://httpbin.org/post -H 'Authorization: Basic dXNlcjpwYXNz' -H 'Content-Type: application/json' -d '{"url":"http://google.com/","foo":"bar"}'
conc 10 3 http --ignore-stdin -F -a user:pass httpbin.org/post url=http://google.com/ foo=bar

Related

BASH SCRIPT output doesn't export in file

i just stuck with my self coded application calling ws command (web socket) and i'm trying to export the output. Also i want to exit wscat when it's finished after sometime of input from API from the JSON backend devlopment
#!/bin/bash
while getopts a:c: flag
do
case "${flag}" in
a) accesskey=${OPTARG};;
c) clientnodeid=${OPTARG};;
esac
done
master="wscat -c ws://localhost:8091/ws/callback -H accessKey:$accesskey -H clientNodeId:$clientnodeid"
sleep 15
eval $master
final=$(eval echo "$master")
echo $final >>logfile.log
ps -ef | grep wscat | grep -v grep | awk '{print $2}' | xargs kill
#curl -X POST --data "$final" -k "https://localhost:7460/activate" -H "accept: application/json" -H "accessKey:$accesskey" -H "clientNodeId:$clientnodeid" -H "Content-Type: application/json" -H "callbackRequested:true"
exit
I want to call then output from wscat to sent over curl
When i run the script manually it got success but when i call it from another application (java) it's it running but not generating log.
With all words, i want to export $final to text file and that text file i should import it to --data of curl calling
Fixed based on #Barmar's comment:
You're overcomplicating this with all those variables. Just do
eval "$master" >> logfile.log

How to use parallel with curl?

How do i use gnu parallel to make this process faster ?
#!/bin/bash
for (( c=1; c<=100; c++ ))
do
curl -sS 'https://example.com' \
--data 'value='$c'' /dev/null
echo $c
done
You can use parallel, or xargs
seq 100 | parallel curl -sS 'https://example.com' --data value='{}' /dev/null
seq 100 | xargs -I{} curl -sS 'https://example.com' --data value='{}' /dev/null
As the script stand, output will be sent to stdout. With xargs, this will result in output from different calls potentially mixed. Consider redirect output to files for additional processing, if needed.
You can add options for max parallel (-Pn, etc.) as needed
I'm not sure why '/dev/null' is needed. Consider reordering:
curl -sS --data value='{}' https://example.com'

How can I loop over comma-separated lists *inside* each line of a file?

Need to write some status checker at bash-script:
Have file with strings like that:
domain.com; 111.111.111.111,222.222.222.222; /link/to/somefile.js,/link/to/somefile2.js
domain2.com; 122.122.111.111,211.211.222.222; /link/to/somefile2.js,/link/to/somefile3.js
Need to execute such commands at total:
curl -s -I -H 'Host: domain.com' http://111.111.111.111/link/to/somefile.js
curl -s -I -H 'Host: domain.com' http://222.222.222.222/link/to/somefile.js
curl -s -I -H 'Host: domain.com' http://111.111.111.111/link/to/somefile2.js
curl -s -I -H 'Host: domain.com' http://222.222.222.222/link/to/somefile2.js
curl -s -I -H 'Host: domain2.com' http://122.122.111.111/link/to/somefile2.js
curl -s -I -H 'Host: domain2.com' http://211.211.222.222/link/to/somefile2.js
curl -s -I -H 'Host: domain2.com' http://122.122.111.111/link/to/somefile3.js
curl -s -I -H 'Host: domain2.com' http://211.211.222.222/link/to/somefile3.js
The question is:
what tool do I need to use to have such result at total?
Maybe xargs with some arguments/flags can do that or gnu parallel?
Can you, please, show examples?
I can to separate lines and set result to different variables that's isn't problem at all:
domain=$(cut -d';' -f1 file| xargs -I N -d "," echo curl -H) \'N\'
ip=$(cut -d';' -f2 file| xargs -I N -d "," echo curl -H) \'N\'
and else
But question at other :) :
how after delimiting and separating strings to variables, I can execute curl with different variables at that case - the number of arguments for different variables will be different ?
The answer's that get Barmar doesn't cover task problem at all, cause it has greater than two list's. The problem is not at ignorance of bash, but of way I can resolve issue
#!/usr/bin/env bash
# ^^^^- IMPORTANT: not /bin/sh
# print command instead of running it, so people can test their answers without real URLs
log_command() { printf '%q ' "$#"; printf '\n'; }
while IFS='; ' read -r domain addrs_str files_str; do
IFS=, read -a addrs <<<"$addrs_str"
IFS=, read -a files <<<"$files_str"
for file in "${files[#]}"; do
for addr in "${addrs[#]}"; do
log_command curl -s -I -H "Host: $domain" "http://$addr/$file"
done
done
done
...emits as output (as the list of commands if it would run if the log_command prefix were removed):
curl -s -I -H Host:\ domain.com http://111.111.111.111//link/to/somefile.js
curl -s -I -H Host:\ domain.com http://222.222.222.222//link/to/somefile.js
curl -s -I -H Host:\ domain.com http://111.111.111.111//link/to/somefile2.js
curl -s -I -H Host:\ domain.com http://222.222.222.222//link/to/somefile2.js
curl -s -I -H Host:\ domain2.com http://122.122.111.111//link/to/somefile2.js
curl -s -I -H Host:\ domain2.com http://211.211.222.222//link/to/somefile2.js
curl -s -I -H Host:\ domain2.com http://122.122.111.111//link/to/somefile3.js
curl -s -I -H Host:\ domain2.com http://211.211.222.222//link/to/somefile3.js
...as you can see at https://ideone.com/dTC8q8
Now how does this work?
Step 1: Read each line into domain, addrs_str and files_str, split on semicolons and spaces.
That's what's done by the line IFS='; ' read -r domain addrs_str files_str, which operates as described in BashFAQ #1, and in How to read variables from file, with multiple variables per line?
Step 2: For addrs_str and files_str, split them on commas into separate arrays. This is described in How do I split a string on a delimiter in Bash?
Step 3: Iterate over those arrays, and call curl for each combination. If you wanted to call the first IP with only the first file, and the second IP with the second file, you could use Iterate over two arrays simultaneously in bash; otherwise, it's a plain nested loop.
With GNU Parallel it would look like this
doit() {
domain="$1"
ips="$2"
paths="$3"
parallel --dry-run -d ',' -q curl -s -I -H Host:\ "$domain" http://{1}/{2} ::: "$ips" ::: "$paths"
}
export -f doit
parallel --colsep ';' doit :::: input.file
Remove --dry-run when you are convinced it works.

Start multiple background jobs and print results sequentially

I want to start a number of parallel jobs and I want the result outputs in sequential order. The jobs in my case are HTTP requests sent with curl and I'm interested in the response code only. Here is what I have so far:
for i in {1..6}
do
curl -H "Content-Type: application/json" -X POST \
-d 'some data' \
-s -o /dev/null -w "%{http_code}\n" \
<url of service> &
done
wait
This prints the result code of each request, but not in the correct order. Any way I can correct the order of the output?
It is necessary that the requests are actually sent in parallel.
Store the results to files, then print them out once everything is complete:
for i in {1..6}
do
curl -H "Content-Type: application/json" -X POST \
-d 'some data' \
-s -o /dev/null -w "%{http_code}\n" \
<url of service> > result_$i &
done
wait
for i in {1..6}
do
cat result_$i
rm result_$i
done

Bashscript with curl operation in parallel

I have a list with urls which I like to load with CURL and do some operations on the result with a bash script.
Since it are almost 100k requests I like to run this in parallel.
I already looked into GNU parallel, but how am I going to glue all together? Thanks!
The bashscript:
while read URL; do
curl -L -H "Accept: application/unixref+xml" $URL > temp.xml;
YEAR=$(xmllint --xpath '//year' temp.xml);
MONTH=$(xmllint --xpath '(//date/month)[1]' temp.xml);
echo "$URL;$YEAR;$MONTH" >> results.csv;
sed -i '1d' urls.txt;
done < urls.txt;
You shouldn't be modifying the input list of URLs as you make each HTTP request. And having multiple appenders writing to the same output file from different processes will likely end in tears.
Put most of your commands in a separate script (named, say, geturl.sh) that could be invoked with the URL as a parameter, and writes its line of output to standard out:
#!/usr/bin/env bash
URL="${1}"
curl -L -H "Accept: application/unixref+xml" "${URL}" > /tmp/$$.xml
YEAR="$(xmllint --xpath '//year' /tmp/.xml)"
MONTH="$(xmllint --xpath '(//date/month)[1]' /tmp/$$.xml)"
rm -f /tmp/$$.xml
echo "${URL};${YEAR};${MONTH}"
Then invoke as follows (here we let parallel merge the outputs from the various threads line by line):
parallel --line-buffer geturl.sh < urls.txt > results.csv

Resources