grepping the output of a curl command within bash script - bash

I am currently attempting to make a script that when i enter the name of a vulnerability it will return to me the CVSS3 scores from tenable.
So far my plan is:
Curl the page
Grep the content i want
output the grepped CVSS3 score
when running myscript however grep is throwing the following error:
~/Documents/Tools/Scripts ❯ ./CVSS3-Grabber.sh
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 30964 0 30964 0 0 28355 0 --:--:-- 0:00:01 --:--:-- 28355
grep: unrecognized option '-->Nessus<!--'
Usage: grep [OPTION]... PATTERNS [FILE]...
Try 'grep --help' for more information.
This has me very confused as when i run this in the command line i curl the content to sample.txt and then using the exact same grep syntax:
grep $pagetext -e CVSS:3.0/E:./RL:./RC:.
it returns to me the content i need, however when i run it via my script below...
#! /bin/bash
pagetext=$(curl https://www.tenable.com/plugins/nessus/64784)
cvss3_temporal=$(grep $pagetext -e CVSS:3.0/E:./RL:./RC:.)
echo $cvss3_temporal
i receive the errors above!
I believe this is because the '--' are causing grep to think the text inside the file that it is an instruction which grep doesnt know hence the error. I have tried copying the output of the curl to a text file and then grepping that rather than straight from the curl but still no joy. Does anyone know of a method to get grep to ignore '--' or any flags when reading text? Or alternatively if i can configure curl so that it only brings back text and no symbols?
Thanks in advance!

You don't need to store curl response in a variable, just pipe grep after curl like this:
cvss3_temporal=$(curl -s https://www.tenable.com/plugins/nessus/64784 |
grep -F 'CVSS:3.0/E:./RL:./RC:.')
Note use of -s in curl to suppress progress and -F in grep to make sure you are searching for a fixed string.

Grep filters a given file or standard input if none was given. In bash, you can use the <<< here-word syntax to send the variable content to grep's input:
grep -e 'CVSS:3.0/E:./RL:./RC:.' <<< "$pagetext"
Or, if you don't need the page anywhere else, you can pipe the output from curl directly to grep:
curl https://www.tenable.com/plugins/nessus/64784 | grep -e 'CVSS:3.0/E:./RL:./RC:.'

Related

Unable to capture cURL output in a text file?

I am using this command to capture time taken by cURL command:
time curl -v -k http://10.164.128.232:8011/oam/server/HeartBeat >> abc.txt
This leaves abc.txt blank. I further tried this:
time curl -v -k http://10.164.128.232:8011/oam/server/HeartBeat 2>> bcde.txt
I was expecting this command to write complete console output on my text file, but it din't capture time in bcde.txt.
I am unable to find a way using which I can capture cURL's output alongside time taken by it.
Please assist me on this.
The time command may think that the redirection is part of the command being timed. In that case, you can get past it with grouping:
(time curl -v -k http://10.164.128.232:8011/oam/server/HeartBeat) >> abc.txt
(time curl -v -k http://10.164.128.232:8011/oam/server/HeartBeat) 2>> abc.txt
This worked for me!

How to get the highest numbered link from curl result?

i have create small program consisting of a couple of shell scripts that work together, almost finished
and everything seems to work fine, except for one thing of which i'm not really sure how to do..
which i need, to be able to finish this project...
there seem to be many routes that can be taken, but i just can't get there...
i have some curl results with lots of unused data including different links, and between all data there is a bunch of similar links
i only need to get (into a variable) the link of the highest number (without the always same text)
the links are all similar, and have this structure:
always same text
always same text
always same text
i was thinking about something like;
content="$(curl -s "$url/$param")"
linksArray= get from $content all links that are in the href section of the links
that contain "always same text"
declare highestnumber;
for file in $linksArray
do
href=${1##*/}
fullname=${href%.html}
OIFS="$IFS"
IFS='_'
read -a nameparts <<< "${fullname}"
IFS="$OIFS"
if ${nameparts[1]} > $highestnumber;
then
highestnumber=${nameparts[1]}
fi
done
echo ${nameparts[1]}_${highestnumber}.html
result:
https://always/same/link/unique-name_19.html
this was just my guess, any working code that can be run from bash script is oke...
thanks...
update
i found this nice program, it is easily installed by:
# 64bit version
wget -O xidel/xidel_0.9-1_amd64.deb https://sourceforge.net/projects/videlibri/files/Xidel/Xidel%200.9/xidel_0.9-1_amd64.deb/download
apt-get -y install libopenssl
apt-get -y install libssl-dev
apt-get -y install libcrypto++9
dpkg -i xidel/xidel_0.9-1_amd64.deb
it looks awsome, but i'm not really sure how to tweak it to my needs.
based on that link and the below answer, i guess a possible solution would be..
use xidel, or use "$ sed -n 's/.href="([^"]).*/\1/p' file" as suggested in this link, but then tweak it to get the link with html tags like:
< a href="https://always/same/link/same-name_17.html">always same text< /a>
then filter out all that doesn't end with ( ">always same text< /a> )
and then use the grep sort as mentioned below.
Continuing from the comment, you can use grep, sort and tail to isolate the highest number of your list of similar links without too much trouble. For example, if you list of links is as you have described (I've saved them in a file dat/links.txt for the purpose of the example), you can easily isolate the highest number in a variable:
Example List
$ cat dat/links.txt
always same text
always same text
always same text
Parsing the Highest Numbered Link
$ myvar=$(grep -o 'https:.*[.]html' dat/links.txt | sort | tail -n1); \
echo "myvar : '$myvar'"
myvar : 'https://always/same/link/same-name_19.html'
(note: the command above is all one line separate by the line-continuation '\')
Applying Directly to Results of curl
Whether your list is in a file, or returned by curl -s, you can apply the same approach to isolate the highest number link in the returned list. You can use process substitution with the curl command alone, or you can pipe the results to grep. E.g. as noted in my original comment,
$ myvar=$(grep -o 'https:.*[.]html' < <(curl -s "$url/$param") | sort | tail -n1); \
echo "myvar : '$myvar'"
or pipe the result of curl to grep,
$ myvar=$(curl -s "$url/$param" | grep -o 'https:.*[.]html' | sort | tail -n1); \
echo "myvar : '$myvar'"
(same line continuation note.)
Why not use Xidel with xquery to sort the links and return the last?
xidel -q links.txt --xquery "(for $i in //#href order by $i return $i)[last()]" --input-format xml
The input-format parameter makes sure you don't need any html tags at the start and ending of your txt file.
If I'm not mistaken, in the latest Xidel the -q (quiet) param is replaced by -s (silent).

Piping curl output into grep

Just a little disclaimer, I am not very familiar with programming so please excuse me if I'm using any terms incorrectly/in a confusing way.
I want to be able to extract specific information from a webpage and tried doing this by piping the output of a curl function into grep. Oh and this is in cygwin if that matters.
When just typing in
$ curl www.ncbi.nlm.nih.gov/gene/823951
The terminal prints the whole webpage in what I believe to be html. From here I thought I could just pipe this output into a grep function with whatever search term want with:
$ curl www.ncbi.nlm.nih.gov/gene/823951 | grep "Gene Symbol"
But instead of printing the webpage at all, the terminal gives me:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 142k 0 142k 0 0 41857 0 --:--:-- 0:00:03 --:--:-- 42083
Can anyone explain why it does this/how I can search for specific lines of text in a webpage? I eventually want to compile information like gene names, types, and descriptions into a database, so I was hoping to export the results from the grep function into a text file after that.
Any help is extremely appreciated, thanks in advance!
Curl detects that it is not outputting to a terminal, and shows you the Progress Meter. You can suppress the progress meter with -s.
The HTML data is indeed being sent to grep. However that page does not contain the text "Gene Symbol". Grep is case-sensitive (unless invoked with -i) and you are looking for "Gene symbol".
$ curl -s www.ncbi.nlm.nih.gov/gene/823951 | grep "Gene symbol"
<dt class="noline"> Gene symbol </dt>
You probably also want the next line of HTML, which you can make grep output with the -A option:
$ curl -s www.ncbi.nlm.nih.gov/gene/823951 | grep -A1 "Gene symbol"
<dt class="noline"> Gene symbol </dt>
<dd class="noline">AT3G47960</dd>
See man curl and man grep for more information about these and other options.

using curl to call data, and grep to scrub output

I am attempting to call an API for a series of ID's, and then leverage those ID's in a bash script using curl, to query a machine for some information, and then scrub the data for only a select few things before it outputs this.
#!/bin/bash
url="http://<myserver:myport>/ws/v1/history/mapreduce/jobs"
for a in $(cat jobs.txt); do
content="$(curl "$url/$a/counters" "| grep -oP '(FILE_BYTES_READ[^:]+:\d+)|FILE_BYTES_WRITTEN[^:]+:\d+|GC_TIME_MILLIS[^:]+:\d+|CPU_MILLISECONDS[^:]+:\d+|PHYSICAL_MEMORY_BYTES[^:]+:\d+|COMMITTED_HEAP_BYTES[^:]+:\d+'" )"
echo "$content" >> output.txt
done
This is for a MapR project I am currently working on to peel some fields out of the API.
In the example above, I only care about 6 fields, though the output that comes from the curl command gives me about 30 fields and their values, many of which are irrelevant.
If I use the curl command in a standard prompt, I get the fields I am looking for, but when I add it to the script I get nothing.
Please remove quotes after
$url/$a/counters" ". Like following:
content="$(curl "$url/$a/counters | grep -oP '(FILE_BYTES_READ[^:]+:\d+)|FILE_BYTES_WRITTEN[^:]+:\d+|GC_TIME_MILLIS[^:]+:\d+|CPU_MILLISECONDS[^:]+:\d+|PHYSICAL_MEMORY_BYTES[^:]+:\d+|COMMITTED_HEAP_BYTES[^:]+:\d+'" )"

is it possible to prepend a string to output of cURL?

Looking at the man page for cURL:
-w, --write-out <format>
Make curl display information on stdout after a completed transfer.
Where it is possible to use this flag and append a string to the output of cURL. However I can only get this to append to the end of the output of cURL, because as the man page suggests, the -w flag appends after a completed transfer.
so doing:
curl -sS "http:/somewebsite" -w "hello_world"
will produce:
$
contentfromcurl
hello_world
....well how do you get the output to be
$
hello_worldcontentfromcurl
i.e. is it possible to get -w to prepend rather than append?
thanks to #Adrian, this is the final answer -
curl -sS "http:/somewebsite" | xargs echo "mystring"
cheers!
If you're really desperate you can make a code block and include an echo. The following will have the output you're looking for:
{ echo -n "hello_world"; curl -sS "http:/somewebsite"; }
As for getting the -w option to prepend, the answer is no:
-w, --write-out
Make curl display information on stdout after a completed transfer. The format is a string ...
Is this what you are after?
$ printf "bar\nquux\n"
bar
quux
$ printf "bar\nquux\n" | sed 's#^#foo#g'
foobar
fooquux
Obviously, you would replace printf with your curl invocation.
But this seems a bit like an XY-problem - what are you trying to accomplish?

Resources