I want to parse out data out of a log file which consist of JSON sting and I wonder if there's a way for me to use a bash function to perform any custom parsing instead of overloading jq command.
Command:
tail errors.log --follow | jq --raw-output '. | [.server_name, .server_port, .request_file] | #tsv'
Outputs:
8.8.8.8 80 /var/www/domain.com/www/public
I want to parse 3rd column to cut the string to exclude /var/www/domain.com part where /var/www/domain.com is the document root, and /var/www/domain.com/subdomain/public is the public html section of the site. Therefore I would like to leave my output as /subdomain/public (or from the example /www/public).
I wonder if I can somehow inject a bash function to parse .request_file column? Or how would I do that using jq?
I'm having issues piping out the output of any part of this command that would allow me to do any sort of string manipulation.
Use a BashFAQ #1 while read loop to iterate over the lines, and a BashFAQ #100 parameter expansion to perform the desired modifications:
tail -f -- errors.log \
| jq --raw-output --unbuffered \
'[.server_name, .server_port, .request_file] | #tsv' \
| while IFS=$'\t' read -r server_name server_port request_file; do
printf '%s\t%s\t%s\n' "$server_name" "$server_port" "/${request_file#/var/www/*/}"
done
Note the use of --unbuffered, to force jq to flush its output lines immediately rather than buffering them. This has a performance penalty (so it's not default), but it ensures that you get output immediately when reading from a potentially-slow input source.
That said, it's also easy to remove a prefix in jq, so there's no particular reason to do the above:
tail -f -- errors.log | jq -r '
def withoutPrefix: sub("^([/][^/]+){3}"; "");
[.server_name, .server_port, (.request_file | withoutPrefix)] | #tsv'
Related
The below code is producing the expected results with username.
es_eh="$(aws cloudtrail --region us-east-1 lookup-events --lookup-attributes AttributeKey=EventSource,AttributeValue=route53.amazonaws.com --max-items 50 --start-time "${start_date}" --end-time "${end_date}" --output json)"
for row in $(echo "${es_eh}" | jq -r '.Events[] | #base64'); do
echo "${row}" | base64 --decode | jq -r '.Username'
done
I didn't understand the purpose of doing base64 encode and then doing decode of the same string inside loop to retrieve username.
This is not working when I remove base64 encode and decode.
for row in $(echo "${es_eh}" | jq -r '.Events[]'); do
echo "${row}"| jq -r '.Username'
done
Without the encoding, the output of the first jq is more than one row. The loop iterates over the lines and fails, as none of them contains a valid JSON. With the | #base64, each subobject is encoded to a single row, inflated back to a full JSON object by base64 --decode.
To see the rows, try outputting $row before processing it.
When you use $( ) without quotes around it, the result gets split into "words", but the shell's definition of a "word" is almost never what you want (and certainly has nothing to do with the json entries you want it split into). This sort of thing is why you should almost never use unquoted expansions.
Converting the output entries to base64 makes them wordlike enough that shell word splitting actually does the right thing. But note: some base64 encoders split their output into lines, which would make each line be treated as a separate "word". If jq's base64 encoding did this, this code would fail catastrophically on large events.
Transforming the for loop into a while loop should fix the problem :
while read -r row; do
echo "${row}" | jq -r '.Username'
done < <(echo "${es_eh}" | jq -c -r '.Events[]')
Note that in the outer jq, I used option -c to put output in a single ine.
When making query using Jq Play for the provided json the output looks as expected Demo. But when I try the same query in shell Script & iterate the object I see new row got added because of whitespace between sentence.
Query:
query=$(cat $basename/test.json | jq -r '.DesignCode | to_entries[] | "\(.key):\(.value)"')
for i in $query
do
printf "$i"
done
used in the shell script
Output ScreenShot
What is the correct way to write the query?
I'm not sure about the output of your command but, in my experience, shell is a bit confusing when it comes to creating arrays from strings.
A useful workaround I use a lot is forcing shell to recognize the output as an array by compound assignment:
query=( $(cat $basename/test.json | jq -r '.DesignCode | to_entries[] | "\(.key):\(.value)"') )
for i in $query
do
printf "$i \n"
done
currently I call
cat my_file.json | jq
to pretty print json data. I am a bit surprised that I can't do
I would like to avoid the extra cat; i.e.,
jq my_file.json
Can I specify a file name?
You need to specify the jq program to run:
jq . my_file.json
jq -h
The usage line produced by jq -h:
Usage: jq [options] <jq filter> [file...]
Note that the summary produced by invoking jq with the -h option does not (currently) provide a complete listing of the options. For the supported options, see the jq manual: https://stedolan.github.io/jq/manual/
Two undocumented options of note are:
--debug-dump-disasm
--debug-trace
jq .
Under certain circumstances, jq . can be abbreviated to jq but it's always safe to use the full form; a good rule of thumb is: if in doubt, do so.
Im hoping to walk through some kinesis data using bash. Using a cmd like:
aws kinesis get-records --shard-iterator <long shard info> | jq '[.|.Records[].Data]' | grep \"ey | sed -e 's/^[ \t]*\"//;s/[ \t]*\",$//'
I can get the base64 data from the stream. What Im having issues with is piping this through base64 so I can see the actual data.
If I send it through using a combination of head -n and tail I can see individual values but any attempt to pass through more than 2-3 lines fails. Errors are typically one set of JSON values followed by garbage data. The whole command is typically preceded by
Invalid character in input stream.
To see the json values I use <long bash command from above> | xargs base64 -D
-- Caveat: Using bash on OSX
This works (assuming you've copied the base64 data to a file):
while IFS= read -r line; do echo $line | base64 -D && printf "\n"; done < <infile>
I have developed Kines - friendly CLI for Amazon Kinesis Data Stream. This can be useful for your debugging purpose.
You can install it using pip.
pip install kines
Then you can run kines walk command on stream and shard to view decoded data.
kines walk <stream-name> <shard-id>
Demo:
I'm using below jq statement with grep in my code to print a value.
jq '.Subnets[0].Tags' subnet.txt | grep -q "${add}usea1 internal us-east"
This works fine for some values however, few values need grep to be "${add}use* internal us-east", can i use asterisk so that all my values can be printed.
I get error when i include asterisk. any suggestions?
You have not followed the mcve guidelines, but as #shellter pointed out, the problem description suggests you just have to use the proper (grep) regex:
grep -q "${add}use.* internal us-east"
However, since you are using jq in any case, it would probably be better to perform the filtering by extending the jq filter, for example as follows:
jq --arg add "$add" '
.Subnets[0].Tags
| select(test("\($add)use.* internal us-east"))
' subnet.txt