Read YAML metadata from a Pandoc markdown file - yaml

Is it possible to extract Pandoc's metadata (title, date, et al.) from a markdown file without a Haskell filter, or parsing the --to=json output?
The JSON output is particularly inconvenient for this, since a two-word title looks like:
$ pandoc -t json posts/test.md | jq '.meta | .title'
{
"t": "MetaInlines",
"c": [
{
"t": "Str",
"c": "Test"
},
{
"t": "Space"
},
{
"t": "Str",
"c": "post"
}
]
}
so even after having jq read the title, we still need to reconstruct words, and any emphasis, code, or anything else is only going to make it more complicated.

We can use the template variable $meta-json$ for this.
Stick the variable in a file (with an extension, to stop Pandoc looking in it's own directories) and then use it with pandoc --template=file.ext.
Pandoc's output is a JSON object with keys "title", "date", "tags", etc. and their respective values from the markdown document, which we can easily parse, filter, and manipulate with jq.
$ echo '$meta-json$' > /tmp/metadata.pandoc-tpl
$ pandoc --template=/tmp/metadata.pandoc-tpl | jq '.title,.tags'
"The Title"
[
"a tag",
"another tag"
]

Related

How to extract data from a JSON file into a variable

I have the following json format, basically it is a huge file with several of such entries.
[
{
"id": "kslhe6em",
"version": "R7.8.0.00_BNK",
"hostname": "abacus-ap-hf-test-001:8080",
"status": "RUNNING",
},
{
"id": "2bkaiupm",
"version": "R7.8.0.00_BNK",
"hostname": "abacus-ap-hotfix-001:8080",
"status": "RUNNING",
},
{
"id": "rz5savbi",
"version": "R7.8.0.00_BNK",
"hostname": "abacus-ap-hf-test-005:8080",
"status": "RUNNING",
},
]
I wanted to fetch all the hostname values that starts with "abacus-ap-hf-test" and without ":8080" into a variable and then wanted to use those values for further commands over a for loop something like below. But, am bit confused how can I extract such informaion.
HOSTAME="abacus-ap-hf-test-001 abacus-ap-hf-test-005"
for HOSTANAME in $HOSTNAME
do
sh ./trigger.sh
done
The first line command update to this:
HOSTAME=$(grep -oP 'hostname": "\K(abacus-ap-hf-test[\w\d-]+)' json.file)
or if you sure that the hostname end with :8080", try this:
HOSTAME=$(grep -oP '(?<="hostname": ")abacus-ap-hf-test[\w\d-]+(?=:8080")' json.file)
you will find that abacus-ap-hf-test[\w\d-]+ is the regex, and other strings are the head or the end of the regex content which for finding result accuracy.
Assuming you have valid JSON, you can get the hostname values using jq:
while read -r hname ; do printf "%s\n" "$hname" ; done < <(jq -r .[].hostname j.json)
Output:
abacus-ap-hf-test-001:8080
abacus-ap-hotfix-001:8080
abacus-ap-hf-test-005:8080

Replace string with Bash variable in jq command

I realize this is a simple question but I haven't been able to find the answer. Thank you to anyone who may be able to help me understand what I am doing wrong.
Goal: Search and replace a string in a specific key in a JSON file with a string in a Bash variable using jq.
For example, in the following JSON file:
"name": "Welcome - https://docs.mysite.com/",
would become
"name": "Welcome",
Input (file.json)
[
{
"url": "https://docs.mysite.com",
"name": "Welcome - https://docs.mysite.com/",
"Ocurrences": "679"
},
{
"url": "https://docs.mysite.com",
"name": "Welcome",
"Ocurrences": "382"
}
]
Failing script (using variable)
siteUrl="docs.mysite.com"
jq --arg siteUrl "$siteUrl" '.[].name|= (gsub(" - https://$siteUrl/"; ""))' file.json > file1.json`
Desired output (file1.json)
[
{
"url": "https://docs.mysite.com",
"name": "Welcome",
"Ocurrences": "679"
},
{
"url": "https://docs.mysite.com",
"name": "Welcome",
"Ocurrences": "382"
}
]
I've tried various iterations on removing quotes, changing between ' and ", and adding and removing backslashes.
Successful script (not using variable)
jq '.[].name|= (gsub(" - https://docs.mysite.com/"; ""))' file.json > file1.json
More specifically, if it matters, I am processing an export of a website's usage data from Azure App Insights. Unfortunately, the same page may be assigned different names. I sum the Ocurrences of the two objects with the newly identical url later. If it is better to fix this in App Insights I am grateful for that insight, although I know Bash better than Kusto queries. I am grateful for any help or direction.
Almost. Variables are not automatically expanded within a string. You must interpolate them explicitly with \(…):
jq --arg siteUrl 'docs.mysite.com' '.[].name |= (gsub(" - https://\($siteUrl)/"; ""))' file.json
Alternatively, detect a suffix match and extract the prefix by slicing:
jq --arg siteUrl 'docs.mysite.com' '" - https://\($siteUrl)/" as $suffix | (.[].name | select(endswith($suffix))) |= .[:$suffix|-length]' file.json

jq does not show null output

I have the following code in the command line script:
output_json=$(jq -n \
--argjson ID "${id}" \
--arg Title "${title}" \
--argjson like "\"${like}\"" \
'$ARGS.named')
I put the id, title and like variables into the jq. I get the following output:
[
{
"ID": 6,
"Title": "ABC",
"like": ""
},
{
"ID": 22,
"Title": "ABC",
"like": "Yes"
}
]
But, I am trying to get the output in the following format, i.e. with null:
[
{
"ID": 6,
"Title": "ABC",
"like": null
},
{
"ID": 22,
"Title": "ABC",
"like": "Yes"
}
]
I don't quite get it is it possible to do this in general, or is it a problem with my jq command?
And as far as I understood "like": "" is not the same as "like": null. I am also a little confused now, and do not really understand what is the correct choice to use.
By using --argjson you need to provide valid JSON-encoded argument, thus if you want to receive null the value needs to be literally null. Your solution, however, adds quotes around it, so it can never be evaluated to null. (Also, it will only be a valid JSON string if it follows the JSON encoding for special characters such as the quote characters itself).
If you want to have a JSON string in the regular case, and null in the case where it is empty, import the content of ${like} as string using --arg and without the extra quotes (just as you do with ${title}), then use some jq logic to turn the empty string into null. An if statement would do, for example:
like=
jq -n --arg like "${like}" '{like: (if $like == "" then null else $like end)}'
{
"like": null
}

JQ query on JSON file

I am having below code in JSON file.
{
"comment": {
"vm-updates": [],
"site-ops-updates": [
{
"comment": {
"message": "You can start maintenance on this resource"
},
"hw-name": "Machine has got missing disks. "
}
]
},
"object_name": "4QXH862",
"has_problems": "yes",
"tags": ""
}
I want to separate "hw-name" from this JSON file using jq. I've tried below combinations, but nothing worked.
cat jsonfile | jq -r '.comment[].hw-name'
cat json_file.json | jq -r '.comment[].site-ops-updates[].hw-name'
Appreciated help from StackOverflow!!!
It should be:
▶ cat jsonfile | jq -r '.comment."site-ops-updates"[]."hw-name"'
Machine has got missing disks.
Or better still:
▶ jq -r '.comment."site-ops-updates"[]."hw-name"' jsonfile
Machine has got missing disks.
From the docs:
If the key contains special characters, you need to surround it with double quotes like this: ."foo$", or else .["foo$"].

json processing in .sh files

[
{
"id": "1636ea48-28b7-783a-48dd-5e041f10d9e6",
"name": "Test_Component1",
"desiredVersions": [],
"children": false
},
{
"id": "1636f939-136f-4609-ab93-238b1af193fe",
"name": "Test_Component2",
"desiredVersions": [],
"children": false
}
]
I am writing command in Execute Shell window in Jenkins. I have this json in a variable. I want to extract both Id values so further processing in next set of command can be done.
Using jq:
$ echo "$var" | jq '.[].id'
"1636ea48-28b7-783a-48dd-5e041f10d9e6"
"1636f939-136f-4609-ab93-238b1af193fe"
is it a string? If so, you can use a regexp expression to extract the ID´s values. Like:
(\"id\"\:\W)\"(.+)(\"\,)

Resources