Is it possible for Pandoc to take text as an input argument rather than an input file? - pandoc

I can't seem to figure out if this is possible. Still trying to learn the tool - I've figured out how to run it on an input file and generate outputs, but would it be possible for it to, for example, take a text as an input and generate an output file.
For e.g., instead of
pandoc -i somefile.md -o -f markdown -t docx output.md
could I do
pandoc "# hello there! \n\nI went to the market" -o -f markdown -t docx output.md
Am I missing some option in the doc?

You can pass text input to pandoc. I did so by using the pipe operator:
echo "# hello there! \n\nI went to the market" | pandoc -f markdown -t docx -o foo.docx
This works with other text markups:
echo "* hello there! \n\nI went to the market" | pandoc -f org -t docx -o foo.docx
If you are targeting another text-based markup, you can even get it to print out text:
echo "# hello there! \n\nI went to the market" | pandoc -f markdown -t org
* hello there!
:PROPERTIES:
:CUSTOM_ID: hello-there
:END:
I went to the market

Related

How to use curl get image right filename and extension

My links just like below
https://cdn.sspai.com/2022/06/22/article/a88df95f3401d5b6c9d716bf31eeef33?imageView2/2/w/1120/q/90/interlace/1/ignore-error/1
If I use chrome to open this like and cmd + s
I will get the right filename and right extension png.
But if I use bash below, then it will no extension:
curl -J -O https://cdn.sspai.com/2022/06/22/article/a88df95f3401d5b6c9d716bf31eeef33?imageView2/2/w/1120/q/90/interlace/1/ignore-error/1
I just want to download image with right filename and extension.
a88df95f3401d5b6c9d716bf31eeef33.png
Same error include different image links below:
https://cdn.sspai.com/article/fa848601-4cdf-38b0-b020-7afd6efc4a7e.jpg?imageMogr2/auto-orient/quality/95/thumbnail/!800x400r/gravity/Center/crop/800x400/interlace/1
You can get the name from the URL itself.
url="YOUR-URL"
file="`echo "${url}" | sed 's|\?.*|.jpg|' | xargs basename`"
curl -o "${file}.tmp" "${url}"
mv "${file}.tmp" "${file}"
Hope it helps

Why does curl -o output contain sequences like "^[[38;5;250m", when "surf" output looks fine?

I want to output wttr.in in to a file with curl. The problem is that the output isn't how it would be when i just surf wttr.in.
What i did is:
curl wttr.in -o ~/wt.tex and curl wttr.in -o ~/wt
The output is like: <output>
It should be https://wttr.in.
I solved my self:
less -r -f -L wt.tex
-r controlls the binary characters
-f forces to open the the file with out asking.

Pass stdin to plistbuddy

I have a script to show the content of the Info.plist of .ipa files:
myTmpDir=`mktemp -d 2>/dev/null || mktemp -d -t 'myTmpDir'`
unzip -q "$1" -d "${myTmpDir}";
pathToFile=${myTmpDir}/Payload/*.app/Info.plist
/usr/libexec/PlistBuddy -c "Print" ${pathToFile}
With large files this can take some time until they are extracted to the temp folder just to read a small Info.plist (xml) file.
I wondered if I can just extract Info.plist file and pass that to plistbuddy? I've tried:
/usr/libexec/PlistBuddy -c "Print" /dev/stdin <<< \
$(unzip -qp test.ipa Payload/*.app/Info.plist)
but this yields
Unexpected character b at line 1
Error Reading File: /dev/stdin
The extraction is working fine. When running unzip -qp test.ipa Payload/*.app/Info.plist I get the output of the Info.plist file to the terminal:
$ unzip -qp test.ipa Payload/*.app/Info.plist
bplist00?&
!"#$%&'()*+5:;*<=>?ABCDECFGHIJKXYjmwxyIN}~N_BuildMachineOSBuild_CFBundleDevelopm...
How can I pass the content of the Info.plist to plistbuddy?
Usually commands support "-" as a synonym of stdin, but this PlistBuddy tool doesn't.
But you can still extract just one file from your ipa, save it as a temporary file, and then run PlistBuddy on that file:
tempPlist="$(mktemp)"
unzip -qp test.ipa "Payload/*.app/Info.plist" > "$tempPlist"
/usr/libexec/PlistBuddy -c Print "$tempPlist"
rm "$tempPlist"
I ended up with plutil as chepner suggested:
unzip -qp test.ipa Payload/*.app/Info.plist | plutil -convert xml1 -r -o - -- -

wget to parse a webpage in shell

I am trying to extract URLS from a webpage using wget. I tried this
wget -r -l2 --reject=gif -O out.html www.google.com | sed -n 's/.*href="\([^"]*\).*/\1/p'
It is displaiyng FINISHED
Downloaded: 18,472 bytes in 1 files
But not displaying the weblinks. If I try to do it seperately
wget -r -l2 --reject=gif -O out.html www.google.com
sed -n 's/.*href="\([^"]*\).*/\1/p' < out.html
Output
http://www.google.com/intl/en/options/
/intl/en/policies/terms/
It is not displaying all the links
ttp://www.google.com
http://maps.google.com
https://play.google.com
http://www.youtube.com
http://news.google.com
https://mail.google.com
https://drive.google.com
http://www.google.com
http://www.google.com
http://www.google.com
https://www.google.com
https://plus.google.com
And more over I want to get links from 2nd level and more can any one give a solution for this
Thanks in advance
The -O file option captures the output of wget and writes it to the specified file, so there is no output going through the pipe to sed.
You can say -O - to direct wget output to standard output.
If you don't want to use grep, you can try
sed -n "/href/ s/.*href=['\"]\([^'\"]*\)['\"].*/\1/gp"

get veehd url in bash/python?

Can anybody figure out how to get the .avi URL of a veehd[dot]com video, by providing the page of the video in a script? It can be BASH, or Python, or common programs in Ubuntu.
They make you install a extension, and I've tried looking at the code but I can't figure it out.
This worked for me:
#!/bin/bash
URL=$1 # page with the video
FRAME=`wget -q -O - $URL | sed -n -e '/playeriframe.*do=d/{s/.*src : "//;s/".*//p;q}'`
STREAM=`wget -q -O - http://veehd.com$FRAME | sed -n -e '/<a/{s/.*href="//;s/".*//p;q}'`
echo $STREAM

Resources