youtube_dl: Download files synchronously via terminal - terminal

I'm using the linux terminal to download files using youtube_dl.
youtube-dl <link>
However, the file is downloaded asynchronously.
Is there a simple way to have the above command block until download is complete. (The other way would have been to check for downloaded file creation)

My crystal ball tells me that your URL contains ampersands, like https://www.youtube.com/watch?v=BaW_jenozKc&t=1s&end=9. In a shell, the ampersand makes the program run in the background (asynchronously).
Escape ampersands in URLs, by putting the whole URL in quotes:
youtube-dl 'https://www.youtube.com/watch?v=BaW_jenozKc&t=1s&end=9'
On Windows cmd, use double quotes instead:
youtube-dl "https://www.youtube.com/watch?v=BaW_jenozKc&t=1s&end=9"
Alternatively, escsape all problematic characters. Consult your shell handbook on which characters have special meanings and how to escape them. Oftentimes, a backslash will work:
youtube-dl https://www.youtube.com/watch?v=BaW_jenozKc\&t=1s\&end=9

Related

Concatenate multiple Markdown files using Pandoc on Windows

I have several Markdown (.md) files in a folder and I want to concatenate them and get a final Markdown file using Pandoc. I wrote a bash file like this:
#!/bin/bash
pandoc *.md > final.md
But I am getting the following error when I double-click on it:
pandoc: *.md: openBinaryFile: invalid argument (Invalid argument)
and the final.md file is empty.
If I try this:
pandoc file1.md file2.md .... final.md
I am getting the results I expect: a final.md file with the contents of all the other Markdown files.
On macOS it works fine. Why doesn't this work on Windows?
On Unix-like shells (like bash, for which your script is written) glob expansion (e.g. turning *.md into file1.md file2.md file3.md) is performed by the shell, not the application you're running. Your application sees the final list of files, not the wildcard.
However, glob expansion in cmd.exe is performed by the application:
The Windows command interpreter cmd.exe relies on a runtime function in applications to perform globbing.
As a result, Pandoc is being passed a literal *.md when it expects to see a list of files like file1.md file2.md file3.md. It doesn't know how to expand the glob itself and tries to open a file whose name is *.md.
You should be able to run your bash script in a unix-like shell like Cygwin or bash on Windows. It may also work on PowerShell, though I don't have a machine handy to test. As a last resort you could jump through some hoops to write a batch file that expands the glob and passes file names to Pandoc.

Is it possible to wrap a Doxygen filter command in quotes?

I am trying to write a Doxygen file filter that lives in a subdirectory that works from Doxyfile on both Windows and Linux.
Doxyfile
/scripts
myfilter
I seem to be unable to specify the path using a forward-slash on Windows unless it is quoted:
"scripts/myfilter"
However, trying to quote the command in Doxyfile does not work.
FILTER_PATTERNS = *.glsl=""scripts/runpython" scripts/doxygen-glslfilter.py"
On Windows, you get an error that implies the quotes don't exist.
'scripts' is not recognized as an internal or external command, operable program or batch file.
Doxygen uses popen() to run these commands and will remove the wrapping quotes around the command, but it does not seem to remove all quotes.
popen() call:
https://github.com/doxygen/doxygen/blob/master/src/definition.cpp#L745
filter name quote strip:
https://github.com/doxygen/doxygen/blob/master/src/util.cpp#L2458
However, the result is the same as if there were no quotes.
Update
I was able to get command logging in Doxygen, and it appears the extra quotes are being stripped in an odd way. You can see how there is a space in front of the command.
Executing popen(` scripts/runpython scripts/doxygen-glslfilter.py "C:/dev/g3d/G3D10/data-files/shader/AlphaFilter.glsl"`)
Update
I submitted a bug report/feature request but I doubt it will be read.
Doxygen Bug Report
The issue was reported to the doxygen project, and they have provided a solution where any '/' in the command is replaced by '\' on Windows.
https://bugzilla.gnome.org/show_bug.cgi?id=792846
This was done to resolve a similar issue here:
What is the QHG_LOCATION path relative to for doxygen?
The pull request for the project on github here: https://github.com/doxygen/doxygen/pull/703
When using double quotes within double quotes in a single string, it sees the first double quote as the start string and the next double quote as the end of the string.
So in your example:
""scripts/runpython" scripts/doxygen-glslfilter.py"
The first 2 quotes are seen as open and close, then it sees scripts/runpython as the next command etc.
I do not have the same tool, but these 2 examples will probably sort out your issue.
This example wraps each set in double quotes, and the entire set in single quotes.
FILTER_PATTERNS = *.glsl='"scripts/runpython" "scripts/doxygen-glslfilter.py"'
Where this example wraps the first set in double quotes and the entire set in single quotes.
FILTER_PATTERNS = *.glsl='"scripts/runpython" scripts/doxygen-glslfilter.py'
NOTE!! I am unable to test this as I do not have the same environment as you. I am therefore not sure if the second option will work, as it might also need scripts/doxygen-glslfilter.py in double quotes, I am adding it to the answer regardless.

How can you automatically escape special characters in a string that is pasted into terminal?

I have the CLI utility "youtube-dl" installed. It takes a urls as arguments. It is most natural to paste these urls from the system clipboard. Using zsh, however, this returns an error "no matches found" because the special characters in youtube urls are not escaped.
I need to go from this:
https://www.youtube.com/watch?v=ShxHGFs2IKE
to this:
https\:\/\/www\.youtube\.com\/watch\?v=ShxHGFs2IKE
It is quite a pain to manually escape all the characters every time, so my question is: how can I make this work without all the manual editing of urls each time?
As said in the comments, try using quotes:
youtube-dl 'https://www.youtube.com/watch?v=ShxHGFs2IKE'
Or you can load zsh url-quote-magic to get special shell characters to be quoted automatically in URLs:
autoload -Uz url-quote-magic
zle -N self-insert url-quote-magic

wildcard in scripts

It is fine to run
evince ./result/demo_1000000_10000*.ps
on a shell window. But when I put it into a scripts file, then run that file, it can not find all those files ./result/demo_1000000_10000*.ps... here * is meant to be a wildcard and following is the scripts.
evince ./result/demo_1000000_10000"*.ps"
So are there any changes that should be made when putting commands into scripts?
It should work the same way in a script or on the command line. The quotation marks prevent the wildcard from being expanded. Just remove them from the script. (Why did you add them in the first place?)
If the command runs from the prompt as shown, then it should also run from a shell script if the current directory of the invoking process is the same - with exactly the same notation. There is no reason to include quotes in the scripted version if you want it to do the same as the unscripted version. And if you ran the quoted version at the command line, it would fail the same as the quoted version in the scripted version does.
However, in a script, you do have to worry about whether the Postscript files you plan to work on are in the correct location. Sometimes, the script uses an absolute pathname, sometimes the script uses cd to change directory to the correct place, sometimes there's an argument or environment variable that locates the files.
So, if used carefully, you don't have to change anything for the script to work - but there are many ways you can prevent the script from working. One of those is by adding quotes around wildcard characters.

wget errors breaks shell script - how to prevent that?

I have a huge file with lots of links to files of various types to download. Each line is one download command like:
wget 'URL1'
wget 'URL2'
...
and there are thousands of those.
Unfortunately some URLs look really ugly, like for example:
http://www.cepa.org.gh/archives/research-working-papers/WTO4%20(1)-charles.doc
It opens OK in a browser, but confuses wget.
I'm getting an error:
./tasks001.sh: line 35: syntax error near unexpected token `1'
./tasks001.sh: line 35: `wget 'http://www.cepa.org.gh/archives/research-working-papers/WTO4%20(1)-charles.doc''
I've tried both URL and 'URL' ways of specifying what to download.
Is there a way to make a script like that running unattended?
I'm OK if it'll just skip the file it couldn't download.
Do not (ab)use the shell.
Save your URLs to some file (let's say my_urls.lst) and do:
wget -i my_urls.lst
Wget will handle quoting etc on it's own
I think you need to used double-quotes (") and not single quotes (') around the URL.
If that still doesn't work, try escaping the paren characters ( and ) with a backslash: \( and \)
Which shell are you using? Bash? zsh?
This doesn't exactly answer your question but:
Both of the following commands work directly in a bash shell:
wget "http://www.cepa.org.gh/archives/research-working-papers/WTO4%20(1)-charles.doc"
and
wget 'http://www.cepa.org.gh/archives/research-working-papers/WTO4%20(1)-charles.doc'
Can you check to see if either of those work for you?
What seems to be happening is that your shell is doing something with the ( characters. I would try using double quotes " instead of single quotes ' around your URL.
If you wish to suppress errors you can use a >/dev/null under unix to redirect standard output or 2> /dev/null to redirect standard error. Under other operating systems it may be something else.

Resources