using output template error via youtube-dl - download

I admit, I'm new to Linux, but I pieced together the following in the Ubuntu terminal to download all of the videos from a YouTube channel:
youtube-dl -o "/media/ubuntu/3A3A9F353A9EED5F/%(uploader)s/%(autonumber)s.%(title)s.%(ext)s" --download-archive ~/.mydownloads -citw ytuser:DirectFix
However, I keep getting this error:
youtube-dl: error: using output template conflicts with using title, video ID or auto number
What do I need to do so I can download the files straight to a separate internal drive, rename the files, and keep track of the videos I've already downloaded?

Your options -citw do not make any sense. Simply remove them (maybe leave -i) and the download will work.
In detail:
-c forces youtube-dl to always resume downloads. By default, youtube-dl does already resume downloads. At best, this option is superfluous. At worst, you may be forcing youtube-dl to resume a download in another quality, which will result in a broken video file.
-i makes youtube-dl continue if it cannot download a video from the playlist. Unlike the other options, it is regularly useful. Be aware that you may miss errors though, if you need a complete download.
-t is the equivalent of -o "%(title)s-%(id)s.%(ext)s". As such, it is causing the immediate error at hand, since you are passing in two different output templates and youtube-dl doesn't know which one to pick.
-w forces youtube-dl to never overwrite an existing file. This is useful for metadata files, which you don't use in the first place. Even then, most users will want the updated information.

Related

How to change Media Created Date in Exiftool?

I have some MP4 videos that don't have Media Created date and time because they were recorded with Instagram camera. I want to set the date and time and I found out that I can do that with Exiftool software. I know that the software works by typing command lines, but I don't know where I should type what. I searched on Google and didn't find any helpful results.
I downloaded the software and got "exiftool(-k).exe" file. What should I do now? I have no idea how it works. I hope somebody can write simple steps just to set that Media Created date.
First you will probably want to rename the exiftool(-k).exe to just exiftool.exe and place it someplace in your PATH (see install exiftool-Windows). There is also Oliver Betz's Alternative Exiftool build for Windows which includes an installer and is a bit more security friendly. See this post on the exiftool forums. For those that use Chocolatey, the Chocolatey exiftool package will add exiftool to the PATH and is a well maintained package
Then you will want to use one of these commands. In the case of your example filename, the filename appears to have been named for the time it was taken i.e YearMonthDay_HourMinutesSeconds. In that case you can simply use this command (see exiftool FAQ #5)
exiftool -api QuickTimeUTC "-CreateDate<Filename" 20181223_000542.mp4
This will work correctly as long as the video was taken in the same time zone as the computer you are currently using. If not, you will have to add the time zone like this:
exiftool -api QuickTimeUTC "-CreateDate<${Filename}-04:00" 20181223_000542.mp4
This is because the CreateDate tag for MP4 files is supposed to be UTC and Windows properties will read it as such. Mac Finder will also correctly adjust from UTC. With the -api QuickTimeUTC option, exiftool will automatically adjust the time to UTC.
If you need to set the time to something different than what the filename is, then you would use this, adding the time zone if needed:
exiftool "-CreateDate=2018:12:23 00:05:42" 20181223_000542.mp4
These commands creates backup files. Add -overwrite_original option to suppress the creation of backup files. If on a Mac, the slower -overwrite_original_in_place option could be used to preserve any MDItem/XAttr data Add the -r (-recurse) option to recurse into subdirectories. If this command is run under Unix/Mac/Powershell, reverse any double/single quotes to avoid shell variable interpretation.
You can process as many files and/or directories as you can put on the command line, so if you wanted to process all the files in c:\Dir1 and C:\Dir2, you would just list both of them at the end of the command c:\Dir1 C:\Dir2

Retrieving latest file in a directory from a remote server

I was hoping to crack this myself, but it seems I have fallen at the first hurdle because I can't make head nor tale of other options I've read about.
I wish to access a database file hosted as follows (i.e. the hhsuite_dbs is a folder containing several databases)
http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/pdb70_08Oct15.tgz
Periodically, they update these databases, and so I want to download the lastest version. My plan is to run a bash script via cron, most likely monthly (though I've yet to even tackle the scheduling aspect of the task).
I believe the database is refreshed fortnightly, so if my script runs monthly I can expect there to be a new version. I'll then be running downstream programs that require the database.
My question is then, how do I go about retrieving this (and for a little more finesse I'd perhaps like to be able to check whether the remote file has changed in name or content to avoid a large download if unnecessary)? Is the best approach to query the name of the file, or the file property of date last modified (given that they may change the naming syntax of the file too?). To my naive brain, some kind of globbing of the pdb70 (something I think I can rely on to be in the filename) then pulled down with wget was all I had come up with so far.
EDIT Another confounding issue that has just occurred to me is that the file I want wont necessarily be the newest in the folder (as there are other types of databases there too), but rather, I need the newest version of, in this case, the pdb70 database.
Solutions I've looked at so far have mentioned weex, lftp, curlftpls but all of these seem to suggest logins/passwords for the server which I don't have/need if I was to just download it via the web. I've also seen mention of rsync, but of a cursory read it seems like people are steering clear of it for FTP uses.
Quite a few barriers in your way for this.
My first suggestion is that rather than getting the filename itself, you simply mirror the directory using wget, which should already be installed on your Ubuntu system, and let wget figure out what to download.
base="http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/"
cd /some/place/safe/
wget --mirror -nd "$base"
And new files will be created in the "safe" directory.
But that just gets you your mirror. You're still after is the "newest" file.
Luckily, wget sets the datestamp of files it downloads, if it can. So after mirroring, you might be able to do something like:
newestfile=$(ls -t /some/place/safe/pdb70*gz | head -1)
Note that this fails if ever there are newlines in the filename.
Another possibility might be to check the difference between the current file list and the last one. Something like this:
#!/bin/bash
base="http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/"
cd /some/place/safe/
wget --mirror -nd "$base"
rm index.html* *.gif # remove debris from mirroring an index
ls > /tmp/filelist.txt.$$
if [ -f /tmp/filelist.txt ]; then
echo "Difference since last check:"
diff /tmp/filelist.txt /tmp/filelist.txt.$$
fi
mv /tmp/filelist.txt.$$ /tmp/filelist.txt
You can parse the diff output (man diff for more options) to determine what file has been added.
Of course, with a solution like this, you could run your script every day and hopefully download a new update within a day of it being ready, rather than a fortnight later. Nice thing about --mirror is that it won't download files that are already on-hand.
Oh, and I haven't tested what I've written here. That's one monstrously large file.

wget: delete incomplete files

I'm currently using a bash script to download several images using wget.
Unfortunately the server I am downloading from is less than reliable and therefore sometimes when I'm downloading a file, the server will disconnect and the script will move onto the next file, leaving the previous one incomplete.
In order to remedy this I've tried to add a second line after the script fetches all incomplete files using:
wget -c myurl.com/image{1..3}.png
This seems to work as wget goes back and completes download of the files, but the problem then comes from this: ImageMagick which I use to stich the images in a pdf, claims there are errors with the headers of the images.
My thought of what to with deleting the incomplete files is:
wget myurl.com/image{1..3}.png
wget -rmincompletefiles
wget -N myurl.com/image{1..3}.png
convert *.png mypdf.pdf
So the question is, what can I use in place of -rmincompletefiles that actually exists, or is there a better I should be approaching this issue?
I made surprising discovery when attempting to implement tvm's suggestion.
It turns out, and this something I didn't realize, that when you run wget -N, wget actually checks file sizes and verifies they are the same. If they are not, the files are deleted and then downloaded again.
So cool tip if you're having the same issue I am!
I've found this solution to work for my use case.
From the answer:
wget http://www.example.com/mysql.zip -O mysql.zip || rm -f mysql.zip
This way, the file will only be deleted if an error or cancellation occurred.
Well, I would try hard to download the files with wget (you can specify extra parameters like larger --timeout to give the server some extra time). wget assumes certain things about the partial downloads and even with proper resume, they can sometimes end up mangled (unless you check their eg. MD5 sums by other means).
Since you are using convert and bash, there will be most likely another tool available from the Imagemagick package - namely identify.
While certain features are surely poorly documented, it has one awesome functionality - it can identify broken (or partially downloaded images).
➜ ~ identify b.jpg; echo $?
identify.im6: Invalid JPEG file structure: ...
1
It will return exit status 1 if you call it on the inconsistent image. You can remove these inconsistent images using simple loop such as:
for i in *.png;
do identify "$i" || rm -f "$i";
done
Then I would try to download again the files that are broken.

How to stream all videos in a folder?

Hi i want to stream videos over web using ffserver. i got this link as reference.
Now what i am not able to figure out is how to pass a folder(which content all videos i want to stream) as input to stream all videos. I also want add more videos dynamically to this folder in time to time and streaming should happen(like how it works in Darwin). now i can't use Darwin because it doesn't support for iOS.
please give me a suggestion.
is there any other open source tool by which i can do this?
I wrote a bash script for this, it's working in ubuntu 16
Hopefully someone else can write it up in a less terrible language
Here's the script:
echo -e "HTTPPort 8090\nHTTPBindAddress 0.0.0.0\nMaxHTTPConnections 2000\nMaxClients 1000\nMaxBandwidth 1000\nCustomLog -\n<Stream stat.html>\nFormat status\n</Stream>"
num=1
for i in *.mp3; do
echo -e "<Stream \"$(urlencode $i)\">\nFile \"$(pwd)/$i\"\nFormat mp2\nAudioCodec libmp3lame\nAudioBitRate 64\nAudioChannels 1\nAudioSampleRate 44100\nNoVideo\n</Stream>"
done
save this as a bash script in the folder you want to serve, I'll refer to it as:
./gen_ffserver_conf.sh
it's hard coded for mp3, you'd have to sort through my echos to get it to do another format.
run the server with:
ffserver -f <(bash -e ./gen_ffserver_conf.sh)
I had to install a package for the url encoding:
sudo apt install gridsite-clients
(and of course you need ffserver as well, in the ffmpeg package)
I stream the files by going to:
http://<ip address of streaming server>:8090/stat.html
and clicking on the urlencoded values, (using chromium). This will open the stream and start playing.
Explanation:
ffserver doesn't like wildcards, or at least I never figured that out, so I'm just creating an entry for each file in the server. The urlencoding is annoying but necessary for the stat page links to work properly.

How to resume an ftp download at any point? (shell script, wget option)?

I want to download a huge file from an ftp server in chunks of 50-100MB each. At each point, I want to be able to set the "starting" point and the length of the chunk I want. I won't have the "previous" chunks saved locally (i.e. I can't ask the program to "resume" the download).
What is the best way of going about that? I use wget mostly, but would something else be better?
I'm really interested in a pre-built/in-build function rather than using a library for this purpose... Since wget/ftp (also, I think) allow resumption of downloads, I don't see if that would be problem... (I can't figure out from all the options though!)
I don't want to keep the entire huge file at my end, just process it in chunks... fyi all - I'm having a look at continue FTP download afther reconnect which seems interesting..
Use wget with:
-c option
Extracted from man pages:
-c / --continue
Continue getting a partially-downloaded file. This is useful when you want to finish up a download started by a previous instance of Wget, or by another program. For instance:
wget -c ftp://sunsite.doc.ic.ac.uk/ls-lR.Z
If there is a file named ls-lR.Z in the current directory, Wget will assume that it is the first portion of the remote file, and will ask the server to continue the retrieval from an offset equal to the length of the local file.
For those who'd like to use command-line curl, here goes:
curl -u user:passwd -C - -o <partial_downloaded_file> ftp://<ftp_path>
(leave out -u user:pass for anonymous access)
I'd recommend interfacing with libcurl from the language of your choice.

Resources