Why can I sometimes download pictures with curl and sometimes not? - shell

So I have gotten a link to an image from google
https://media.istockphoto.com/photos/pile-of-euro-notes-picture-id471843075?k=20&m=471843075&s=612x612&w=0&h=aEFb1spFMtvSnsNvkpgA2tULw-cmcBC4nwbCvDFYN9c=
I got this by right clicking on the image and getting the URL address of the image
This is another Image url I got in the same way
https://m.media-amazon.com/images/I/61RzcieEZpL._AC_SX522_.jpg
Both images show up fine when I paste the links in the browser
When I use curl to download the second image it does so without issues
curl -O 'https://m.media-amazon.com/images/I/61RzcieEZpL._AC_SX522_.jpg'
However, for the second...
curl -O 'https://media.istockphoto.com/photos/pile-of-euro-notes-picture-id471843075?k=20&m=471843075&s=612x612&w=0&h=aEFb1spFMtvSnsNvkpgA2tULw-cmcBC4nwbCvDFYN9c='
The downloaded file is just this strange looking text file
ˇÿˇ‡JFIF,,ˇ·òExifII*[&òÇÅPile of euro notes. This notes are miniatures, made by myself. More money? In my portfolio.Kerstin Waurickˇ·îhttp://ns.adobe.com/xap/1.0/<?xpacket begin="Ôªø" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="" xmlns:photoshop="http://ns.adobe.com/photoshop/1.0/" xmlns:Iptc4xmpCore="http://iptc.org/std/Iptc4xmpCore/1.0/xmlns/" xmlns:GettyImagesGIFT="http://xmp.gettyimages.com/gift/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:plus="http://ns.useplus.org/ldf/xmp/1.0/" xmlns:iptcExt="http://iptc.org/std/Iptc4xmpExt/2008-02-29/" xmlns:xmpRights="http://ns.adobe.com/xap/1.0/rights/" dc:Rights="Kerstin Waurick" photoshop:Credit="Getty Images/iStockphoto" GettyImagesGIFT:AssetID="471843075" xmpRights:WebStatement="https://www.istockphoto.com/legal/license-agreement?utm_medium=organic&utm_source=google&utm_campaign=iptcurl" >
<dc:creator><rdf:Seq><rdf:li>Kerrick</rdf:li></rdf:Seq></dc:creator><dc:description><rdf:Alt><rdf:li xml:lang="x-default">Pile of euro notes. This notes are miniatures, made by myself. More money? In my portfolio.</rdf:li></rdf:Alt></dc:description>
<plus:Licensor><rdf:Seq><rdf:li rdf:parseType='Resource'><plus:LicensorURL>https://www.istockphoto.com/photo/license-gm471843075-?utm_medium=organic&utm_source=google&utm_campaign=iptcurl</plus:LicensorURL></rdf:li></rdf:Seq></plus:Licensor>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end="w"?>
ˇÌ∫Photoshop 3.08BIMùPKerrickx[Pile of euro notes. This notes are miniatures, made by myself. More money? In my portfolio.tKerstin WauricknGetty Images/iStockphotoˇ€C
#%$""!&+7/&)4)!"0A149;>>>%.DIC<H7=>;ˇ€C
Can anyone tell me why this is happening?

Your viewer just failed to fathom out that the file is a JPEG image because it has the wrong extension. Try adding an extension like this:
curl -O 'https://media.istockphoto.com/photos/pile-of-euro-notes-picture-id471843075?k=20&m=471843075&s=612x612&w=0&h=aEFb1spFMtvSnsNvkpgA2tULw-cmcBC4nwbCvDFYN9c=' > image.jpg
If you might be downloading PNGs and GIFs and stuff other than JPEG, you can use file to get a sensible extension:
curl ... > UnknownThing
Then:
file -b --extension UnknownThing
jpeg/jpg/jpe/jfif
So maybe something along the lines of:
curl ... > UnknownThing
ext=$(file -b --extension UnknownThing | sed 's|/.*||')
mv UnknownThing image.${ext}

Related

Image downloaded with wget has size of 4 bytes

I have a problem with downloading certain image.
I'm trying to download image and save it on disk.
Here is the wget command, that I'm using and it works perfectly fine with almost every image. (code above works fine with this url)
wget -O test.gif http://www.fmwconcepts.com/misc_tests/animation_example/lena_anim2.gif
Almost, becouse when I try to download image from this url: http://sklepymuzyczne24.pl/_data/ranking/686/e3991/ranking.gif
It fails. Downloaded file size is 4 bytes. I tried doing this using curl instead of wget, but the results are the same.
I think that the second image (the one not working) might be somehow generated (the image automatically changes, depending on store reviews). I belive that it has something to do with this issue.
Looks like some kind of misconfiguration on the server side. It won't return the image unless you specify that you accept gzip compressed content. Most web browsers nowadays do this by default, so the image is working fine in browser, but for wget or curl you need to add accept-encoding header manually. This way you will get gzip compressed image. Then you can pipe it to gunzip and get a normal, uncompressed image.
You could save the image using:
wget --header='Accept-Encoding: gzip' -O- http://sklepymuzyczne24.pl/_data/ranking/686/e3991/ranking.gif | gunzip - > ranking.gif

How to get page size(Trim/Bleed/Art/Media....) in PDF with GhostScript 9.19 in windows 10?

I want page size (Trim/Bleed/Art/Media....) in PDF with GS 9.19 in windows 10.
I tried this command :
gswin64c -dNODISPLAY -q -dDumpMediaSizes "../lib/PDFA_def.ps" test.pdf
, but I had below error.
Error: /undefinedfilename in --file--
Thank you for your information.
Why would you think that pdfa_def.ps, which is used to create a PDF/A compliant PDF file would give you any information ?
I suspect you actually mean pdf_info.ps, which is located in the toolbin folder, not the lib folder.

download all images on the page with WGET

I'm trying to download all the images that appear on the page with WGET, it seems that eveything is fine but the command is actually downloading only the first 6 images, and no more. I can't figure out why.
The command i used:
wget -nd -r -P . -A jpeg,jpg http://www.edpeers.com/2013/weddings/umbria-wedding-photographer/
It's downloading only the first 6 images relevant of the page and all other stuff that i don't need, look at the page, any idea why it's only getting the first 6 relevant images?
Thanks in advance.
I think the main problem is, that there are only 6 jpegs on that site, all others are gifs, example:
<img src="http://www.edpeers.com/wp-content/themes/prophoto5/images/blank.gif"
data-lazyload-src="http://www.edpeers.com/wp-content/uploads/2013/11/aa_umbria-italy-wedding_075.jpg"
class="alignnone size-full wp-image-12934 aligncenter" width="666" height="444"
alt="Umbria wedding photographer" title="Umbria wedding photographer" /
data-lazyload-src is a jquery plugin, which wouldn't download the jpegs, see http://www.appelsiini.net/projects/lazyload
Try -p instead of -r
wget -nd -p -P . -A jpeg,jpg http://www.edpeers.com/2013/weddings/umbria-wedding-photographer/
see http://explainshell.com:
-p
--page-requisites
This option causes Wget to download all the files that are necessary to properly display a given HTML
page. This includes such things as inlined images, sounds, and referenced stylesheets.

jpg won't optimize (jpegtran, jpegoptim)

I have an image and it's a jpg.
I tried running through jpegtran with the following command:
$ jpegtran -copy none -optimize image.jpg > out.jpg
The file outputs, but the image seems un-modified (no size change)
I tried jpegoptim:
$ jpegoptim image.jpg
image.jpg 4475x2984 24bit P JFIF [OK] 1679488 --> 1679488 bytes (0.00%), skipped.
I get the same results when I use --force with jpegoptim except it reports that it's optimized but there is no change in file size
Here is the image in question: http://i.imgur.com/NAuigj0.jpg
But I can't seem to get it to work with any other jpegs I have either (only tried a couple though).
Am I doing something wrong?
I downloaded your image from imgur, but the size is 189,056 bytes. Is it possible that imgur did something to your image?
Anyway, I managed to optimize it to 165,920 bytes using Leanify (I'm the author) and it's lossless.

Download GD-JPEG image with correct dimensions from cURL CLI

I need help in downloading a series of GD generated images with their correct dimensions.
I'm using this command in the cURL CLI to download a range of items:
curl "http://apl-moe-eng-www.ai-mi.jp/img/php/fitsample.php?&i_id=4[0001-9999]" -o "Clothes\4#1.jpg" --create-dirs
But the downloaded image's dimensions are smaller than the one shown on the website. The website's image's dimensions are 640*882, but cURL's output image's dimensions are 232*320.
Original Image
cURL Output Image
Why is this, and can anything be added to the command to fix this?
I figured out it was because I left out a user agent:
curl "http://apl-moe-eng-www.ai-mi.jp/img/php/fitsample.php?&i_id=4[0001-9999]" -o "Clothes\4#1.jpg" --create-dirs --user-agent "Android"
Odd, huh?

Resources