How to detect webp file behind .png link? - image

I have some tricky link:
https://www.pwc.com.tr/tr/sektorler/Perakende-T%C3%BCketici/kuresel-tuketicileri-tanima-arastirmasi/kuresel-tuketici-gorusleri-arastirmasi-info-5en.png
The last 4 characters in the link implies that we will get image in png format, and even GET HTTP request to that link brings the content-type ‘image/png’.
But if you’ll try to save it in browser, you will end up with webp file format
So, question is - how one can detect that it really webp image 'hidden' behind the link that looks like and act (remember headers!) like png file via program that can use only http protocol?
Update: I want to point out that I did http get request from different environments and get 'image/png' type in headers content-type. For example using node.js and axios
https://youtu.be/KiRrAVl67uQ

Update: The server will detect client type by request's User-Agent header, and return different Content-Type correspondingly. It makes sense, because not all client support webp.
Thus, to get image/webp type resource, you can send custom User-Agent header and simulate as Chrome etc. For example, in Node.js and axios:
const axios = require('axios');
axios.request({
url: 'https://www.pwc.com.tr/tr/sektorler/Perakende-T%C3%BCketici/kuresel-tuketicileri-tanima-arastirmasi/kuresel-tuketici-gorusleri-arastirmasi-info-5en.png',
method: 'get',
headers: {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'
}
}).then(function(res) {
console.log(res.headers); // content-type header is image/webp now.
}).catch(function(err) {
console.log(err);
});
Browser try to save this picture as .webp format because: in HTTP response headers, the Content-Type header's value is image/webp:
how one can detect that it really webp image 'hidden' behind the link that looks like and act like png file...?
You can check HTTP response header and find what Content-Type it is.

Related

HTML5 Progressive Streaming -- no follow-up range requests

I'm working on an embedded device that is recording video on the fly. I'd like to stream that to an HTML5 video element, using our own custom server. I have this almost working and would like some help.
So far as I can tell, I've got libav / ffmpeg doing their job right. I encoded an mp4 in RAM with the moov atom at the start of the file. I've written this file to disk and it plays everywhere it should.
The problem, I think, lies with how I'm responding to HTTP range requests. When I try to do a live stream, I get an initial range request from the browser / player (currently tried Chrome, Firefox, and VLC) for bytes:0-. I responded with some initial bytes. The browser / player actually plays this fine, but never asks again. So the live stream doesn't work, just the first 3 seconds or whatever.
I've looked at the RFC spec of partial content, and my understanding is I'm doing what I should be... Clearly I'm not though. Here is an example of a request / response with Chrome as the requester:
get /live.mp4 HTTP/1.1
host: localhost:1235
connection: keep-alive
accept-encoding: identity;q=1, *;q=0
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.167 Safari/537.36
accept: */*
dnt: 1
accept-language: en-GB,en-US;q=0.9,en;q=0.8
range: bytes=0-
HTTP/1.1 206 Partial Content
Accept-Ranges: bytes
Content-Type: video/mp4
Content-Length: 182400
Content-Range: bytes 0-182399/*
Again, with that request / response pair, Chrome plays the first 182400 bytes but never makes a second request. I thought having the '*' in Content-Range would make this happen...
Progressive download doesn’t work that way. It browser assumes the file will never change. To play a live stream you need to use fragmented MP4 and media source extensions.
This was the problem. I needed media extensions, it wasn't clear to me that progressive streaming wasn't for live feeds of unknown length. Media extensions and a websocket solved the issue. Also needed to use -dash for libav to make it work in Chrome.
I came across a similar issue where it was working perfectly on Firebox but only plays the first fragment of the video in Chrome and does not request any other. My case was solved simply by making the very first response return nothing with a 200 status code and Accept-Ranges: bytes header.
Looks like it is common, check the update part for the following question: Content-Range working in Safari but not in Chrome

Invalid-Content-Type error when trying to send GIF through Twilio SMS

I have this bit of code to send an MMS message with a GIF. (using Ruby with Sinatra, hosted on Heroku).
client.messages.create(
to: to,
from: phone,
body: message,
media_url: 'http://media.giphy.com/media/zl170rmVMCpEY/giphy.gif'
)
It fails, and Twilio's debug console shows a 12300 invalid content-type error. I'm certain I'm missing something simple here, but I cannot figure out what.
The URL you are using is returns a different type of content based on the Accept header of the request.
In Chrome a response with a "Content-Type" header of "text/html". Which is surprising given the .gif suffix on the URL.
Chrome accept headers look like: Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
However if I use curl -I http://media.giphy.com/media/zl170rmVMCpEY/giphy.gif I get Content-Type: image/gif
If you look at the image URL on the HTML page, in Chrome, it is actually: https://i.giphy.com/zl170rmVMCpEY.webp
webp is an alternative format to gif, I suspect it is served instead of gif if the browser supports it.
If Twilio supports webp format images you could use that instead.
Gify also seem to use mp4 format, it looks like they brand as gif, but don't actually serve gif's to clients which can accept HTML or WebP content.

How to use Nokogiri in ruby to parse link contain # charater

I use Nokogiri in ruby to parse link like this
link='http://vnreview.vn/danh-gia-di-dong#cur=2'
doc= Nokogiri::HTML(open(link,'User-Agent'=>'Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.64 Safari/537.31').read, nil, 'UTF-8')
but nokogiri return doc is source of link='http://vnreview.vn/danh-gia-di-dong'
How can i parse link with #cur=1, #cur=2...
Fragment is not sent to server with http request, i.e. if you open http://www.example.com/#fragment in browser following request will be made:
GET / HTTP/1.1
Host: example.com
Then after receiving response, browser will append fragment to URL and perform some actions (for example, scroll to element with id="fragment", or execute javascript callbacks)
If page content differs based on fragment, it's done via javascript. Nokogiri is not capable of running javascript, so you need some other tool, like selenium-webdriver or capybara-webkit.
Another option is to inspect ajax requests on page you trying to parse and probably you'll find JSON with data you need. Then download this json directly. Probably content is already on page, and it's just hidden via css (like tabs in twitter bootstrap).

Correct HTTP Headers for Images?

I'm writing a web server in C#, just for the fun of it, and I am able to serve basic text files to my browser. However, when serving up an image (say, image.png), all browsers that I test my server on (IE, Firefox, and Chrome) show some kind of placeholder thumbnail for the image, as if the image is corrupted or invalid.
The response that I am sending to the browser looks like
HTTP/1.0 200 Ok
Content-Type: image/png
Content-Length: 14580053
{image data here}
Am I using the correct HTTP headers? Or, if I am, why else would browsers not accept the image?
Ah, figured it out... my code forgot to add an extra \n before the response body. It wasn't a problem with the headers at all, just incorrect response syntax.

HTTPS not returning images from public folder while HTTP does

My app works perfectly fine with HTTP, when I moved to HTTPS I am not able to render images. Basically i have following folder structure :
-public
----images
--------12243456.jpg
I am requesting this image from Javascript by setting HTML img tag
<img.src= "<path/to/image>
And my request returns with response 404. When I switch to HTTP the same code works.
This is the standard line in my conf file for the static files
GET /assets/*file controllers.Assets.at(path="/public", file)
CSS and JS files are being rendered fine in HTTPS, somehow images are the ones creating problem.
Here is a sample request
Request URL:https://hay.com:9443/assets/images/1378237450056.jpg?1378237450057
Request Method:GET
Status Code:404 Not Found
Request Headersview source
Accept:image/webp,*/*;q=0.8
Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-US,en;q=0.8,ca;q=0.6
Connection:keep-alive
Cookie:PLAY_SESSION="0a28e06a3342d31a45af7182fc4598c202d11890-email=guillaume%40sample.com"
Host:localhost:9443
Referer:https://hay.com:9443/launchCamera/1
User-Agent:Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.57 Safari/537.36
Query String Parametersview sourceview URL encoded
Response Headersview source
Content-Length:0

Resources