Getting the dimensions of the image in ruby - ruby

To get the image dimensions in ruby, I tried to use identify to get image dimensions. I wanted to retrieve the output of this system call and get the output as a string
str = system('identify -format "%[fx:w]x%[fx:h]" image.png')
output = `ls`
print output
But, I'm getting the last lines of output and not the output to this particular system call.
Also, if there is a simpler way to get the image dimensions without external gems or libraries, please suggest as it would be great !

Since you already use an external library (ImageMagick), you could use its Ruby wrapper RMagick:
require 'RMagick'
img = Magick::Image::read('image.png').first
arr = [img.columns, img.rows]
Here's an example of a very simple PNG parser:
data = File.binread('image.png', 100) # read first 100 bytes
if data[0, 8] == [137, 80, 78, 71, 13, 10, 26, 10].pack("C*")
# file has a PNG file signature, let's get the image header chunk
length, chunk_type = data[8, 8].unpack("l>a4")
raise "unknown format, expecting image header" unless chunk_type == "IHDR"
chunk_data = data[16, length].unpack("l>l>CCCCC")
width = chunk_data[0]
height = chunk_data[1]
bit_depth = chunk_data[2]
color_type = chunk_data[3]
compression_method = chunk_data[4]
filter_method = chunk_data[5]
interlace_method = chunk_data[6]
puts "image size: #{width}x#{height}"
else
# handle other formats
end

Okay, I finally found a solution after some experiments.
str = `identify -format "%[fx:w]x%[fx:h]" image.png`
arr = str.split('x')
The array arr now contains dimensions in it [width,height] .
This worked for me ! Please suggest other approaches that might be more easier or simpler.

Related

Ruby Zlib compression gives different outputs for the same input

I have this ruby method for compressing a string -
def compress_data(data)
output = StringIO.new
gz = Zlib::GzipWriter.new(output)
gz.write(data)
gz.close
compressed_data = output.string
compressed_data
end
When I call this method with the same input, I get different outputs at different times. I am trying to get the byte array for the compressed outputs and compare them.
The output is Different when I run the below -
input = "hello world"
output1 = (compress_data input).bytes.to_a
sleep 1
output2 = (compress_data input).bytes.to_a
if output1 == output2
puts 'Same'
else
puts 'Different'
end
The output is Same when I remove the sleep. Does the compression algorithm have something to do with the current time?
Option 1 - fixed mtime:
Yes. The compression time is stored in the header. You can use the mtime method to set the time to a fixed value, which will resolve your problem:
gz = Zlib::GzipWriter.new(output)
gz.mtime = 1
gz.write(data)
gz.close
Note that the Ruby documentation says that setting mtime to zero will disable the timestamp. I tried it, and it does not work. I also looked at the source code, and it appears this functionality is missing. Seems like a bug. So you have to set it to something else than 0 (but see comments below - it will be fixed in future releases).
Option 2 - skip the header:
Another option is to just skip the header when checking for similar data. The header is 10 bytes long, so to only check the data:
data = compress_data(input).bytes[10..-1]
Note that you do not need to call to_a on bytes. It is already an Array:
String.bytes -> an_array
Returns an array of bytes in str. This is a shorthand for str.each_byte.to_a.

Combine remote images to one image using ruby-vips

I had a template image and need to append on that a specific images on X , Y positions. Is there any equivalent to that function in rmagick
ImageList.new("https://365psd.com/images/istock/previews/8479/84796157-football-field-template-with-goal-on-top-view.jpg")
and draw on that other images and generate one image.
You can read and write URIs in ruby-vips like this:
#!/usr/bin/ruby
require "vips"
require "down"
def new_from_uri(uri)
byte_source = Down.open uri
source = Vips::SourceCustom.new
source.on_read do |length|
puts "reading #{length} bytes from #{uri} ..."
byte_source.read length
end
source.on_seek do |offset, whence|
puts "seeking to #{offset}, #{whence} in #{uri}"
byte_source.seek(offset, whence)
end
return Vips::Image.new_from_source source, "", access: :sequential
end
a = new_from_uri "https://upload.wikimedia.org/wikipedia/commons/a/a6/Big_Ben_Clock_Face.jpg"
b = new_from_uri "https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png"
out = a.composite b, "over", x: 100, y: 100
out.write_to_file "x.jpg"
If you watch the console output you can see it loading the two source images and interleaving the pixels. It makes this output:
The docs on Vips::Source have more details.

Open and read a .RAW file

I need to writing a MATLAB program that will read this image file Download here.
This picture shows the instruction and the expected result I have tried the following code, but I couldn't get the expected results.
row=256; col=256;
f=fopen('e6712s4i50.raw','r');
a=fread(f, [col row],'*int16');
Z=a;
imshow(Z)
How do I read this picture correctly?
check below code:
fid = fopen('raw.raw')
rawdata = fread(fid, inf, '*uint8')
fclose(fid);
B = reshape(rawdata(2:2:131072),256,256)';
Y = circshift(B,[240 140]);
imshow(Y)
whats about this?

Detect duplicate videos from YouTube

In consideration to my M.tech Project
I want to know if there is any algorithm to detect duplicate videos from youtube.
For example (here are links of two videos):
random user upload
upload by official channel
Amongst these second is official video and T-series has it's copyright.
Is youtube officially doing something to remove duplicate videos from youtube?
Not only videos, there exists duplicate youtube channels also.
Sometimes the original video has less number of views than that of pirated version.
So, while searching found this
(see page number [49] of pdf)
What I learnt from the given link
Original vs copyright infringed video detection Classifier is used.
Given a query, firstly top k search results are being retrieved.Thereafter three parameters are used to classify the videos
Number of subscribers
user profile
username popularity
and on the basis of these parameters, original video is identified as described in the link.
EDIT 1:
There are basically two different objectives
To identify original video with the above method
To eliminate the duplicate videos
obviously identifying original video is easier than finding out all the duplicate videos.
So i preferred to first find out the original video.
Approach which i can think till now
to improve the accuracy:
We can first find out the original videos with above method
And then use the most popular publicized frames(may be multiple) of that video to search on google image. This method therefore retrieves the list of duplicate videos in google image search results.
After getting these duplicate videos, we can once again check frame by frame and reach a level of satisfaction(yes retrieved videos were "exact or "almost" duplicate copy of original video)
Will this approach work?
if not, is there any better algorithm, to improve upon the given method?
Please write in the comments section if i am unable to explain my approach clearly.
I will soon add some more details.
I've recently hacked together a small tool for that purpose. It's still work in progress but usually pretty accurate. The idea is to simply compare time between brightness maxima in the center of the video. Therefore it should work with different resolutions, frame rates and rotation of the video.
ffmpeg is used for decoding, imageio as bridge to python, numpy/scipy for maxima computation and some k-nearest-neighbor library (annoy, cyflann, hnsw) for comparison.
At the moment it's not polished at all so you should know a little python to run it or simply copy the idea.
Me too had the same problem.. So wrote a program myself..
Problem is I had videos of various formats and resolution.. So needed to take hash of each video frame and compare.
https://github.com/gklc811/duplicate_video_finder
you can just change the directories at top and you are good to go..
from os import path, walk, makedirs, rename
from time import clock
from imagehash import average_hash
from PIL import Image
from cv2 import VideoCapture, CAP_PROP_FRAME_COUNT, CAP_PROP_FRAME_WIDTH, CAP_PROP_FRAME_HEIGHT, CAP_PROP_FPS
from json import dump, load
from multiprocessing import Pool, cpu_count
input_vid_dir = r'C:\Users\gokul\Documents\data\\'
json_dir = r'C:\Users\gokul\Documents\db\\'
analyzed_dir = r'C:\Users\gokul\Documents\analyzed\\'
duplicate_dir = r'C:\Users\gokul\Documents\duplicate\\'
if not path.exists(json_dir):
makedirs(json_dir)
if not path.exists(analyzed_dir):
makedirs(analyzed_dir)
if not path.exists(duplicate_dir):
makedirs(duplicate_dir)
def write_to_json(filename, data):
file_full_path = json_dir + filename + ".json"
with open(file_full_path, 'w') as file_pointer:
dump(data, file_pointer)
return
def video_to_json(filename):
file_full_path = input_vid_dir + filename
start = clock()
size = round(path.getsize(file_full_path) / 1024 / 1024, 2)
video_pointer = VideoCapture(file_full_path)
frame_count = int(VideoCapture.get(video_pointer, int(CAP_PROP_FRAME_COUNT)))
width = int(VideoCapture.get(video_pointer, int(CAP_PROP_FRAME_WIDTH)))
height = int(VideoCapture.get(video_pointer, int(CAP_PROP_FRAME_HEIGHT)))
fps = int(VideoCapture.get(video_pointer, int(CAP_PROP_FPS)))
success, image = video_pointer.read()
video_hash = {}
while success:
frame_hash = average_hash(Image.fromarray(image))
video_hash[str(frame_hash)] = filename
success, image = video_pointer.read()
stop = clock()
time_taken = stop - start
print("Time taken for ", file_full_path, " is : ", time_taken)
data_dict = dict()
data_dict['size'] = size
data_dict['time_taken'] = time_taken
data_dict['fps'] = fps
data_dict['height'] = height
data_dict['width'] = width
data_dict['frame_count'] = frame_count
data_dict['filename'] = filename
data_dict['video_hash'] = video_hash
write_to_json(filename, data_dict)
return
def multiprocess_video_to_json():
files = next(walk(input_vid_dir))[2]
processes = cpu_count()
print(processes)
pool = Pool(processes)
start = clock()
pool.starmap_async(video_to_json, zip(files))
pool.close()
pool.join()
stop = clock()
print("Time Taken : ", stop - start)
def key_with_max_val(d):
max_value = 0
required_key = ""
for k in d:
if d[k] > max_value:
max_value = d[k]
required_key = k
return required_key
def duplicate_analyzer():
files = next(walk(json_dir))[2]
data_dict = {}
for file in files:
filename = json_dir + file
with open(filename) as f:
data = load(f)
video_hash = data['video_hash']
count = 0
duplicate_file_dict = dict()
for key in video_hash:
count += 1
if key in data_dict:
if data_dict[key] in duplicate_file_dict:
duplicate_file_dict[data_dict[key]] = duplicate_file_dict[data_dict[key]] + 1
else:
duplicate_file_dict[data_dict[key]] = 1
else:
data_dict[key] = video_hash[key]
if duplicate_file_dict:
duplicate_file = key_with_max_val(duplicate_file_dict)
duplicate_percentage = ((duplicate_file_dict[duplicate_file] / count) * 100)
if duplicate_percentage > 50:
file = file[:-5]
print(file, " is dup of ", duplicate_file)
src = analyzed_dir + file
tgt = duplicate_dir + file
if path.exists(src):
rename(src, tgt)
# else:
# print("File already moved")
def mv_analyzed_file():
files = next(walk(json_dir))[2]
for filename in files:
filename = filename[:-5]
src = input_vid_dir + filename
tgt = analyzed_dir + filename
if path.exists(src):
rename(src, tgt)
# else:
# print("File already moved")
if __name__ == '__main__':
mv_analyzed_file()
multiprocess_video_to_json()
mv_analyzed_file()
duplicate_analyzer()

Error in reading video file in Matlab

Hi I am getting a strange error while trying to read a video, frame wise in matlab. I am doing the following:
xyloObj = VideoReader(vid_name);
fps = xyloObj.FrameRate;
nFrames = xyloObj.NumberOfFrames;
vidHeight = xyloObj.Height;
vidWidth = xyloObj.Width;
% Preallocate movie structure.
mov(1:nFrames) = ...
struct('cdata', zeros(vidHeight, vidWidth, 3, 'uint8'),...
'colormap', []);
index =1;
for k = 1:nFrames
mov(index).cdata = read(xyloObj, k);
index = index+1;
end
I get the following error:
Error using VideoReader/read (line 80)
The file could not be read.
Haven't found solution to this error anywhere else.
EDIT: File format is avi. something like: D:\videos\drunk.avi.
How about using a mmread? I used VideoReader on Linux, but in my case, the length of frames is not correct.
Moreover, since I need the timestamp of videos, I have changed from VideoReader mmread.

Resources