Viterbi Block Decoding - algorithm

I don't know if this fits here but here goes:
I have some noisy data encoded using a Hamming Block Code and I'd like to decode them using a Viterbi Decoder.
I made my homework thus I know how the Viterbi Block Decoder works but I'd like to avoid implementing it all by myself as it would take quite some time and might be suboptimal.
My question is the following: do you know of any matlab function for Viterbi Block Decoder?
I found comm.ViterbiDecoder but it is only for convolutional encoding.
In the mean time I tried to retrieve as many information as I could from the encoder in case it might be needed in the future (see code below).
% Parameters
codewordLength = 31;
messageLength = 26;
data = randi([0, 1], 1, messageLength);
% Encoder 1
encoder1 = comm.BCHEncoder(...
'CodewordLength', codewordLength, ...
'MessageLength', messageLength);
% Encoder 2
M = ceil(log2(codewordLength+1));
primitivePolynomialDe = primpoly(M, 'nodisplay');
primitivePolynomialBi = fliplr(de2bi(primitivePolynomialDe));
SL = (2^M-1) - codewordLength;
generatorPolynomial = bchgenpoly(codewordLength + SL, ...
messageLength + SL, primitivePolynomialDe);
encoder2 = comm.BCHEncoder(...
'CodewordLength', codewordLength, ...
'MessageLength', messageLength, ...
'PrimitivePolynomialSource', 'Property', ...
'PrimitivePolynomial', primitivePolynomialBi, ...
'GeneratorPolynomialSource', 'Property', ...
'GeneratorPolynomial', generatorPolynomial);
dataEncoded1 = step(encoder1, data.').';
dataEncoded2 = step(encoder2, data.').';
sum(dataEncoded1~=dataEncoded2)

Related

Detectron2- How to log validation loss during training?

I copied the idea from mnslarcher and wrote the following two functions for my keypoint detector (resnet50 backbone) algorithm.
def build_valid_loader(cfg):
_cfg = cfg.clone()
_cfg.defrost() # make this cfg mutable.
_cfg.DATASETS.TRAIN = cfg.DATASETS.TEST
return build_detection_train_loader(_cfg)
def store_valid_loss(model, data, storage):
training_mode = model.training
with torch.no_grad():
loss_dict = model(data)
losses = sum(loss_dict.values())
assert torch.isfinite(losses).all(), loss_dict
loss_dict_reduced = {k: v.item()
for k, v in comm.reduce_dict(loss_dict).items()}
losses_reduced = sum(loss for loss in loss_dict_reduced.values())
if comm.is_main_process():
storage.put_scalars(val_loss=losses_reduced, **loss_dict_reduced)
model.train(training_mode)
then in plain_train_net.py I am calling them as bellow.
val_data_loader = build_valid_loader(cfg)
logger.info("Starting training from iteration {}".format(start_iter))
with EventStorage(start_iter) as storage:
for data, val_data, iteration in zip(data_loader, val_data_loader, range(start_iter, max_iter)):
iteration = iteration + 1
..
..
#At the end of the for loop.
# Calculate and log validation loss.
store_valid_loss(model, val_data, storage)
after 1k iteration, loss_keypoint is increasing, but total_loss is same compared to without store_valid_loss call. What am I missing? Can anyone please help to understand?
I am using 4 GeForce RTX 2080 Ti.

Pyaudio : how to check volume

I'm currently developping a VOIP tool in python working as a client-server. My problem is that i'm currently sending the Pyaudio input stream as follows even when there is no sound (well, when nobody talks or there is no noise, data is sent as well) :
CHUNK = 1024
p = pyaudio.PyAudio()
stream = p.open(format = pyaudio.paInt16,
channels = 1,
rate = 44100,
input = True,
frames_per_buffer = CHUNK)
while 1:
self.conn.sendVoice(stream.read(CHUNK))
I would like to check volume to get something like this :
data = stream.read(CHUNK)
if data.volume > 20%:
self.conn.sendVoice(data)
This way I could avoid sending useless data and spare connection/ increase performance. (Also, I'm looking for some kind of compression but I think I will have to ask it in another topic).
Its can be done using root mean square (RMS).
One way to build your own rms function using python is:
def rms( data ):
count = len(data)/2
format = "%dh"%(count)
shorts = struct.unpack( format, data )
sum_squares = 0.0
for sample in shorts:
n = sample * (1.0/32768)
sum_squares += n*n
return math.sqrt( sum_squares / count )
Another choice is use audioop to find rms:
data = stream.read(CHUNK)
rms = audioop.rms(data,2)
Now if do you want you can convert rms to decibel scale decibel = 20 * log10(rms)

Dealing with under flow while calculating GMM parameters using EM

I am currently runnuing training in matlab on a matrix of logspecrum samples I am constantly dealing with underflow problems.I understood that I need to work with log's in order to deal with underflowing.
I am still strugling with uderflow though , when i calculate the mean (mue) bucause it is negetive i cant work with logs so i need the real values that underflow.
These are equasions i am working with:
In MATLAB code i calulate log_tau in oreder avoid underflow but when calulating mue i need exp(log(tau)) which goes to zero.
I am attaching relevent MATLAB code
**in the code i called the variable alpha is tau ...
for i = 1 : 50
log_c = Logsum(log_alpha,1) - log(N);
c = exp(log_c);
mue = DataMat*alpha./(repmat(exp(Logsum(log_alpha,1)),FrameSize,1));
log_abs_mue = log(abs(mue));
log_SigmaSqr = log((DataMat.^2)*alpha) - repmat(Logsum(log_alpha,1),FrameSize,1) - 2*log_abs_mue;
SigmaSqr = exp(log_SigmaSqr);
for j=1:N
rep_DataMat(:,:,j) = repmat(DataMat(:,j),1,M);
log_gamma(j,:) = log_c - 0.5*(FrameSize*log(2*pi)+sum(log_SigmaSqr)) + sum((rep_DataMat(:,:,j) - mue).^2./(2*SigmaSqr));
end
log_alpha = log_gamma - repmat(Logsum(log_gamma,2),1,M);
alpha = exp(log_alpha);
end
c = exp(log_c);
SigmaSqr = exp(log_SigmaSqr);
does any one see how i can avoid this? or what needs to be fixed in code?
What i did was add this line to the MATLAB code:
mue(isnan(mue))=0; %fix 0/0 problem
and this one:
SigmaSqr(SigmaSqr==0)=1;%fix if mue_k = x_k
not sure if this is the best solution but is seems to work...
any have a better idea?

Detect duplicate videos from YouTube

In consideration to my M.tech Project
I want to know if there is any algorithm to detect duplicate videos from youtube.
For example (here are links of two videos):
random user upload
upload by official channel
Amongst these second is official video and T-series has it's copyright.
Is youtube officially doing something to remove duplicate videos from youtube?
Not only videos, there exists duplicate youtube channels also.
Sometimes the original video has less number of views than that of pirated version.
So, while searching found this
(see page number [49] of pdf)
What I learnt from the given link
Original vs copyright infringed video detection Classifier is used.
Given a query, firstly top k search results are being retrieved.Thereafter three parameters are used to classify the videos
Number of subscribers
user profile
username popularity
and on the basis of these parameters, original video is identified as described in the link.
EDIT 1:
There are basically two different objectives
To identify original video with the above method
To eliminate the duplicate videos
obviously identifying original video is easier than finding out all the duplicate videos.
So i preferred to first find out the original video.
Approach which i can think till now
to improve the accuracy:
We can first find out the original videos with above method
And then use the most popular publicized frames(may be multiple) of that video to search on google image. This method therefore retrieves the list of duplicate videos in google image search results.
After getting these duplicate videos, we can once again check frame by frame and reach a level of satisfaction(yes retrieved videos were "exact or "almost" duplicate copy of original video)
Will this approach work?
if not, is there any better algorithm, to improve upon the given method?
Please write in the comments section if i am unable to explain my approach clearly.
I will soon add some more details.
I've recently hacked together a small tool for that purpose. It's still work in progress but usually pretty accurate. The idea is to simply compare time between brightness maxima in the center of the video. Therefore it should work with different resolutions, frame rates and rotation of the video.
ffmpeg is used for decoding, imageio as bridge to python, numpy/scipy for maxima computation and some k-nearest-neighbor library (annoy, cyflann, hnsw) for comparison.
At the moment it's not polished at all so you should know a little python to run it or simply copy the idea.
Me too had the same problem.. So wrote a program myself..
Problem is I had videos of various formats and resolution.. So needed to take hash of each video frame and compare.
https://github.com/gklc811/duplicate_video_finder
you can just change the directories at top and you are good to go..
from os import path, walk, makedirs, rename
from time import clock
from imagehash import average_hash
from PIL import Image
from cv2 import VideoCapture, CAP_PROP_FRAME_COUNT, CAP_PROP_FRAME_WIDTH, CAP_PROP_FRAME_HEIGHT, CAP_PROP_FPS
from json import dump, load
from multiprocessing import Pool, cpu_count
input_vid_dir = r'C:\Users\gokul\Documents\data\\'
json_dir = r'C:\Users\gokul\Documents\db\\'
analyzed_dir = r'C:\Users\gokul\Documents\analyzed\\'
duplicate_dir = r'C:\Users\gokul\Documents\duplicate\\'
if not path.exists(json_dir):
makedirs(json_dir)
if not path.exists(analyzed_dir):
makedirs(analyzed_dir)
if not path.exists(duplicate_dir):
makedirs(duplicate_dir)
def write_to_json(filename, data):
file_full_path = json_dir + filename + ".json"
with open(file_full_path, 'w') as file_pointer:
dump(data, file_pointer)
return
def video_to_json(filename):
file_full_path = input_vid_dir + filename
start = clock()
size = round(path.getsize(file_full_path) / 1024 / 1024, 2)
video_pointer = VideoCapture(file_full_path)
frame_count = int(VideoCapture.get(video_pointer, int(CAP_PROP_FRAME_COUNT)))
width = int(VideoCapture.get(video_pointer, int(CAP_PROP_FRAME_WIDTH)))
height = int(VideoCapture.get(video_pointer, int(CAP_PROP_FRAME_HEIGHT)))
fps = int(VideoCapture.get(video_pointer, int(CAP_PROP_FPS)))
success, image = video_pointer.read()
video_hash = {}
while success:
frame_hash = average_hash(Image.fromarray(image))
video_hash[str(frame_hash)] = filename
success, image = video_pointer.read()
stop = clock()
time_taken = stop - start
print("Time taken for ", file_full_path, " is : ", time_taken)
data_dict = dict()
data_dict['size'] = size
data_dict['time_taken'] = time_taken
data_dict['fps'] = fps
data_dict['height'] = height
data_dict['width'] = width
data_dict['frame_count'] = frame_count
data_dict['filename'] = filename
data_dict['video_hash'] = video_hash
write_to_json(filename, data_dict)
return
def multiprocess_video_to_json():
files = next(walk(input_vid_dir))[2]
processes = cpu_count()
print(processes)
pool = Pool(processes)
start = clock()
pool.starmap_async(video_to_json, zip(files))
pool.close()
pool.join()
stop = clock()
print("Time Taken : ", stop - start)
def key_with_max_val(d):
max_value = 0
required_key = ""
for k in d:
if d[k] > max_value:
max_value = d[k]
required_key = k
return required_key
def duplicate_analyzer():
files = next(walk(json_dir))[2]
data_dict = {}
for file in files:
filename = json_dir + file
with open(filename) as f:
data = load(f)
video_hash = data['video_hash']
count = 0
duplicate_file_dict = dict()
for key in video_hash:
count += 1
if key in data_dict:
if data_dict[key] in duplicate_file_dict:
duplicate_file_dict[data_dict[key]] = duplicate_file_dict[data_dict[key]] + 1
else:
duplicate_file_dict[data_dict[key]] = 1
else:
data_dict[key] = video_hash[key]
if duplicate_file_dict:
duplicate_file = key_with_max_val(duplicate_file_dict)
duplicate_percentage = ((duplicate_file_dict[duplicate_file] / count) * 100)
if duplicate_percentage > 50:
file = file[:-5]
print(file, " is dup of ", duplicate_file)
src = analyzed_dir + file
tgt = duplicate_dir + file
if path.exists(src):
rename(src, tgt)
# else:
# print("File already moved")
def mv_analyzed_file():
files = next(walk(json_dir))[2]
for filename in files:
filename = filename[:-5]
src = input_vid_dir + filename
tgt = analyzed_dir + filename
if path.exists(src):
rename(src, tgt)
# else:
# print("File already moved")
if __name__ == '__main__':
mv_analyzed_file()
multiprocess_video_to_json()
mv_analyzed_file()
duplicate_analyzer()

Multiple linear optimizations

I'm interested in solving a few hundred linear systems in MATLAB. At the moment this is done by a for-loop with linprog
The vectors used have identical dimensions and are lines of one matrix.
for combination_id = 1:1000
[tempOperatingPointsVectors,tempTargetValue, exitflag] = ...
linprog( lo_c(combination_id,:), ...
[], [], ...
lo_G(:,:,combination_id), lo_d(:,combination_id), ...
lo_u(:,combination_id), lo_v(:,combination_id), ...
x0_in, options);
end
Is there a way of using linprog with the whole vectors instead of picking each line?
I also tried a parfor loop but since the operations in each loop are very small there is no speed improvement.
Why can't you set up one big linear program and then solve all of it at once?
Since I don't have your data, I cannot test the following code, but the basic idea should work.
xVar = 1:size(lo_c,2);
uBound = lo_u(:, 1);
vBound = lo_v(:, 1);
dMat = lo_d(:, 1);
gMat = lo_G(:,:, 1);
objMat = lo_c(1,:);
x0_inMat = x0_in;
for combination_id = 2:1000
xVar = [xVar, xVar(end)+1:xVar(end)+size(lo_c,2)];
uBound = [uBound; lo_u(:, combination_id);
vBound = [vBound; lo_v(:, combination_id);
dMat = [dMat; lo_d(:, combination_id);
gMat = [gMat; lo_G(:,:, combination_id)];
objMat = [objMat; lo_c(combination_id,:)];
x0_inMat = [xo_inMat; x0_in];
end
[tempOperatingPointsVectors,tempTargetValue, exitflag] = ...
linprog( objMat, ...
[], [], ...
gMat, dMat, ...
uBound, vBound, ...
x0_in, options);
Should do the trick.

Resources