Bert Config: Num attention heads - huggingface-transformers

I am using the BertConfig() to create a encoder decoder model in the following way:
encoder = BertConfig()
decoder = BertConfig()
config = EncoderDecoderConfig.from_encoder_decoder_configs(encoder, decoder)
bert2bert = EncoderDecoderModel(config=config)
bert2bert.config.decoder.is_decoder = True
bert2bert.config.decoder.add_cross_attention = True
bert2bert.config.encoder.num_attention_heads = 12
print(bert2bert.encoder.num_parameters(only_trainable=True),bert2bert.encoder.config.num_attention_heads)
With default attention heads 12 the trainable parameters and attention heads at encoder are as below
86742528, 12
However when I try to change the number of attention heads to 4, the number of trainable parameters does not change while the value for number of attention heads changes (as below). Can anyone help me out?
bert2bert.config.decoder.add_cross_attention = True
bert2bert.config.encoder.num_attention_heads = 4
print(bert2bert.encoder.num_parameters(only_trainable=True), bert2bert.encoder.config.num_attention_heads)
(86742528, 4)

Related

Fine-tune a pre-trained model

I am new to transformer based models. I am trying to fine-tune the following model (https://huggingface.co/Chramer/remote-sensing-distilbert-cased) on my dataset. The code:
enter image description here
and I got the following error:
enter image description here
I will be thankful if anyone could help.
The preprocessing steps I followed:
input_ids_t = []
attention_masks_t = []
for sent in df_train['text_a']:
encoded_dict = tokenizer.encode_plus(
sent,
add_special_tokens = True,
max_length = 128,
pad_to_max_length = True,
return_attention_mask = True,
return_tensors = 'tf',
)
input_ids_t.append(encoded_dict['input_ids'])
attention_masks_t.append(encoded_dict['attention_mask'])
# Convert the lists into tensors.
input_ids_t = tf.concat(input_ids_t, axis=0)
attention_masks_t = tf.concat(attention_masks_t, axis=0)
labels_t = np.asarray(df_train['label'])
and i did the same for testing data. Then:
train_data = tf.data.Dataset.from_tensor_slices((input_ids_t,attention_masks_t,labels_t))
and the same for testing data
It sounds like you are feeding the transformer_model 1 input instead of 3. Try removing the square brackets around transformer_model([input_ids, input_mask, segment_ids])[0] so that it reads transformer_model(input_ids, input_mask, segment_ids)[0]. That way, the function will have 3 arguments and not just 1.

Keras Graph disconnected cannot obtain value for tensor KerasTensor

Tensorflow: 2.4.0
This is the Full Error Message:
ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(type_spec=TensorSpec(shape=(None, 64, 64, 3), dtype=tf.float32, name='input_1'), name='input_1', description="created by layer 'input_1'") at layer "flatten". The following previous layers were accessed without issue: []
I have been trying to make a controllable Autoencoder where I have 10 features I can variy to get an image (64x64 RGB)
And i have been having Trouble getting it working. I want to seperate the Neural Network into a full model which i can fit and an Decoder which i can use to later after training parse values into to generate images
btw i know this is not the perfect way to do an autoencoder it's just the simplest i can think of.
def Create_Generator(Image_Shape):
Input_Layer = Input(shape=Image_Shape)
Flatten_Layer1 = Flatten()(Input_Layer)
Dense_Layer1 = Dense(12288,activation="relu")(Flatten_Layer1)
Dense_Layer2 = Dense(6144,activation="relu")(Dense_Layer1)
Dense_Layer3 = Dense(1024, activation="relu")(Dense_Layer2)
Dense_Layer4 = Dense(10,activation="relu")(Dense_Layer3)
Dense_Layer5 = Dense(1024, activation="relu")(Dense_Layer4)
Dense_Layer6 = Dense(6144,activation="relu")(Dense_Layer5)
Dense_Layer7 = Dense(12288,activation="relu")(Dense_Layer6)
Reshape_Layer = Reshape(Image_Shape)(Dense_Layer7)
AutoEncoder = Model(Input_Layer,Reshape_Layer)
AutoEncoder.compile(optimizer='adam', loss ='binary_crossentropy')
encoded_input = Input(shape=(10,))
Decoder = Model([encoded_input,Dense_Layer5,Dense_Layer6,Dense_Layer7],Reshape_Layer)
return AutoEncoder,Decoder
data = np.load("data.npz")
X_train = data['X']
AutoEncoder,Decoder = Create_Generator((64,64,3))
#Just for testing if it works
print(AutoEncoder.predict([X_train[0]]))
print(Decoder([[1,1,1,1,1,1,1,1,1,1]]))
I think you have an error here:
Decoder = Model([encoded_input,Dense_Layer5,Dense_Layer6,Dense_Layer7],Reshape_Layer)
Dense_Layer5, Dense_Layer6, Dense_Layer7 are not tf.keras.layers.Input. You can not create Decoder this way.

Pyaudio : how to check volume

I'm currently developping a VOIP tool in python working as a client-server. My problem is that i'm currently sending the Pyaudio input stream as follows even when there is no sound (well, when nobody talks or there is no noise, data is sent as well) :
CHUNK = 1024
p = pyaudio.PyAudio()
stream = p.open(format = pyaudio.paInt16,
channels = 1,
rate = 44100,
input = True,
frames_per_buffer = CHUNK)
while 1:
self.conn.sendVoice(stream.read(CHUNK))
I would like to check volume to get something like this :
data = stream.read(CHUNK)
if data.volume > 20%:
self.conn.sendVoice(data)
This way I could avoid sending useless data and spare connection/ increase performance. (Also, I'm looking for some kind of compression but I think I will have to ask it in another topic).
Its can be done using root mean square (RMS).
One way to build your own rms function using python is:
def rms( data ):
count = len(data)/2
format = "%dh"%(count)
shorts = struct.unpack( format, data )
sum_squares = 0.0
for sample in shorts:
n = sample * (1.0/32768)
sum_squares += n*n
return math.sqrt( sum_squares / count )
Another choice is use audioop to find rms:
data = stream.read(CHUNK)
rms = audioop.rms(data,2)
Now if do you want you can convert rms to decibel scale decibel = 20 * log10(rms)

Detect duplicate videos from YouTube

In consideration to my M.tech Project
I want to know if there is any algorithm to detect duplicate videos from youtube.
For example (here are links of two videos):
random user upload
upload by official channel
Amongst these second is official video and T-series has it's copyright.
Is youtube officially doing something to remove duplicate videos from youtube?
Not only videos, there exists duplicate youtube channels also.
Sometimes the original video has less number of views than that of pirated version.
So, while searching found this
(see page number [49] of pdf)
What I learnt from the given link
Original vs copyright infringed video detection Classifier is used.
Given a query, firstly top k search results are being retrieved.Thereafter three parameters are used to classify the videos
Number of subscribers
user profile
username popularity
and on the basis of these parameters, original video is identified as described in the link.
EDIT 1:
There are basically two different objectives
To identify original video with the above method
To eliminate the duplicate videos
obviously identifying original video is easier than finding out all the duplicate videos.
So i preferred to first find out the original video.
Approach which i can think till now
to improve the accuracy:
We can first find out the original videos with above method
And then use the most popular publicized frames(may be multiple) of that video to search on google image. This method therefore retrieves the list of duplicate videos in google image search results.
After getting these duplicate videos, we can once again check frame by frame and reach a level of satisfaction(yes retrieved videos were "exact or "almost" duplicate copy of original video)
Will this approach work?
if not, is there any better algorithm, to improve upon the given method?
Please write in the comments section if i am unable to explain my approach clearly.
I will soon add some more details.
I've recently hacked together a small tool for that purpose. It's still work in progress but usually pretty accurate. The idea is to simply compare time between brightness maxima in the center of the video. Therefore it should work with different resolutions, frame rates and rotation of the video.
ffmpeg is used for decoding, imageio as bridge to python, numpy/scipy for maxima computation and some k-nearest-neighbor library (annoy, cyflann, hnsw) for comparison.
At the moment it's not polished at all so you should know a little python to run it or simply copy the idea.
Me too had the same problem.. So wrote a program myself..
Problem is I had videos of various formats and resolution.. So needed to take hash of each video frame and compare.
https://github.com/gklc811/duplicate_video_finder
you can just change the directories at top and you are good to go..
from os import path, walk, makedirs, rename
from time import clock
from imagehash import average_hash
from PIL import Image
from cv2 import VideoCapture, CAP_PROP_FRAME_COUNT, CAP_PROP_FRAME_WIDTH, CAP_PROP_FRAME_HEIGHT, CAP_PROP_FPS
from json import dump, load
from multiprocessing import Pool, cpu_count
input_vid_dir = r'C:\Users\gokul\Documents\data\\'
json_dir = r'C:\Users\gokul\Documents\db\\'
analyzed_dir = r'C:\Users\gokul\Documents\analyzed\\'
duplicate_dir = r'C:\Users\gokul\Documents\duplicate\\'
if not path.exists(json_dir):
makedirs(json_dir)
if not path.exists(analyzed_dir):
makedirs(analyzed_dir)
if not path.exists(duplicate_dir):
makedirs(duplicate_dir)
def write_to_json(filename, data):
file_full_path = json_dir + filename + ".json"
with open(file_full_path, 'w') as file_pointer:
dump(data, file_pointer)
return
def video_to_json(filename):
file_full_path = input_vid_dir + filename
start = clock()
size = round(path.getsize(file_full_path) / 1024 / 1024, 2)
video_pointer = VideoCapture(file_full_path)
frame_count = int(VideoCapture.get(video_pointer, int(CAP_PROP_FRAME_COUNT)))
width = int(VideoCapture.get(video_pointer, int(CAP_PROP_FRAME_WIDTH)))
height = int(VideoCapture.get(video_pointer, int(CAP_PROP_FRAME_HEIGHT)))
fps = int(VideoCapture.get(video_pointer, int(CAP_PROP_FPS)))
success, image = video_pointer.read()
video_hash = {}
while success:
frame_hash = average_hash(Image.fromarray(image))
video_hash[str(frame_hash)] = filename
success, image = video_pointer.read()
stop = clock()
time_taken = stop - start
print("Time taken for ", file_full_path, " is : ", time_taken)
data_dict = dict()
data_dict['size'] = size
data_dict['time_taken'] = time_taken
data_dict['fps'] = fps
data_dict['height'] = height
data_dict['width'] = width
data_dict['frame_count'] = frame_count
data_dict['filename'] = filename
data_dict['video_hash'] = video_hash
write_to_json(filename, data_dict)
return
def multiprocess_video_to_json():
files = next(walk(input_vid_dir))[2]
processes = cpu_count()
print(processes)
pool = Pool(processes)
start = clock()
pool.starmap_async(video_to_json, zip(files))
pool.close()
pool.join()
stop = clock()
print("Time Taken : ", stop - start)
def key_with_max_val(d):
max_value = 0
required_key = ""
for k in d:
if d[k] > max_value:
max_value = d[k]
required_key = k
return required_key
def duplicate_analyzer():
files = next(walk(json_dir))[2]
data_dict = {}
for file in files:
filename = json_dir + file
with open(filename) as f:
data = load(f)
video_hash = data['video_hash']
count = 0
duplicate_file_dict = dict()
for key in video_hash:
count += 1
if key in data_dict:
if data_dict[key] in duplicate_file_dict:
duplicate_file_dict[data_dict[key]] = duplicate_file_dict[data_dict[key]] + 1
else:
duplicate_file_dict[data_dict[key]] = 1
else:
data_dict[key] = video_hash[key]
if duplicate_file_dict:
duplicate_file = key_with_max_val(duplicate_file_dict)
duplicate_percentage = ((duplicate_file_dict[duplicate_file] / count) * 100)
if duplicate_percentage > 50:
file = file[:-5]
print(file, " is dup of ", duplicate_file)
src = analyzed_dir + file
tgt = duplicate_dir + file
if path.exists(src):
rename(src, tgt)
# else:
# print("File already moved")
def mv_analyzed_file():
files = next(walk(json_dir))[2]
for filename in files:
filename = filename[:-5]
src = input_vid_dir + filename
tgt = analyzed_dir + filename
if path.exists(src):
rename(src, tgt)
# else:
# print("File already moved")
if __name__ == '__main__':
mv_analyzed_file()
multiprocess_video_to_json()
mv_analyzed_file()
duplicate_analyzer()

Viterbi Block Decoding

I don't know if this fits here but here goes:
I have some noisy data encoded using a Hamming Block Code and I'd like to decode them using a Viterbi Decoder.
I made my homework thus I know how the Viterbi Block Decoder works but I'd like to avoid implementing it all by myself as it would take quite some time and might be suboptimal.
My question is the following: do you know of any matlab function for Viterbi Block Decoder?
I found comm.ViterbiDecoder but it is only for convolutional encoding.
In the mean time I tried to retrieve as many information as I could from the encoder in case it might be needed in the future (see code below).
% Parameters
codewordLength = 31;
messageLength = 26;
data = randi([0, 1], 1, messageLength);
% Encoder 1
encoder1 = comm.BCHEncoder(...
'CodewordLength', codewordLength, ...
'MessageLength', messageLength);
% Encoder 2
M = ceil(log2(codewordLength+1));
primitivePolynomialDe = primpoly(M, 'nodisplay');
primitivePolynomialBi = fliplr(de2bi(primitivePolynomialDe));
SL = (2^M-1) - codewordLength;
generatorPolynomial = bchgenpoly(codewordLength + SL, ...
messageLength + SL, primitivePolynomialDe);
encoder2 = comm.BCHEncoder(...
'CodewordLength', codewordLength, ...
'MessageLength', messageLength, ...
'PrimitivePolynomialSource', 'Property', ...
'PrimitivePolynomial', primitivePolynomialBi, ...
'GeneratorPolynomialSource', 'Property', ...
'GeneratorPolynomial', generatorPolynomial);
dataEncoded1 = step(encoder1, data.').';
dataEncoded2 = step(encoder2, data.').';
sum(dataEncoded1~=dataEncoded2)

Resources