I'm trying to fit a model using fitDataset(). I can train using the "normal" approach, with a for loop and getting random batches of data (20000 data points).
I'd like to use the fitDataset() and be able to use the entire dataset and not rely on "randomness" of my getBatch function.
I'm getting closer, using the API docs and the example on tfjs-data but, i'm stuck on a probably dumb data manipulation...
So here's how i'm doing it:
const [trainX, trainY] = await bigData
const model = await cnnLSTM // gru performing well
const BATCH_SIZE = 32
const dataSet = flattenDataset(trainX.slice(200), trainY.slice(200))
model.compile({
loss: 'categoricalCrossentropy',
optimizer: tf.train.adam(0.001),
metrics: ['accuracy']
})
await model.fitDataset(dataSet.train.batch(32), {
epochs: C.trainSteps,
validationData: dataSet.validation,
callbacks: {
onBatchEnd: async (batch, logs) => (await tf.nextFrame()),
onEpochEnd: (epoch, logs) => {
let i = epoch + 1
lossValues.push({'epoch': i, 'loss': logs.loss, 'val_loss': logs.val_loss, 'set': 'train'})
accuracyValues.push({'epoch': i, 'accuracy': logs.acc, 'val_accuracy': logs.val_acc, 'set': 'train'})
// await md `${await plotLosses(train.lossValues)} ${await plotAccuracy(train.accuracyValues)}`
}
}
})
here's my interpretation of the dataset creation:
flattenDataset = (features, labels, split = 0.35) => {
return tf.tidy(() => {
let slice =features.length - Math.floor(features.length * split)
const featuresTrain = features.slice(0, slice)
const featuresVal = features.slice(slice)
const labelsTrain = labels.slice(0, slice)
const labelsVal = labels.slice(slice)
const data = {
train: tf.data.array(featuresTrain, labelsTrain),
validation: tf.data.array(featuresVal, labelsVal)
}
return data
})
}
I'm getting an error:
Error: Dataset iterator for fitDataset() is expected to generate an Array of length 2: `[xs, ys]`, but instead generates Tensor
[[0.4106583, 0.5408, 0.4885066, 0.9021732, 0.1278526],
[0.3711334, 0.5141, 0.4848816, 0.9021571, 0.2688071],
[0.4336613, 0.5747, 0.4822159, 0.9021728, 0.3694479],
...,
[0.4123166, 0.4553, 0.478438 , 0.9020132, 0.8797594],
[0.3963479, 0.3714, 0.4871198, 0.901996 , 0.7170534],
[0.4832076, 0.3557, 0.4892016, 0.9019232, 0.9999322]],Tensor
[[0.3711334, 0.5141, 0.4848816, 0.9021571, 0.2688071],
[0.4336613, 0.5747, 0.4822159, 0.9021728, 0.3694479],
[0.4140858, 0.5985, 0.4789927, 0.9022084, 0.1912155],
...,
The input data is 6 timesteps with 5 dimensions and the labels are just one-hot encoded classes [0,0,1], [0,1,0] and [1, 0, 0]. I guess the flattenDataset() is not sending the data in the correct way.
Does data.train needs to output for each data point [6 timesteps with 5 dims, label] ? I get this error when i tried that:
Error: The feature data generated by the dataset lacks the required input key 'conv1d_Conv1D5_input'.
Could really use some pro insight...
--------------------
Edit #1:
I feel i'm close to an answer.
const X = tf.data.array(trainX.slice(0, 100))//.map(x => x)
const Y = tf.data.array(trainY.slice(0, 100))//.map(x => x)
const zip = tf.data.zip([X, Y])
const dataSet = {
train: zip
}
dataSet.train.forEach(x => console.log(x))
With this i get on the console:
[Array(6), Array(3)]
[Array(6), Array(3)]
[Array(6), Array(3)]
...
[Array(6), Array(3)]
[Array(6), Array(3)]
but the fitDataset is giving me: Error: The feature data generated by the dataset lacks the required input key 'conv1d_Conv1D5_input'.
my model look like this:
const model = tf.sequential()
model.add(tf.layers.conv1d({
inputShape: [6, 5],
kernelSize: (3),
filters: 64,
strides: 1,
padding: 'same',
activation: 'elu',
kernelInitializer: 'varianceScaling',
}))
model.add(tf.layers.maxPooling1d({poolSize: (2)}))
model.add(tf.layers.conv1d({
kernelSize: (1),
filters: 64,
strides: 1,
padding: 'same',
activation: 'elu'
}))
model.add(tf.layers.maxPooling1d({poolSize: (2)}))
model.add(tf.layers.lstm({
units: 18,
activation: 'elu'
}))
model.add(tf.layers.dense({units: 3, activation: 'softmax'}))
model.compile({
loss: 'categoricalCrossentropy',
optimizer: tf.train.adam(0.001),
metrics: ['accuracy']
})
return model
What is wrong here?
What model.fitDataset expects are a Dataset, each element inside this dataset is a tuple of two items, [feature, label].
So in your case, you need to create featureDataset and labelDataset, then merge then with tf.data.zip to create trainDataset. Same for validation dataset.
Solved it
so after a lot of trial an error i found a way to make it work.
So, i had an input shape of [6, 5], meaning an array with 6 arrays of 5 floats each.
[[[0.3467378, 0.3737, 0.4781905, 0.90665, 0.68142351],
[0.44003019602788285, 0.3106, 0.4864576, 0.90193448, 0.5841830879700972],
[0.30672944860847245, 0.3404, 0.490295674, 0.90720676, 0.8331748581920732],
[0.37475716007758336, 0.265, 0.4847249, 0.902056932, 0.6611207914113887],
[0.5639427928616854, 0.2423002, 0.483168235, 0.9020202294447865, 0.82823],
[0.41581425627336555, 0.4086, 0.4721923, 0.902094287, 0.914699]], ... 20k more]
What i did was to flatten the array becoming an array of 5 dimensions arrays. Then applied the .batch(6) to it.
const BATCH_SIZE = 20 //batch size fed to the NN
const X = tf.data.array([].concat(...trainX)).batch(6).batch(BATCH_SIZE)
const Y = tf.data.array(trainY).batch(BATCH_SIZE)
const zip = tf.data.zip([X, Y])
const dataSet = {
train: zip
}
Hope it can help others on complex data!!
Related
I am new to transformer based models. I am trying to fine-tune the following model (https://huggingface.co/Chramer/remote-sensing-distilbert-cased) on my dataset. The code:
enter image description here
and I got the following error:
enter image description here
I will be thankful if anyone could help.
The preprocessing steps I followed:
input_ids_t = []
attention_masks_t = []
for sent in df_train['text_a']:
encoded_dict = tokenizer.encode_plus(
sent,
add_special_tokens = True,
max_length = 128,
pad_to_max_length = True,
return_attention_mask = True,
return_tensors = 'tf',
)
input_ids_t.append(encoded_dict['input_ids'])
attention_masks_t.append(encoded_dict['attention_mask'])
# Convert the lists into tensors.
input_ids_t = tf.concat(input_ids_t, axis=0)
attention_masks_t = tf.concat(attention_masks_t, axis=0)
labels_t = np.asarray(df_train['label'])
and i did the same for testing data. Then:
train_data = tf.data.Dataset.from_tensor_slices((input_ids_t,attention_masks_t,labels_t))
and the same for testing data
It sounds like you are feeding the transformer_model 1 input instead of 3. Try removing the square brackets around transformer_model([input_ids, input_mask, segment_ids])[0] so that it reads transformer_model(input_ids, input_mask, segment_ids)[0]. That way, the function will have 3 arguments and not just 1.
Error message
Traceback (most recent call last):
File "pred.py", line 134, in
output = model(data)
Runtime Error: Expected 4-dimensional input for 4-dimensional weight [16, 3, 3, 3], but got 3-dimensional input of size [1, 32, 32] instead.
Prediction code
normalize = transforms.Normalize(mean=[0.4914, 0.4824, 0.4467],
std=[0.2471, 0.2435, 0.2616])
train_set = transforms.Compose([
transforms.RandomCrop(32, padding=4),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
normalize,
])
model = models.condensenet(args)
model = nn.DataParallel(model)
PATH = "results/savedir/save_models/checkpoint_001.pth.tar"
model.load_state_dict(torch.load(PATH)['state_dict'])
device = torch.device("cpu")
model.eval()
image = Image.open("horse.jpg")
input = train_set(image)
train_loader = torch.utils.data.DataLoader(
input,
batch_size=1,shuffle=True, num_workers=1)
for i, data in enumerate(train_loader):
#input_var = torch.autograd.Variable(data, volatile=True)
#input_var = input_var.view(1, 3, 32,32)
**output = model(data)
topk=(1,5)
maxk = max(topk)
_, pred = output.topk(maxk, 1, True, True)
Am getting this error when am trying to predict on a single image
Image shape/size error message
Link to saved model
Training code repository
Plz uncomment this line #input_var = input_var.view(1, 3, 32,32) so that your input dimension is 4.
I assume that your no. of input channels are 3 if its one then use input_var = input_var.view(1, 1, 32,32) if gray scale
Instead of doing the for loop and train_loader, solved this by just passing the input directly into the model. like this
input = train_set(image)
input = input.unsqueeze(0)
model.eval()
output = model(input)
More details can be found here link
I got a simple code form tutorial and output it to .pb file as below:
mnist_softmax_train.py
x = tf.placeholder("float", shape=[None, 784], name='input_x')
y_ = tf.placeholder("float", shape=[None, 10], name='input_y')
W = tf.Variable(tf.zeros([784, 10]), name='W')
b = tf.Variable(tf.zeros([10]), name='b')
tf.initialize_all_variables().run()
y = tf.nn.softmax(tf.matmul(x,W)+b, name='softmax')
cross_entropy = -tf.reduce_sum(y_*tf.log(y))
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy, name='train_step')
train_step.run(feed_dict={x:input_x, y_:input_y})
In C++, I load the same graph, and feed in fake data for testing:
Tensor input_x(DT_FLOAT, TensorShape({10,784}));
Tensor input_y(DT_FLOAT, TensorShape({10,10}));
Tensor W(DT_FLOAT, TensorShape({784,10}));
Tensor b(DT_FLOAT, TensorShape({10,10}));
Tensor input_test_x(DT_FLOAT, TensorShape({1,784}));
for(int i=0;i<10;i++){
for(int j=0;j<10;j++)
input_x.matrix<float>()(i,i+j) = 1.0;
input_y.matrix<float>()(i,i) = 1.0;
input_test_x.matrix<float>()(0,i) = 1.0;
}
std::vector<std::pair<string, tensorflow::Tensor>> inputs = {
{ "input_x", input_x },
{ "input_y", input_y },
{ "W", W },
{ "b", b },
{ "input_test_x", input_test_x },
};
std::vector<tensorflow::Tensor> outputs;
status = session->Run(inputs, {}, {"train_step"}, &outputs);
std::cout << outputs[0].DebugString() << "\n";
However, this fails with the error:
Invalid argument: Input 0 of node train_step/update_W/ApplyGradientDescent was passed float from _recv_W_0:0 incompatible with expected float_ref.
The graph runs correctly in Python. How can I run it correctly in C++?
The issue here is that you are running the "train_step" target, which performs much more work than just inference. In particular, it attempts to update the variables W and b with the result of the gradient descent step. The error message
Invalid argument: Input 0 of node train_step/update_W/ApplyGradientDescent was passed float from _recv_W_0:0 incompatible with expected float_ref.
...means that one of the nodes you attempted to run ("train_step/update_W/ApplyGradientDescent") expected a mutable input (with type float_ref) but it got an immutable input (with type float) because the value was fed in.
There are (at least) two possible solutions:
If you only want to see predictions for a given input and given weights, fetch "softmax:0" instead of "train_step" in the call to Session::Run().
If you want to perform training in C++, do not feed W and b, but instead assign values to those variables, then continue to execute "train_step". You may find it easier to create a tf.train.Saver when you build the graph in Python, and then invoke the operations that it produces to save and restore values from a checkpoint.
I am trying to periodically calculate complex top score for all items in post table.
const {log10, max, abs, round} = Math;
const topScore = post => { // from Reddit
const {score, createdAt} = post;
const order = log10(max(abs(score), 1));
const sign = score > 0 ? 1 : (score < 0 ? -1 : 0);
const seconds = Date.now() - createdAt;
return sign * order + seconds / 45000;
};
With the above function, I want to perform something like this:
// Update topScore every 60 seconds.
setInterval(() =>
r.table('post').update(post => post.topScore = topScore(post)).run();
, 60000);
How do I do this with RethinkDB javascript driver?
You can write r.table('post').update(r.js('(function(post) { ... })'), {nonAtomic: true}) where ... is arbitrary JS code. Otherwise you'd either have to translate that code into ReQL or pull down the documents to your client, update them, and then write them back to the server.
I haven't found a solution with data set up quite like mine...
var marketshare = [
{"store": "store1", "share": "5.3%", "q1count": 2, "q2count": 4, "q3count": 0},
{"store": "store2","share": "1.9%", "q1count": 5, "q2count": 10, "q3count": 0},
{"store": "store3", "share": "2.5%", "q1count": 3, "q2count": 6, "q3count": 0}
];
Code so far, returning undefined...
var minDataPoint = d3.min( d3.values(marketshare.q1count) ); //Expecting 2 from store 1
var maxDataPoint = d3.max( d3.values(marketshare.q2count) ); //Expecting 10 from store 2
I'm a little overwhelmed by d3.keys, d3.values, d3.maps, converting to array, etc. Any explanations or nudges would be appreciated.
I think you're looking for something like this instead:
d3.min(marketshare, function(d){ return d.q1count; }) // => 2.
You can pass an accessor function as the second argument to d3.min/d3.max.