Training logs are not being printed in LightGBM in Jupyter - lightgbm

I am trying to train a simple LightGBM model on a Macbook but its not printing any logs even when verbose parameter is set to 1 (or even greater than 1)
param = {'num_leaves':50, 'num_trees':500, 'learning_rate':0.01, 'feature_fraction':1.0, 'tree_learner': 'serial', 'objective':'cross_entropy', 'verbose' : 1, 'metric':'kullback_leibler', 'is_training_metric':True}
model = lgb.train(param, train_data_lgbm)
I also changed is_training_metric to True as per another suggestion on Github. This also didn't lead to rectification. Can someone help with what I might be missing?
EDIT: I was running this code in Jupyter notebook. When I tried the same thing on terminal, it worked.
Can someone help with why am I not seeing logs on Jupyter notebook?

I don't know what kind of log you want, but in my case (lightbgm 2.2.3 on Colab not Jupiter notebook though), by adding valid_sets parameter to the train method, I was able to produce a logloss as shown below.
model = lgb.train(param,
train_data_lgbm,
valid_sets=[train_data_lgbm])
[1] training's xentropy: 0.606795.
[2] training's xentropy: 0.579697.
[3] training's xentropy: 0.513748.
[4] training's xentropy: 0.494762.
....
If you want to produce a logloss for evaluation, you can display it in the following way.
eval_data_lgb = lgb.Dataset(X_test, y_test, reference=train_data_lgbm)
model = lgb.train(param,
train_data_lgbm,
valid_sets=[train_data_lgbm,
eval_data_lgb])
[1] training's xentropy: 0.606795 valid_1's xentropy: 0.60837.
[2] training's xentropy: 0.579697 valid_1's xentropy: 0.582659.
[3] training's xentropy: 0.513748 valid_1's xentropy: 0.517523.
[4] training's xentropy: 0.494762 valid_1's xentropy: 0.499277.
....

Related

train_step in Custom training class does not work when upgrading my MacBook from AMD to M1

I am trying to do a simple custom training loop.
For some reason, the train_step() function gets ignored and a normal training loop is carried. I noticed when the word "Hello" is not printed when I run the script on my MacBook Pro with M1 chip. My old MacBook Pro (AMD) works perfectly and the code also worked perfectly on Google Colab.
The tensor flow version is 2.0.0 and Keras is 2.3.1. Thanks in advance for your help.
My Code is:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import Model
# tf.config.run_functions_eagerly(True)
class CustomModel(Model):
# #tf.function
def train_step(self, data):
# Unpack the data. Its structure depends on your model and
# on what you pass to `fit()`.
x, y = data
print('Hello')
with tf.GradientTape() as tape:
y_pred = self(x, training=True) # Forward pass
# Compute the loss value
# (the loss function is configured in `compile()`)
loss = self.compiled_loss(y, y_pred, regularization_losses=self.losses)
# Compute gradients
trainable_vars = self.trainable_variables
gradients = tape.gradient(loss, trainable_vars)
# Update weights
self.optimizer.apply_gradients(zip(gradients, trainable_vars))
# Update metrics (includes the metric that tracks the loss)
self.compiled_metrics.update_state(y, y_pred)
# Return a dict mapping metric names to current value
return {m.name: m.result() for m in self.metrics}
# Construct and compile an instance of CustomModel
inputs = keras.Input(shape=(32,))
outputs = keras.layers.Dense(1)(inputs)
model = CustomModel(inputs, outputs)
model.compile(optimizer="adam", loss="mse", metrics=["mae"])
# Just use `fit` as usual
x = np.random.random((1000, 32))
y = np.random.random((1000, 1))
model.fit(x, y, epochs=3)
Hello is not printed:
Hello is printed during training:

How to save {UINT16, 2} array to image in Julia

I have an Array{UInt16,2} in Julia of size 5328×3040. I want to save it to a png image.
I tried the following:
save("gray.png", colorview(Gray, img))
But got the following error:
ERROR: TypeError: in Gray, in T, expected T<:Union{Bool, AbstractFloat, FixedPoint}, got Type{UInt16}
Stacktrace:
[1] ccolor_number at C:\Users\ankushar\.julia\packages\ImageCore\KbJyT\src\convert_reinterpret.jl:60 [inlined]
[2] ccolor_number at C:\Users\ankushar\.julia\packages\ImageCore\KbJyT\src\convert_reinterpret.jl:57 [inlined]
[3] colorview(::Type{Gray}, ::Array{UInt16,2}) at C:\Users\ankushar\.julia\packages\ImageCore\KbJyT\src\colorchannels.jl:104
[4] top-level scope at REPL[16]:1
caused by [exception 3]
IOError: symlink: operation not permitted (EPERM)
I am using Julia 1.4.2
Can you suggest a good way to store these arrays as images in Julia?
TIA!
You can normalize the pixel values before saving.
using Images
img = rand(UInt16, 10, 20)
img[1:3]
# => 3-element Array{UInt16,1}:
0x7fc2
0x057e
0xae79
gimg = colorview(Gray, img ./ typemax(UInt16))
gimg[1:3] |> channelview
# => 3-element reinterpret(Float64, ::Array{Gray{Float64},1}):
0.4990615701533532
0.02145418478675517
0.6815442130159457
save("gray.png", gimg)
A faster and more accurate solution is to reinterpret your array as an array of N0f16 which is a type from FixedPointNumbers which is basically just a Uint16 scaled between 0 and 1. This will both avoid rounding errors, but also prevent the need for making a copy.
using FixedPointNumbers
img = rand(UInt16, 10, 20)
colorview(Gray, reinterpret(N0f16, img)))

Nokogiri - parsing multi-line `<link>` tag as link and text

I am using Nokogiri to parse an RSS feed for a podcast. I am trying to grab a particular piece of data containing a link to the episode, so I'm using Nokogiri to parse the XML response for the RSS feed.
The relevant bit is below:
<item>
<title>An awesome title!</title>
...
<link>
http://www.foobar.com/episodes/1
</link>
</item>
Nokogiri appears to be having a hard time grabbing the <link> tag though; I am able to get the <item> tag as a Nokogiri::Node object, and I can grab the title just fine with node.css('title').text, but when I try the same with node.css('link').text, I get a blank string.
I tried calling node.children.to_a to examine all of the children in this node, and I noticed something odd: the text inside the <link> tag is being parsed as a separate child:
[0] = {Nokogiri::XML::Element} <title>An awesome title!</title>\n
[1] = {Nokogiri::XML::Element} <link>
[2] = {Nokogiri::XML::Text} http://www.foobar.com/episodes/1\n
Is there a way I can help Nokogiri properly parse this multi-line tag so that I can grab the text inside?
UPDATE: Here is the exact code I'm executing when I run into the issue.
require 'open-uri'
doc = Nokogiri::HTML(open('https://rss.acast.com/abroadinjapan')) # Returns Nokogiri::HTML::Document
node = doc.css('//item').first # Returns Nokogiri::XML::Element
node.css('title').text # Returns "Abroad in Japan: Two weeks more in Japan!"
node.css('link').text # Returns ""
node.css('link').inner_text # Also returns "" - saw this elsewhere and thought I'd try it
node.children.to_a # Result, parsed by RubyMine for readability:
result = Array (14 elements)
[0] = {Nokogiri::XML::Element} <title>Abroad in Japan: Two weeks more in Japan!</title>\n
[1] = {Nokogiri::XML::Element} <subtitle>Chris and Pete return and they've planned out a very different route through Northern Japan.&nbsp;\n\n\nOur Google Map can be found here:&nbsp;\ngoo.gl/3t4t3q&nbsp;\n\n\nGet in touch:&nbsp;abroadinjapanpodcast#gmail.com&nbsp;\nMore Abr...</subtitle>
[2] = {Nokogiri::XML::Element} <summary></summary>
[3] = {Nokogiri::XML::Element} <guid ispermalink="false"></guid>
[4] = {Nokogiri::XML::Element} <pubdate>Wed, 16 May 2018 21:00:00 GMT</pubdate>
[5] = {Nokogiri::XML::Element} <duration>01:00:00</duration>
[6] = {Nokogiri::XML::Element} <keywords></keywords>
[7] = {Nokogiri::XML::Element} <explicit>no</explicit>
[8] = {Nokogiri::XML::Element} <episodetype>full</episodetype>
[9] = {Nokogiri::XML::Element} <image href="https://imagecdn.acast.com/image?h=1500&w=1500&source=https%3A%2F%2Fmediacdn.acast.com%2Fassets%2Fcb30d29f-7342-46f0-a649-12f1b4e601f7%2Fcover-image-jgyt2ecc-japan.jpg"></image>
[10] = {Nokogiri::XML::Element} <description>Chris and Pete return and they've planned out a very different route through Northern Japan. <p><br></p>\n<p>Our Google Map can be found here: </p>\n<p>goo.gl/3t4t3q </p>\n<p><br></p>\n<p>Get in touch: abroadinjapanpodcast#gmail.com </p>\n<p>More Abroad In Japan shows available below, do subscribe, rate and review us on iTunes, and please tell your friends! </p>\n<p><br></p>\n<p>http://www.radiostakhanov.com/abroadinjapan/</p>]]></description>
[11] = {Nokogiri::XML::Element} <link>
[12] = {Nokogiri::XML::Text} https://www.acast.com/abroadinjapan/abroadinjapan-twoweeksmoreinjapan-\n
[13] = {Nokogiri::XML::Element} <enclosure url="https://media.acast.com/abroadinjapan/abroadinjapan-twoweeksmoreinjapan-/media.mp3" length="28806528" type="audio/mpeg"></enclosure>
NOTE: One of the URLs above uses a URL shortener, which SO doesn't like, so I replaced it with foobar.com.
The fix is a lot simpler than you would think. An RSS feed is not valid HTML, but it works with XML:
doc = Nokogiri::XML(open('...'))
Ruby also has a module named RSS, which might be better suited for something like this:
require 'rss'
doc = RSS::Parser.parse(open('...'))
doc.items.first.link
=> "https://...."

Uuencode vs Base64 encode: why do pack('m') and pack('u') in Ruby return strings of different lengths?

According to the specs they should be same length, and a string of length 36 should translate to a string of length 48, for example:
bin = "123456789012345678901234567890123456"
[49] pry(main)> [bin].pack("m").length
=> 49
[50] pry(main)> [bin].pack("u").length
=> 50
[54] pry(main)> [bin].pack("m")
=> "MTIzNDU2Nzg5MDEyMzQ1Njc4OTAxMjM0NTY3ODkwMTIzNDU2\n"
[55] pry(main)> [bin].pack("u")
=> "D,3(S-#4V-S#Y,\#$R,S0U-C<X.3`Q,C,T-38W.#DP,3(S-#4V\n"
Compensating for the "funny newline" we get the proper length in the base64 encoding (the pack('m') variant), but I don't know how to get the line length right in the uuencoding (the pack('u') variant).
I really need that uuencoded string to be 48 chars long :) what's the issue here?
Update
I did my own uuencode implementation, created a method that generates bitmap and then split the bitmap etc to make a uuencode implementation, as the provider of the specification helpfully explained in the spec
def to_bitmap bytes
bytes.scan(/./).map{|b| b.ord.to_s(2).rjust(8, "0")}.join
end
[5] pry(main)> to_bitmap(str).scan(/.{6}/).map{|b| (from_bitmap("00"+b).ord+0x20).chr }.join
=> ",3(S-#4V-S#Y,\#$R,S0U-C<X.3 Q,C,T-38W.#DP,3(S-#4V"
[6] pry(main)> to_bitmap(str).scan(/.{6}/).map{|b| (from_bitmap("00"+b).ord+0x20).chr }.join.length
=> 48
and I assume this is the good thing, it's kind of like uuencode, but differs in a couple of places:
,3(S-#4V-S#Y,\#$R,S0U-C<X.3 Q,C,T-38W.#DP,3(S-#4V
D,3(S-#4V-S#Y,\#$R,S0U-C<X.3`Q,C,T-38W.#DP,3(S-#4V\n
Wierd, I guess it's the specification I'm implementing is using a "uuencode" and not quite uuencode though they claim that generic software libraries support this format, am I missing something or does this seem like bullshit and workaround for somebodies half-assed implementation of uuencode?

LGOCV caret train

Why does initial part of below code run, but when I try to run later part of code I get an error? I am learning data mining from the page and trying to understand how to perform cross validation using LGOCV option
library(mlbench)
data(Sonar)
str(Sonar)
library(caret)
set.seed(998)
inTraining <- createDataPartition(Sonar$Class, p = 0.75, list = FALSE)
training <- Sonar[inTraining, ]
testing <- Sonar[-inTraining, ]
fitControl <- trainControl(## 10-fold CV
method = "repeatedcv",
number = 10,
## repeated ten times
repeats = 10)
gbmGrid <- expand.grid(.interaction.depth = c(1, 5, 9),
.n.trees = (1:15)*100,
.shrinkage = 0.1)
fitControl <- trainControl(method = "repeatedcv",
number = 10,
repeats = 10,
## Estimate class probabilities
classProbs = TRUE,
## Evaluate performance using
## the following function
summaryFunction = twoClassSummary)
set.seed(825)
gbmFit3 <- train(Class ~ ., data = training,
method = "gbm",
trControl = fitControl,
verbose = FALSE,
tuneGrid = gbmGrid,
## Specify which metric to optimize
metric = "ROC")
gbmFit3
Get error below: (
datarow <- 1:nrow(training)
fitControl <- trainControl(method = "LGOCV",
summaryFunction = twoClassSummary,
classProbs = TRUE,
index = list(TrainSet = datarow ),
savePredictions = TRUE)
gbmFit4 <- train(Class ~ ., data = training,
method = "gbm",
trControl = fitControl,
verbose = FALSE,
tuneGrid = gbmGrid,
## Specify which metric to optimize
metric = "ROC")
My error is as below
Error in { :
task 1 failed - "arguments imply differing number of rows: 0, 1"
In addition: Warning messages:
1: In eval(expr, envir, enclos) :
predictions failed for TrainSet: interaction.depth=1, shrinkage=0.1, n.trees=1500 Error in 1:ncol(tmp) : argument of length 0
2: In eval(expr, envir, enclos) :
predictions failed for TrainSet: interaction.depth=5, shrinkage=0.1, n.trees=1500 Error in 1:ncol(tmp) : argument of length 0
3: In eval(expr, envir, enclos) :
predictions failed for TrainSet: interaction.depth=9, shrinkage=0.1, n.trees=150
session info:
sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] parallel splines stats graphics grDevices utils datasets methods base
other attached packages:
[1] gbm_2.1 survival_2.37-4 mlbench_2.1-1 pROC_1.5.4 caret_5.17-7 reshape2_1.2.2
[7] plyr_1.8 lattice_0.20-15 foreach_1.4.1 cluster_1.14.4
loaded via a namespace (and not attached):
[1] codetools_0.2-8 compiler_3.0.1 grid_3.0.1 iterators_1.0.6 stringr_0.6.2 tools_3.0.1
You also posted the same question on CrossValidated. We normally say to make very sure that you are not in error before look for help and then contact the package author.
The problem is your use of datarow <- 1:nrow(training). You are tuning model on all of the instances and leaving nothing to compute the hold-out estimates.
I'm not really sure what you are try to do.
Max

Resources