IndexError: invalid index to scalar variable. Sentiment Analysis - sentiment-analysis

cleaning price
pr = df.iloc[0,1]
df.iloc[0,1]= float(pr[1:])
pr2 = df.iloc[1,1]
df.iloc[1,1] = 0.0
pr3 = df.iloc[2,1]
df.iloc[2,1]= float(pr3[1:])
pr4 = df.iloc[3,1]
df.iloc[3,1]= float(pr4[1:])

Related

How to speed up resample with for loop?

I want to get the value of rsi on h1 for each m15 candle, this is how I do this. However, with data larger than 500000 lines, this is very time consuming, is there any better way. Note that it is mandatory to resample each row to get the correct result
import talib
import pandas as pd
import numpy as np
def Data(df):
df['RSI1'] = talib.RSI(df['close'], timeperiod=13)
df['RSI2'] = talib.RSI(df['close'], timeperiod=21)
return df
#len(df) > 555555
df = pd.read_csv('m15_candle.csv')
for i in range(0, len(df)):
t = df.at[i, 'time']
if t.hour == 0 and t.minute == 0:
df = df[i:]
break
df = df.set_index('time')
ohlc = {
'open': 'first',
'high': 'max',
'low': 'min',
'close': 'last'
}
rsi1 = [0]*len(df)
rsi2 = [0]*len(df)
for i in range(100000, len(df)):
h1 = Data(df[:i].resample("1h", offset=0).apply(ohlc).dropna())
rsi1[i] = h1.iloc[-1]['RSI1']
rsi2[i] = h1.iloc[-1]['RSI2']
df['RSI1_h1'] = rsi1
df['RSI2_h1'] = rsi2
df = df.reset_index()
df.to_csv("data.csv", index = False)

probleme de renitialisation d'un QtreeWidget

I am developing an application that allows to place orders with python and QtDesigner. I can't manage to place two commands in a row. The first command passes without any problem but when I want to place another command without closing the application, this error is displayed: "self.ui.treeWidgetcommand.topLevelItem(self.Line ).setText(0, str(Id))
AttributeError: 'NoneType' object has no attribute 'setText'".
def AddCommande(self):
QtWidgets.QTreeWidgetItem(self.ui.treeWidgetcommande)
Libelle = self.ui.comboBoxproduit.currentText()
Qte = int(self.ui.lineEditQteproduit.text())
Info = self.stock.GetProductName(Libelle)[0]
Id = str(int(Info[0]))
Pu = Info[1]
Total = int(Qte)*int(Pu)
data=(Libelle,Qte,Id,Pu,Total)
#print(data)
self.ui.treeWidgetcommande.topLevelItem(self.Ligne).setText(0, str(Id))
self.ui.treeWidgetcommande.topLevelItem(self.Ligne).setText(1, str(Libelle))
self.ui.treeWidgetcommande.topLevelItem(self.Ligne).setText(2, str(Qte))
self.ui.treeWidgetcommande.topLevelItem(self.Ligne).setText(3, str(Pu))
self.ui.treeWidgetcommande.topLevelItem(self.Ligne).setText(4, str(Total))
self.Ligne +=1
def ValiderCommande(self):
Client = self.ui.comboBoxclient.currentText()
IdClient = self.stock.GetClientIdByName(Client.split(" ")[0])
PrixTotal = 0
UniqueId = random.random()
Date = date.today()
Data = (IdClient,PrixTotal,Date,UniqueId)
if self.stock.AddCommande(Data) == 0:
for i in range(self.Ligne):
IdCommande = self.stock.GetClientIdByUniqueId(UniqueId)
Libelle = self.ui.treeWidgetcommande.topLevelItem(i).text(1)
IdProduit = self.ui.treeWidgetcommande.topLevelItem(i).text(0)
Pu = self.ui.treeWidgetcommande.topLevelItem(i).text(3)
Qte = self.ui.treeWidgetcommande.topLevelItem(i).text(2)
Total = int(self.ui.treeWidgetcommande.topLevelItem(i).text(4))
InfoData = (IdCommande, Libelle, Qte, Pu, Total)
data = (Qte,IdProduit)
if self.stock.AjoutInfoCommande(InfoData) == 0:
PrixTotal += Total
self.stock.UpdateQteStock(data)
if self.stock.UpdateCommande(PrixTotal,IdCommande) == 0:
self.ui.treeWidgetcommande.clear()
#self.ui.treeWidgetcommande.topLevelItem(self.Ligne).setHidden(True)
self.ui.lineEditQteproduit.setText(" ")
`
I would like after placing an order, reset my treeWidget array and be able to place other orders.

Is "insample" in mlr3tuning resampling can be used when we want to do hyperparameter tuning with the full dataset?

I've been trying to do some tuning hyperparameters for the survival SVM model. I used the AutoTuner function from the mlr3tuning package. I want to do tuning for the whole dataset (No train & test split). I've found the resampling class which is "insample". When I look at the mlr3 dictionary, it said "Uses all observations as training and as test set."
My questions is, Is "insample" in mlr3tuning resampling can be used when we want to do hyperparameter tuning with the full dataset and if it applies, why when I tried to use the hyperparameter to the survivalsvm function from the survivalsvm package, it gives the different output of concordance index?
This is the code I used for hyperparameter tuning
veteran<-veteran
set.seed(1)
task = as_task_surv(x = veteran, time = 'time', event = 'status')
learner = lrn("surv.svm", type = "hybrid", diff.meth = "makediff3",
gamma.mu = c(0.1, 0.1),kernel = 'rbf_kernel')
search_space = ps(gamma = p_dbl(2^-5, 2^5),mu = p_dbl(2^-5, 2^5))
search_space$trafo = function(x, param_set) {
x$gamma.mu = c(x$gamma, x$mu)
x$gamma = x$mu = NULL
x}
ssvm_at = AutoTuner$new(
learner = learner,
resampling = rsmp("insample"),
search_space = search_space,
measure = msr('surv.cindex'),
terminator = trm('evals', n_evals = 5),
tuner = tnr('grid_search'))
ssvm_at$train(task)
And this is the code that I've been trying using the survivalsvm function from the survivalsvm package
survsvm.reg <- survivalsvm(Surv(veteran$time , veteran$status ) ~ .,
data = veteran,
type = "hybrid", gamma.mu = c(32,32),diff.meth = "makediff3",
opt.meth = "quadprog", kernel = "rbf_kernel")
pred.survsvm.reg <- predict(survsvm.reg,veteran)
conindex(pred.survsvm.reg, veteran$time)

Understanding the distance metric in company name matching using KNN

I am trying to understand the following code that I found for matching a messy list of company names to a list of clean list of company names. My question is what the 'Ratio' metric is calculated using. It appears that the ratio is from scorer = fuzz.token_sort_ratio which is I understand is part of the fuzzywuzzy package and therefore a levenschtein distance calculation correct? I'm trying to understand why the author uses this as the scorer rather than the distance output from KNN. When I try changing the metric inside NearestNeighbors, it doesn't appear to change the results. Does the metric in NearestNeighbors matter then?
Original article:
https://audhiaprilliant.medium.com/fuzzy-string-matching-optimization-using-tf-idf-and-knn-b07fce69b58f
def build_vectorizer(
clean: pd.Series,
analyzer: str = 'char',
ngram_range: Tuple[int, int] = (1, 4),
n_neighbors: int = 1,
**kwargs
) -> Tuple:
# Create vectorizer
vectorizer = TfidfVectorizer(analyzer = analyzer, ngram_range = ngram_range, **kwargs)
X = vectorizer.fit_transform(clean.values.astype('U'))
# Fit nearest neighbors corpus
nbrs = NearestNeighbors(n_neighbors = n_neighbors, metric = 'cosine').fit(X)
return vectorizer, nbrs
# String matching - KNN
def tfidf_nn(
messy,
clean,
n_neighbors = 1,
**kwargs
):
# Fit clean data and transform messy data
vectorizer, nbrs = build_vectorizer(clean, n_neighbors = n_neighbors, **kwargs)
input_vec = vectorizer.transform(messy)
# Determine best possible matches
distances, indices = nbrs.kneighbors(input_vec, n_neighbors = n_neighbors)
nearest_values = np.array(clean)[indices]
return nearest_values, distances
# String matching - match fuzzy
def find_matches_fuzzy(
row,
match_candidates,
limit = 5
):
row_matches = process.extract(
row, dict(enumerate(match_candidates)),
scorer = fuzz.token_sort_ratio,
limit = limit
)
result = [(row, match[0], match[1]) for match in row_matches]
return result
# String matching - TF-IDF
def fuzzy_nn_match(
messy,
clean,
column,
col,
n_neighbors = 100,
limit = 5, **kwargs):
nearest_values, _ = tfidf_nn(messy, clean, n_neighbors, **kwargs)
results = [find_matches_fuzzy(row, nearest_values[i], limit) for i, row in enumerate(messy)]
df = pd.DataFrame(itertools.chain.from_iterable(results),
columns = [column, col, 'Ratio']
)
return df
# String matching - Fuzzy
def fuzzy_tf_idf(
df: pd.DataFrame,
column: str,
clean: pd.Series,
mapping_df: pd.DataFrame,
col: str,
analyzer: str = 'char',
ngram_range: Tuple[int, int] = (1, 3)
) -> pd.Series:
# Create vectorizer
clean = clean.drop_duplicates().reset_index(drop = True)
messy_prep = df[column].drop_duplicates().dropna().reset_index(drop = True).astype(str)
messy = messy_prep.apply(preprocess_string)
result = fuzzy_nn_match(messy = messy, clean = clean, column = column, col = col, n_neighbors = 1)
# Map value from messy to clean
return result

Odoo - one2many sum

I'm working on simple project and I've got a problem. I want sum one column in my one2many fields how i can do this ?
from openerp import models, fields, api, _
class Fam(models.Model):
_name = 'fam'
fm_id = fields.Many2one('fam')
mileage = fields.Float(string="Mileage", required=True)
fueled = fields.Float(string="Fueled", required=True)
perliter = fields.Float(string='Price per liter', required=True)
class Car2(models.Model):
_name = 'car2'
_description = 'Car record'
_log_access = True
name = fields.Char(
string='Name',
required=True
)
mile = fields.One2many(
"fam",
"fm_id",
string='Mileage, Fuel and cost perliter',
required=True
)
average = fields.Float(
string='Average'
)
combustion = fields.Float(
string='Combustion'
)
You can achieve with following example:
for line in self.one2many_field_name:
total += line.field_name_in_one2many_table
# in your case
total_mileage = 0.0
total_fueled = 0.0
total_perliter = 0.0
for line in self.mile:
total_mileage += line.mileage
total_fueled += line.fueled
total_perliter += line.perliter

Resources