Is there a way to get a list of all my categorical and numerical features in Pycaret? - installation

I am using Pycaret for a classification problem and I want to get a list of all the categorical and numerical variables inferred by setup() for EDA. Is there a way to do this?
I have tried looking at any function in the documentation but couldn't find anything.

Currently, I find only one way to do it in PyCaret 3.x by accessing the private variable of the Experiment object.
from pycaret.datasets import get_data
from pycaret.classification import *
data = get_data('bank', verbose=False)
exp = setup(data = data, target = 'deposit', session_id=123, verbose=False);
print(f'Ordinal features: {exp._fxs["Ordinal"]}')
print(f'Numeric features: {exp._fxs["Numeric"]}')
print(f'Date features: {exp._fxs["Date"]}')
print(f'Text features: {exp._fxs["Text"]}')
print(f'Categorical features: {exp._fxs["Categorical"]}')

Related

How can I extract, edit and replot a data matrix in Abaqus?

Good afternoon,
We´ve been working on an animal model (skull) applying a series of forces and evaluating the resultant stresses in Abaqus. We got some of those beautiful and colourful (blue-to-red) contour-plots. Now, we´d like to obtain a similar image but coloured by a new matrix, which will be the result of some methematical transformations.
So, how can I extract the data matrix used to set those colour patterns (I guess with X-, Y-, Z-, and von Mises-values or so), apply my transformation, and replot the data to get a new (comparable) figure with the new values?
Thanks a lot and have a great day!
I've never done it myself but I know that this is possible. You can start with the documentation (e.g. here and here).
After experimenting using GUI you can check out the corresponding python code which should be automatically recorded in the abaqus.rpy file at your working directory (or at C:\temp). Working it trhough you could get something like:
myodb = session.openOdb('my_fem.odb') # or alternatively `session.odbs['my_fem.odb']` if it is already loaded into the session
# Define a temporary step for accessing your transformed output
tempStep = myodb.Step(name='TempStep', description='', domain=TIME, timePeriod=1.0)
# Define a temporary frame to storeyour transformed output
tempFrame = tempStep.Frame(frameId=0, frameValue=0.0, description='TempFrame')
# Define a new field output
s1f2_S = myodb.steps['Step-1'].frames[2].fieldOutputs['S'] # Stress tensor at the second frame of the 'Step-1' step
s1f1_S = myodb.steps['Step-1'].frames[1].fieldOutputs['S'] # Stress tensor at the first frame of the 'Step-1' step
tmpField = s1f2_S - s1f1_S
userField = tempFrame.FieldOutput(
name='Field-1', description='s1f2_S - s1f1_S', field=tmpField
)
Now, to display your new Field Output using python you can do the following:
session.viewports['Viewport: 1'].odbDisplay.setFrame(
step='TempStep', frame=0
)
For more information on used methods and objects, you can consult with the documentation "Abaqus Scripting Reference Guide":
Step(): Odb commands -> OdbStep object -> Step();
Frame(): Odb commands -> OdbFrame object -> Frame();
FieldOutput object: Odb commands -> FieldOutput object;

STM8A CAN Filtering in Standard Peripheral Library

I am working with the STM8AF5286UDY and am trying to set up a CAN interface.
For programming, I use the standard peripheral library. At the moment, my CAN interface works fine. The only thing, which does not work, is filtering.
I use extended IDs and want to get all IDs from 0x18FEC100 to 0x18FEC999.
My code looks as follows:
/* CAN filter init */
CAN_FilterNumber = CAN_FilterNumber_0;
CAN_FilterActivation = ENABLE;
CAN_FilterMode = CAN_FilterMode_IdMask;
CAN_FilterScale = CAN_FilterScale_32Bit;
CAN_FilterID1=0x18FEC101;
CAN_FilterID2=0;
CAN_FilterID3=0;
CAN_FilterID4=0;
CAN_FilterIDMask1=0x1FFFF000;
CAN_FilterIDMask2=0;
CAN_FilterIDMask3=0;
CAN_FilterIDMask4=0;
CAN_FilterInit(CAN_FilterNumber, CAN_FilterActivation, CAN_FilterMode,
CAN_FilterScale,CAN_FilterID1, CAN_FilterID2, CAN_FilterID3,
CAN_FilterID4,CAN_FilterIDMask1, CAN_FilterIDMask2,
CAN_FilterIDMask3, CAN_FilterIDMask4);
I would appreciate any help! Thank you!
EDIT: In my initial code, I forgot to include IDE and RTR at addressing. Also, in the library, each address and mask is an 8-bit value. Therefore, I have changed my code to the following:
/* CAN filter init */
CAN_FilterNumber = CAN_FilterNumber_2;
CAN_FilterActivation = ENABLE;
CAN_FilterMode = CAN_FilterMode_IdMask;
CAN_FilterScale = CAN_FilterScale_32Bit;
CAN_FilterID1=0xc7;
CAN_FilterID2=0xed;
CAN_FilterID3=0x02;
CAN_FilterID4=0x02;
CAN_FilterIDMask1=0xFF;
CAN_FilterIDMask2=0xE7;
CAN_FilterIDMask3=0xE0;
CAN_FilterIDMask4=0x00;
CAN_FilterInit(CAN_FilterNumber, CAN_FilterActivation, CAN_FilterMode,
CAN_FilterScale,CAN_FilterID1, CAN_FilterID2, CAN_FilterID3,
CAN_FilterID4,CAN_FilterIDMask1, CAN_FilterIDMask2,
CAN_FilterIDMask3, CAN_FilterIDMask4);
This filter works for the first 16-bit, so at using 0x18FEC101 it filters the 0x18FE. Somehow, it does not work for the other 16-bit.
In the library, the following code is used for writing the addresses and masks in the filter bank at 32-bit:
else if (CAN_FilterScale == CAN_FilterScale_32Bit)
{
CAN->Page.Filter.FR01 = CAN_FilterID1;
CAN->Page.Filter.FR02 = CAN_FilterID2;
CAN->Page.Filter.FR03 = CAN_FilterID3;
CAN->Page.Filter.FR04 = CAN_FilterID4;
CAN->Page.Filter.FR05 = CAN_FilterIDMask1;
CAN->Page.Filter.FR06 = CAN_FilterIDMask2;
CAN->Page.Filter.FR07 = CAN_FilterIDMask3;
CAN->Page.Filter.FR08 = CAN_FilterIDMask4;
}
Are there any ideas, what my mistake might be?
Thanks!
Mask filtering works bitwise. So you can't create a filter to accept values between 0x18FEC100 - 0x18FEC999. You need to think binary.
In the filter mask registers, 1 means "must match" and 0 means "don't care".
ID = 0x18FEC101 and Mask = 0x1FFFF000 means that it will accept values between 0x18FEC000 - 0x18FECFFF as the filter won't care the least significant 12 bits.
However, the process is further complicated by the bit arrangement of the hardware registers. Be aware that RTR & IDE bits are also included in the filter registers. I don't know if the standard peripheral library handles this but probably not. You probably need to manually arrange the bits to determine the correct register values. In the reference manual (RM0016), refer to Figure 148.
The code I posted (edited version) works now.
Turns out I had a problem calculating the addresses by hand.
Thank you #Tagli.

Any ar js multimarkers learning tutorial?

I have been searching for ar.js multimarkers tutorial or anything that explains about it. But all I can find is 2 examples, but no tutorials or explanations.
So far, I understand that it requires to learn the pattern or order of the markers, then it stores it in localStorage. This data is used later to display the image.
What I don't understand, is how this "learner" is implemented. Also, the learning process is only used once by the "creator", right? The output file should be stored and then served later when needed, not created from scratch at each person's phone or computer.
Any help is appreciated.
Since the question is mostly about the learner page, I'll try to break it down as much as i can:
1) You need to have an array of {type, URL} objects.
A sample of creating the default array is shown below (source code):
var markersControlsParameters = [
{
type : 'pattern',
patternUrl : 'examples/marker-training/examples/pattern-files/pattern-hiro.patt',
},
{
type : 'pattern',
patternUrl : 'examples/marker-training/examples/pattern-files/pattern-kanji.patt',
}]
2) You need to feed this to the 'learner' object.
By default the above object is being encoded into the url (source) and then decoded by the learner site. What is important, happens on the site:
for each object in the array, an ArMarkerControls object is created and stored:
// array.forEach(function(markerParams){
var markerRoot = new THREE.Group()
scene.add(markerRoot)
// create markerControls for our markerRoot
var markerControls = new THREEx.ArMarkerControls(arToolkitContext, markerRoot, markerParams)
subMarkersControls.push(markerControls)
The subMarkersControls is used to create the object used to do the learning. At long last:
var multiMarkerLearning = new THREEx.ArMultiMakersLearning(arToolkitContext, subMarkersControls)
The example learner site has multiple utility functions, but as far as i know, the most important here are the ArMultiMakersLearning members which can be used in the following order (or any other):
// this method resets previously collected statistics
multiMarkerLearning.resetStats()
// this member flag enables data collection
multiMarkerLearning.enabled = true
// this member flag stops data collection
multiMarkerLearning.enabled = false
// To obtain the 'learned' data, simply call .toJSON()
var jsonString = multiMarkerLearning.toJSON()
Thats all. If you store the jsonString as
localStorage.setItem('ARjsMultiMarkerFile', jsonString);
then it will be used as the default multimarker file later on. If you want a custom name or more areas - then you'll have to modify the name in the source code.
3) 2.1.4 debugUI
It seems that the debug UI is broken - the UI buttons do exist but are nowhere to be seen. A hot fix would be using the 'markersAreaEnabled' span style for the div
containing the buttons (see this source bit).
It's all in this glitch, you can find it under the phrase 'CHANGES HERE' in the arjs code.

Issue with topic word distributions after malletmodel2ldamodel in gensim

After training an LDA model on gensim LDA model i converted the model to a with the gensim mallet via the malletmodel2ldamodel function provided with the wrapper. Before and after the conversion the topic word distributions are quite different. The mallet version returns very rare topic word distribution after conversion.
ldamallet = gensim.models.wrappers.LdaMallet(mallet_path, corpus=corpus, num_topics=13, id2word=dictionary)
model = gensim.models.wrappers.ldamallet.malletmodel2ldamodel(ldamallet)
model.save('ldamallet.gensim')
dictionary = gensim.corpora.Dictionary.load('dictionary.gensim')
corpus = pickle.load(open('corpus.pkl', 'rb'))
lda_mallet = gensim.models.wrappers.LdaMallet.load('ldamallet.gensim')
import pyLDAvis.gensim
lda_display = pyLDAvis.gensim.prepare(lda_mallet, corpus, dictionary, sort_topics=False)
pyLDAvis.display(lda_display)
Here is the output from gensim original implementation:
I can see there was a bug around this issue which has been fixed with the previous versions of gensim. I am using gensim=3.7.1
Here is an optional function to use instead of malletmodel2ldamodel (reported to have bugs):
from gensim.models.ldamodel import LdaModel
import numpy
def ldaMalletConvertToldaGen(mallet_model):
model_gensim = LdaModel(id2word=mallet_model.id2word, num_topics=mallet_model.num_topics, alpha=mallet_model.alpha, eta=0, iterations=1000, gamma_threshold=0.001, dtype=numpy.float32)
model_gensim.state.sstats[...] = mallet_model.wordtopics
model_gensim.sync_state()
return model_gensim
converted_model = ldaMalletConvertToldaGen(mallet_model)
I used it and it worked perfectly.

How do I create a compound multi-index in rethinkdb?

I am using Rethinkdb 1.10.1 with the official python driver. I have a table of tagged things which are associated to one user:
{
"id": "PK",
"user_id": "USER_PK",
"tags": ["list", "of", "strings"],
// Other fields...
}
I want to query by user_id and tag (say, to find all the things by user "tawmas" with tag "tag"). Starting with Rethinkdb 1.10 I can create a multi-index like this:
r.table('things').index_create('tags', multi=True).run(conn)
My query would then be:
res = (r.table('things')
.get_all('TAG', index='tags')
.filter(r.row['user_id'] == 'USER_PK').run(conn))
However, this query still needs to scan all the documents with the given tag, so I would like to create a compound index based on the user_id and tags fields. Such an index would allow me to query with:
res = r.table('things').get_all(['USER_PK', 'TAG'], index='user_tags').run(conn)
There is nothing in the documentation about compound multi-indexes. However, I
tried to use a custom index function combining the requirements for compound
indexes and multi-indexes by returning a list of ["USER_PK", "tag"] pairs.
My first attempt was in python:
r.table('things').index_create(
'user_tags',
lambda each: [[each['user_id'], tag] for tag in each['tags']],
multi=True).run(conn)
This makes the python driver choke with a MemoryError trying to parse the index function (I guess list comprehensions aren't really supported by the driver).
So, I turned to my (admittedly, rusty) javascript and came up with this:
r.table('things').index_create(
'user_tags',
r.js(
"""(function (each) {
var result = [];
var user_id = each["user_id"];
var tags = each["tags"];
for (var i = 0; i < tags.length; i++) {
result.push([user_id, tags[i]]);
}
return result;
})
"""),
multi=True).run(conn)
This is rejected by the server with a curious exception: rethinkdb.errors.RqlRuntimeError: Could not prove function deterministic. Index functions must be deterministic.
So, what is the correct way to define a compound multi-index? Or is it something
which is not supported at this time?
Short answer:
List comprehensions don't work in ReQL functions. You need to use map instead like so:
r.table('things').index_create(
'user_tags',
lambda each: each["tags"].map(lambda tag: [each['user_id'], tag]),
multi=True).run(conn)
Long answer
This is actually a somewhat subtle aspect of how RethinkDB drivers work. So the reason this doesn't work is that your python code doesn't actually see real copies of the each document. So in the expression:
lambda each: [[each['user_id'], tag] for tag in each['tags']]
each isn't ever bound to an actual document from your database, it's bound to a special python variable which represents the document. I'd actually try running the following just to demonstrate it:
q = r.table('things').index_create(
'user_tags',
lambda each: print(each)) #only works in python 3
And it will print out something like:
<RqlQuery instance: var_1 >
the driver only knows that this is a variable from the function, in particular it has no idea if each["tags"] is an array or what (it's actually just another very similar abstract object). So python doesn't know how to iterate over that field. Basically exactly the same problem exists in javascript.

Resources