How is TOPS metric calculated with any AI model.? - performance

As per the https://developer.nvidia.com/embedded/jetson-modules,the Jetson Xavier NX supports 21 TOPS in AI Performance
For example if the use case is people detection, tracking and counting, how many TOPS will be utilized?
Is the calculation based on any AI model Architecture? if so , please share the calculation steps.

Related

Can we measure bulk density from wetland soil samples (of unknown volume)?

I am part of an undergrad pilot study - sampling wetlands for carbon stock. Field samples have been collected. Now it is winter, the ground is frozen, and we realize we should have collected bulk density samples. Dried, ground, + sieved samples have been sent to our lab partners for carbon analysis - per the lab's instructions.
Can we still measure bulk density, from our samples?
We still have composite cores with marble-golf ball sized hardened chunks of clay-heavy soil. Also, we have we have remaining ground samples.
Could water displacement, with samples in a plastic bag, removing as much air as possible, work - to determine this soil's volume?
One prof suggested that measuring the chunks may work this way.
Or, chunks could be tied to a 'sinker', sans plastic bag - assuming such short-term water immersion would have minimal effect on clay-heavy soil volume.
??
Is there any way to determine bulk density from our samples?
please and thank you.

How does one arrive at "fair" priors for spatial and non-spatial effects

In a basic BYM model may be written as
sometimes with covariates but that doesn't matter much here. Where s are the spatially structured effects and u the unstructured effects over units.
In Congdon (2020) they refer to the fair prior on these as one in which
where is the average number of neighbors in the adjacency matrix.
It is defined similarly (in terms of precision, I think) in Bernardinelli et al. (1995).
However, for the gamma distribution, scaling appears to only impact the scale term
I haven't been able to find a worked example for this, and don't understand how the priors are arrived at, for example, in the well-known lip cancer data
I am hoping someone could help me understand how these are reached in this setting, even in the simple case of two gamma hyperpriors.
References
Congdon, P. D. (2019). Bayesian Hierarchical Models: With Applications Using R, Second Edition (2nd edition). Chapman and Hall/CRC.
Bernardinelli, L., Clayton, D. and Montomoli, C. (1995). Bayesian estimates of disease maps: How important are priors? Statistics in Medicine 14 2411–2431.

How is IBM Watson Tradeoff Analytics any different from simple constrained decision making?

I am continuously astounded by the technological genius of the IBM Watson package. The tools do things from recognizing the subjects in images to extracting the emotion in a letter, and they're amazing. And then there's Tradeoff Analytics. In their Nests demo, you select a state and then a series of constraints (price must be between W and X, square footage must be between Y and Z, there must be Insured Escrow financing available, etc.) and they rank the houses based on how well they fit your constraints.
It would seem that all Tradeoff Analytics does is run a simple query on the order of:
SELECT * FROM House WHERE price >= W AND price <= X AND square_footage >= Y
AND square_footage <= Z AND ...
Am I not understanding Tradeoff Analytics correctly? I have tremendous respect for the people over at IBM that built all of these amazing tools, but Tradeoff Analytics seems like simple constrained decision making, which appears in any Intro to Programming course as you're learning if statements. What am I missing?
As #GuyGreer pointed out the service indeed uses Pareo Optimization which is much different than simple constraints.
For example:
Say you have three houses
Sqr Footage Price
HouseA 6000 1000K
HouseB 9000 750K
HouseC 8000 800K
Now say your constraints are Sqr Footage > 5000 and Price < 900K
then you are left with House B and House C
Tradeoff Analytics will return to you only houseB.
Since according to Pareto, give your objectives of Price and Footage,
HouseB dominates House C as it has larger footage and is cheaper.
Obviously, this is a made up example, and in real life there are more objecitves (attributes) on which you take into account when you buy a house.
The idea with Pareto, is to find the Pareto Frontier.
Tradeoff Analytics add to Pareto Optimization additional home-grown algorithms to give you more insights on the tradeoff.
Finally the service, is accompanied with a client-side widget that uses novel method for visualizing Pareto Frontiers. In its own a sophisticated problems, given that such frontier is multi-diemnsional.
The page you link to says they use Pareto Optimisation that tries to optimise all the parameters to come to a pareto-optimal solution - a solution or set of solutions for when you can't optimise each individual parameter, so have to settle for some sub-optimal ones.
Rather than just find anything that matches the criteria they are trying to find some sort of optimal solution(s) given the constraints. That's how it's different than simple constrained decision-making.
Note I'm basing this answer completely off of their statement:
The service uses a mathematical filtering technique called “Pareto Optimization,”...
and what I've read about Pareto problems. I have no experience with this technology or Pareto problems myself.

Steps for age classification

I am working on age (or gender) classification using images of human faces. I have decided to use the LBP (Local Binary Patterns) approach for feature extraction and Support Vector Machines (SVM) for freature classification. The whole process is shown in Fig. 1. Below.
As I understand it, the procedure is as follows:
Start with a training set that includes 3 groups: Chidren, Young, Senior. Each group has 50 images (150 images total). Use LBP to prepare the 150 images for classification.
Train a SVM on 150 LBP images with labels:
0: Child
1: Young Adult
2: Senior
Test the system using a set of new images. If all goes according to plan, the system should properly classify images based on the groups defined in step 2.
The algorithm:
for i=1 to N //Assume N is number of image
LBP_feature[i]=LBP_extract(image_i)
end
//Training stage
SVM.train(LBP_feature,label);
//Test stage
face=getFromCamera
//Extract LBP from the face
face_LBP=LBP_extract(face)
label=SVM.predict(face_LBP)
if label=0 then Children
if label=1 then Young
if label=2 then Senior
Does the proposed system make sense for this task?
If you want to use support vector machines, and you also want to consider an image to be a "sample" of subregions, then so-called "support distribution machines" developed by Jeff Schneider and Barnabas Poczos might be best suited for your problem (paper and documentation available online). They actually showed that with some tweaks, support distribution machines outperformed all state-of-the-art methods for a certain popular image classification data set. They used SIFT (sp?) features and then each image was a collection of samples (subregion patches) from the feature space, and then "support distribution machines" are kernel-based SVMs that estimate a divergence kernel between two distributions by using a sample-based estimator.
If you want to use SVMs like support distribution machines, there is one final point to consider. SVMs are two-class classifiers. In order to extend to more than 2 classes, you can either train an SVM that classifies one class versus the union of the rest of the classes, for each choice of class (so N SVMs if you have N classes), and then you run each SVM and choose the class with the highest classification score. Another method, however, is to train an SVM for each pair of classes (so N(N-1)/2 SVMs for N classes) and then try to choose the best class by getting a "consensus" of all the pairwise comparisons. You can read about all this online and choose whichever method you think is best, or whichever method gives the best leave-one-out cross validation performance on the training data. (which should be easy to calculate because you only have 150 training points)
On paper, the approach makes sense. The most important point is whether the LBP is the right feature for this task. You can first extract the LBP using different parameters (image size, bin count if you are using LBP histogram, etc.) and observe the data using a tool like Weka or R to see if your sample data for different classes exhibit different distributions.
You can also refer to a few research papers on age estimation to see what other features are suitable. I have tried Radon transform with some success, for seniors. The wrinkles in faces are well represented in Radon transform.

Anomaly Detection Algorithms

I am tasked with detecting anomalies (known or unknown) using machine-learning algorithms from data in various formats - e.g. emails, IMs etc.
What are your favorite and most effective anomaly detection algorithms?
What are their limitations and sweet-spots?
How would you recommend those limitations be addressed?
All suggestions very much appreciated.
Statistical filters like Bayesian filters or some bastardised version employed by some spam filters are easy to implement. Plus there are lots of online documentation about it.
The big downside is that it cannot really detect unknown things. You train it with a large sample of known data so that it can categorize new incoming data. But you can turn the traditional spam filter upside down: train it to recognize legitimate data instead of illegitimate data so that anything it doesn't recognize is an anomaly.
There are various types of anomaly detection algorithms, depending on the type of data and the problem you are trying to solve:
Anomalies in time series signals:
Time series signals is anything you can draw as a line graph over time (e.g., CPU utilization, temperature, rate per minute of number of emails, rate of visitors on a webpage, etc). Example algorithms are Holt-Winters, ARIMA models, Markov Models, and more. I gave a talk on this subject a few months ago - it might give you more ideas about algorithms and their limitations.
The video is at: https://www.youtube.com/watch?v=SrOM2z6h_RQ
Anomalies in Tabular data: These are cases where you have feature vector that describe something (e.g, transforming an email to a feature vector that describes it: number of recipients, number of words, number of capitalized words, counts of keywords, etc....). Given a large set of such feature vectors, you want to detect some that are anomalies compared to the rest (sometimes called "outlier detection"). Almost any clustering algorithm is suitable in these cases, but which one would be most suitable depends on the type of features and their behavior -- real valued features, ordinal, nominal or anything other. The type of features determine if certain distance functions are suitable (the basic requirement for most clustering algorithms), and some algorithms are better with certain types of features than others.
The simplest algo to try is k-means clustering, where an anomaly sample would be either very small clusters or vectors that are far from all cluster centers. One sided SVM can also detect outliers, and has the flexibility of choosing different kernels (and effectively different distance functions). Another popular algo is DBSCAN.
When anomalies are known, the problem becomes a supervised learning problem, so you can use classification algorithms and train them on the known anomalies examples. However, as mentioned - it would only detect those known anomalies and if the number of training samples for anomalies is very small, the trained classifiers may not be accurate. Also, because the number of anomalies is typically very small compared to "no-anomalies", when training the classifiers you might want to use techniques like boosting/bagging, with over sampling of the anomalies class(es), but optimize on very small False Positive rate. There are various techniques to do it in the literature --- one idea that I found to work many times very well is what Viola-Jones used for face detection - a cascade of classifiers. see: http://www.vision.caltech.edu/html-files/EE148-2005-Spring/pprs/viola04ijcv.pdf
(DISCLAIMER: I am the chief data scientist for Anodot, a commercial company doing real time anomaly detection for time series data).

Resources