GALSIM: Using interpolated image class without specifying flux normalisation - galsim

For a project I am undertaking, I will need to calculate the derivative of a given surface brightness profile which is convolved with a pixel response function (as well as PSF etc.)
For various reasons, but mainly for consistency, I wish to do this using the guts of the GALSIM code. However, since in this case the `flux' defined as the sum of the non-parametric model no longer has a physical meaning in terms of the image itself (it will always be considered as noise-free in this case), there are certain situations where I would like to be able to define the interpolated image without a flux normalisation.
The code does not seem to care if the 'flux' is negative, but I am coming across certain situations where the 'flux' is within machine precision of zero, and thus the assertion ``dabs(flux-flux_tot) <= dabs(flux_tot)'' fails.
My question is therefore: Can one specify a non-parametric model to interpolate over without specifying a flux normalisation value?

There is currently no way to do this using the galsim.InterpolatedImage() class; you could open an issue to make this feature request at the GalSim repository on GitHub.
There is a way to do this using the guts of GalSim; an example is illustrated in the lensing power spectrum functionality if you are willing to dig into the source code (lensing_ps.py -- just do a search for SBInterpolatedImage to find the relevant code bits). The basic idea is that instead of using galsim.InterpolatedImage() you use the associated C++ class, galsim._galsim.SBInterpolatedImage(), which is accessible in python. The SBInterpolatedImage can be initialized with an image and choices of interpolants in real- and Fourier-space, as shown in the examples in lensing_ps.py, and then queried using the xValue() method to get the value interpolated to some position.
This trick was necessary in lensing_ps.py because we were interpolating shear fields which tend to have a mean of zero, so we ran into the same problem you did. Use of the SBInterpolatedImage class is not generally recommended for GalSim users (we recommend using only the python classes) but it is definitely a way around your problem for now.

Related

How to store an equation in EEPROM?

I'm working with embedded systems. For the sake of explanation, I'm working with a dsPIC33EP and a simple serial EEPROM.
Suppose I'm building a controller that uses a linear control scheme (y=mx+b). If the controller needs may different setting It's easy, store the m and the b in EEPROM and retrieve it for the different settings.
Now suppose I want to have different equations for different settings. I would have to pre program all the equations and then have a method for selecting that equation and pulling the settings from the EEPROM. It's harder because you need to know the equations ahead of time but still doable.
Now suppose that you don't know the equations ahead of time. Maybe you have to do a piece wise approximation for example. How could you store something like that in memory? That all a controller has to do is feed it a sensor reading and it would give back a control variable. Kind of like passing a variable to a function and getting the answer passed back.
How could you store a function like that in memory if only the current state is important?
How could you store a function like that if past states are important (if the control equation is second, third or fourth order for example)?
The dsPICs have limited RAM, but quite a bit of FLASH, enough for a small, but effective text parser. Have you thought of using some form of text based script? These can be translated to a more efficient data format at run-time.

Binary classification of sensor data

My problem is the following: I need to classify a data stream coming from an sensor. I have managed to get a baseline using the
median of a window and I subtract the values from that baseline (I want to avoid negative peaks, so I only use the absolute value of the difference).
Now I need to distinguish an event (= something triggered the sensor) from the noise near the baseline:
The problem is that I don't know which method to use.
There are several approaches of which I thought of:
sum up the values in a window, if the sum is above a threshold the class should be EVENT ('Integrate and dump')
sum up the differences of the values in a window and get the mean value (which gives something like the first derivative), if the value is positive and above a threshold set class EVENT, set class NO-EVENT otherwise
combination of both
(unfortunately these approaches have the drawback that I need to guess the threshold values and set the window size)
using SVM that learns from manually classified data (but I don't know how to set up this algorithm properly: which features should I look at, like median/mean of a window?, integral?, first derivative?...)
What would you suggest? Are there better/simpler methods to get this task done?
I know there exist a lot of sophisticated algorithms but I'm confused about what could be the best way - please have a litte patience with a newbie who has no machine learning/DSP background :)
Thank you a lot and best regards.
The key to evaluating your heuristic is to develop a model of the behaviour of the system.
For example, what is the model of the physical process you are monitoring? Do you expect your samples, for example, to be correlated in time?
What is the model for the sensor output? Can it be modelled as, for example, a discretized linear function of the voltage? Is there a noise component? Is the magnitude of the noise known or unknown but constant?
Once you've listed your knowledge of the system that you're monitoring, you can then use that to evaluate and decide upon a good classification system. You may then also get an estimate of its accuracy, which is useful for consumers of the output of your classifier.
Edit:
Given the more detailed description, I'd suggest trying some simple models of behaviour that can be tackled using classical techniques before moving to a generic supervised learning heuristic.
For example, suppose:
The baseline, event threshold and noise magnitude are all known a priori.
The underlying process can be modelled as a Markov chain: it has two states (off and on) and the transition times between them are exponentially distributed.
You could then use a hidden Markov Model approach to determine the most likely underlying state at any given time. Even when the noise parameters and thresholds are unknown, you can use the HMM forward-backward training method to train the parameters (e.g. mean, variance of a Gaussian) associated with the output for each state.
If you know even more about the events, you can get by with simpler approaches: for example, if you knew that the event signal always reached a level above the baseline + noise, and that events were always separated in time by an interval larger than the width of the event itself, you could just do a simple threshold test.
Edit:
The classic intro to HMMs is Rabiner's tutorial (a copy can be found here). Relevant also are these errata.
from your description a correctly parameterized moving average might be sufficient
Try to understand the Sensor and its output. Make a model and do a Simulator that provides mock-data that covers expected data with noise and all that stuff
Get lots of real sensor data recorded
visualize the data and verify your assuptions and model
annotate your sensor data i. e. generate ground truth (your simulator shall do that for the mock data)
from what you learned till now propose one or more algorithms
make a test system that can verify your algorithms against ground truth and do regression against previous runs
implement your proposed algorithms and run them against ground truth
try to understand the false positives and false negatives from the recorded data (and try to adapt your simulator to reproduce them)
adapt your algotithm(s)
some other tips
you may implement hysteresis on thresholds to avoid bouncing
you may implement delays to avoid bouncing
beware of delays if implementing debouncers or low pass filters
you may implement multiple algorithms and voting
for testing relative improvements you may do regression tests on large amounts data not annotated. then you check the flipping detections only to find performance increase/decrease

Odd correlated posterior traceplots in multilevel model

I'm trying out PyMC3 with a simple multilevel model. When using both fake and real data the traces of the random effect distributions move with each other (see plot below) and appear to be offsets of the same trace. Is this an expected artifact of NUTS or an indication of a problem with my model?
Here is a traceplot on real data:
Here is an IPtyhon notebook of the model and the functions used to create the fake data. Here is the corresponding gist.
I would expect this to happen in accordance with the group mean distribution on alpha. If you think about it, if the group mean shifts around it will influence all alphas to the same degree. You could confirm this by doing a scatter plot of the group mean trace against some of the alphas. Hierarchical models are in general difficult for most samplers because of these complex interdependencies between group mean and variance and the individual RVs. See http://arxiv.org/abs/1312.0906 for more information on this.
In your specific case, the trace doesn't look too worrisome to me, especially after iteration 1000. So you could probably just discard those as burn-in and keep in mind that you have some sampling noise but probably got the right posterior overall. In addition, you might want to perform a posterior predictive check to see if the model can reproduce the patterns in your data you are interested in.
Alternatively, you could try to estimate a better hessian using pm.find_hessian(), e.g. https://github.com/pymc-devs/pymc/blob/3eb2237a8005286fee32776c304409ed9943cfb3/pymc/examples/hierarchical.py#L51
I also found this paper which looks interesting (haven't read it yet but might be cool to implement in PyMC3): arxiv-web3.library.cornell.edu/pdf/1406.3843v1.pdf

Best way to define / declare structured data in Mathematica? (for NAO robot link properties)

I am simulating a NAO robot that has published physical properties for its links and joints (such as dimensions, link mass, center of mass, mass moment of inertia about that COM, etc).
The upper torso will be static, and I would like to get the lumped physical properties for the static upper torso. I have the math lined out (inertia tensors with rotation and parallel axis theorem), but I am wondering what the best method is to structure the data.
Currently, I am just defining everything as rules, a method I got from looking at data Import[]'d from struct's in a MAT file. I refer to attributes / properties with strings so that I don't have to worry about symbols being defined. Additionally, it makes it easier to generate the names for the different degrees of freedom.
Here is an example of how I am defining this:
http://pastebin.com/VNBwAVbX
I am also considering using some OOP package for Mathematica, but I am unsure of how to easily define it.
For this task, I had taken a look at Lichtblau's slides but could not figure out a simple way to apply it towards defining my data structure. I ended up using MathOO which got the job accomplished since efficiency was not too much of a concern and it was more or less of a one-time deal.

Implementing a model written in a Predicate Calculus into ProLog, how do I start?

I have four sets of algorithms that I want to set up as modules but I need all algorithms executed at the same time within each module, I'm a complete noob and have no programming experience. I do however, know how to prove my models are decidable and have already done so (I know Applied Logic).
The models are sensory parsers. I know how to create the state-spaces for the modules but I don't know how to program driver access into ProLog for my web cam (I have a Toshiba Satellite Laptop with a built in web cam). I also don't know how to link the input from the web cam to the variables in the algorithms I've written. The variables I use, when combined and identified with functions, are set to identify unknown input using a probabilistic, database search for best match after a breadth first search. The parsers aren't holistic, which is why I want to run them either in parallel or as needed.
How should I go about this?
I also don't know how to link the
input from the web cam to the
variables in the algorithms I've
written.
I think the most common way for this is to use the machine learning approach: first calculate features from your video stream (like position of color blobs, optical flow, amount of green in image, whatever you like). Then you use supervised learning on labeled data to train models like HMMs, SVMs, ANNs to recognize the labels from the features. The labels are usually higher level things like faces, a smile or waving hands.
Depending on the nature of your "variables", they may already be covered on the feature-level, i.e. they can be computed from the data in a known way. If this is the case you can get away without training/learning.

Resources