Bonding/Merging of Bodies after collision in Abaqus - collision

Is it possible to model the merging or bonding of bodies or mesh elements after a collision in Abaqus? From what I have seen so far, cohesive elements and cohesive contact can be used to model the adhesives between two components or interfacial debonding, as well as fracture propagation. From a high level, I think that this, or similar, criteria could be used to describe the joining, rather than separation, of bodies.
If the theory does not apply, does anyone know if it is possible to develop a criteria for joining of mesh elements (maybe via some subroutine) based on the kinetic energy from the collision or something similar. When it comes to describing the onset of adhesion between the two layers, there could be four different criteria. When a certain contact pressure is exceeded, when the two boundaries come within a certain distance of one another, from the very beginning of the analysis, and when a user-defined Boolean expression is fulfilled. This, as well as modeling adhesion via contact, was discussed on this COMSOL Multiphysics blog post (https://www.comsol.com/blogs/how-to-model-adhesion-and-decohesion-in-comsol-multiphysics/).
Any information related to this idea would be greatly appreciated.

if your collision surfaces are simple enough and should not undergo high unpredictable changes before the collision, then you can change interaction property starting from a certain step:
# Create the contact property with separation
contact_with_separation = m.ContactProperty('Contact_with_separation')
contact_with_separation.NormalBehavior(
pressureOverclosure=HARD, allowSeparation=ON, constraintEnforcementMethod=DEFAULT
)
contact_with_separation.TangentialBehavior(
formulation=PENALTY, table=((0.1, ), ), fraction=0.005, maximumElasticSlip=FRACTION
)
# Create the contact property without separation
contact_no_separation = m.ContactProperty('Contact_no_separation')
contact_no_separation.NormalBehavior(
pressureOverclosure=HARD, allowSeparation=OFF, constraintEnforcementMethod=DEFAULT
)
contact_no_separation.TangentialBehavior(formulation=ROUGH)
# Define the contact
m.SurfaceToSurfaceContactStd(
name='interaction_name', createStepName='Initial',
master=master_surf, slave=slave_surf, interactionProperty=contact_with_separation,
)
# Change contact property
m.interactions['interaction_name'].setValuesInStep(
interactionProperty=contact_no_separation, stepName=change_property_step_name
)

Related

cb_explore input format : Use of providing probability value in training

The cb_explore input format requires specifying action:cost:action_probability for each example.
However the cb algorithms within are already trying to learn the optimal policy i.e. probability for each action from the data. Then, why does it need the probability of each action in the input? Is it just for initialization?
If I understand correctly, you are asking why the label associated with cb_explore is a set of action/probability pairs.
The probability of the label action is used as an importance weight for training. This has the effect of amplifying the updates for actions that are played less frequently, making them less likely to be drowned out by actions played more frequently.
As well, this type of label is very useful during predict-time, because it generates a log that can be used to perform unbiased counterfactual analyses. In other words, by logging the probability of playing each of the actions before sampling (see cb_sample - this implements how a single action/probability vector is sampled, as for example in the ccb reduction: https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/vowpalwabbit/cb_sample.cc#L37), we can then use the log to train another policy, and compare how it performs against the original.
See the "A Multi-World Testing Decision Service" paper to describe the mechanism to do unbiased offline experimentation with logged data: https://arxiv.org/pdf/1606.03966v1.pdf

Logoot CRDT: interleaving of data on concurrent edits to the same spot?

I want to implement Logoot for eventually-convergent P2P text editing and I've run into a bit of a problem.
My understanding of Logoot is that the intervals between objects (lines of text in the original paper, but could be characters or words) can be divided infinitely on account of an unbounded identifier. This means that the position of an object is determined not by its neighbors as in WOOT (which would require tombstones) but by a fixed numerical point along the length of the string. Combined with a unique site identifier, this also gives us a total order and enables eventual convergence.
However... doesn't this cause a problem when concurrent edits are made to the same spot? If two disconnected clients start writing new sentences at the same cursor position and then merge, their sentences have a good chance of interleaving.
Below is a whiteboard example of what I'm talking about:
As you can see, both site B and site C divide the interval between "I" and "conquered" according to the rules of Logoot, giving us random points between the positions of (20,A) and (25,A). But nothing orders these points relative to each other, causing them to mix when merged. Meanwhile, neighbor-based algorithms can account for this issue since the causality chain of each object is preserved.
The above is a baby example, but in the more general case, imagine if two users wanted to insert a different sentence between two existing sentences. If one of the users happened to be offline, they shouldn't come back to a garbled mess! Clearly, to preserve intent, one sentence should follow the other.
Am I missing something in my reading of the paper, or is this an inherent downside to Logoot?
(Also, why is there a recorded clock value that's seemingly unused in the algorithm? The paper even points out that each object's identifier is necessarily unique without the clock.)
You're correct, this a real anomaly in Logoot and LSEQ. Whether or not it constitutes a intention violation depends on what your definition of intention is. An extension to the definition requiring that contiguous sequences remain contiguous unless they are split by a casually subsequent operation would make intuitive sense.
The clock is unnecessary. Most likely the authors used the (site, clock) pair or Lamport timestamp as their UUIDs out of convention. One site can never create two identical positions, so clocks will never need to be compared. (Assuming messages are received from a site in order, which is required for other aspects of Logoot/LSEQ too.)

Algorithms and the Strategy Design Pattern? Breaking a process into codependent steps

I am designing a factory that creates objects of Maze type. The factory requires three distinct steps to create the mazes and one optional step.
partition (split) the 2d or 3d closed space into Partitions (rooms). This step is called Partitioning.
Check which Partitions' surfaces overlap sufficiently (which rooms share a wall) so that it's possible to (add a door between them and) connect them. This step I call Graphing.
Run a Spanning Tree or a similar algorithm that connects all Partitions as a Tree or a similar connected Graph structure. I call this Linkning.
Optional*: Create the actual doors between Partitions.
I have a class called MazeGenerator that accepts a Partitioning(1), Graphing(2), Linking(3) and possibly a DoorGeneration(4) method.
Alright, the issue is with flexibility and usability:
Some methods of spliting the space, are very rigid (for instance spliting it into a grid) and allow us to use a much faster method in step [2], we may not have to do anything actually cause we know exactly which walls are shared by which rooms and can continue to step [3].
On the other hand, some methods for partitioning are very random and require a lot more work on step [2]. I want to allow users (other devs) to apply the fast method when it is viable but to constraint and convey that they need to use another method when they split the space with a more random method.
The issue is, how do I convey or constraint people that use this API?
So far I thought about adding a protected const array to each method step class called viableNextStep that lists all the classes that describe possible methods that can be used in conjunction with this method.
So for instance, PartitionIntoRigidGrid2D would have
Array viableMethods = [SimpleGridGraph2D, findOverlap2D] // Hope that makes sense

An understandable clusterization

I have a dataset. Each element of this set consists of numerical and categorical variables. Categorical variables are nominal and ordinal.
There is some natural structure in this dataset. Commonly, experts clusterize datasets such as mine using their 'expert knowledge', but I want to automate this process of clusterization.
Most algorithms for clusterization use distance (Euclidean, Mahalanobdis and so on) between objects to group them in clusters. But it is hard to find some reasonable metrics for mixed data types, i.e. we can't find a distance between 'glass' and 'steel'. So I came to the conclusion that I have to use conditional probabilities P(feature = 'something' | Class) and some utility function that depends on them. It is reasonable for categorical variables, and it works fine with numeric variables assuming they are distributed normally.
So it became clear to me that algorithms like K-means will not produce good results.
At this time I try to work with COBWEB algorithm, that fully matches my ideas of using conditional probabilities. But I faced another obsacles: results of clusterization are really hard to interpret, if not impossible. As a result I wanted to get something like a set of rules that describes each cluster (e.g. if feature1 = 'a' and feature2 in [30, 60], it is cluster1), like descision trees for classification.
So, my question is:
Is there any existing clusterization algorithm that works with mixed data type and produces an understandable (and reasonable for humans) description of clusters.
Additional info:
As I understand my task is in the field of conceptual clustering. I can't define a similarity function as it was suggested (it as an ultimate goal of the whoal project), because of the field of study - it is very complicated and mercyless in terms of formalization. As far as I understand the most reasonable approach is the one used in COBWEB, but I'm not sure how to adapt it, so I can get an undestandable description of clusters.
Decision Tree
As it was suggested, I tried to train a decision tree on the clustering output, thus getting a description of clusters as a set of rules. But unfortunately interpretation of this rules is almost as hard as with the raw clustering output. First of only a few first levels of rules from the root node do make any sense: closer to the leaf - less sense we have. Secondly, these rules doesn't match any expert knowledge.
So, I came to the conclusion that clustering is a black-box, and it worth not trying to interpret its results.
Also
I had an interesting idea to modify a 'decision tree for regression' algorithm in a certain way: istead of calculating an intra-group variance calcualte a category utility function and use it as a split criterion. As a result we should have a decision tree with leafs-clusters and clusters description out of the box. But I haven't tried to do so, and I am not sure about accuracy and everything else.
For most algorithms, you will need to define similarity. It doesn't need to be a proper distance function (e.g. satisfy triangle inequality).
K-means is particularly bad, because it also needs to compute means. So it's better to stay away from it if you cannot compute means, or are using a different distance function than Euclidean.
However, consider defining a distance function that captures your domain knowledge of similarity. It can be composed of other distance functions, say you use the harmonic mean of the Euclidean distance (maybe weighted with some scaling factor) and a categorial similarity function.
Once you have a decent similarity function, a whole bunch of algorithms will become available to you. e.g. DBSCAN (Wikipedia) or OPTICS (Wikipedia). ELKI may be of interest to you, they have a Tutorial on writing custom distance functions.
Interpretation is a separate thing. Unfortunately, few clustering algorithms will give you a human-readable interpretation of what they found. They may give you things such as a representative (e.g. the mean of a cluster in k-means), but little more. But of course you could next train a decision tree on the clustering output and try to interpret the decision tree learned from the clustering. Because the one really nice feature about decision trees, is that they are somewhat human understandable. But just like a Support Vector Machine will not give you an explanation, most (if not all) clustering algorithms will not do that either, sorry, unless you do this kind of post-processing. Plus, it will actually work with any clustering algorithm, which is a nice property if you want to compare multiple algorithms.
There was a related publication last year. It is a bit obscure and experimental (on a workshop at ECML-PKDD), and requires the data set to have a quite extensive ground truth in form of rankings. In the example, they used color similarity rankings and some labels. The key idea is to analyze the cluster and find the best explanation using the given ground truth(s). They were trying to use it to e.g. say "this cluster found is largely based on this particular shade of green, so it is not very interesting, but the other cluster cannot be explained very well, you need to investigate it closer - maybe the algorithm discovered something new here". But it was very experimental (Workshops are for work-in-progress type of research). You might be able to use this, by just using your features as ground truth. It should then detect if a cluster can be easily explained by things such as "attribute5 is approx. 0.4 with low variance". But it will not forcibly create such an explanation!
H.-P. Kriegel, E. Schubert, A. Zimek
Evaluation of Multiple Clustering Solutions
In 2nd MultiClust Workshop: Discovering, Summarizing and Using Multiple Clusterings Held in Conjunction with ECML PKDD 2011. http://dme.rwth-aachen.de/en/MultiClust2011
A common approach to solve this type of clustering problem is to define a statistical model that captures relevant characteristics of your data. Cluster assignments can be derived by using a mixture model (as in the Gaussian Mixture Model) then finding the mixture component with the highest probability for a particular data point.
In your case, each example is a vector has both real and categorical components. A simple approach is to model each component of the vector separately.
I generated a small example dataset where each example is a vector of two dimensions. The first dimension is a normally distributed variable and the second is a choice of five categories (see graph):
There are a number of frameworks that are available to run monte carlo inference for statistical models. BUGS is probably the most popular (http://www.mrc-bsu.cam.ac.uk/bugs/). I created this model in Stan (http://mc-stan.org/), which uses a different sampling technique than BUGs and is more efficient for many problems:
data {
int<lower=0> N; //number of data points
int<lower=0> C; //number of categories
real x[N]; // normally distributed component data
int y[N]; // categorical component data
}
parameters {
real<lower=0,upper=1> theta; // mixture probability
real mu[2]; // means for the normal component
simplex[C] phi[2]; // categorical distributions for the categorical component
}
transformed parameters {
real log_theta;
real log_one_minus_theta;
vector[C] log_phi[2];
vector[C] alpha;
log_theta <- log(theta);
log_one_minus_theta <- log(1.0 - theta);
for( c in 1:C)
alpha[c] <- .5;
for( k in 1:2)
for( c in 1:C)
log_phi[k,c] <- log(phi[k,c]);
}
model {
theta ~ uniform(0,1); // equivalently, ~ beta(1,1);
for (k in 1:2){
mu[k] ~ normal(0,10);
phi[k] ~ dirichlet(alpha);
}
for (n in 1:N) {
lp__ <- lp__ + log_sum_exp(log_theta + normal_log(x[n],mu[1],1) + log_phi[1,y[n]],
log_one_minus_theta + normal_log(x[n],mu[2],1) + log_phi[2,y[n]]);
}
}
I compiled and ran the Stan model and used the parameters from the final sample to compute the probability of each datapoint under each mixture component. I then assigned each datapoint to the mixture component (cluster) with higher probability to recover the cluster assignments below:
Basically, the parameters for each mixture component will give you the core characteristics of each cluster if you have created a model appropriate for your dataset.
For heterogenous, non-Euclidean data vectors as you describe, hierarchical clustering algorithms often work best. The conditional probability condition you describe can be incorporated as an ordering of attributes used to perform cluster agglomeration or division. The semantics of the resulting clusters are easy to describe.

Cellular Automata update rules Mathematica

I am trying to built some rules about cellular automata. In every cell I don't have one element (predator/prey) I have a number of population. To achieve movement between my population can I compare every cell with one of its neighbours every time? or do I have to compare the cell with all of its neighbours and add some conditions.
I am using Moore neighbourhood with the following update function
update[site,_,_,_,_,_,_,_,_]
I tried to make them move according to all of their neighbours but it's very complicated and I am wondering if by simplify it and check it with all of its neighbours individually it will be wrong.
Thanks
As a general advice I would recommend against going down the pattern recognition technique in Mathematica for specifying the rule table in CA, they tend to get out of hand very quickly.
Doing a predator-prey kind of simulation with CA is a little tricky since in each step, (unlike in traditional CA) the value of the center cell changes ALONG WITH THE VALUE OF THE NEIGHBOR CELL!
This will lead to issues since when the transition function is applied on the neighbor cell it will again compute a new value for itself, but it also needs to "remember" the changes done to it previously when it was a neighbor.
In fluid dynamics simulations using CA, they encounter problems like this and they use a different neighborhood called Margolus neighborhood. In a Margolus neighborhood, the CA lattice is broken into distinct blocks and the update rule applied on each block. In the next step the block boundaries are changed and the transition rules applied on the new boundary, hence information transfer occurs across block boundaries.
Before I give my best attempted answer, you have to realize that your question is oddly phrased. The update rules of your cellular automaton are specified by you, so I wouldn't know whether you have additional conditions you need to implement.
I think you're asking what is the best way to select a neighborhood, and you can do this with Part:
(* We abbreviate 'nbhd' for neighborhood *)
getNbhd[A_, i_Integer?Positive, j_Integer?Positive] :=
A[[i - 1 ;; i + 1, j - 1 ;; j + 1]];
This will select the appropriate Moore neighborhood, including the additional central cell, which you can filter out when you call your update function.
Specifically, to perform the update step of a cellular automaton, one must update all cells simultaneously. In the real world, this means creating a separate array, and placing the updated values there, scrapping the original array afterward.
For more details, see my blog post on Cellular Automata, which includes an implementation of Conway's Game of Life in Mathematica.

Resources