vertex video action recognition - predict timeframes instead of frames - google-cloud-vertex-ai

Trying to train Vertex to identify actions. My problem is, whenever I run a prediction, I only get in response a single timeframe, e.g.
{
"id":"4286686324175405056",
"displayName":"SITTING",
"timeSegmentStart":"288s",
"timeSegmentEnd":"288s",
"confidence":0.99240303
}
while I was trying to train the modeal that SITTING starts when the person is standing near a chair, and ends when they're sitted. Looks like Vertex knows only to identify the final "sitted" situation and not the length of the action itself (standing, banding, sitting) I would expect to get a timeSegmentStart different than timeSegmentEnd but I always get them identical

Related

Godot : How to instantiate a scene from a list of possible scenes

I am trying to create a game in which procedural generation will be like The Binding of Isaac one : successive rooms selected from a list. I think I will be able to link them together and all, but I have a problem : How do I choose a room from a list ?
My first thought is to create folders containing scenes, something like
zone_1
basic_rooms
room_1.tscn
room_2.tscn
special_rooms
...
zone_2
...
and to select a random scene from the folder I need, for example a random basic room from the first zone would be a random scene from zone_1/basic_rooms.
The problem is that I have no idea if this a good solution as it will create lots of scenes, and that I don't know how to do this properly. Do I simply use a string containing the folder path, or are there better ways ? Then I suppose I get all the files in the folder, choose one randomly, load it and instanciate it, but again, I'm not sure.
I think I got a little lost in my explainations, but to summarize, I am searching for a way to select a room layout from a list, and don't know how to do.
What you suggest would work.
You can instance scene by this pattern:
var room_scene = load("res://zone/room_type/room_1.tscn")
var room_instance = room_scene.instance()
parent.add_child(room_instance)
I'll also remind you to give a position to the room_instance.
So, as you said, you can build the string you pass to load.
I'll suggest to put hat logic in an autoload and call it where you need it.
However, the above code will stop the game while it is loading the scene. Instead do Background Loading with ResourceLoader.
First you need to call load_interactive which will give you a ResourceInteractiveLoader object:
loader = ResourceLoader.load_interactive(path)
Then you need to call poll on the loader. Until it returns ERR_FILE_EOF. In which case you can get the scene with get_resource:
if loader.poll() == ERR_FILE_EOF:
scene = loader.get_resource()
Otherwise, it means that call to poll wasn't enough to finish loading.
The idea is to spread the calls to poll across multiple frames (e.g. by calling it from _process).
You can call get_stage_count to get the number of times you need to call poll, and get_stage will tell you how many you have called it so far.
Thus, you can use them to compute the progress:
var progress = float(loader.get_stage()) / loader.get_stage_count()
That gives you a value from 0 to 1. Where 0 is not loaded at all, and 1 is done. Multiply by 100 to get a percentage to display. You may also use it for a progress bar.
The problem is that I have no idea if this a good solution as it will create lots of scenes
This is not a problem.
Do I simply use a string containing the folder path
Yes.
Then I suppose I get all the files in the folder, choose one randomly
Not necessarily.
You can make sure that all the scenes in the folder have the same name, except for a number, then you only need to know how many scenes are in the folder, and pick a number.
However, you may not want full randomness. Depending on your approach to generate the rooms, you may want to:
Pick the room based on the connections it has. To make sure it connects to adjacent rooms.
Have weighs for how common or rare a room should be.
Thus, it would be useful to have a file with that information (e.g. a json or a csv file). Then your autoload code responsible for loading scenes would load that file into a data structure (e.g. a dictionary or an array), from where it can pick what scene to load, considering any weighs or constraints specified there.
I will assume that your rooms exist on a grid, and can have doors for NORTH, SOUTH, EAST, WEST. I will also assume that the player can backtrack, so the layout must be persistent.
I don't know how far ahead you will generate. You can choose to generate all the map at once, or generate rooms as the player attempt to enter, or generate a few rooms ahead.
If you are going to generate as the player attempts to enter, you will want an room transition animation where you can hide the scene loading (with the Background Loading approach).
However, you should not generate a room that has already been generated. Thus, keep a literal grid (an array) where you can store if a room has been generated. You would first check the grid (the array), and if it has been generated, there is nothing to do. But if it hasn't, then you need to pick a room at random.
But wait! If you are entering - for example - from the south, the room you pick must have a south door to go back. If you organize the rooms by the doors they have, then you can pick from the rooms that have south doors - in this example.
In fact, you need to consider the doors of any neighbor rooms you have already generated. Thus, store in the grid (the array) what doors the room that was generated has. So you can later read from the array to see what doors the new room needs. If there is no room, decide at random if you want a door there. Then pick a room at random, from the sets that have the those doors.
Your sets of rooms would be, the combinations of NORTH, SOUTH, EAST, WEST. A way to generate the list, is to give each direction a power of two. For example:
NORTH = 1
SOUTH = 2
EAST = 4
WEST = 8
Then to figure out the sets, you can count, and the binary representation gives the doors. For example 10 = 8 + 2 -> WEST and SOUTH.
Those are your sets of rooms. To reiterate, look at the already generated neighbors for doors going into the room you are going to generate. If there is no room, decide at random if you want a door there. That should tell you from what set of rooms you need to pick to generate.
This is similar to the approach auto-tile solution use. You may want to read how that works.
Now assuming the rooms in the set have weights (so some rooms are more common and others are rarer), and you need to pick at random.
This is the general algorithm:
Sum the weights.
Normalize the weights (Divide the weights by the sum, so they add up to 1).
Accumulate the normalized weights.
Generate a random number from 0 to 1, and what is the last accumulated normalized weight that is greater than the random number we got.
Since, presumably, you will be picking rooms from the same set multiple times, you can calculate and store the accumulated normalized weights (let us call them final weights), so you don't compute them every time.
You can compute them like this:
var total_weight:float = 0.0
for option in options:
total_weight = total_weight + option.weight
var final_weight:float = 0.0
var final_weights:Array = []
for option in options:
var normalized_weight = option.weight / total_weight
final_weight = final_weight + normalized_weight
final_weights.append(final_weight)
Then you can pick like this:
var randomic:float = randf()
for index in final_weights.size():
if final_weights[index] > randomic:
return options[index]
Once you have picked what room to generate, you can load it (e.g. with the Background Loading approach), instance it, and add it to the scene tree. Remember to give a position in the world.
Also remember to update the grid (the array) information. You picked a room from a set that have certain doors. You want to store that to take into account to generate the adjacent rooms.
And, by the way, if you need large scale path-finding (for something going from a room to another), you can use that grid too.

What algorithm and data structure would fit the use case of overlapping traffic on a roadway

I have a problem where I have a road that has multiple entry points and exits. I am trying to model it so that traffic can flow into an entry and go out the exit. The entry points also act as exits. All the entrypoints are labelled 1 to 10 (i.e. we have 10 entry and exits).
A car is allowed to enter and exit at any point however the entry is always lower number than the exit. For example a car enters at 3 and goes to 8, it cannot go from 3 to 3 or from 8 to 3.
After every second the car moves one unit on the road. So from above example the car goes from 3 to 4 after one second. I want to continuously accept cars at different entrypoints and update their positions after each second. However I cannot accept a car at an entry if there is already one present at that location.
All cars are travelling at the same speed of 1 unit per second and all are same size and occupy just the space at the point they are in. Once a car reaches its destination, its removed from the road.
For all new cars that come into the entrypoint and are waiting, we need to assign a waiting time. How would that work? For example it needs to account for when it is able to find a slot where it can be put on the road.
Is there an algorithm that this problem fits into?
What data structure would I model this in - for example for each entrypoints, I was thinking something like a queue or like an ordered map and for the road, maybe a linkedlist?
Outside of a top down master algorithm that decides what each car does and when, there is another approach that uses agents that interact with their environment and amongst themselves, with a limited set of simple rules. This often give rise to complex behaviors: You could maybe code simple rules into car objects, to define these interactions?
Maybe something like this:
emerging behavior algorithm:
a car moves forward if there are no cars just in front of it.
a car merges into a lane if there are no car right on its side (and
maybe behind that slot too)
a car progresses towards its destination, and removes itself when destination is reached.
proposed data structure
The data structure could be an indexed collection of "slots" along which a car moves towards a destination.
Two data structures could intersect at a tuple of index values for each.
Roads with 2 or more lanes could be modeled with coupled data structures...
optimial numbers
Determining the max road use, and min time to destination would require running the simulation several times, with varying parameters of the number of cars, and maybe variations of the rules.
A more elaborate approach would us continuous space on the road, instead of discrete slots.
I can suggest a Directed Acyclic Graph (DAG) which will store each entry point as a node.
The problem of moving from one point to another can be thought of as a graph-flow problem, which has a number of algorithms for determining movement in a graph.

using Roiroad function in venis

I have a mobility model created by SUMO with area around 2 KM * 2 Km for real map.
I want to compute the results for only part of this model. I read that I can use roiroad or roirect.
Roirect take (x1,y1-x2,y2) as Traci coordination, however, I want to use roiroad to take exactly the cars in specific road.
My question is: if the roiroad function take a string of road name , from where in sumo that I can get this value.
should I construct the map again with Netconvert and using --output-street-names
Edges in SUMO always have an ID. It is stored in the id="..." attribute of the <edge> tag. If you convert a network from some other data format (say, OpenStreetMap) to SUMO's XML representation, you have the option to try and use an ID that closely resembles the road name the edge represents (this is the option you mentioned). The default is to allocate a numeric ID.
Other than by opening the road network XML file in a text editor, you can also find the edge ID by opening the network in the SUMO GUI and right clicking on the edge (or by enabling the rendering of edge IDs in the GUI).
Note that, depending on the application you simulate, you will need to make sure that you have no "gaps" in the Regions Of Interest (ROIs) you specify. When a vehicle is no longer in the ROI its corresponding node is removed from the network simulation. Even if the same vehicle later enters another (or the same) ROI, a brand new node will be created. This is particularly important when specifying edges as ROI (via the roiRoads parameter). Keep in mind that SUMO uses edges not just to represent streets, but also to represent lanes crossing intersections. If you do not specify these internal edges, your ROIs will have small gaps at every intersection.
Note also that up until OMNeT++ 5.0, syntax highlighting of the .ini file in the IDE will (mistakenly) display a string containing a # character as if it were a comment. This is just a problem with the syntax highlighting. The simulation will behave as expected. For example, setting the roiRoads parameter to "-5445204#1 :252726232_7 -5445204#2" in the Veins 4.4 example as follows...
...will result in a Veins simulation where only cars on one of the following three edges are simulated:
on the edge leading to the below intersection; or
on the edge crossing the below intersection; or
on the edge leaving the below intersection.

Algorithm for animating elements running across a scene

I'm not sure if the title is right but...
I want to animate (with html + canvas + javascript) a section of a road with a given density/flow/speed configuration. For that, I need to have a "source" of vehicles in one end, and a "sink" in the other end. Then, a certain parameter would determine how many vehicles per time unit are created, and their (constant) speed. Then, I guess I should have a "clock" loop, to increment the position of the vehicles at a given frame-rate. Preferrably, a user could change some values in a form, and the running animation would update accordingly.
The end result should be a (much more sophisticated, hopefully) variation of this (sorry for the blinking):
Actually this is a very common problem, there are thousands of screen-savers that use this effect, most notably the "star field", which has parameters for star generation and star movement. So, I believe there must be some "design pattern", or more widespread form (maybe even a name) for this algoritm. What would solve my problem would be some example or tutorial on how to achieve this with common control flows (loops, counters, ifs).
Any idea is much appreciated!
I'm not sure of your question, this doesn't seem an algorithm question, more like programming advice. I have a game which needs exactly this (for monsters not cars), this is what I did. It is in a sort of .Net psuedocode but similar stuff exists in other environments.
If you are running an animation by hand, you essentially need a "game-loop".
while (noinput):
timenow = getsystemtime();
timedelta = timenow - timeprevious;
update_object_positions(timedelta);
draw_stuff_to_screen();
timeprevious = timenow;
noinput = check_for_input()
The update_object_positions(timedelta) moves everything along timedelta, which is how long since this loop last executed. It will run flat-out redrawing every timedelta. If you want it to run at a constant speed, say once every 20 mS, you can stick in a thread.sleep(20-timedelta) to pad out the time to 20mS.
Returning to your question. I had a car class that included its speed, lane, type etc as well as the time it appears. I had a finite number of "cars" so these were pre-generated. I held these in a list which I sorted by the time they appeared. Then in the update_object_position(time) routine, I saw if the next car had a start time before the current time, and if so I popped cars off the list until the first (next) car had a start time in the future.
You want (I guess) an infinite number of cars. This requires only a slight variation. Generate the first car for each lane, record its start time. When you call update_object_position(), if you start a car, find the next car for that lane and its time and make that the next car. If you have patterns that you want to repeat, generate the whole pattern in one go into a list, and then generate a new pattern when that list is emptied. This would also work well in terms of letting users specify variable pattern flows.
Finally, have you looked at what happens in real traffic flows as the volume mounts? Random small braking activities cause cars behind to slightly over-react, and as the slight over-reactions accumulate it turns into cars completely stopping a kilometre back up the road. Its quite strange, and so might be a great effect in your wallpaper/screensaver whatever as well as being a proper simulation.

Confusion with neural networks in MATLAB

I'm working on character recognition (and later fingerprint recognition) using neural networks. I'm getting confused with the sequence of events. I'm training the net with 26 letters. Later I will increase this to include 26 clean letters and 26 noisy letters. If I want to recognize one letter say "A", what is the right way to do this? Here is what I'm doing now.
1) Train network with a 26x100 matrix; each row contains a letter from segmentation of the bmp (10x10).
2) However, for the test targets I use my input matrix for "A". I had 25 rows of zeros after the first row so that my input matrix is the same size as my target matrix.
3) I run perform(net, testTargets,outputs) where outputs are the outputs from the net trained with the 26x100 matrix. testTargets is the matrix for "A".
This doesn't seem right though. Is training supposed by separate from recognizing any character? What I want to happen is as follows.
1) Training the network for an image file that I select (after processing the image into logical arrays).
2) Use this trained network to recognize letter in a different image file.
So train the network to recognize A through Z. Then pick an image, run the network to see what letters are recognized from the picked image.
Okay, so it seems that the question here seems to be more along the lines of "How do I neural networks" I can outline the basic procedure here to try to solidify the idea in your mind, but as far as actually implementing it goes you're on your own. Personally I believe that proprietary languages (MATLAB) are an abomination, but I always appreciate intellectual zeal.
The basic concept of a neural net is that you have a series of nodes in layers with weights that connect them (depending on what you want to do you can either just connect each node to the layer above and beneath, or connect every node, or anywhere in betweeen.). Each node has a "work function" or a probabilistic function that represents the chance that the given node, or neuron will evaluate to "on" or 1.
The general workflow starts from whatever top layer neurons/nodes you've got, initializing them to the values of your data (in your case, you would probably start each of these off as the pixel values in your image, normalized to be binary would be simplest). Each of those nodes would then be multiplied by a weight and fed down towards your second layer, which would be considered a "hidden layer" depending on the sum (either geometric or arithmetic sum, depending on your implementation) which would be used with the work function to determine the state of your hidden layer.
That last point was a little theoretical and hard to follow, so here's an example. Imagine your first row has three nodes ([1,0,1]), and the weights connecting the three of those nodes to the first node in your second layer are something like ([0.5, 2.0, 0.6]). If you're doing an arithmetic sum that means that the weighting on the first node in your "hidden layer" would be
1*0.5 + 0*2.0 + 1*0.6 = 1.1
If you're using a logistic function as your work function (a very common choice, though tanh is also common) this would make the chance of that node evaluating to 1 approximately 75%.
You would probably want your final layer to have 26 nodes, one for each letter, but you could add in more hidden layers to improve your model. You would assume that the letter your model predicted would be the final node with the largest weighting heading in.
After you have that up and running you want to train it though, because you probably just randomly seeded your weights, which makes sense. There are a lot of different methods for this, but I'll generally outline back-propagation which is a very common method of training neural nets. The idea is essentially, since you know which character the image should have been recognized, you compare the result to the one that your model actually predicted. If your model accurately predicted the character you're fine, you can leave the model as is, since it worked. If you predicted an incorrect character you want to go back through your neural net and increment the weights that lead from the pixel nodes you fed in to the ending node that is the character that should have been predicted. You should also decrement the weights that led to the character it incorrectly returned.
Hope that helps, let me know if you have any more questions.

Resources