How to create an artificial network of 3 blocks for analyzing stochastic blockmodel? - social-networking

I want to create three graphs using the tie probabilities. In those graphs, Block 1 has 20 nodes, Block 2 has 20 nodes, and Block 3 has 10 nodes. I want to plot each network, with nodes colored by block membership. I tried this but unsuccessful. I suspect the problem is in pref.matrix=kbm. The error says this: Error in structure(as.double(pref.matrix), dim = dim(pref.matrix)) :
argument "pref.matrix" is missing, with no default
art_sbm <- sample_sbm(50,pref.matrix=kbm,block.sizes=c(20,20,10),directed=FALSE)
plot(art_sbm,vertex.color=c(rep(1,20),rep(2,20),rep(1,10)),main="Artificial Network")
assortativity.nominal(art_sbm,c(rep(1,20),rep(2,20),rep(1,10)))

Related

Efficient flow graph algorithm that routes its flow more evenly

This problem I recently came back to after putting it on a backburner for a while - and that's the one of trying to create a program to calculate the flow rates of some kind of resource through a network of pipes from resource sources to resource sinks, each of which can only pass so much resource per unit of time. This is, of course, a classic problem called the "network flow problem", and in particular the typical aim is to find a flow pattern that maximizes the flow going from the sources to the sinks. And I made a program that uses a common algorithm called the Ford-Fulkerson Max-Flow Method to do this, but I found that while this algorithm certainly does a nice job at finding a flow solution, it doesn't necessarily do a good job at making one that is particularly "natural" in terms of the flow pattern.
That is to say, consider a graph like the one below.
------------- SINK 1
0 / 8 | 0 / 5
SOURCE ---------X
| 0 / 5
------------- SINK 2
where the numbers represent the current flow rate on that particular edge or "pipe", here in "units" per second, versus the maximum flow the pipe can support, the "X" is a junction node, and the other labels should be self-explanatory.
When we solve this using F-F (which requires us to temporarily add an "aggregate sink" node that ties the two sinks on the right together), we find the max flow rate is indeed 8 U/s, which should be obvious just from simple inspection for such a simple graph. However, the flow pattern it gives may look something like either
------------- SINK 1
8 / 8 | 5 / 5
SOURCE ---------X
| 3 / 5
------------- SINK 2
or
------------- SINK 1
8 / 8 | 3 / 5
SOURCE ---------X
| 5 / 5
------------- SINK 2
depending on the order on which it encounters the edges during the depth-first walk used in the calculation. Trouble is, not only is that behavior itself not ideal, that flow doesn't "feel natural" in a certain sense. Intuitively, if we were imagining pushing a fluid, we'd expect 4 U/s of flow to go to sink 1 and another 4 to go to sink 2 by symmetry. Indeed, if we actually shrink the capacity of the edge leading out of the source to 5, the Ford-Fulkerson algorithm will starve one sink entirely, and that is also a behavior I'd like to avoid - if there's not enough flow to keep everybody as happy as they'd like to be, then at least try to distribute it as evenly as possible. In this case, that'd mean that if the max flow is, say, as here, 80% of the flow needed to fully satiate all the sinks, then 80% should go to each sink, unless there's a constriction somewhere in the graph that prevents sending even that much to that sink, in which case excess flow should back up and go to the other sinks while that one still gets the maximum it can get.
So my question is, what sort of algorithms would have either this behavior or a behavior similar to it? Or, to put it another way, if F-F is a good tool to just find a maximum flow, what is a good tool for tailoring the pattern of that maximum flow to some "desirable" form like this?
One simple solution I thought of is to just repeatedly apply F-F, only instead of routing from the source to the fictitious aggregate sink, apply it from the source to each individual sink, thus giving the max flow that is capable of making it through the constrictions, then work out from that how much each sink can actually get fed based on its demand and the whole-graph max flow. Trouble is, that means running the algorithm as many times as there are sinks, so the Big-O goes up, perhaps too much. Is there a more efficient way to achieve this?

Given some input, find the optimal weightings for edges of a graph to maximize some output

I have a directed graph that describes the relationship between items (nodes) via recipes (edges).
Simple example of a recipe: 2 Iron Ore => 1 Iron Ingot.
I want to find a weighting for each recipe (that is to say the number of times each recipe should be applied) such that given some starting number of items, it produces the maximum amount of a specified item.
How can I go about finding this weighting for each recipe?
Note: All weightings must be non-negative (they can be decimal). No weightings can result in more input being required than the amount available.
That's the main problem I'm trying to solve, but following that, the next thing I want to then solve is taking energy usage into account. Each recipe will either use some amount of energy or produce some amount of energy.
How can I ensure that when finding the weightings, the energy production minus the energy consumption is non-negative?
Thanks in advance for any advice :)
I've managed to solve this by abandoning the graph structure and treating the problem as a linear programming problem.
The graph structure might be able to be used if treating the problem as a max flow problem but I haven't look into that.
Here's a small example of the LP to solve for max iron ingots:
Maximize
iron_ingot: 30 recipe_iron_ingot + 50 recipe_iron_alloy_ingot + 65 recipe_pure_iron_ingot
Subject To
iron_ore: 30 recipe_iron_ingot + 20 recipe_iron_alloy_ingot + 35 recipe_pure_iron_ingot <= 70380
copper_ore: 20 recipe_iron_alloy_ingot <= 28860
Bounds
0 <= recipe_iron_ingot
0 <= recipe_iron_alloy_ingot
0 <= recipe_pure_iron_ingot
End
This represents an input of 70380 iron ore per minute, 28860 copper ore per minute and unlimited water. It analyses 3 different recipes that make iron ingots.
The result being:
0 × iron ingot recipe
1443 × iron alloy ingot recipe
1186.29 × pure iron ingot recipe
which equants to 149258.85 iron ingots per minute.
This isn't yet taking into account power production / consumption but that should be easy to add by just treating power as an input/output like how items are treated.

Specifying starting point for graph traversal algorithm in neo4j

I'm trying to write an algorithm which will propagate values from a starting node to the entire connected component. Basically, if A receives 5 requests, and A sends 5 requests to B for each request A receives, B will receive 25 requests.
So basically, I'm trying to go from this
to this
I've written the following snippet in neo4j:
MATCH (a:Loc)-[r:ROAD]->(b:Loc)
SET b.volume = b.volume + a.volume * r.cost
RETURN a,r,b
But, what I don't know is how I am supposed to specify a starting point for this algorithm to start? It appears as if neo4j is updating the values correctly in this case, but I don't think this will work for a larger graph. I want to explicitly make the algorithm start propagating values from the START node.
Thanks.
I'm sure there will be a better answer, and this approach has some limitations since some assumptions are made about the graph, but this works for your example.
Note that I added an id property to the :Loc nodes, but I only used it to select the start (and for printing the node id at the end).
MATCH p=(n:Loc)<-[:ROAD*]-(:Loc {id: 0})
WITH DISTINCT n, max(length(p)) as maxLp
ORDER BY maxLp // order the nodes by their maximum distance from start
MATCH (n)<-[r:ROAD]-(p:Loc)
SET n.volume = n.volume + r.cost * p.volume
RETURN DISTINCT n.id, n.volume
And here's the result:
n.id n.volume
1 4000
2 200000
3 200000
4 16400000
5 508000000
6 21632000000
The idea here was to get the longest paths to each node from the starting node. These are ordered by "closeness" and then the volumes are updated in order of "closeness".
In this case the planner will use the labels to find starting places for the query (you can run an EXPLAIN of the query to see the query plan), so it's going to match to all :Loc nodes and expand the pattern and modify the properties accordingly.
This will be for all :Loc nodes, is that what you want, or do you only want this to apply for some smaller portion of your graph reachable from some starting node?

Kibana - display statistics (percentage) relative to other graph results

I'm facing some difficulties displaying statistics of one graph (in percentage mode) relative to other graph results.
For example, the following graph displays the total number of write failures:
And the other graph, display the number of write failures of a specific group:
What I want to do, is displaying the second graph (Y-axis) in percentage, relative to the results of the first graph.
For example, the number of total write failures in the first graph at "2018-08-09 03:00" is 92,000 (green line).
And the number of write failures, of this specific group in the second graph at "2018-08-09 03:00" is 15,000.
So Instead of displaying 15,000 in the second graph, I want to display 16.30 percentage:
(15000/92000.0) * 100 = 16.30%
Your help is appreciated.
Thanks.

Find minimum number of moves for Tower of London task

I am looking for a solution for a task similar to the Tower of Hanoi task, however this is different from Hanoi as the disks are not constrained by size. The Tower of London task I am creating has 8 disks, instead of the traditional 3 or 5 (as shown in the Wikipedia link). I am using PEBL software that is "programmed primarily in C++ (although you do not need to know C++ to use PEBL), but also uses flex and bison (GNU versions of lex and yacc) to handle parsing."
Here is a video of what the task looks like in action: http://www.youtube.com/watch?v=IiBJ94HRpeM&noredirect=1
*Each disk is a number. e.g., blue disk=1, red disk = 2, etc.
1 \
2 ----\
3 ----/ 3 1
4 5 / 2 4 5
========= =========
The left side consists of the disks you have to move, to match the right side. There are 3 columns.
So if I am making it with 8 disks, I would create a trial to look like this:
1 \
2 ----\ 7 8
6 3 8 ----/ 3 6 1
7 4 5 / 2 4 5
========= =========
How do I figure out what is the minimum amount of moves needed for the left to look like the right? I don't need to use PEBL to code this, but I need to know since I am calculating how close to the minimum a person would get for each trial.
The principle is easy and its called breadth first search:
Each state has a certain number of successor states (defined by the moves possible).
You start out with a set of states that contains the initial state and step number 0.
If the end state is in the set of states, return the step number.
Increment the step number.
Rebuild the set of states by replacing the current states with each of their successor states.
Go to 2
So, in each step, compute the successor states of your currently available states and look if you reached the target state.
BUT, be warned, this can take a while and eat up a lot of memory!
You can optimize a bit in our case, since you can leave out the predecessor state.
Still, you will have 5 possible moves in most states. Which means you will have 5^N states to consider after N steps.
For example, your second example will need 10 moves, if I don't err. This will give you about 10 million states. Most contemporary computers will not be able to search beyond depth 15.
I think that an algorithm to find a solution would be easy and fast, but we have no proof this solution would be the shortest one.

Resources