OCaml: parse JSON into cyclic type - data-structures

This question is related to another question I asked before.
I am reading data from a JSON file and try to parse them into a datatype i made.
{
"rooms":
[
{
"id": "room1",
"description": "This is Room 1. There is an exit to the north.\nYou should drop the white hat here.",
"items": ["black hat"],
"points": 10,
"exits": [
{
"direction": "north",
"room": "room2"
}
],
"treasure": ["white hat"]
},
{
"id": "room2",
"description": "This is Room 2. There is an exit to the south.\nYou should drop the black hat here.",
"items": [],
"points": 10,
"exits": [
{
"direction": "south",
"room": "room1"
}
],
"treasure": ["black hat"]
}
]
}
My user-defined type for room is:
type room = {
room_id : int ;
room_description : string ;
room_items : item list ;
room_points : int ;
room_exits : exit list ;
room_treasure : item list ;
}
and exit = direction * room
However, room has a "exit" field, which itself is a "room" type. Then when I try to create record for room1, I first need to define room2, but in order to define room2, I need to know room1. This seems like a cyclic type.
Can anyone help me with this?

If you stick to the immutable and eager subset of OCaml, there's no real way to build up arbitrary cyclic structures. The problem is exactly as you state it.
It's possible to build specific examples of cyclic structures using let rec, but I don't believe this can be extended to building arbitrary structures while (for example) parsing JSON.
You can solve the problem by dropping the requirement for immutable data. If you make the links to other rooms into OCaml references (mutable fields), you can build cyclic structures pretty much as you'd do it in the imperative part of JavaScript.
One way to make this work might be to use an array for room_exits rather than a list. OCaml arrays are mutable.
Here's some code that creates the complete graph over 3 nodes (for a trivial node type that contains only the neighbor nodes):
# type node = { nabes: node array };;
type node = { nabes : node array; }
# type graph = node list;;
type graph = node list
# let z = { nabes = [||] };;
val z : node = {nabes = [||]}
# let temp = Array.init 3 (fun _ -> { nabes = Array.make 2 z});;
val temp : node array =
[|{nabes = [|{nabes = [||]}; {nabes = [||]}|]};
{nabes = [|{nabes = [||]}; {nabes = [||]}|]};
{nabes = [|{nabes = [||]}; {nabes = [||]}|]}|]
# temp.(0).nabes.(0) <- temp.(1);;
- : unit = ()
# temp.(0).nabes.(1) <- temp.(2);;
- : unit = ()
# temp.(1).nabes.(0) <- temp.(0);;
- : unit = ()
# temp.(1).nabes.(1) <- temp.(2);;
- : unit = ()
# temp.(2).nabes.(0) <- temp.(0);;
- : unit = ()
# temp.(2).nabes.(1) <- temp.(1);;
- : unit = ()
# let k3 : graph = Array.to_list temp;;
val k3 : graph =
[{nabes =
[|{nabes = [|<cycle>; {nabes = [|<cycle>; <cycle>|]}|]};
{nabes = [|<cycle>; {nabes = [|<cycle>; <cycle>|]}|]}|]};
{nabes =
[|{nabes = [|<cycle>; {nabes = [|<cycle>; <cycle>|]}|]};
{nabes = [|{nabes = [|<cycle>; <cycle>|]}; <cycle>|]}|]};
{nabes =
[|{nabes = [|{nabes = [|<cycle>; <cycle>|]}; <cycle>|]};
{nabes = [|{nabes = [|<cycle>; <cycle>|]}; <cycle>|]}|]}]
You can also solve the problem by linking through an intermediate structure. For example, you can have a dictionary that maps room names to rooms. Then your links to other rooms can just use the name (rather than a direct link to the OCaml value). I've used this method in the past, and it works pretty well. (This is, in fact, how your JSON is working implicitly.)

That's why in the previous answer I put function room_exits into the Game interface, not into the Room. The intuition behind this is that room exits, i.e., other rooms, are not part of the room. If you define some structure, as "the room is walls, treasure, and other rooms", then you defining something more than a room, that basically means, that you're defining the whole maze. So the room, is just a room, i.e., it's contents. The way how the rooms are connected is a Maze. (I've used Game for this in previous answer, but maybe Maze is a better name).
To summarize, in your particular case, you need just to remove references to other rooms from the room data representation, and store the maze information as an associative container inside the maze (or game) data structure:
type exits = (dir * room) list
type maze = {
...
entry : room;
rooms : exits Room.Map.t
}
or maybe even more precise, you can use Dir.Map as an associative container, instead of associative list:
type exits = room Dir.Map.t
The latter representation guarantees that there is no more than one room per direction.
Note: the above definitions assumes that Room implements Comparable interface, and that you're using Core library. (I think you're since I remember that there was a link to RWO from the course page). To implement a comparable interface you need to implement the compare function and Sexpable interface. It is easy using type generators, basically it looks like this:
module Room = struct
type t = {
name : string;
treasures : treasure list;
...
} with compare, sexp
include Comparable.Make(struct
type nonrec t = t with compare, sexp
end)
end
the with compare, sexp will automatically generate the compare function, and the pair of sexp_of_t, t_of_sexp functions, that are needed for a Comparable.Make functor to realize the Comparable interface.
Note: if it too much at this point of course, then you can just use String.Map.t data structure, and perform the lookup by a room name. This wouldn't be a bad idea.

Related

Kotlin. ArrayList, how to move element to first position

I have a list of Lessons. Here is my Lessons class:
data class Lessons(
val id: Long,
val name: String,
val time: Long,
val key: String
)
I need to move the element to the beginning of the list, whose key field has a value "priority".
Here is my code:
val priorityLesson = lessons.find { it.key == "priority" }
if (priorityLesson != null) {
lessons.remove(priorityLesson)
lessons.add(0, priorityLesson)
}
Everything is working but I do not like this solution, perhaps there is a more efficient way to perform this algorithm. In addition, it comes to me to convert the list to mutable, and I would like to leave it immutable.
Please help me.
One way is to call partition() to split the list into a list of priority lesson(s), and a list of non-priority lessons; you can then rejoin them:
val sorted = lessons.partition{ it.key == "priority" }
.let{ it.first + it.second }
As well as handling the case of exactly one priority lesson, that will cope if there are none or several. And it preserves the order of priority lessons, and the order of non-priority lessons.
(That will take a little more memory than modifying the list in-place; but it scales the same — both are 𝒪(n). It's also easier to understand and harder to get wrong!)
First, I would call your class Lesson rather than Lessons as it represents a single lesson. Your choice of the variable name lessons is good for your list of lessons.
You can use a mutable list and move the item to the top:
val priorityLessonIndex = lessons.indexOf { it.key == "priority" }
if (priorityLessonIndex != -1)
lessons[0] = lessons[priorityLessonIndex]
.also { lessons[priorityLessonIndex] = lessons[0] }
Or you can use an immutable list:
val priorityLesson = lessons.firstOrNull { it.key == "priority" }
val newList =
if (priorityLesson != null)
listOf(priorityLesson) + (lessons - priorityLesson)
else
lessons
A possibly more efficient way, which avoids creation of intermediate lists:
val newList = buildList(lessons.size) {
lessons.filterTo(this) { it.key == "priority" }
lessons.filterTo(this) { it.key != "priority" }
}

Find cities reachable in given time

I am trying to solve a problem where i have 3 columns in csv like below
connection Distance Duration
Prague<>Berlin 400 4
Warsaw<>Berlin 600 6
Berlin<>Munich 800 8
Munich<>Vienna 400 3.5
Munich<>Stuttgart 800 8
Stuttgart<>Freiburg 150 2
I need to find out how many cities i can cover in given time from the origin city
Example if i would give input as
Input: Berlin, 10
Output: ["Prague","Munich","Warsaw"]
Input : Berlin, 30
Output : ["Prague","Munich","Warsaw", "Vienna", "Stuttgart",
"Freiburg"]
This is something a Graph problem in real time.
I am trying this with Scala, can someone help please.
Below what i tried:
I made it working partially.
import scalax.collection.Graph // or scalax.collection.mutable.Graph
import scalax.collection.GraphPredef._, scalax.collection.GraphEdge._
import scalax.collection.edge.WDiEdge
import scalax.collection.edge.Implicits._
val rows = """Prague<>Berlin,400,4
Warsaw<>Berlin,600,6
Berlin<>Munich,800,8
Munich<>Vienna,400,3.5
Munich<>Stuttgart,800,8
Stuttgart<>Freiburg,150,2""".split("\n").toList
I am preparing the input for my application.
Below i am having a list of cities which are present in the given file.
NOTE: We can have it from file itself while reading and kept in list. Here i kept all as lowercase
val cityList = List("warsaw","berlin","prague","munich","vienna","stuttgart","freiburg")
Now creating a case class:
case class Bus(connection: String, distance: Int, duration: Float)
val buses: List[Bus] = rows.map(row => {
val r =
row.split("\\,")
Bus(r(0).toLowerCase, r(1).toInt, r(2).toFloat)
})
case class City(name: String)
// case class BusMeta(distance: Int, duration: Float)
val t = buses.map(bus => {
val s = bus.connection.split("<>")
City(s.head) ~ City(s.last) % bus.duration
})
val busGraph = Graph(t:_*)
From above we will create a Graph as required from the input file. "busGraph" in my case.
import scala.collection.mutable
val travelFrom = ("BERLIN").toLowerCase
val travelDuration = 16F
val possibleCities: mutable.Set[String] = mutable.Set()
if (cityList.contains(travelFrom)){
busGraph.nodes.get(City(travelFrom)).edges.filter(_.weight <= travelDuration).map(edge => edge.map(_.name)).flatten.toSet.filterNot(_ == travelFrom).foreach(possibleCities.add)
println("City PRESENT in File")
}
else
{
println("City Not Present in File")
}
I am geting Output here :
possibleCities: scala.collection.mutable.Set[String] = Set(munich, warsaw, prague)
Expected Output : Set(munich, warsaw, prague, stuttgart, Vienna)
Your solution only finds direct routes (that's why your output is shorter than expected). To get complete answer, you need to also consider connections, by recursively traversing the graph from each of the direct destinations.
Also, do not use mutable collections, they are evil.
Here is a possible solution for you:
// First, create the graph structure
def routes: Map[String, (String, Double)] = Source
.fromInputStream(System.in)
.getLines
.takeWhile(_.nonEmpty)
.map(_.split("\\s+"))
.flatMap { case Array(from_to, dist, time) =>
val Array(from,to) = from_to.split("<>")
Seq(from -> (to, time.toDouble), to -> (from, time.toDouble))
}.toSeq
.groupMap(_._1)(_._2)
// Now search for suitable routes
def reachable(
routes: Map[String, Seq[(String, Double)]],
from: String,
limit: Double,
cut: Set[String] = Set.empty
): Set[String] = routes
.getOrElse(from, Nil)
.filter(_._2 <= limit)
.filterNot { case (n, _) => cut(n) }
.flatMap { case(name, time) =>
reachable(routes, name, limit - time, cut + from) + name
}.toSet
// And here is how you use it
def main(argv: Array[String]): Unit = {
val Array(from, limit) = new Scanner(System.in).nextLine().split("\\s")
val reach = reachable(routes, from, limit.toDouble)
println(reach)
}
Do a breadth first search from the origin city, stopping going deeper when you reach the time limit. Output the stops reached by the search.
To my best knowledge this tasks should be solved with Graph Adjacency Matrix, which first need to build from input data.
In your particular case the Graph Adjacency Matrix would be 2D and contains cities on rows and columns and weight of direction as value.
See screenshot from Excel with example below,
At the first iteration you search for possible routes from starting cities and store city name (row/column id) and weight.
Each next iteration you try to add route and compare with limit (can you add it or not also make sure you are not adding same city)
To store results you will need again 2D array, where first element is you possible route and next element is a Tuple of visited city and value taken.
After few iterations you should get all possible options and just provide summary of founded.
TL;DR; Most of Graph programmatical algorithms use or depends (with different extent) on Graph Adjacency Matrix

Objects arranged as a graph

I want to arrange many objects of a certain class as a graph in Matlab. The goal is, that when I create a new object it automatically is added to the graph. However, as far as I can see graphs only accept numbers when I add a new node. How is typically dealt with it? Should I have a GroupClass that holds all the objects and a graph with the relations? What I would like to have is something like
G = graph()
O1 = createObject(G)
O2 = createObject(G)
and in createObject something like
...
G.addnode(O1)
G.addedge(O1,O2)
...
Afterwards I want to be able to plot the relations, print out groups or all nodes, etc.
You can do this by adding nodes as a "node properties" table. Here's a very simple example:
G = graph();
for idx = 1:10
% make a single-row table containing the name and data
% associated with this node
nodeProps = table({['Idx ', num2str(idx)]}, ...
MException('msg:id', sprintf('Message %d', idx)), ...
'VariableNames', {'Name', 'Data'});
G = addnode(G, nodeProps);
end
for idx = 2:10
% add edges based on the node names
G = addedge(G, 'Idx 1', sprintf('Idx %d', idx));
end
plot(G)

how to convert forEach to lambda

Iterator<Rate> rateIt = rates.iterator();
int lastRateOBP = 0;
while (rateIt.hasNext())
{
Rate rate = rateIt.next();
int currentOBP = rate.getPersonCount();
if (currentOBP == lastRateOBP)
{
rateIt.remove();
continue;
}
lastRateOBP = currentOBP;
}
how can i use above code convert to lambda by stream of java 8? such as list.stream().filter().....but i need to operation list.
The simplest solution is
Set<Integer> seen = new HashSet<>();
rates.removeIf(rate -> !seen.add(rate.getPersonCount()));
it utilizes the fact that Set.add will return false if the value is already in the Set, i.e. has been already encountered. Since these are the elements you want to remove, all you have to do is negating it.
If keeping an arbitrary Rate instance for each group with the same person count is sufficient, there is no sorting needed for this solution.
Like with your original Iterator-based solution, it relies on the mutability of your original Collection.
If you really want distinct and sorted as you say in your comments, than it is as simple as :
TreeSet<Rate> sorted = rates.stream()
.collect(Collectors.toCollection(() ->
new TreeSet<>(Comparator.comparing(Rate::getPersonCount))));
But notice that in your example with an iterator you are not removing duplicates, but only duplicates that are continuous (I've exemplified that in the comment to your question).
EDIT
It seems that you want distinct by a Function; or in simpler words you want distinct elements by personCount, but in case of a clash you want to take the max pos.
Such a thing is not yet available in jdk. But it might be, see this.
Since you want them sorted and distinct by key, we can emulate that with:
Collection<Rate> sorted = rates.stream()
.collect(Collectors.toMap(Rate::getPersonCount,
Function.identity(),
(left, right) -> {
return left.getLos() > right.getLos() ? left : right;
},
TreeMap::new))
.values();
System.out.println(sorted);
On the other hand if you absolutely need to return a TreeSet to actually denote that this are unique elements and sorted:
TreeSet<Rate> sorted = rates.stream()
.collect(Collectors.collectingAndThen(
Collectors.toMap(Rate::getPersonCount,
Function.identity(),
(left, right) -> {
return left.getLos() > right.getLos() ? left : right;
},
TreeMap::new),
map -> {
TreeSet<Rate> set = new TreeSet<>(Comparator.comparing(Rate::getPersonCount));
set.addAll(map.values());
return set;
}));
This should work if your Rate type has natural ordering (i.e. implements Comparable):
List<Rate> l = rates.stream()
.distinct()
.sorted()
.collect(Collectors.toList());
If not, use a lambda as a custom comparator:
List<Rate> l = rates.stream()
.distinct()
.sorted( (r1,r2) -> ...some code to compare two rates... )
.collect(Collectors.toList());
It may be possible to remove the call to sorted if you just need to remove duplicates.

How to evaluate a complex expression tree against incremental data?

I have a collection of data and a collection of search filters I want to run against that data. The filters follow the LDAP search filter format and are parsed into an expression tree. The data is read one item at a time and processed through all the filters. Intermediate match results are stored in each leaf node of the tree until all the data has been processed. Then the final results are obtained by traversing the tree and applying the logical operators to each leaf node's intermediate result. For example, if I have the filter (&(a=b)(c=d)) then my tree will look like this:
root = "&"
left = "a=b"
right = "c=d"
So if a=b and c=d then both the left and right child nodes are a match and thus the filter is a match.
The data is a collection of different types of objects, each with their own fields. For example, assume the collection represents a class at a school:
class { name = "math" room = "12A" }
teacher { name = "John" age = "35" }
student { name = "Billy" age = "6" grade = "A" }
student { name = "Jane" age = "7" grade = "B" }
So a filter might look like (&(teacher.name=John)(student.age>6)(student.grade=A)) and be parsed like so:
root = "&"
left = "teacher.name=John"
right = "&"
left = "student.age>6"
right = "student.grade=A"
I run the class object against it; no matches. I run the teacher object against it; root.left is a match. I run the first student node against it; root.right.right is a match. I run the second student node against it; root.right.left is a match. Then I traverse the tree and determine that all nodes matched and thus the final result is a match.
The problem is the intermediate matches need to be constrained based upon commonality: the student.age and student.grade filters need to somehow be tied together in order to store an intermediate match only if they match for the same object. I can't for the life of me figure out how to do this.
My filter node abstract base class:
class FilterNode
{
public:
virtual void Evaluate(string ObjectName, map<string, string> Attributes) = 0;
virtual bool IsMatch() = 0;
};
I have a LogicalFilterNode class that handles logical AND, OR, and NOT operations; it's implementation is pretty straightforward:
void LogicalFilterNode::Evaluate(string ObjectName, map<string, string> Attributes)
{
m_Left->Evaluate(ObjectName, Attributes);
m_Right->Evaluate(ObjectName, Attributes);
}
bool LogicalFilterNode::IsMatch()
{
switch(m_Operator)
{
case AND:
return m_Left->IsMatch() && m_Right->IsMatch();
case OR:
return m_Left->IsMatch() || m_Right->IsMatch();
case NOT:
return !m_Left->IsMatch();
}
return false;
}
Then I have a ComparisonFilterNode class that handles the leaf nodes:
void ComparisonFilterNode::Evaluate(string ObjectName, map<string, string> Attributes)
{
if(ObjectName == m_ObjectName) // e.g. "teacher", "student", etc.
{
foreach(string_pair Attribute in Attributes)
{
Evaluate(Attribute.Name, Attribute.Value);
}
}
}
void ComparisonFilterNode::Evaluate(string AttributeName, string AttributeValue)
{
if(AttributeName == m_AttributeName) // e.g. "age", "grade", etc.
{
if(Compare(AttributeValue, m_AttributeValue) // e.g. "6", "A", etc.
{
m_IsMatch = true;
}
}
}
bool ComparisonFilterNode::IsMatch() { return m_IsMatch; }
How it's used:
FilterNode* Root = Parse(...);
foreach(Object item in Data)
{
Root->Evaluate(item.Name, item.Attributes);
}
bool Match = Root->IsMatch();
Essentially what I need is for AND statements where the children have the same object name, the AND statement should only match if the children match for the same object.
Create a new unary "operator", let's call it thereExists, which:
Does have state, and
Declares that its child subexpression must be satisfied by a single input record.
Specifically, for each instance of a thereExists operator in an expression tree you should store a single bit indicating whether or not the subexpression below this tree node has been satisfied by any of the input records seen so far. These flags will initially be set to false.
To continue processing your dataset efficiently (i.e. input record by input record, without having to load the entire dataset into memory), you should first preprocess the query expression tree to pull out a list of all instances of the thereExists operator. Then as you read in each input record, test it against the child subexpression of each of these operators that still has its satisfied flag set to false. Any subexpression that is now satisfied should toggle its parent thereExists node's satisfied flag to true -- and it would be a good idea to also attach a copy of the satisfying record to the newly-satisfied thereExists node, if you want to actually see more than a "yes" or "no" answer to the overall query.
You only need to evaluate tree nodes above a thereExists node once, after all input records have been processed as described above. Notice that anything referring to properties of an individual record must appear somewhere beneath a thereExists node in the tree. Everything above a thereExists node in the tree is only allowed to test "global" properties of the collection, or combine the results of thereExists nodes using logical operators (AND, OR, XOR, NOT, etc.). Logical operators themselves can appear anywhere in the tree.
Using this, you can now evaluate expressions like
root = "&"
left = thereExists
child = "teacher.name=John"
right = "|"
left = thereExists
child = "&"
left = "student.age>6"
right = "student.grade=A"
right = thereExists
child = "student.name = Billy"
This will report "yes" if the collection of records contains both a teacher whose name is "John" and either a student named "Billy" or an A student aged over 6, or "no" otherwise. If you track satisfying records as I suggested, you'll also be able to dump these out in the case of a "yes" answer.
You could also add a second operator type, forAll, which checks that its subexpression is true for every input record. But this is probably not as useful, and in any case you can simulate forAll(expr) with not(thereExists(not(expr))).

Resources