I am currently working on my proyect for my Programming classes. The work is to resolve a Skyline problem. A city's skyline is the outer contour of the silhouette formed by all the buildings in that city when viewed from a distance. So, basically, you take a list of buildings with 3 parameters each (initial position,final position, height) and you have to return the coordinates of the Skyline.
I have two base cases. First one is used when the list is empty. Second one is if there is only one building and the last one is used when I have two or more buildings in the list.
The function 'divide' receives a list of buildings and returns two lists of buildings.
My problem is:
divide([],[],[]).
divide([C|[]],[C|ed(X1,X2,H1)]):-
divide([],ed(X1,X2,H1),[]).
divide([ed(X1,X2,H1),ed(Y1,Y2,H2)|L],L1,L2):-
L1 = [ed(X1,X2,H1)|L1],
L2 = [ed(Y1,Y2,H2)|L2],
divide(L,L1,L2).
When I run the function 'divide' on the console it returns false as answer instead of returning a list. I just can't figure out what is wrong or where the problem might be. It should return two lists, not a 'false'.
An example:
?- divide([(1,2,3),(2,3,4),(1,4,5),(6,2,4)],X,Y).
false.
Any ideas?
Sorry for the bad english and thanks.
Always look at the warnings your Prolog system produces. If you ignore them, don't be surprised to fail. Here is another error:
divide([C|[]],[C|ed(X1,X2,H1)]):-
^^^^^^^^^^^^
This is not a well formed list.
Related
I'm newbie here.
I am currently trying to solve the problem regarding the sorting algorithm.
I will outline the situation:
we have 60 items. Variables of type A and B are written to these items. Variables A and B are stored randomly. Variables A and B have another parameter X, which indicates their material. (material may change during storage). Items are then taken one by one to another item with 10 elements, where we try to achieve the storage of 2 or 3 of the same types of variables A or B from the same material on one element. After saving the required number of variables with the same properties, they are subsequently removed from this item.
I tried to describe it as simply as possible, but maybe I should have described it with a real example.
It can be imagined as a warehouse that has 10 elements and takes from a conveyor that has a capacity of 60 elements. As soon as the warehouse has the same type of goods of the same material on one element, it dispatches the goods and releases its position.
So I want to remove the elements from the conveyor as efficiently as possible and sort them in stock according to requirements.
It occurred to me to sort by case for all options.
Thank you for all your ideas and comments. If it's not very clear, then I apologize and try to explain it differently. :)
I am trying to use VW to perform ranking using the contextual bandit framework, specifically using --cb_explore_adf --softmax --lambda X. The choice of softmax is because, according to VW's docs: "This is a different explorer, which uses the policy not only to predict an action but also predict a score indicating the quality of each action." This quality-related score is what I would like to use for ranking.
The scenario is this: I have a list of items [A, B, C, D], and I would like to sort it in an order that maximizes a pre-defined metric (e.g., CTR). One of the problems, as I see, is that we cannot evaluate the items individually because we can't know for sure which item made the user click or not.
To test some approaches, I've created a dummy dataset. As a way to try and solve the above problem, I am using the entire ordered list as a way to evaluate if a click happens or not (e.g., given the context for user X, he will click if the items are [C, A, B, D]). Then, I reward the items individually according to their position on the list, i.e., reward = 1/P for 0 < P < len(list). Here, the reward for C, A, B, D is 1, 0.5, and 0.25, 0.125, respectively. If there's no click, the reward is zero for all items. The reasoning behind this is that more important items will stabilize on top and less important on the bottom.
Also, one of the difficulties I found was defining a sampling function for this approach. Typically, we're interested in selecting only one option, but here I have to sample multiple times (4 in the example). Because of that, it's not very clear how I should incorporate exploration when sampling items. I have a few ideas:
Copy the probability mass function and assign it to copy_pmf. Draw a random number between 0 and max(copy_pmf) and for each probability value in copy_pmf, increment the sum_prob variable (very similar to the tutorial here:https://vowpalwabbit.org/tutorials/cb_simulation.html). When sum_prob > draw, we add the current item/prob to a list. Then, we remove this probability from copy_pmf, set sum_prob = 0, and draw a new number again between 0 and max(copy_pmf) (which might change or not).
Another option is drawing a random number and, if the maximum probability, i.e., max(pmf) is greater than this number, we exploit. If it isn't, we shuffle the list and return this (explore). This approach requires tuning the lambda parameter, which controls the output pmf (I have seen cases where the max prob is > 0.99, which would mean around a 1% chance of exploring. I have also seen instances where max prob is ~0.5, which is around 50% exploration.
I would like to know if there are any suggestions regarding this problem, specifically sampling and the reward function. Also, if there are any things I might be missing here.
Thank you!
That sounds like something that can be solved by conditional contextual bandits
For demo scenario that you are mentioning each example should have 4 slots.
You can use any exploration algorithm in this case and it is going to be done independently per each slot. Learning objective is average loss over all slots, but decisions are made sequentially from the first slot to the last, so you'll effectively learn the ranking even in case of binary reward here.
I think I need a method called “Touching” (as in contiguous, not emotional.)
I need to identify those elements of a matrix that are next to an individual element or set of elements. At least that’s the way I’ve thought of to solve the problem at hand.
The matrix State in the program below represents, let’s say, some underwater topography. As I lower the water, eventually the highest point will stick out and become an “island”. When the “water level” is at 34 then the element State[2,3] is the single point of the island. The array atlantis holds the coordinates of that single point .
As we lower the water level further, additional points will be “above water.” Additional contiguous points will become part of the island and their coordinates would be added to the array atlantis. (For example, the next piece of land to be part of atlantis would be State[3,4] at 31.)
My thought about how to do this is to identify all the matrix elements that touch/are next to the element in the atlantis, find the one with the highest elevation and then add it to the array atlantis. Looking for the elements next to a single element is a challenge in itself, but we could write some code to examine the set [i,j-1], [i,j+1], [i-1,j-1], [i-1,j], [i-1,j+1], [i+1,j-1], [i+1,J], [i+1,j+1]. (I think I got that right.)
But as we add additional points, the task of determining which points surround the points in atlantis becomes increasingly difficult. So that’s my question: can anyone think of any mechanism by which to do this? Any kind of simplified algorithm using capabilities of ruby of which I am unaware? (which include all but the most basic.) If such a method could be written then I could write atlantis.touching and get an array, for example, containing all the coordinates of all the points presently contiguous to atlantis.
At least that’s how I’m thinking this could be done. Any other ideas would be welcome. And if anyone knows any kind of partnering site where I could seek others who might be interested in working with me on this, that would be great.
# create State database using matrix
require 'matrix'
State=Matrix[ [3,1,4,4,6,2,8,12,8,2],
[6,2,4,13,25,21,11,22,9,3,],
[6,20,27,34,22,14,12,11,2,5],
[6,28,17,23,31,18,11,9,18,12],
[9,18,11,13,8,9,10,14,24,11],
[3,9,7,16,9,12,28,24,29,21],
[5,8,4,7,17,14,19,30,33,4],
[7,17,23,9,5,9,22,21,12,21,],
[7,14,25,22,16,10,19,15,12,11],
[5,16,7,3,6,3,9,8,1,5] ]
#find sate elements contiguous to island
atlantis=[[2,3]]
find all state[i,j] "touching" atlantis
Only checking the points around the currently exposed area doesn't sound like it could cover every case - what if the next point to be exposed was the beginning of a new island?
I'd go about it like this: Have another array - let's call it sorted which contains your points sorted by height. Every time you raise the water level, pop all the elements higher than the new water level off sorted and onto atlantis.
In fact, there's no need for separate sorted and atlantis arrays if you do it this way. Just store the index of the highest point not above water, and you've essentially got two arrays in one - everything above water on one side, and everything below water on the other.
Hope that helps!
Given a list of URLs known to be somewhat "RESTful", what would be a decent algorithm for grouping them so that URLs mapping to the same "controller/action/view" are likely to be grouped together?
For example, given the following list:
http://www.example.com/foo
http://www.example.com/foo/1
http://www.example.com/foo/2
http://www.example.com/foo/3
http://www.example.com/foo/1/edit
http://www.example.com/foo/2/edit
http://www.example.com/foo/3/edit
It would group them as follows:
http://www.example.com/foo
http://www.example.com/foo/1
http://www.example.com/foo/2
http://www.example.com/foo/3
http://www.example.com/foo/1/edit
http://www.example.com/foo/2/edit
http://www.example.com/foo/3/edit
Nothing is known about the order or structure of the URLs ahead of time. In my example, it would be somewhat easy since the IDs are obviously numeric. Ideally, I'd like an algorithm that does a good job even if IDs are non-numeric (as in http://www.example.com/products/rocket and http://www.example.com/products/ufo).
It's really just an effort to say, "Given these URLs, I've grouped them by removing what I think it he 'variable' ID part of the URL."
Aliza has the right idea, you want to look for the 'articulation points' (in REST, basically where a parameter is being passed). Looking only for a single point of change gets tricky
Example
http://www.example.com/foo/1/new
http://www.example.com/foo/1/edit
http://www.example.com/foo/2/edit
http://www.example.com/bar/1/new
These can be grouped several equally good ways since we have no idea of the URL semantics. This really boils down to the question of this - is this piece of the URL part of the REST descriptor or a parameter. If we know what all the descriptors are, the rest are parameters and we are done.
Give a sufficiently large dataset, we'd want to look at the statistics of all URLs at each depth. e.g., /x/y/z/t/. We would count the number of occurrences in each slot and generate a large joint probability distribution table.
We can now look at the distribution of symbols. A high count in a slot means it's likely a parameter. We would start from the bottom, look for conditional probability events, ie., What is the probability of x being foo, then what is the probability y being something given x, etc. etc. I'd have to think more to determine a systematic way to extracting these, but it seems like a promisign start
split each url to an array of strings with the delimiter being '/'
e.g. http://www.example.com/foo/1/edit will give the array [http:,www.example.com,foo,1,edit]
if two arrays (urls) share the same value in all indecies except for one, they will be in the same group.
e.g. http://www.example.com/foo/1/edit = [http:,www.example.com,foo,1,edit] and
http://www.example.com/foo/2/edit = [http:,www.example.com,foo,2,edit]. The arrays match in all indices except for #3 which is 1 in the first array and 2 in the second array. Therefore, the urls belong to the same group.
It is easy to see that urls like http://www.example.com/foo/3 and http://www.example.com/foo/1/edit will not belong to the same group according to this algorithm.
I'm study recommendation engines, and I went through the paper that defines how Google News generates recommendations to users for news items which might be of their interest, based on collaborative filtering.
One interesting technique that they mention is Minhashing. I went through what it does, but I'm pretty sure that what I have is a fuzzy idea and there is a strong chance that I'm wrong. The following is what I could make out of it :-
Collect a set of all news items.
Define a hash function for a user. This hash function returns the index of the first item from the news items which this user viewed, in the list of all news items.
Collect, say "n" number of such values, and represent a user with this list of values.
Based on the similarity count between these lists, we can calculate the similarity between users as the number of common items. This reduces the number of comparisons a lot.
Based on these similarity measures, group users into different clusters.
This is just what I think it might be. In Step 2, instead of defining a constant hash function, it might be possible that we vary the hash function in a way that it returns the index of a different element. So one hash function could return the index of the first element from the user's list, another hash function could return the index of the second element from the user's list, and so on. So the nature of the hash function satisfying the minwise independent permutations condition, this does sound like a possible approach.
Could anyone please confirm if what I think is correct? Or the minhashing portion of Google News Recommendations, functions in some other way? I'm new to internal implementations of recommendations. Any help is appreciated a lot.
Thanks!
I think you're close.
First of all, the hash function first randomly permutes all the news items, and then for any given person looks at the first item. Since everyone had the same permutation, two people have a decent chance of having the same first item.
Then, to get a new hash function, rather than choosing the second element (which would have some confusing dependencies on the first element), they choose a whole new permutation and take the first element again.
People who happen to have the same hash value 2-4 times (that is, the same first element in 2-4 permutations) are put together in a cluster. This algorithm is repeated 10-20 times, so that each person gets put into 10-20 clusters. Finally, recommendations are given based (the small number of) other people in the 10-20 clusters. Since all this work is done by hashing, people are put directly into buckets for their clusters, and large numbers of comparisons aren't needed.