Getting all zip codes within an n mile radius - location

What's the best way to get a function like the following to work:
def getNearest(zipCode, miles):
That is, given a zipcode (07024) and a radius, return all zipcodes which are within that radius?

There is a project on SourceForge that could assist with this:
http://sourceforge.net/projects/zips/
It gives you a database with zip codes and their latitude / longitude, as well as coding examples of how to calculate the distance between two sets of coordinates. There is probably a better way to do it, but you could have your function retrieve the zipcode and its coordinates, and then step through each zipcode in the list and add the zipcode to a list if it falls within the number of miles specified.

If you want this to be accurate, you must start with polygon data that includes the location and shape of every zipcode. I have a database like this (used to be published by the US census, but they no longer do that) and have built similar things atop it, but not that exact request.
If you don't care about being exact (which I'm guessing you don't), you can get a table of center points of zipcodes and query points ordered by great circle distance. PostGIS provides great tools for doing this, although you may construct a query against other databases that will perform similar tasks.
An alternate approach I've used is to construct a box that encompasses the circle you want, querying with a between clause on lon/lat and then doing the great-circle in app code.

Maybe this can help. The project is configured in kilometers though. You can modify these in CityDAO.java
public List<City> findCityInRange(GeoPoint geoPoint, double distance) {
List<City> cities = new ArrayList<City>();
QueryBuilder queryBuilder = geoDistanceQuery("geoPoint")
.point(geoPoint.getLat(), geoPoint.getLon())
//.distance(distance, DistanceUnit.KILOMETERS) original
.distance(distance, DistanceUnit.MILES)
.optimizeBbox("memory")
.geoDistance(GeoDistance.ARC);
SearchRequestBuilder builder = esClient.getClient()
.prepareSearch(INDEX)
.setTypes("city")
.setSearchType(SearchType.QUERY_THEN_FETCH)
.setScroll(new TimeValue(60000))
.setSize(100).setExplain(true)
.setPostFilter(queryBuilder)
.addSort(SortBuilders.geoDistanceSort("geoPoint")
.order(SortOrder.ASC)
.point(geoPoint.getLat(), geoPoint.getLon())
//.unit(DistanceUnit.KILOMETERS)); Original
.unit(DistanceUnit.MILES));
SearchResponse response = builder
.execute()
.actionGet();
SearchHit[] hits = response.getHits().getHits();
scroll:
while (true) {
for (SearchHit hit : hits) {
Map<String, Object> result = hit.getSource();
cities.add(mapper.convertValue(result, City.class));
}
response = esClient.getClient().prepareSearchScroll(response.getScrollId()).setScroll(new TimeValue(60000)).execute().actionGet();
if (response.getHits().getHits().length == 0) {
break scroll;
}
}
return cities;
}
The "LocationFinder\src\main\resources\json\cities.json" file contains all cities from Belgium. You can delete or create entries if you want too. As long as you don't change the names and/or structure, no code changes are required.
Make sure to read the README https://github.com/GlennVanSchil/LocationFinder

Related

how to get Unique records from elastic search engine based on a field

I have an elastic search index that stores the list of restaurants in an area. I'm using spring elastic search to query the restaurant based on a given geo-location (lat/long) within 10 miles distance. I have a requirement where I only need to show a restaurant chain once, I'm seeing multiple records in my search result for the restaurant chains because they have the same name but different addresses. I only need to show the nearest restaurant chain restaurant along with the other unique restaurants. Is there a single query that can do that? Below is my code [removed some stuff for brevity!]
public SearchHits<Results> search(List<String> items){
final NativeSearchQueryBuilder searchQuery = new NativeSearchQueryBuilder();
BoolQueryBuilder termsQuery = boolQuery();
termsQuery.should(termsQuery(entry.getKey(), items));
boolQuery.must(termsQuery);
// ...I do additional logic here
searchQuery.withQuery(boolQuery);
// apply the terms aggregation searchQuery.addAggregation(terms(CATEGORIES_KEY).field(CATEGORY).size(BUCKET_SIZE));
Query query = searchQuery.build();
SearchHits<Results> searchHits = elasticsearcTemplate.search(query, Results.class);
return searchHits;
}
I was going thru the documentation of elasticsearch, it turns out...there is a simple fix for that :) I can use Collapse The collapse feature removes the duplicate data based on a field. So I only needed to add this line:
searchQuery.withCollapseField("restaurant_name");
// restaurant_name is what I want unique values on

MongoTemplate, Criteria and Hashmap

Good Morning.
I'm starting to learn some mongo right now.
I'm facing this problem right now, and i'm start to think if this is the best approach to resolve this "task", or if is bettert to turn around and write another way to solve this "problem".
My goal is to iterate a simple map of values (key) and vector\array (values)
My test map will be recived by a rest layer.
{
"1":["1","2","3"]
}
now after some logic, i need to use the Dao in order to look into db.
The Key will be "realm", the value inside vector are "castle".
Every Realm have some castle and every castle have some "rules".
I need to find every rules for each avaible combination of realm-castle.
AccessLevel is a pojo labeled by #Document annotation and it will have various params, such as castle and realm (both simple int)
So the idea will be to iterate a map and write a long query for every combination of key-value.
public AccessLevel searchAccessLevel(Map<String,Integer[]> request){
Query q = new Query();
Criteria c = new Criteria();
request.forEach((k,v)-> {
for (int i: Arrays.asList(v)
) {
q.addCriteria(c.andOperator(
Criteria.where("realm").is(k),
Criteria.where("castle").is(v))
);
}
});
List<AccessLevel> response=db.find(q,AccessLevel.class);
for (AccessLevel x: response
) {
System.out.println(x.toString());
}
As you can see i'm facing an error concerning $and.
Due to limitations of the org.bson.Document, you can't add a second '$and' expression specified as [...]
it seems mongo can't handle various $and, something i'm pretty used to abuse over sql
select * from a where id =1 and id=2 and id=3 and id=4
(not the best, sincei can use IN(), but sql allow me)
So, the point is: mongo can actualy work in this way and i need to dig more into the problem, or i need to do another approach, like using criterion.in(), and make N interrogation via mongotemplate one for every key in my Map?

Lucene scoring: get cosine similarity as scores

I'm trying to solve nearest neighbor search problem.
Here is my code:
// Indexing
val analyzer = new StandardAnalyzer()
val directory = new RAMDirectory()
val config = new IndexWriterConfig(analyzer)
val iwriter = new IndexWriter(directory, config)
val queryField = "fieldname"
stringData.foreach { str =>
val doc = new Document()
doc.add(new TextField(queryField, str, Field.Store.YES))
iwriter.addDocument(doc)
}
iwriter.close()
// Searching
val ireader = DirectoryReader.open(directory)
val isearcher = new IndexSearcher(ireader)
val parser = new QueryParser(queryField, analyzer)
val query = parser.parse("Some text for testing")
val hits = isearcher.search(query, 10).scoreDocs
When I look on the value hits I see scores more then 1.
As far as I know, lucene scoring formula is:
score(q,d) = coord-factor(q,d) · query-boost(q) · cosSim(q,d) · doc-len-norm(d) · doc-boost(d)
But I want to get only cosine similarity in range[0,1] between query and document instead of coord-factor, doc-len-norm and so on.
What is a possible way to achieve it?
If you have gone through this official documentation, you would realize that the rest of the terms in the score expression is important and makes the scoring process more logical and coherent.
But still if you want to achieve a scoring process using only Cosine Similaity, then you can write your custom similarity class. I have used different types of similarity method for document retrieval in my class assignment. So, in short you can write your own similarity method and assign it to the Lucene's index searcher. I am giving an example here which you modify to accomplish what you want.
Write your custom class (you just need to override one method in your class).
import org.apache.lucene.search.similarities.BasicStats;
import org.apache.lucene.search.similarities.SimilarityBase;
public class MySimilarity extends SimilarityBase {
#Override
protected float score(BasicStats stats, float termFreq, float docLength) {
double tf = 1 + (Math.log(termFreq) / Math.log(2));
double idf = Math.log((stats.getNumberOfDocuments() + 1) / stats.getDocFreq()) / Math.log(2);
float dotProduct = (float) (tf * idf);
return dotProduct;
}
}
Then assign your implemented method to index searcher for relevance calculation as below.
IndexReader reader = DirectoryReader.open(FSDirectory.open(new File(indexPath)));
IndexSearcher indexSearcher = new IndexSearcher(reader);
indexSearcher.setSimilarity(new MySimilarity());
Here, i am using tf-idf dot product to compute similarity between query and documents. Formula is,
Two things need to be mentioned here are:
stats.getNumberOfDocuments() returns total number documents in the index.
stats.getDocFreq() returns document frequency for a term appeared in both query and document.
Lucene will now call the score() method that you have implemented to compute relevance score for each of the matched terms; terms that appeare both in query and documents.
This is not an straight forward answer to your question i know but you can use the approach i mentioned above in anyway you want. I implemented 6 different scoring technique in my homework assignment. I hope it will help you too.

Merging a dynamic number of collections together

I'm working on my first laravel project: a family tree. I have 4 branches of the family, each with people/families/images/stories/etc. A given user on the website will have access to everything for 1, 2, or 4 of these branches of the family (I don't want to show a cousin stuff for people they're not related to).
So on various pages I want the collections from the controller to contain stuff based on the given user's permissions. Merge seems like the right way to do this.
I have scopes to get people from each branch of the family, and in the following example I also have a scope for people with a birthday this month. In order to show the right set of birthdays for this user, I can get this by merging each group individually if they have access.
Here's what my function would look like if I showed everyone in all 4 family branches:
public function get_birthday_people()
{
$user = \Auth::user();
$jones_birthdays = Person::birthdays()->jones()->get();
$smith_birthdays = Person::birthdays()->smith()->get();
$lee_birthdays = Person::birthdays()->lee()->get();
$brandt_birthdays = Person::birthdays()->brandt()->get();
$birthday_people = $jones_birthdays
->merge($smith_birthdays)
->merge($lee_birthdays )
->merge($brandt_birthdays );
return $birthday_people;
My challenge: I'd like to modify it so that I check the user's access and only add each group of people accordingly. I'm imagining something where it's all the same as above except I add conditionals like this:
if($user->jones_access) {
$jones_birthdays = Person::birthdays()->jones()->get();
}
else{
$jones_birthdays =NULL;
}
But that throws an error for users without access because I can't call merge on NULL (or an empty array, or the other versions of 'nothing' that I tried).
What's a good way to do something like this?
if($user->jones_access) {
$jones_birthdays = Person::birthdays()->jones()->get();
}
else{
$jones_birthdays = new Collection;
}
Better yet, do the merge in the condition, no else required.
$birthday_people = new Collection;
if($user->jones_access) {
$birthday_people->merge(Person::birthdays()->jones()->get());
}
You are going to want your Eloquent query to only return the relevant data for the user requesting it. It doesn't make sense to query Lee birthdays when a Jones person is accessing that page.
So what you will wind up doing is something like
$birthdays = App\Person::where('family', $user->family)->get();
This pulls in Persons where their family property is equal to the family of the current user.
This probably does not match the way you have your relationships right now, but hopefully it will get you on the right track to getting them sorted out.
If you really want to go ahead with a bunch of queries and checking for authorization, read up on the authorization features of Laravel. It will give let you assign abilities to users and check them easily.

Is $maxDistanceInKilometers broken?

I'm using parse REST api to query based on location. I'm using the following location queries
latitude = "-33.90483";
longitude = "151.2243";
When I use that, I got response objects. But when I try with the following location query, I got zero objects.
latitude = "-33.89";
longitude = "151.14";
I have this option as well -> "$maxDistanceInKilometers" = 20;
I used http://andrew.hedges.name/experiments/haversine/ to find distance between two coordinates but I found that they are just 7km away from each other so I should at least get some response objects from the second query.
Please let me know if I'm doing something wrong.
Thanks

Resources