Sharing internet connection within two places - wireless

I need to share my internet connection within my two homes.
The first home is at the 12th floor of the flat. And the second place is 800-900m aerial distance from my flat. I don't have a direct line of sight from home to home, so I suppose it might be problem. There are a few buildings between the locations.
I was thinking about some high gain antennas at both locations, pointing at some third location that both of my homes see. Is that possible ? Using Nanostation or similar.
What can I do to solve this problem ?

Which country do you live?
WiFi in standard IEEE 802.11a, can transmit up to 5000 m, on the 3.7 GHz frequency. But it's only allowed in the USA.
Look here: https://en.wikipedia.org/wiki/IEEE_802.11#endnote_80211ns_37ghzA1
And two notes:
1) You should be careful about using external antennas, as you can violate transmission regulations
2) it's probably question that you should ask here:
http://networkengineering.stackexchange.com

Related

Finding POIs that are near or contain a certain location

I have an application that does the following:
Receives a device's location
Fetches a route (collection of POIs, or Points of Interest) assigned to that device
Determines if the device is near any of the POIs in the route
The route's POIs can be either a point with a radius, in which case it should detect if the device is within the radius of the point; or a polygon, where it should detect if the device is inside of it.
Here is a sample of a route with 3 POIs, two of them are points with different radii, and the other one is a polygon:
https://jsonblob.com/285c86cd-61d5-11e7-ae4c-fd99f61d20b8
My current algorithm is programmed in PHP with a MySQL database. When a device sends a new location, the script loads all the POIs for its route from the database into memory, and then iterates through them. For POIs that are points, it uses the Haversine formula to find if the device is within the POI's radius, and for POIs that are polygons it uses a "point in polygon" algorithm to find if the device is inside of it or not.
I would like to rewrite the algorithm with the goal of using less computing resources than the current one. We receive about 100 locations per second and they each have to be checked against routes that have about 40 POIs on average.
I can use any language and database to do so, which ones would you recommend for the best possible performance?
I'd use a database (e.g., Postgresql) that supports spatial queries.
That will let you create a spatial index that puts a bounding box around each POI. You can use this to do an initial check to (typically) eliminate the vast majority of POIs that aren't even close to the current position (i.e., where the current position isn't inside their bounding box).
Then when you've narrowed it down to a few POIs, you can test the few that are lest using roughly the algorithm you are now--but instead of testing 40 POIs per point, you might be testing only 2 or 3.
Exactly how well this will work will depend heavily upon how close to rectangular your POIs are. Circular is close enough that it tends to give pretty good results.
Others may depend--for example, a river that runs nearly north and south may work quite well. If you have a river that runs mostly diagonally, it may be worthwhile to break it up into a number of square/rectangular segments instead of treating the whole thing as a single feature, since the latter will create a bounding box with a lot of space that's quite a ways away from the river.

Designing an algorithm that keeps traffic equally distributed over a network

My question isn't related to how to code a particular thing or any technical programming question. I need help in developing a logic of an algorithm that I am working on which I will be explaining in a bit. I am here because I couldnt think of a better place than stackoverflow to help me out as it has proved the best option in the past. I would like to thank you guys for helping your me out and giving me your precious time.
So here goes my brain teasing algorithm:
Aim :
I am developing an algorithm that aims to equally distribute traffic(cars) over a network. I will be simulating this algorithm to check how it works on some traffic simulator.
To put it visually what I am planning to do I have included this image:
Plotting the traffic network which shows that some roads are populated than the others. Thicker lines indicate roads that have more traffic as compared to others.**
Here are some points to note :
The image represents 8*8 junctions(intersection of roads) and the roads themselves. Not good with my graphic skills so just wanted to make clear that 4 roads meet up to make a junction(not clearly shown in diagram, but you get what I mean).
Notice the directed arrows, those indicate the direction in which the traffic will be moving. For example, the first row indicates the traffic will be moving from right to left. Second row indicates the traffic will move from left to right. Similary, the first column represents that traffic will more from bottom to top etc. If you observe closely, you will notice that I have kept an alternate pattern ie. right to left,then left to right, then again right to left etc. Similar is the case for columns.
No turning of vehicles is allowed. Since, this is the initial stages of my algorithm I have put a restriction that vehicles are not allowed to turn and continue in the direction of the arrow.
This is a loop-around type network. Meaning, the cars after reaching the extremes of the network do not vanish away, they loop around and are fed again into the network. Keep in mind, the cars are fed in the direction of arrow. For example in first row, after the cars leave from the left-most junction, the cars will join in again into the right-most junction. Simply to say, it is some sort of wrapping around or mesh kind of thing.
The traffic is created randomly and as the image suggests thicker the lines, the more traffic on that road and vice-versa.
Important: When at a junction, if we are to allow cars to pass from left to right (horizontally) all the cars moving in vertical direction will have to be stopped (obvious so that they don't crash into each other) and vice-versa.
What I need help in:
Pseudocode/Logic for the algorithm which allows me to equally distribute the traffic in the network (graphically lines of almost the same thickness as per my image). Mathematically, the thickness of the line will be average number of cars that should be on each road so that the whole network is balanced. Roads, should be equally populated, no road should have more traffic than the other (fairness policy).
I wanted to solve the problem using a divide and conquer approach. Meaning, in real world the network will be very big so it is almost impossible to have the information of the whole network with us at any given moment of time.
I want to solve the problems locally and round-wise.
For example, Round 1: Solve 4 junctions at a time and try to equalize the traffic at them.
Round 2: Some other 4 junctions
and so on...
Tip: The effect should propagate as the rounds advances reaching to some optimal solution where the traffic is equally distributed and then we stop the algorithm.
I know this is a long post and I really appreciate anyone who even tried to understand my objective and gave a thought about it.
This idea popped into my head giving because of the growing traffic situations and that there are some roads that are less populated than the others and so equally distribute the cars and take more advantage of the available infrastructure.

Cache Geospatial calculations or calculate on the fly?

I'm a developer on a service vehicle dispatching web app. It's written in .Net 4+, MVC4, using SQL server.
There are 2000+ locations stored in the database as geography data-types. Assuming we send resources from location A to location B, the drive time / distance etc... needs to be displayed at one point. If I calculate the distance with SQL Server's STDistance it will only give me the "As the crow flies" distance. So the system will need to hit a geo spatial service like bing, Google, or ESRI and get the actual drive time or suggested routes. the problem is this is a core function and will happen ALOT.
Should I pre-populate a lookup table with pre-calculated distances or average drive times? The down side is even without adding more locations that's 4Million records to search every time the information is needed.
On top of this, most times the destination is not one of our stored geospatial coordinates and can instead be an address or long/lat point anywhere on the continent which makes pre-calculating impossible.
I'm trying to avoid performance issues having to hit some geoservies endpoint constantly.
Any suggestions on how best to approach this?
-thanks!
Having looked at these problems before, you are unlikely to be able to store them all.
it is usually against almost all of the routing providers TOS for you to cache the results. You can sometimes negotiate this ability but it costs alot.
Given that there is not a fixed set of points you are searching against, doing one calculation gives you little information for the next calculation.
I would say maybe you can store the route for pair once they have been selected so you can show that route again if needed. Once the transaction is done I would remove the route from your DB.
If you really want to cache all this or have more control over it you can use PGRouting (with Postgresql) and then obtain street data. though I doubt it is worth your effort.

Regional Proximity UI

I'm developing a UI (AJAX-enabled; LAMP server) which will allow a user to designate regions in which a company operates. A "region" in this case may be a state (if dealing with the US) a province (Canada), or entire country (everyone else).
As there are 195 countries in the world, I would like to avoid a multi-select box or list of checkboxes. In the workflow leading to this particular screen, the user will have already entered the full address of the company, so I have a starting region to work from.
Since the majority of companies only operate out of their own region, and those covering multiple regions tend not to branch out too far, I am considering displaying the list of regions gradually based on proximity. I realize at some point (I'm using 3 passes for now) the full list will need to be displayed; I'm just trying to delay the user from reaching that point as it's a definite edge case.
Here is a PNG mockup that explains this concept a bit more clearly. (196kb)
Questions:
What suggestions do you have for the actual form interaction? This has not been presented to representative end users yet, but I'm open to all suggestions during the prototyping stage.
Do you think 'rolling up' US states and/or Canadian provinces between transitions will negatively affect the user's spatial memory?
More clearly: after the 3rd pass, the company will operate in every US state - so convert those 50 inputs into one.
Are there any existing applications that have utilized this approach to use as a baseline or demo?
And, since I know my developer will want to know - what would be the easiest way to store each region's proximity? Lat/long of the center? Lat/long of each corner of a 'bounding box' (more accurate)? I'm assuming we will end up writing some proximity calculations based on the lat/long of the company's actual address.
Are you expecting users to read the map in order to know what list of checkboxes to go to? If your users have than level of geographic ability, then it’s less work for them to select the regions directly from the map, rather than have them make the map-to-Proximity-Level cognitive transfer, followed by a Proximity-Level-to-region transfer.
If some users do not have that level geographic expertise (you may be surprised how many Americans cannot find their own state on a US map), then I’d try, perhaps in addition to the map, no more than two lists, one proximal (the default) with regions close to the home address, and one exhaustive. I can’t see users with weak geographic abilities being be able to handle multiple arbitrary levels of proximity. People who can’t read maps well are not going to able to estimate the proximity level of one region to another. So the idea is to try a proximal list and if that doesn’t work, then forget about proximity and go exhaustive –don’t send your users wandering among proximity levels looking for Idaho (“I swear it’s near Indiana”).
By default, show the proximal list with regions likely to satisfy most of your users based on research of your likely clients. A “more” button displays the exhaustive list. Both lists should be sorted alphabetically, except first subdivide the exhaustive list into States of the US, Provinces & Territories of Canada, and Country (which includes the US (all) and Canada (all)).
You can provide some command buttons to select multiple regions (e.g., “All 48 contiguous US states, All of South America), allowing users to de-select some regions afterward. For this reason, I wouldn’t roll anything up until the user commits the input.
As an example of someone using a map plus list (all in HTML, no less), see http://justaddwater.dk/2007/12/21/map-with-positions-in-css/
I am not really clear what it is that you are trying to achieve from the current UI (are you looking for branch offices? other companies? etc?)
I am not a big fan of using pure geographical proximity to define regions. For example, if one company operates in NYC, it could have an office in NJ which could well be as far as the moon. On the other hand, for a company in anchorage, an office in Vancouver could still be within the region. Unfortunately, state boundaries are fairly meaningless too. For example, I live in western PA, and can tell you that while Pittsburgh and Philly are in the same state, they could be different countries for all that matters, and most companies have offices in each.
If your project is lamp based, why not just let a user click a point on the map, and based on that ask him what he means (e.g., nearest city, entire county, entire state, entire country?. If you then need to define the entire region, you can perhaps use some sort of a grab tool to click or delineate all the other regions that could be part of it?
Either way, present your offices as pushpins on the map, and then maybe have a list on the side the way that standard google maps handles searches.
It may be a lot of work, but if it's an important form, users may prefer that over manual text entry or selections from a list.

Algorithm to classify a list of products? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have a list representing products which are more or less the same. For instance, in the list below, they are all Seagate hard drives.
Seagate Hard Drive 500Go
Seagate Hard Drive 120Go for laptop
Seagate Barracuda 7200.12 ST3500418AS 500GB 7200 RPM SATA 3.0Gb/s Hard Drive
New and shinny 500Go hard drive from Seagate
Seagate Barracuda 7200.12
Seagate FreeAgent Desk 500GB External Hard Drive Silver 7200RPM USB2.0 Retail
For a human being, the hard drives 3 and 5 are the same. We could go a little bit further and suppose that the products 1, 3, 4 and 5 are the same and put in other categories the product 2 and 6.
We have a huge list of products that I would like to classify. Does anybody have an idea of what would be the best algorithm to do such thing. Any suggestions?
I though of a Bayesian classifier but I am not sure if it is the best choice. Any help would be appreciated!
Thanks.
You need at least two components:
First, you need something that does "feature" extraction, i.e. that takes your items and extracts the relevant information. For example, "new and shinny" is not as relevant as "500Go hard drive" and "seagate". A (very) simple approach would consist of a simple heuristic extracting manufacturers, technology names like "USB2.0" and patterns like "GB", "RPM" from each item.
You then end up with a set of features for each item. Some machine learning people like to put this into a "feature vector", i.e. it has one entry for each feature, being set to 0 or 1, depending on whether the feature exists or not. This is your data representation. On this vectors you can then do a distance comparison.
Note that you might end up with a vector of thousands of entries. Even then, you then have to cluster your results.
Possibly useful Wikipedia articles:
Feature Extraction
Nearest Neighbour Search
One of the problems you will encounter is to decide Nearest Neighbours in non-linear or non-ordered attributes. I'm building on Manuel's entry here.
One problem you will have is to decide on proximity of (1) Seagate 500Go, (2) Seagate Hard Drive 120Go for laptop, and (3) Seagate FreeAgent Desk 500GB External Hard Drive Silver 7200RPM USB2.0 Retail:
Is 1 closer to 2 or to 3? Do the differences justify different categories?
A human person would say that 3 is between 1 and 2, as an external HD can be used on both kind of machines. Which means that if somebody searches for a HD for his desktop, and broadens the scope of selection to include alternatives, external HDs will be shown too, but not laptop HDs. Probably, SSDs, USB memory sticks, CD/DVD drives will even show up before laptop drives, enlarging the scope.
Possible solution:
Present users with pairs of attributes and let them weight proximity. Give them a scale to tell you how close together certain attributes are. Broadening the scope of a selection will then use this scale as a distance function on this attribute.
To actually classify a product, you could use somewhat of a "enhanced neural network" with a blackboard. (This is just a metaphore to get you thinking in the right direction, not a strict use of the terms.)
Imagine a set of objects that are connected through listeners or events (just like neurons and synapsis). Each object has a set of patterns and tests the input against these patterns.
An example:
One object tests for ("seagate"|"connor"|"maxtor"|"quantum"| ...)
Another object tests for [:digit:]*(" ")?("gb"|"mb")
Another object tests for [:digit:]*(" ")?"rpm"
All these objects connect to another object that, if certain combinations of them fire, categorizes the input as a hard drive. The individual objects themselves would enter certain characterizations into the black board (common writing area to say things about the input) such as manufacturer, capacity, or speed.
So the neurons do not fire based on a threshhold, but on a recognition of a pattern. Many of these neurons can work highly parallel on the blackboard and even correct categorizations by other neurons (maybe introducing certainties?)
I used something like this in a prototype for a product used to classify products according to UNSPSC and was able to get 97% correct classification on car parts.
There's no easy solution for this kind of problem. Especially if your list is really large (millions of items). Maybe those two papers can point you in the right direction:
http://www.cs.utexas.edu/users/ml/papers/normalization-icdm-05.pdf
http://www.ismll.uni-hildesheim.de/pub/pdfs/Rendle_SchmidtThieme2006-Object_Identification_with_Constraints.pdf
MALLET has implementations of CRFs and MaxEnt that can probably do the job well. As someone said earlier you'll need to extract the features first and then feed them into your classifier.
To be honest, this seems more like a Record Linkage problem than a classification problem. You don't know ahead of time what all of the classes are, right? But you do want to figure out which product names refer to the same products, and which refer to different ones?
First I'd use a CountVectorizer to look at the vocabulary generated. There'd be words like 'from', 'laptop', 'fast', 'silver' etc. You can use stop words to discard such words that give us no information. I'd also go ahead and discard 'hard', 'drive', 'hard drive' etc. because I know this is a list of hard drives so they provide no information. Then we'd have list of words like
Seagate 500Go
Seagate 120Go
Seagate Barracuda 7200.12 ST3500418AS 500GB 7200 RPM SATA 3.0Gb/s
500Go Seagate etc.
You can use list of features like things that end with RPM are likely to give RPM information, same goes with stuff ending with mb/s or Gb/s. Then I'd discard alphanumeric characters like '1234FBA5235' which is most likely model numbers etc. which won't give us much information. Now if you are already aware of brands of hard drives that are appearing in your list like 'Seagate' 'Kingston' you can use string similarity or simply check if they are present in the given sentence. Once that's done you can use Clustering to group similar objects together. Now objects with similary rpm, gb's, gb/s, brand name will be clustered together. Again, if you use something like KMeans you'd have to figure out the best value of K. You'll have to do some manual work. What you could do it use a scatter plot and eyeball for which value of K the data classifies the best.
But the problem in above approach is if you don't know before hand the list of brands then you'd be in trouble. Then I'd use Bayesian Classifier to look for every sentence and get the probability of it being a hard drive brand. I'd look for two things
Look at the data, most of the times the sentence would explicitly mention the word 'hard drive' then I'd know it's definitely talking about a hard drive. Chances for something like 'Mercedes Benz hard drive' are slim.
This is a bit laborious but I'd write a Python web scraper over Amazon (or if you can't write one just Google for most used Hard Drive brands and create a list) It give me list like 'Seagate Barracuda 7200.12 ST3500418AS 500GB 7200 RPM SATA 3.0Gb/s' now for every sentence it'd use something like Naive Bayes to give me probability it's a brand. sklearn come pretty handy to do this stuff.

Resources