How to show nearest store based on zipcode - xcode

I am creating an app tha is for a a bussines that has several stores around the state.
How can i show the information for the nearest sores based in the zip code?
Thanks for any help

The basic idea is:
Convert the ZIP code to geographical coordinates (longitude and latitude).
Compute the distance of each store to this coordinate.
Order the results by distance, ascending.
Step 2 can be optimized a bit -- for example, you might limit the search to those stores in the same state. You may also want to limit the number of stores returned if you are only going to display 10, for example.
This is about all the detail I can provide since your question is quite general.

Related

Algorithm to find areas of support in a candlestick chart

I am in the process of designing an algorithm that will calculate regions in a candlestick chart where strong areas of support exist. An "area of support" in this case is defined as an area in the chart where the price of a stock rises by a large amount in a short period of time. (Please see the diagram below, the blue dots represent these strong areas of support)
The data I am working with is a list of over 6000 TOHLC (timestamp, open price, high price, low price, close price) values. For example, the first entry in this list of data is:
[1555286400, 83.7, 84.63, 83.7, 84.27]
The way I have structured the algorithm to work is as follows:
1.) The list of 6000+ TOHLC values are split into sub-lists of 30 TOHLC values (30 is a number that I arbitrarily chose). The lowest low price (LLP) is then obtained from each of these sub-lists. The purpose behind using this method is to find areas in the chart where prices dip.
2.) The next step is to determine how high the price rose from each of these lows. For this, I take the next 30 candlestick values from the low and determine what the highest high price (HHP) is. Then, if HHP / LLP >= 1.03, the low price is accepted, otherwise it is discarded. Again, 1.03 is a value that I arbitrarily chose, by analysing the stock chart manually and determining how much the price rose on average from these lows.
The blue dots in the chart above represent the accepted areas of support by the algorithm. It appears to be working well, in terms of that I am trying to achieve.
So the question I have is: does anyone have any improvements they can suggest for this algorithm, or point out any faults in it?
Thanks!
I may have understood wrong, however, from your explanation it seems like you are doing your calculation in separate 30-ish sub lists and then combining them together.
So, what if the LLP is the 30th element of sublist N and HHP is 1st element of sublist N+1 ? If you have taken that into account, then it's fine.
If you haven't taken that into account, I would suggest doing a moving-window type of approach in reading those data. So, you would start from 0th element of 6000+ TOHLC and start with a window size of 30 and slide it 1 by 1. This way, you won't miss any values.
Some of the selected blue dots have higher dip than others. Why is that? I would separate them into another classifier. If you will store them into an object, store the dip rate as well.
Floating point numbers are not suggested in finance. If possible, I'd use a different approach and perhaps classifier, solely using integers. It may not bother you or your project as of now, but surely, it will begin to create false results when the numbers add up in the future.

Statistics/Algorithm: How do I compare a weekly graph with its own history to see when in the past it was almost the same?

I’ve got a statistical/mathematical problem I’m stumped on and I was really hoping to get some help. I’m working on a research where I need to compare a weekly graph with its own history to see when in the past it was almost the same. Think of this as “finding the closest match”. The information is displayed as a line graph, but it’s readily available as raw data:
Date...................Result
08/10/18......52.5
08/07/18......60.2
08/06/18......58.5
08/05/18......55.4
08/04/18......55.2
and so on...
What I really want is the output to be a form of correlation between the current data points with the other set of 5 concurrent data points in history. So, something like:
Date range.....................Correlation
07/10/18-07/15/18....0.98
We’ll be getting a code written in Python for the software to do this automatically (so that as new data is added, it automatically runs and finds the closest set of numbers to match the current one).
Here’s where the difficulty sets in: Since numbers are on a general upward trend over time, we don’t want it to compare the absolute value (since the numbers might never really match). One suggestion has been to compare the delta (rate of change as a percentage over the previous day), or using a log scale.
I’m wondering: how do I go about this? What kind of calculation I can use to get the desired results? I’ve looked at the different kind of correlation equations, but they don’t account for the “shape” of the data, and they generally just average it out. The shape of the line chart is the important thing.
Thanks very much in advance!
I would simply divide the data of each week by their average (i.e., normalize them to an average of 1), then sum the squares of the differences of each day of each pair of weeks. This sum is what you want to minimize.
If you don't care about how much a graph oscillates relative to its mean, you can normalize also the variance. For each week, calculate mean and variance, then subtract the mean and divide by the root of the variance. Each week will have mean 0 and variance 1. Then minimize the sum of squares of differences like before.
If the normalization of data is all you can change in your workflow, just leave out the sum of squares of differences minimization part.

How does "Addressing missing data" help KNN function better?

Source:- https://machinelearningmastery.com/k-nearest-neighbors-for-machine-learning/
This page has a section quoting the following passage:-
Best Prepare Data for KNN
Rescale Data: KNN performs much better if all of the data has the same scale. Normalizing your data to the range [0, 1] is a good idea. It may also be a good idea to standardize your data if it has a Gaussian
distribution.
Address Missing Data: Missing data will mean that the distance between samples cannot be calculated. These samples could be excluded or the missing values could be imputed.
Lower Dimensionality: KNN is suited for lower dimensional data. You can try it on high dimensional data (hundreds or thousands of input variables) but be aware that it may not perform as well as other techniques. KNN can benefit from feature selection that reduces the dimensionality of the input feature space.
Please, can someone explain the Second point, i.e. Address Missing Data, in detail?
Missing data in this context means that some samples do not have all the existing features.
For example:
Suppose you have a database with age and height for a group of individuals.
This would mean that for some persons either the height or the age is missing.
Now, why this affects KNN?
Given a test sample
KNN finds the samples that are closer to it (Aka: the students with similar age and height).
KNN does this to make some inference about the test sample based on its nearest neighbors.
If you want to find these neighbors you must be able to compute the distance between samples. To compute the distance between 2 samples you must have all the features for these 2 samples.
If some of them are missing you won't be able to compute distance.
So implicitly you would be lossing the samples with missing data

Find the State given Latitude and Longitude Coordinates

I have a set of 900 Latitude and Longitude Coordinates-- I need a relatively simple method for finding the 'State' referred to by these coordinates. If it helps, the data is in excel.
Google provide a Geocoding service. Part of this is reverse geocoding which converts geographic coordinates into a human-readable address including States. This Demo illustrates what can be done. There are limits to what you can do with this service.
Try to use the average values as provided here. With a bit of luck, most of your 900 coordinate pairs belong to the state with the nearest center. Calculation of distances between longitude/latitude locations is explained here.
An alternative would be to use a ZIP table with US postcodes as provided here. Once you know the postcode, you know the state, don't you? I'm not sure, but each state has an interval of ZIP codes. Once you know the ZIP code of a location, you can find the interval and the state it belongs to.
A list of coordinates of US locations could help to get a more exact allocation: http://www.bcca.org/bahaivision/fast/latlong_us.html
Find the nearest location in the list and take its state as result.
Google requires that geocoding / reverse geocoding be used with maps that users can see, so if that isn't an option for you, I think the best way is to use a database with spatial functions. First, you'll need the state boundaries found for free at NationalAtlas.gov. I use SQL Server (need 2008 or 2012 versions) and you can use the STContains() method to find what state it belongs to.
A simpler solution would be to just use the ezcmd.com rest API services.
They provide two APIs:
http://ezcmd.com/apps/app_geo_postal_codes#geo_postal_codes_api
1) All you have to do is just give it a zip code and a country code (for usa you either use US or USA) and optionally you'll pass the distance radius, and units (Miles or Km) and it'll return all other zip codes with state and province that are within the given distance
2) Free search, where you give it any fuzzy search phrase that includes either one of zip / city / state / province and country and it returns the best matches for that search phrase.
Hint: You can use #2 to find the zip code for a fuzzy (human readable) address and pass that zip code to #1 to find nearest places to that zip code.
Also they have another API that returns zip code along with full geo location information for a given IP address here:
http://ezcmd.com/apps/app_ezip_locator#ezip_locator_api
Enjoy ! I hope this helps.

Graph plotting: only keeping most relevant data

In order to save bandwith and so as to not to have generate pictures/graphs ourselves I plan on using Google's charting API:
http://code.google.com/apis/chart/
which works by simply issuing a (potentially long) GET (or a POST) and then Google generate and serve the graph themselves.
As of now I've got graphs made of about two thousands entries and I'd like to trim this down to some arbitrary number of entries (e.g. by keeping only 50% of the original entries, or 10% of the original entries).
How can I decide which entries I should keep so as to have my new graph the closest to the original graph?
Is this some kind of curve-fitting problem?
Note that I know that I can do POST to Google's chart API with up to 16K of data and this may be enough for my needs, but I'm still curious
The flot-downsample plugin for the Flot JavaScript graphing library could do what you are looking for, up to a point.
The purpose is to try retain the visual characteristics of the original line using considerably fewer data points.
The research behind this algorithm is documented in the author's thesis.
Note that it doesn't work for any kind of series, and won't give meaningful results when you want a downsampling factor beyond 10, in my experience.
The problem is that it cuts the series in windows of equal sizes then keep one point per window. Since you may have denser data in some windows than others the result is not necessarily optimal. But it's efficient (runs in linear time).
What you are looking to do is known as downsampling or decimation. Essentially you filter the data and then drop N - 1 out of every N samples (decimation or down-sampling by factor of N). A crude filter is just taking a local moving average. E.g. if you want to decimate by a factor of N = 10 then replace every 10 points by the average of those 10 points.
Note that with the above scheme you may lose some high frequency data from your plot (since you are effectively low pass filtering the data) - if it's important to see short term variability then an alternative approach is to plot every N points as a single vertical bar which represents the range (i.e. min..max) of those N points.
Graph (time series data) summarization is a very hard problem. It's like deciding, in a text, what is the "relevant" part to keep in an automatic summarization of it. I suggest you use one of the most respected libraries for finding "patterns of interest" in time series data by Eamonn Keogh

Resources