DataStructure to cache a large set of Chart DataPoints - algorithm

I have a scenario like I am plotting a Chart using API like JFreeChart or SWT Chart or BIRT anything is fine.
The data for plotting the chart is bit high like 10GB. So how chart works is it just keep the latest data points like X,Y and discard other data for efficient memory utilization.
Like I got a scenario that once that is done a user comes and try to
zoom the chart or wanna see the certain specific DataPoints , so solve
this scenario I need to cache all the data points in the chart that
will again take the memory on toss as if I need to save entire data
points it may lead to huge memory.
So what is the most efficient algorithm or precisely any DataStructure to sort this problem.
It is nothing to do with java but I am programming in Java , so I mentioned Java here.

Related

vector tiles map viewer for own data and with interaction

there are same solutions for rendering vector tiles on client-side webbrowser. But i don't find one for my expectations.
I want to display a huge amount data (points, polygons) in a map viewer. I need vector data because of dynamic styling and interactions of the features. Its too much to load all in Google Maps and from my perspective its the right way to use vector tiles, because only nessesary and aggregated data for the viewpoint will be load.
So i dont need to style the basemap like i found thousands of examples. I only want to load my data as a vector tile layer on a raster (google satelite). But my features should by stylable, need to have normal events like clicking or mouseover and store properties. And last but not least it should be really fast ;-)
What viewer i need? And what is the workflow to create and serve the data as vector tiles?
I have been working on a similar problem, strech - technologies are evolving, but mapbox-gl.js is one viewer you can use. You might be able to use mapzen's system as well, but I haven't tried their system with large amounts of features, whereas I know mapbox does work better than leaflet and openlayers for your scenario.

DC.js Crossfilter on "nested" dimensions

I'm quite confused and might need help just formulating the question, so please give good comments...
I'm trying to crossfilter some data where each data point has its own sub-dataset that I want to chart and filter on as well. Each point represents a geographic region, and associated with each point is a time series which measures a certain metric over time.
Here's what I've got so far: http://michaeldougherty.info/dcjs/
The top bar chart shows a particular value for 10 regions, and the choropleth is linked with the same data. Now, below that are two composite line charts. Each line corresponds to a region -- there are 10 lines in each graph, and each graph is measuring a different metric over time. I would like the lines to be filtered as well, so if one bar is selected, only one line will show on the line chart.
Moreover, I want to be able to filter by time on the line charts (through brushing) in addition to some other filter, so I can make queries like "filter out all regions whose line value between 9 AM and 5 PM is less than 20,000", which would also update the bar and choropleth charts.
This is where I'm lost. I'm considering scrapping DC.js for this and using crossfilter and d3.js directly because it seems so complicated, but I would love it if I'm missing something and DC.js can actually handle this. I'd also love some ideas on where to start implementing this in straight crossfilter, because I haven't fully wrapped my head around that yet either.
How does one deal with datasets within datasets?
Screenshot of the link above included for convenience:

"Live" graph d3.js with simulated data

I have created a simple line graph with data from a mySQL database using PHP to return the data in JSON format.
https://gist.github.com/5fc4cd5f41a6ddf2df23
I would like to simulate "live" updating something similar to this example, but less complicated:
http://bl.ocks.org/2657838
I've been searching for examples on how to achieve this simply as new to D3 - to no avail.
I've looked at Mike Bostock's http://bost.ocks.org/mike/path/ path transitions, but not sure how to implement this using json data.
Can anyone help with either an example or some direction on how I could accomplish this?
Doing that kind of line transformations is tricky in SVG because moving large number of points just a little and rerendering the complete line can hurt performance.
For the case when interactivity with each data point is not paramount and the time series can grow to contain arbitrary number of points, consider using Cubism. It is a library based on d3 but meant specially for visualizing time-series data efficiently. To prevent rerendings of SVG, it draws the points on a canvas, allowing for cheap pixel by pixel transitions as new data arrives.

Millions of Google Map Marker using MarkerClusterer, JSON/AJAX

I am developing large geo location web site. There are over 2.5 million places to show on Google Map with markers and info window (when marker clicked).
I am using MarkerClusterer to narrow down the load of individual marker.
But, I am afraid if so much data in browser (JSON etc) would really kill the page.
Any suggestions to load on demand JSON by identifying the map bounds when panning is changed.
Any recommendations to resource also appreciated.
Have a look at Cluster I think it may do what you want:
Only the markers currently visible actually get created.
If too many markers would be visible, then they are grouped together into cluster
markers
You can look for a quadkey. A quadkey is perfect to reduce the dimension complexity and build clusters of the point of interest. There are many different methods like z curve, hilbert curve, peano curve. To further limit the constraints you can attach the cluster thing to the bounding box and the zoom level of the google maps.
There is a version of marker Clusterer that works for v3 of the google maps api, but that isn't the issue here. The issue is that you'd still be handling the underlying data in the browser with JS (2.5 million places retrieved thru JSON/AJAX). That is most likely too much, unless you're on a fast connection using the fastest computers with a lot of ram.
For those contemplating this issue on their own sites, keep in mind that more and more mobile devices are accessing these sites, and the javascript on such devices just can't handle nearly as many points. My own site broke with the latest release of iOS6, and now I have to accommodate by changing my js to an easier system load.
But to get back to the answer at hand, what you'll have to do is make a new ajax call whenever the map bounds change, and if the zoom goes too far out, you'll have to limit the number retrieved and implement some system to show the user that not all results are shown. My site uses a limit of 250, if I recall correctly, and shows a bounding rectangle around the locations (along with markerclusterer to cluster them). Before populating with real data, I did a test database of thousands and thousands, and this number seemed to be the best tradeoff of performance and information. (But that was before I went mobile and before v3 of the api). v3 is supposed to be more streamlined, but mobile devices are limited, so you'll have to test.
I am using marker clusterer plus library with a marker size cap of 200 and default zoom level 8. On zoom change or drag, another 200 markers will come on the map.
If you zoom-out the markers will be clustered and vice-versa.

Google Maps API v3, lots of markers, clustering and performance

I have about 5000 markers I need to render on Google Map. I'm currently using the API (v3) and there are performance issues on slower machines, especially in IE. I have done the following already to help speed things up:
Used a simple marker class that extends OverlayView and renders a single DIV element per marker
Implemented the MarkerClusterer library to cluster the markers at different levels
Render GIFs for IE, instead of alpha PNGs
Are there faster clustering classes? Any other tips? I'm trying to avoid server-side clustering unless this is the only option left to squeeze performance out of the system.
Thanks
I used a method that loads all the markers onto the page, and then listens for the map to finish panning.
When the map has finished panning, I first check the zoom level - if it's too high I don't display anything. If it's at an acceptable level, I then loop through the markers I have stored and see if they fall into the bounding box of the map. If they do, they get added. A second loop then removes any that have moved out of the view.
The highest number I've used is about 30,000 markers with this method, although I have it so you must be zoomed in quite far to see them. In areas of higher concentration of markers it's obviously a little slower but it's useable.
The solution mentioned above works for much higher number of markers. We use it for millions of GPS points at backend (including polygons etc). The only problem is some logic behind like proper caching of spatial queries, or fetching new results only, if user moves a map for more than X meters. There is a lot of work to make it done, but for viewing real high number of points, there is nothing better.
Marker clusteres are usually working at browser side, so these is still need to load all points at once - and this makes this method unusable for large numbers.
You can check it out at http://www.tixik.com/london-2354567.htm live (just click ,,plan a trip " and start planning. Just try to move a map, zoom in or out and all points will show/hide on map zoom/drag.

Resources