Performance Issue with Doctrine, PostGIS and MapFish - performance

I am developing a WebGIS application using Symfony with the MapFish plugin http://www.symfony-project.org/plugins/sfMapFishPlugin
I use the GeoJSON produced by MapFish to render layers through OpenLayers, in a vector layer of course.
When I show layers up to 3k features everything works fine. When I try with layers with 10k features or more the application crash. I don't know the threshold, because I either have layers with 2-3k features or with 10-13k features.
I think the problem is related with doctrine, because the last report in the log is something like:
Sep 02 13:22:40 symfony [info] {Doctrine_Connection_Statement} execute :
and then the query to fetch the geographical records.
I said I think the problem is the number of features. So I used the OpenLayers.Strategy.BBox() to decrease the number of feature to get and to show. The result is the same. The app seems stuck while executing the query.
If I add a limit to the query string used to get the features' GeoJSON the application works. So I don't think this is related to the MapFish plugin but with Doctrine.
Anyone has some enlightenment?
Thanks!

Even if theorically possible, it’s a bad idea to try to show so many vector features on a map.
You'd better change the way features are displayed (eg. raster for low zoom levels, get feature on click…).
Even if your service answer in a reasonable time, your browser will be stuck, or at least will have very bad performance…
I’m the author of sfMapFishPlugin and I never ever tried to query so many features, and even less tried to show them on a OL map.
Check out OpenLayers FAQ on this subject: http://trac.osgeo.org/openlayers/wiki/FrequentlyAskedQuestions#WhatisthemaximumnumberofCoordinatesFeaturesIcandrawwithaVectorlayer , a bit outdated with recent browsers improvements, but 10k vector features on a map is not reasonable.
HTH,

Related

Apollo InMemoryCache Performance Strategies for Large Data Set (React)

The initial data set received from an Apollo Client GraphqQL query for an application I am trying to tune is currently very large. In "large" I mean that it seems that the data normalizes to about 7,000 entries under the "data" key in the cache. The payload is about 1.6MB. If I were to save the cache's data entry it's normalized to about 3MB. I'm not a fan of how the initial query works as I am currently redesigning their application to use cursors, and filtering, on the graph rather than the client fetching such a large amount of data and filtering itself. The current implementation cannot scale due to larger data sets will be returned when this software is installed in other locations. But, I am looking for a short term solution to make this cache build faster while I undertake very large redesign task.
*UPDATE July 25, 2018** The cursor approach doesn't work as the cache write performance degrades as more entries are added during each page/cursor of data is fetch.
The real issue is that IE 11, which we I have to support due to the industry's (healthcare) usage of this browser, is extremely slow. It's very difficult to measure, but it's about 8-10x slower than Chrome in the area of the Apollo cache and react integration code. Chrome can take 1-2 seconds to build the cache on these slower virtual desktops while IE will take 10-20 seconds.
So, my question is: Are there any performance tweaks to help the cache build faster? I've attached a screenshot to show where the bottleneck lies. It's the same in chrome as in IE, it's just about an order of magnitude slower in IE. I'm not sure if it's an IE shortcoming, or if it's some crazy polyfill issue that is awful. The screenshot shows the hot spots that show up in the performance results. Yes, this screenshot is of the development version of React, but we aren't seeing any real noticeable performance increases in a production. The screenshot is really just a call to the graph and the simplest HTML table being rendered with about 260 rows. The render phase is negligible. It seems there are an awful lot of queued up events or 'work' during this phase. Perhaps there is a way to suspend this? Chrome's profiler shows the same hot spot, it's just not as slow.
Anyway, any advice is greatly appreciated.
The screenshot columns are: function | invocation count | time (seconds)
Our team is facing similar issues. Our current approach is to "denormalize" part of server schema into a String type, which holds a JSON string. On the client side, we will parse the JSON string that's returned by the Apollo client.
Apollo 3.0 will have an option to disable cache normalization for a type:
https://www.apollographql.com/docs/react/v3.0-beta/caching/cache-configuration/#disabling-normalization
I ran into a similar problem (on Apollo client 3.3.6). Eventually, it became clear that a collection this large was not suitable for the Apollo client cache. Rather than saddle yourself with seconds-long processing (probably right when you load your app), I'd strongly recommend that you manage your larger datasets on your own outside the cache - native js map/filter/etc is much faster and probably a better fit depending on why you need the data. Just pass the option fetchPolicy: 'no-cache', and watch your app speed up significantly. Any large fetch (thousands of typed results) is probably better off with this treatment.

Dynamically loaded Markers: DDOS prevention

My app shows a map where locations (or Markers) are dynamically loaded via an ajax (and database) request after every map Bounds changes.
I'm convinced that this solution is not scalable : at the moment, Europe area shows a total of 10 markers.
If the database grows and I display for instance 1000 locations, that means 1000 rows would be returned to the user.
This is not a JS / UI since I use the MarkerCluster plugin and I avoid the redraw of loaded locations's markers.
I made some tweaks :
- Delay the Ajax request thanks to an Idle gmaps event
- Increase the minimal zoom level, so the entire world can't be displayed.
But this is not enough.
There are lots of ways to approach this but I will just put here the two I think are most appropriate from your question.
First is to really control from your web app what information is asked for and when. You could write this all yourself in javascript and implement caching techniques ect. There are a number of libraries out there that do most of this work for you though.
I would recommend one of the following:
OpenGeo SDK
OpenLayers
GeoExt
Leaflet
All of these have ways of controlling local caching, when to get the data and what data is gathered from the server. Most of them can also be extended to add any functionality that is missing. The top two I know support google maps (as well as a number of others) as well.
If you need to add even more control over your data locally you could even look at implementing something like PouchDB. I think this is more suited to mobile applications or instances where the network connection is either really slow or intermittent.
This sort of solution should be able to easily handle 1000's to 10000's of features with 100's of users.
If you are really going to scale up to 100000's to 1000000's of features with 100's to 1000's of users then I would suggest adding a tile server to the soloution above. The tile server will sit between your web application and your data base. Most of them have lots of caching settings and optimistions for dealing with large datasets and pushing them out to a client. Because they push out tiles rather than features the data output remains reasonably constant even as the number of features grow. The OpenGeo SDK and Openlayers libraries I mentioned above can work really well with any of the following tile servers:
GeoServer
Mapserver
MapGuide
Quantum GIS Server
If you are reluctant to do any coding there are some offers that work out of the box for enterprise environments. They are all expensive and from your question I think they are probably not what you are looking for.

What's the optimal amount of queries an ExpressionEngine page should load?

I saw #parscale tweet: How many queries are you happy with for a home page? When do you say this is Optimized?
I saw responses that < 50 is good, 30 or less is best, and 100+ is danger zone. Is there really any proper number? And if say you do have > 50 queries running on your pages, what are some ways to bring it down?
I generally have sites that run the gamut that are under 50 queries and some more, though the "more" don't seem to be too slow, I'm always interested in making it faster. How?
How to reduce queries will vary from site to site, template to template, but there's been a few articles on EE optimisation and performance:
http://expressionengine.com/wiki/Reduce_Queries/
http://expressionengine.com/blog/entry/troubleshooting_site_performance_issues/
http://www.netmagazine.com/tutorials/optimise-your-expressionengine-site
http://www.leezilla.net/post/12377053779/ab-seeing-your-sites-performance
http://eeinsider.com/articles/using-cache-wisely-with-expressionengine/
But if you've done all that and still need to speed things up, then your next step is to look at add-ons like CE Cache.
Thing to remember is not all queries are created equal. You can have 1,000 queries that do very little in the way of impacting performance, or a single query that can slow everything way down.
In EE its actually better to look at the template debug output and identify key slow down spots in the template build then to always focus on just the query count.
As others have pointed out products like CE Cache, Solspace's Template Morsels, or even adding a varnish caching server in-front of an intensive EE web site can do wonders, though with the added work required to fully get a varnish setup in front of EE setup, I would currently stick to the other solutions/directions first.
There is not a magic query number. In my opinion, your server environment dictates what can be supported. The more resources you have, the more complex your code can be.
With that said, there are lots of options you can use if issues do arise on an EE website. The links in the answer above give you a solid list but here are some first things to check:
Remove search:field_name="" parameters
Reduce use of channel tags, combine if you can
Add disable="" parameter to channel tabs to disable what you don't need
Reduce use of embeds
Turn off all EE tracking code
Stop using advanced conditionals if you have a channel tag inside
Following on from Nevin's point. I find that the JB Graphite is a huge help, it turns the debug output into a pretty graph, so you can easily spot bottleneck queries.
http://devot-ee.com/add-ons/jb-graphite
I'll expand on MediaGirl's point number 6 - you can often greatly simplify conditionals by using Croxton's Ifelse and/or Switchee add-ons. Definitely worth a look.
I used CE Cache on a really intensive build and it reduced page load from 6 seconds to 0.7 seconds. Awesome addpon, with incredible documentation and the best support you can get anywhere.

What makes a JavaFx 1.2 Scene Graph Refresh?

My first question =). I'm writing a video game with a user interface written in JavaFx. The behavior is correct, but I'm having performance problems. I'm trying to figure out how to figure out what is queuing up the refreshes which are slowing down the app.
I've got a relatively complex Scene Graph that represents a hexagonal map. It scales so that you could have 100 or a 1000 hexagons in the map. As the number of hexagons grow the responsiveness of the gui decreases. I've used YourKit (a Java Profiler) to trace these delays to major redraw operations.
I've spent most of the night trying to figure out how to do two things and understand one thing:
1) Cause a CustomNode to print something to the console whenever it is painted. This would help me identify exactly when these paints are being queued.
2) Identify when a CustomNode is put on the repaint queued.
If I answered 1 and 2, I might be able to figure out what it is that is binding all these different nodes together. Is it possible that JavaFX only works through global refreshes (doubtful)?
JavaFX script is a powerful UI language but certain practices will kill performance. Best performance generally boils down to:
keeping the Scene Graph small
keeping use of bind to a minimum (you can look at using triggers instead which are more performant)
This blog post by Jim Weaver expands these points.
I'm not sure as to the specific answers to your questions. If you examine the 1.2.1 docs you might be able to find a point in the Node documentation that you can override and add println statements but I'm not sure it can be done. You could try posting on forums.sun.com
This is a partial post. I expect to expand it after I've done some more work. I wanted to put in what I've done to date so I don't forget.
I realized that I'd need to get my IDE running with a full compliment of the JavaFx 1.2 source. This would allow me to put break points into the core code to figure out what is going on. I decided to do this configuration on Eclipse for remote debugging. I'm developing my FX in Netbeans but am more comfortable with Eclipse so that's what I want to debug in if I can.
To get this info into Eclipse, I first made a project with the Java source that my code uses. I then added external Jars to the project. On my Mac, the Jars I linked to were in /Library/Frameworks/JavaFX.framework/Versions/1.2
Then I went searching for the Source to link to these Jars. Unfortunately, it's not available. I could find some of it in /Library/Frameworks/JavaFX.framework/Versions/1.2/src.zip.
I did some research and found that the only available option left was to install a Java Decompilier. I used this one because it was easy to install into Eclipse 3.4: http colon_ //java dot decompiler _dot free.fr/ (<-- Please forgive the psudo link, I'm limited because I'm new)
This is where I am now. I can navigate into the Core FX classes and believe I'll be able to set break points and begin real analysis. I'll update this post as I progress.
I found a helpful benchmarking tool:
If you run with the JVM arg:
-Djava.util.logging.config.file=/path/to/logging/file/logging.properties
And you've put the following args into the file referenced by that arg:
handlers = java.util.logging.ConsoleHandler
java.util.logging.ConsoleHandler.level = ALL
java.util.logging.ConsoleHandler.formatter = java.util.logging.SimpleFormatter
com.sun.scenario.animation.fps.level = ALL
You'll get console output that includes your frame count per second. For FX 1.2 it wasn't working for me, but it appears to be working for 1.2.1 (which was released Sept. 9, 2009.) I don't have a Netbeans that runs 1.2.1 yet.
You may want to read this article.
http://fxexperience.com/2009/09/performance-improving-insertion-times/
Basically, insertions in to the scenegraph are slow and benefits can be seen by batching up inserts

Generating UI from DB - the good, the bad and the ugly?

I've read a statement somewhere that generating UI automatically from DB layout (or business objects, or whatever other business layer) is a bad idea. I can also imagine a few good challenges that one would have to face in order to make something like this.
However I have not seen (nor could find) any examples of people attempting it. Thus I'm wondering - is it really that bad? It's definately not easy, but can it be done with any measure success? What are the major obstacles? It would be great to see some examples of successes and failures.
To clarify - with "generating UI automatically" I mean that the all forms with all their controls are generated completely automatically (at runtime or compile time), based perhaps on some hints in metadata on how the data should be represented. This is in contrast to designing forms by hand (as most people do).
Added: Found this somewhat related question
Added 2: OK, it seems that one way this can get pretty fair results is if enough presentation-related metadata is available. For this approach, how much would be "enough", and would it be any less work than designing the form manually? Does it also provide greater flexibility for future changes?
We had a project which would generate the database tables/stored proc as well as the UI from business classes. It was done in .NET and we used a lot of Custom Attributes on the classes and properties to make it behave how we wanted it to. It worked great though and if you manage to follow your design you can create customizations of your software really easily. We also did have a way of putting in "custom" user controls for some very exceptional cases.
All in all it worked out well for us. Unfortunately it is a sold banking product and there is no available source.
it's ok for something tiny where all you need is a utilitarian method to get the data in.
for anything resembling a real application though, it's a terrible idea. what makes for a good UI is the humanisation factor, the bits you tweak to ensure that this machine reacts well to a person's touch.
you just can't get that when your interface is generated mechanically.... well maybe with something approaching AI. :)
edit - to clarify: UI generated from code/db is fine as a starting point, it's just a rubbish end point.
hey this is not difficult to achieve at all and its not a bad idea at all. it all depends on your project needs. a lot of software products (mind you not projects but products) depend upon this model - so they dont have to rewrite their code / ui logic for different client needs. clients can customize their ui the way they want to using a designer form in the admin system
i have used xml for preserving meta data for this sort of stuff. some of the attributes which i saved for every field were:
friendlyname (label caption)
haspredefinedvalues (yes for drop
down list / multi check box list)
multiselect (if yes then check box
list, if no then drop down list)
datatype
maxlength
required
minvalue
maxvalue
regularexpression
enabled (to show or not to show)
sortkey (order on the web form)
regarding positioning - i did not care much and simply generate table tr td tags 1 below the other - however if you want to implement this as well, you can have 1 more attribute called CssClass where you can define ui specific properties (look and feel, positioning, etc) here
UPDATE: also note a lot of ecommerce products follow this kind of dynamic ui when you want to enter product information - as their clients can be selling everything under the sun from furniture to sex toys ;-) so instead of rewriting their code for every different industry they simply let their clients enter meta data for product attributes via an admin form :-)
i would also recommend you to look at Entity-attribute-value model - it has its own pros and cons but i feel it can be used quite well with your requirements.
In my Opinion there some things you should think about:
Does the customer need a function to customize his UI?
Are there a lot of different attributes or elements?
Is the effort of creating such an "rendering engine" worth it?
Okay, i think that its pretty obvious why you should think about these. It really depends on your project if that kind of model makes sense...
If you want to create some a lot of forms that can be customized at runtime then this model could be pretty uselful. Also, if you need to do a lot of smaller tools and you use this as some kind of "engine" then this effort could be worth it because you can save a lot of time.
With that kind of "rendering engine" you could automatically add error reportings, check the values or add other things that are always build up with the same pattern. But if you have too many of this things, elements or attributes then the performance can go down rapidly.
Another things that becomes interesting in bigger projects is, that changes that have to occur in each form just have to be made in the engine, not in each form. This could save A LOT of time if there is a bug in the finished application.
In our company we use a similar model for an interface generator between cash-software (right now i cant remember the right word for it...) and our application, just that it doesnt create an UI, but an output file for one of the applications.
We use XML to define the structure and how the values need to be converted and so on..
I would say that in most cases the data is not suitable for UI generation. That's why you almost always put a a layer of logic in between to interpret the DB information to the user. Another thing is that when you generate the UI from DB you will end up displaying the inner workings of the system, something that you normally don't want to do.
But it depends on where the DB came from. If it was created to exactly reflect what the users goals of the system is. If the users mental model of what the application should help them with is stored in the DB. Then it might just work. But then you have to start at the users end. If not I suggest you don't go that way.
Can you look on your problem from application architecture perspective? I see you as another database terrorist – trying to solve all by writing stored procedures. Why having UI at all? Try do it in DB script. In effect of such approach – on what composite system you will end up? When system serves different businesses – try modularization, selectively discovered components, restrict sharing references. UI shall be replaceable, independent from business layer. When storing so much data in DB – there is hard dependency of UI – system becomes monolith. How you implement MVVM pattern in scenario when UI is generated? Designers like Blend are containing lots of features, which cannot be replaced by most futuristic UI generator – unless – your development platform is Notepad only.
There is a hybrid approach where forms and all are described in a database to ensure consistency server side, which is then compiled to ensure efficiency client side on deploy.
A real-life example is the enterprise software MS Dynamics AX.
It has a 'Data' database and a 'Model' database.
The 'Model' stores forms, classes, jobs and every artefact the application needs to run.
Deploying the new software structure used to be to dump the model database and initiate a CIL compile (CIL for common intermediate language, something used by Microsoft in .net)
This way is suitable for enterprise-wide software and can handle large customizations. But keep in mind that this approach sets a framework that should be well understood by whoever gonna maintain and customize the application later.
I did this (in PHP / MySQL) to automatically generate sections of a CMS that I was building for a client. It worked OK my main problem was that the code that generates the forms became very opaque and difficult to understand therefore difficult to reuse and modify so I did not reuse it.
Note that the tables followed strict conventions such as naming, etc. which made it possible for the UI to expect particular columns and infer information about the naming of the columns and tables. There is a need for meta information to help the UI display the data.
Generally it can work however the thing is if your UI just mirrors the database then maybe there is lots of room to improve. A good UI should do much more than mirror a database, it should be built around human interaction patterns and preferences, not around the database structure.
So basically if you want to be cheap and do a quick-and-dirty interface which mirrors your DB then go for it. The main challenge would be to find good quality code that can do this or write it yourself.
From my perspective, it was always a problem to change edit forms when a very simple change was needed in a table structure.
I always had the feeling we have to spend too much time on rewriting the CRUD forms instead of developing the useful stuff, like processing / reporting / analyzing data, giving alerts for decisions etc...
For this reason, I made long time ago a code generator. So, it become easier to re-generate the forms with a simple restriction: to keep the CSS classes names. Simply like this!
UI was always based on a very "standard" code, controlled by a custom CSS.
Whenever I needed to change database structure, so update an edit form, I had to re-generate the code and redeploy.
One disadvantage I noticed was about the changes (customizations, improvements etc.) done on the previous generated code, which are lost when you re-generate it.
But anyway, the advantage of having a lot of work done by the code-generator was great!
I initially did it for the 2000s Microsoft ASP (Active Server Pages) & Microsoft SQL Server... so, when that technology was replaced by .NET, my code-generator become obsoleted.
I made something similar for PHP but I never finished it...
Anyway, from small experiments I found that generating code ON THE FLY can be way more helpful (and this approach does not exclude the SAVED generated code): no worries about changing database etc.
So, the next step was to create something that I am very proud to show here, and I think it is one nice resolution for the issue raised in this thread.
I would start with applicable use cases: https://data-seed.tech/usecases.php.
I worked to add details on how to use, but if something is still missing please let me know here!
You can change database structure, and with no line of code you can start edit data, and more like this, you have available an API for CRUD operations.
I am still a fan of the "code-generator" approach, and I think it is just a flavor of using XML/XSLT that I used for DATA-SEED. I plan to add code-generator functionalities.

Resources