I was happily using neo4j 1.8.1 community edition for a while on my system with the following configuration.
System Specs:
OS: 32-bit Ubuntu 12.04.3 LTS. Kernel version 3.2.0-52-generic-pae #78-Ubuntu
Memory: 4GB
Swap: 8GB (swapfile - not a partition)
Processor: Intel® Core™ i5-2430M CPU # 2.40GHz - Quad Core
Harddisk: 500GB Seagate ATA ST9500420AS. Dual boot - Ubuntu uses 100GB and the rest by the almighty Windows 7.
When I switched to neo4j 2.0.1 enterprise edition, my application's response time became 4x slower. So, as advised in http://docs.neo4j.org/chunked/stable/embedded-configuration.html, I started tuning my filesystem, virtual memory, I/O-schedular and JVM configurations.
Performance Tuning
Started Neo4j as a server with highest scheduling priority (nice value = -20)
Set vm.dirty_background_ratio=50 and vm.dirty_ratio=80 in /etc/sysctl.conf to reduce frequent flushing of dirty memory pages to disk.
Increased maximum number of open files from 1024 to 40,000 as suggested in Neo4j startup.
Set noatime,nodiratime for the neo4j ext4 partition in /etc/fstab so that inodes don't get updated every time there is a file/directory access.
Changed I/O scheular to "noop" from "cfq" as mentioned in
http://www.cyberciti.biz/faq/linux-change-io-scheduler-for-harddisk/
JVM parameters: In short, max heap size is 1GB and neostore memory mapped files size is 425 MB.
Xms and Xmx to 1GB.
GC to Concurrent-Mark-Sweep.
neostore.nodestore.db.mapped_memory=25M,
neostore.relationshipstore.db.mapped_memory=50M
neostore.propertystore.db.mapped_memory=90M
neostore.propertystore.db.strings.mapped_memory=130M
neostore.propertystore.db.arrays.mapped_memory=130M
Sadly, this didn't make any difference. I wrote a simple script which creates N nodes and M random relationships among these nodes to get a better picture.
Neo4j 1.8.1 community edition with oracle java version "1.6.0_45":
new-sys-admin#ThinkPad:~/temp$ php perftest.php
Creating 1000 Nodes with index
Time taken : 67.02s
Creating 4000 relationships
Time taken : 201.27s
Neo4j 2.0.1 enterprise edition with oracle java version "1.7.0_51":
new-sys-admin#ThinkPad:~/temp$ php perftest.php
Creating 1000 Nodes with index
Time taken : 75.14s
Creating 4000 relationships
Time taken : 206.52s
The above results are after 2 warm-up runs. 2.0.1 results seem slower than 1.8.1. Any suggestions on adjusting the relevant configurations to boost up neo4j 2.0.1 performance would be highly appreciated.
EDIT 1
All queries are issued using Gremlin via Everyman Neo4j wrapper.
http://grokbase.com/p/gg/neo4j/143w1fen8c/gremlin-plugin-extremely-slow-on-neo4j-2-0-1
In the mean time, I moved to neo4j-enterprise-edition-1.9.6 (the next recent stable release before 2.0.1) and things were back to Normal
From the fact that you're using PHP, and seeing that creating just a 1000 nodes is 67 seconds, I assume you're using the regular REST API (eg. POST /db/data/node). If this is correct, you may be right that 2.0.1 is some percentage point slower than 1.8 for these CRUD operations. In 2.0 we focused on optimizing Cypher and the new transactional endpoint.
As such, for best performance, I'd suggest these things:
Use the new transactional endpoint, /db/data/transaction
Use cypher, and use it to send as much work as possible in "one go" over to the server
When possible, send multiple cypher queries in the same HTTP request, you can do this as well through the transactional endpoint.
Make sure you re-use TCP connections if you can, I'm not sure exactly how this works in PHP, but sending "Connection: Keep-alive" header and ensuring you re-use the same tcp connection saves significant overhead, since you don't have to re-establish TCP connections over and over.
Creating a thousand nodes in one cypher query shouldn't take more than a few milliseconds. In terms of how many cypher statements you can send per second, on my laptop and from python (using https://github.com/jakewins/neo4jdb-python), I get about 10 000 cypher statements per second in a concurrent setup (10 clients).
Related
I am building an OSM tile server with mod_tile/renderd, and osm2pgsql, as per instructions here: https://switch2osm.org/manually-building-a-tile-server-16-04-2-lts/
With my current spec EC2 server t2.xlarge, Ubuntu 16.04, I can just about work with a country-sized map, although rendering tiles on the fly is still slow, so render_list is needed. I have tried all performance tweaks I could find to speed up rendering, but what I really think I need is a more powerful server, particularly as the eventual aim is a planet sized import. Most server specs I can find for this are very outdated.
Would anybody have recommendations for an EC2 instance (or general cloud server specs) for building a planet sized OSM tile server in 2018?
I found upgrading to a m5.2xlarge server was sufficient to work with the planet database - on my previous 16gb server, I was running out of RAM for many DB tasks. Other important issues to resolve were:
Build spatial indexes on the entire table geometries, which was not done by osm2pgsql in my case. I did already have partial indexes from running openstreetmap-carto/scripts/indexes.py but these were not suitable for my style and not being used, so I needed to create these indexes:
CREATE INDEX planet_osm_polygon_index ON planet_osm_polygon USING GIST(way)
CREATE INDEX planet_osm_line_index ON planet_osm_line USING GIST(way)
Manually set a layer extent in the style xml file (I just used the map extent) - I had omitted it, which means it had to be calculated by a time consuming PostGIS query, see: https://github.com/mapnik/mapnik/wiki/OptimizeRenderingWithPostGIS
Run a VACUUM and ANALYZE
I have now been able to run render_list on zooms 0-11, and the server can generate further zoom levels on the demand without issue.
We need to seed an application with 3 million entities before running performance tests.
The 3 million entities should be loaded through the application to simulate 3 years of real data.
We are inserting 1-5000 entities at a time. In the beginning response times are very good. But after a while they decay exponentially.
We use at groovy script to hit a URL to start each round of insertions.
Restarting the application resets the response time - i.e. fixes the problem temporally.
Reruns of the script, without restarting the app, have no effect.
We use the following to enhance performance
1) Cleanup GORM after each 100 insertions:
def session = sessionFactory.currentSession
session.flush()
session.clear()
DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear()
(old Ted Naleid trick: http://naleid.com/blog/2009/10/01/batch-import-performance-with-grails-and-mysql)
2) We use GPars for parallel insertions:
GParsPool.withPool {
(0..<1000).eachParallel {
def entity = new Entity(...)
insertionService.insert(entity)
}
}
Notes
When looking at the log output, I've noticed that the processing time for each entity are the same, but the system seems to pause longer and longer between each iteration.
The exact number of entities inserted are not important, just around 3 mill, so if some fail we can ignore it.
Tuning the number of entities at a time have little or no effect.
Help
I'm really hoping somebody have a good idea on how to fix the problem.
Environment
Grails: 2.4.2 (GRAILS_OPTS=-Xmx2G -Xms512m -XX:MaxPermSize=512m)
Java: 1.7.0_55
MBP: OS X 10.9.5 (2,6 GHz Intel Core i7, 16 GB 1600 MHz DDR3)
The pausing would make me think it's the JVM doing garbage collection. Have you used a profiler such as VisualVM to see what time is being spent doing garbage collection? Typically this will be the best approach to understanding what is happening with your application within the JVM.
Also, it's far better to load the data directly into the database rather than using your application if you are trying to "seed" the application. Performance wise of course.
(Added as answer per comment)
I have Neo4j 1.9.4 installed on 24 core 24Gb ram (centos) machine and for most queries CPU usage spikes goes to 200% with only few concurrent requests.
Domain:
some sort of social application where few types of nodes(profiles) with 3-30 text/array properties and 36 relationship types with at least 3 properties. Most of nodes currently has ~300-500 relationships.
Current data set footprint(from console):
LogicalLogSize=4294907 (32MB)
ArrayStoreSize=1675520 (12MB)
NodeStoreSize=1342170 (10MB)
PropertyStoreSize=1739548 (13MB)
RelationshipStoreSize=6395202 (48MB)
StringStoreSize=1478400 (11MB)
which is IMHO really small.
most queries looks like this one(with more or less WITH .. MATCH .. statements and few queries with variable length relations but the often fast):
START
targetUser=node({id}),
currentUser=node({current})
MATCH
targetUser-[contact:InContactsRelation]->n,
n-[:InLocationRelation]->l,
n-[:InCategoryRelation]->c
WITH
currentUser, targetUser,n, l,c, contact.fav is not null as inFavorites
MATCH
n<-[followers?:InContactsRelation]-()
WITH
currentUser, targetUser,n, l,c,inFavorites, COUNT(followers) as numFollowers
RETURN
id(n) as id,
n.name? as name,
n.title? as title,
n._class as _class,
n.avatar? as avatar,
n.avatar_type? as avatar_type,
l.name as location__name,
c.name as category__name,
true as isInContacts,
inFavorites as isInFavorites,
numFollowers
it runs in ~1s-3s(for first run) and ~1s-70ms (for consecutive and it depends on query) and there is about 5-10 queries runs for each impression. Another interesting behavior is when i try run query from console(neo4j) on my local machine many consecutive times(just press ctrl+enter for few seconds) it has almost constant execution time but when i do it on server it goes slower exponentially and i guess it somehow related with my problem.
Problem:
So my problem is that neo4j is very CPU greedy(for 24 core machine its may be not an issue but its obviously overkill for small project). First time i used AWS EC2 m1.large instance but over all performance was bad, during testing, CPU always was over 100%.
Some relevant parts of configuration:
neostore.nodestore.db.mapped_memory=1280M
wrapper.java.maxmemory=8192
note: I already tried configuration where all memory related parameters where HIGH and it didn't worked(no change at all).
Question:
Where to digg? configuration? scheme? queries? what i'm doing wrong?
if need more info(logs, configs) just ask ;)
The reason for subsequent invocations of the same query being much faster can be easily explained by the usage of caches. A common strategy is to run a cache warmup query upon startup, e.g.
start n=node(*) match n--m return count(n)
200% CPU usage on a 24 core means the machine is pretty lazy as only 2 cores are busy. When a query is in progress it's normal that CPU goes to 100% while running.
The Cypher statement above uses an optional match (in the 2nd match clause). These optional matches are known as being potentially slow. Check out if runtime changes if you make this a non-optional match.
When returning a larger result set consider that transferring the response is driven by network speed. Consider using streaming in the case, see http://docs.neo4j.org/chunked/milestone/rest-api-streaming.html.
You also should set wrapper.java.minmemory to the same value as wrapper.java.maxmemory.
Another approach for your rather small graph is to switch off MMIO caching and use cache_type=strong to keep the full dataset in the object cache. In this case you might need to increas wrapper.java.minmemory and wrapper.java.maxmemory.
I have the following graph structure
4m nodes
23m properties
13m relationships
Java version
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
Neo4j version
neo4j-community-2.0.0-M03
Machine
Mac OS X 10.8.4
2.5 GHz Intel Core i5
8 GB 1600 MHz DDR3
Problem
I am doing some experiments with three queries. #1 is taking 16 seconds, #2 is taking 8 minutes and #3 is "crashing". Both #2 and #3 put all the available CPU cores in ~90% usage. I am using the web interface for evaluating those queries (and I will be using the REST API to integrate the app with neo4j)
I would like to know what is wrong with those queries and how I could optimise them. I am currently using the default settings.
Cypher Queries
Query #1 (Currently taking 16 seconds (after warm-up))
START root=node:source(id="2")
MATCH root-[]->movies<-[]-others
WITH COUNT(movies) as movie_count, others as others
RETURN others.id, movie_count
ORDER BY movie_count DESC
LIMIT 10
Query #2 (8 minutes)
START root=node:source(id="2")
MATCH
root-[]->stuff<-[]-others
WITH DISTINCT(others) as dothers
MATCH dothers-[]->different
RETURN different.id, COUNT(different) as count
ORDER BY count DESC
LIMIT 10
Query #3 (OutOfMemoryError - GC overhead limit exceeded)
START root=node:source(id="2")
MATCH root-[*1..1]->stuff<-[*1..1]-other-[*1..1]->different
WHERE stuff.id <> different.id
WITH COUNT(different) as different_count, different as different
RETURN different.id, different_count
ORDER BY different_count DESC
LIMIT 10
Disclaimer: This advice is for 1.8 and 1.9. If you're using 2.0 or 2.1, these comments may no longer be valid.
Query 1: Make your WITH your RETURN, and skip that extra step.
Query 2: Don't do distinct in WITH as you are now. Go as far as you can without doing distinct. This looks like a premature optimization in the query that makes it not be lazy and has to store many more intermediate results to calculate the WITH results.
Query 3: Don't do -[*1..1]->; that's the same as -[]-> or -->, but it uses a slower matcher for variable length paths when it really just needs adjacent nodes and can use a fast matcher. Make the WITH your RETURN and take out that extra pipe it needs to go through so it can be lazier (although the order by kind of makes it hard to be lazy). See if you can get it to complete without the order by.
If you need faster responses and can't squeeze it out of your queries with my recommendations, you may need to turn to the Java API until Cypher performance improvements in 2.x. The unmanaged extension method makes these easy to call from the REST interface.
When looking for performance please go with the latest stable version (1.9.x at timepoint when writing this answer) of Neo4j.
2.0.0.M03 is a milestone build and not yet optimized. So far the focus is on feature completeness with regards to the new concept of labels and label based indexing.
One of our customer has a 35 Gb database with average active connections count about 70-80. Some tables in database have more than 10M records per table.
Now they have bought new server: 4 * 6 Core = 24 Cores CPU, 48 Gb RAM, 2 RAID controllers 256 Mb cache, with 8 SAS 15K HDD on each.
64bit OS.
I'm wondering, what would be a fastest configuration:
1) FB 2.5 SuperServer with huge buffer 8192 * 3500000 pages = 29 Gb
or
2) FB 2.5 Classic with small buffer of 1000 pages.
Maybe some one has tested such case before and will save me days of work :)
Thanks in advance.
Because there is many processor I would start by Classic.
But try all.
Perhaps soon 2.5 with superclassic can be great for you.
just to dig out the old thread for anyone who may need this.
We use fb classic 2.5 on 75GB db, machine almost the same as described one.
SuperServer was inefficient during tests. Buffers and page size changes only made performance a little bit less miserable.
Currently we use Classic with xinetd, page size = 16384, page buffers = 5000,
SuperServer will use ONLY ONE procesor.
Since you have 24 cores your best option is to use Clasic.
SuperClasic is not yet ready to scale well in a multi processor enviroment.
Definitely go with one of the 'classic' architectures.
If you're using Firebird 2.5, check out SuperClassic.
I am currently having a client who has similar requirements.
The best solution for that case was to install FirebirdSQL 2.5 SuperClassic and just leaving the default small caching settings, because if you have free memory (RAM), Windows and also Linux do better caching of the database then firebird does. The Caching feature of Firebird is not really fast, so let the OS do it.
Also depending on what backup-software you use - if it creates full backups of the firebird-database often, then you can deactivate forced writes on the databases. (just do it if you know what you are doing and if know what can happes by deactivating the forced writes).