Huge page buffer vs. multiple simultaneous processes - performance

One of our customer has a 35 Gb database with average active connections count about 70-80. Some tables in database have more than 10M records per table.
Now they have bought new server: 4 * 6 Core = 24 Cores CPU, 48 Gb RAM, 2 RAID controllers 256 Mb cache, with 8 SAS 15K HDD on each.
64bit OS.
I'm wondering, what would be a fastest configuration:
1) FB 2.5 SuperServer with huge buffer 8192 * 3500000 pages = 29 Gb
or
2) FB 2.5 Classic with small buffer of 1000 pages.
Maybe some one has tested such case before and will save me days of work :)
Thanks in advance.

Because there is many processor I would start by Classic.
But try all.
Perhaps soon 2.5 with superclassic can be great for you.

just to dig out the old thread for anyone who may need this.
We use fb classic 2.5 on 75GB db, machine almost the same as described one.
SuperServer was inefficient during tests. Buffers and page size changes only made performance a little bit less miserable.
Currently we use Classic with xinetd, page size = 16384, page buffers = 5000,

SuperServer will use ONLY ONE procesor.
Since you have 24 cores your best option is to use Clasic.
SuperClasic is not yet ready to scale well in a multi processor enviroment.

Definitely go with one of the 'classic' architectures.
If you're using Firebird 2.5, check out SuperClassic.

I am currently having a client who has similar requirements.
The best solution for that case was to install FirebirdSQL 2.5 SuperClassic and just leaving the default small caching settings, because if you have free memory (RAM), Windows and also Linux do better caching of the database then firebird does. The Caching feature of Firebird is not really fast, so let the OS do it.
Also depending on what backup-software you use - if it creates full backups of the firebird-database often, then you can deactivate forced writes on the databases. (just do it if you know what you are doing and if know what can happes by deactivating the forced writes).

Related

When 777ms are still not good enough (MariaDB tuning)

Over the past couple of months, I've been on a rampage optimising a Joomla website that I'm managing. When I first started, the homepage used to open in around 30-40 seconds, in spite of repeatedly upgrading my dedicated server, as suggested by the hosting firm.
I was able to bring the pagespeed down to around 800ms by religiously following all the recommendations of the likes of GT Matrix and PingdomTools, (such as using JCH-optimize, .htaccess caching and compression settings, and MaxCDN) but now I'm stuck optimising my my.cnf settings, trying various settings suggested on a number of related articles. The fastest I'm getting the homepage to open - with the current settings - is 777ms after refresh, which might not sound too bad, but look at the configuration of my dedicated server:
2 Quads, 128GB, 2x480GB SSD RAID
CloudLinux/Cpanel/WHM
Apache/suEXEC/PHP5/FastCGI
MariaDB 10.0.17 (all tables converted to XtraDB/InnoDB)
The site traffic is moderate, 10,000 and 20,000 visitors per day, with around 200,000 pageviews.
These are the current my.cnf settings. My goal is to bring the pagespeed down to under 600ms, which should be possible with this kind of hardware, provided it is tuned the right way.
[mysqld]
local-infile=0
max_connections=10000
max_user_connections=1000
max_connect_errors=20
key_buffer_size=1G
join_buffer_size=1G
bulk_insert_buffer_size=1G
max_allowed_packet=1G
slow_query_log=1
slow_query_log_file="diskar/mysql-slow.log"
long_query_time=40
connect_timeout=120
wait_timeout=20
interfactive_timeout=25
back_log=500
query_cache_type=1
query_cache_size=512M
query_cache_limit=512K
query_cache_min_res_unit=2K
sort_buffer_size=1G
thread_cache_size=16
open_files_limit=10000
tmp_table_size=8G
thread_handling=pool-of-threads
thread_stack=512M
thread_pool_size=12
thread_pool_idle_timeout=500
thread_cache_size=1000
table_open_cache=52428
table_definition_cache=8192
default-storage-engine=InnoDB
[innodb]
memlock
innodb_buffer_pool_size=96G
innodb_buffer_pool_instances=12
innodb_additional_mem_pool_size=4G
innodb_log_bugger_size=1G
innodb_open_files=300
innodb_data_file_path=ibdata1:400M:autoextend
innodb_use_native_aio=1
innodb_doublewrite=0
innodb_user_atomic_writes=1
innodb_flus_log_at_trx_commit=2
innodb_compression_level=6
innodb_compression_algorithm=2
innodb_flus_method=O_DIRECT
innodb_log_file_size=4G
innodb_log_files_in_group=3
innodb_buffer_pool_instances=16
innodb_adaptive_hash_index_partitions=16
innodb_thread_concurrency
innodb_thread_concurrency=24
innodb_write_io_threads=24
innodb_read_io_threads=32
innodb_adaptive_flushing=1
innodb_flush_neighbors=0
innodb_io_capacity=20000
innodb_io_capacity_max=40000
innodb_lru_scan_depth=20000
innodb_purge_threads=1
innodb_randmon_read_ahead=1
innodb_read_io_threads=64
innodb_write_io_threads=64
innodb_use_fallocate=1
innodb_use_atomic_writes=1
inndb_use_trim=1
innodb_mtflush_threads=16
innodb_use_mfflush=1
innodb_file_per_table=1
innodb_file_format=Barracuda
innodb_fast_shutdown=1
I tried Memcached and APCU, but it didn't work. The site actually runs 2-3 times faster with 'Files' as the caching handler in Joomla's Global Configuration. And yes, I ran my-sqltuner, but that was of no help.
I am newby as far as Linux is concerned and suspect that above settings could be improved. Any comments and/or suggestions?
long_query_time=40
Set that to 1 so you can find out what the slow queries are.
max_connections=10000
That is unreasonably high. If you come anywhere near it, you will have more problems than failure to connect. Let's say only 3000.
query_cache_type=1
query_cache_size=512M
The Query cache is hurting performance by being so large. This is because any write causes all QC entries for the table to be purged. Recommend no more than 50M. If you have heavy writes, it might be better to change the type to DEMAND and pepper your SELECTs with SQL_CACHE (for relatively static tables) or SQL_NO_CACHE (for busy tables).
What OS?
Are the entries in [innodb] making it into the system? I thought these needed to be in [mysqld]. Check by doing SHOW VARIABLES LIKE 'innodb%';.
Ah, buggers; a spelling error:
innodb_log_bugger_size=1G
innodb_flus_log_at_trx_commit=2
inndb_use_trim=1
and more??
After you get some data in the slowlog, run pt-query-digest, and let's discuss the top couple of queries.

Performance decays exponentially when inserting bulk data into grails app

We need to seed an application with 3 million entities before running performance tests.
The 3 million entities should be loaded through the application to simulate 3 years of real data.
We are inserting 1-5000 entities at a time. In the beginning response times are very good. But after a while they decay exponentially.
We use at groovy script to hit a URL to start each round of insertions.
Restarting the application resets the response time - i.e. fixes the problem temporally.
Reruns of the script, without restarting the app, have no effect.
We use the following to enhance performance
1) Cleanup GORM after each 100 insertions:
def session = sessionFactory.currentSession
session.flush()
session.clear()
DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear()
(old Ted Naleid trick: http://naleid.com/blog/2009/10/01/batch-import-performance-with-grails-and-mysql)
2) We use GPars for parallel insertions:
GParsPool.withPool {
(0..<1000).eachParallel {
def entity = new Entity(...)
insertionService.insert(entity)
}
}
Notes
When looking at the log output, I've noticed that the processing time for each entity are the same, but the system seems to pause longer and longer between each iteration.
The exact number of entities inserted are not important, just around 3 mill, so if some fail we can ignore it.
Tuning the number of entities at a time have little or no effect.
Help
I'm really hoping somebody have a good idea on how to fix the problem.
Environment
Grails: 2.4.2 (GRAILS_OPTS=-Xmx2G -Xms512m -XX:MaxPermSize=512m)
Java: 1.7.0_55
MBP: OS X 10.9.5 (2,6 GHz Intel Core i7, 16 GB 1600 MHz DDR3)
The pausing would make me think it's the JVM doing garbage collection. Have you used a profiler such as VisualVM to see what time is being spent doing garbage collection? Typically this will be the best approach to understanding what is happening with your application within the JVM.
Also, it's far better to load the data directly into the database rather than using your application if you are trying to "seed" the application. Performance wise of course.
(Added as answer per comment)

APC with TYPO3: high fragmentation over time

Using APCu with TYPO3 6.2 extensively, I always get a high fragmentation of the cache over time. I already had values of 99% with a smaller shm_size.
In case you are a TYPO3 admin, I also switched the caches cache_pagesection, cache_hash, cache_pages (currently for testing purposes moved to DB again), cache_rootline, extbase_reflection, extbase_opject as well as some other extension caches to apc backend. Mainly switching the cache_hash away from DB sped up menu rendering times dramatically (https://forge.typo3.org/issues/57953)
1) Does APC fragmentation matter at all or should I simply watch out that it just never runs out of memory?
2) To TYPO3 admins: do you happen to have an idea which tables cause fragmentation most and what bit in the apcu.ini configuration is relevant for usage with TYPO3?
I already tried using apc.stat = 0, apc.user_ttl = 0, apc.ttl = 0 (as in the T3 caching guide http://docs.typo3.org/typo3cms/CoreApiReference/CachingFramework/FrontendsBackends/Index.html#caching-backend-apc) and to increase the shm_size (currently at 512M where normally around 100M would be used). Shm_size does a good job at reducing fragmentation, but I'd rather have a smaller but full cache than a large one unused.
3) To APC(u) admins: could it be that frequently updating cache entries that change in size as well cause most of the fragmentation? Or is there any other misconfiguration that I'm unaware of?
I know there is a lot of entries in cache (mainly JSON data from remote servers) where some of them update every 5 minutes and normally are a different size each time. If that is indeed a cause, how can I avoid it? Btw: APCU Info shows there are a lot of entries taking up only 2kB but each with a fragmented spacing of about 200 Bytes.
4) To TYPO3 and APC admins: apc has a great integration in TYPO3, but for more frequently updating and many small entries, would you advise a different cache backend than apc?
This is no longer relevant for us, I found a different solution reverting back to MySQL cache. Though if anyone comes here via search, this is how we did it in the end:
Leave the APC cache alone and only use it for the preconfigured extbase_object cache. This one is less than 1MB, has only a few inserts at the beginning and yields a very high hit / miss ratio after. As stated in the install tool in the section "Configuration Presets", this is what the cache backend has been designed for.
I discovered this bug https://forge.typo3.org/issues/59587 in the process and reviewed our cache usage again. It resulted in huge cache entries only used for tag-to-ident-mappings. My conclusion is, even after trying out the fixed cache, that APCu is great for storing frequently accessed key-value mappings but yields when a lot of frequently inserted or tagged entries are around (such as cache_hash or cache_pages).
Right now, the MySQL cache tables have a better performance with extended usage of the MySQL server memory cache (but in contrast to APCu with disc backup). This was the magic setup for our my.cnf (found here: http://www.mysqlperformanceblog.com/2007/11/01/innodb-performance-optimization-basics/):
innodb_buffer_pool_size = 512M
innodb_log_file_size = 256M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 2
innodb_thread_concurrency = 8
innodb_flush_method=O_DIRECT
innodb_file_per_table
With this additional MySQL server setup, the default typo3 cache tables do their job best.

Neo4j 2.0.1 enterprise edition: Performance issue

I was happily using neo4j 1.8.1 community edition for a while on my system with the following configuration.
System Specs:
OS: 32-bit Ubuntu 12.04.3 LTS. Kernel version 3.2.0-52-generic-pae #78-Ubuntu
Memory: 4GB
Swap: 8GB (swapfile - not a partition)
Processor: Intel® Core™ i5-2430M CPU # 2.40GHz - Quad Core
Harddisk: 500GB Seagate ATA ST9500420AS. Dual boot - Ubuntu uses 100GB and the rest by the almighty Windows 7.
When I switched to neo4j 2.0.1 enterprise edition, my application's response time became 4x slower. So, as advised in http://docs.neo4j.org/chunked/stable/embedded-configuration.html, I started tuning my filesystem, virtual memory, I/O-schedular and JVM configurations.
Performance Tuning
Started Neo4j as a server with highest scheduling priority (nice value = -20)
Set vm.dirty_background_ratio=50 and vm.dirty_ratio=80 in /etc/sysctl.conf to reduce frequent flushing of dirty memory pages to disk.
Increased maximum number of open files from 1024 to 40,000 as suggested in Neo4j startup.
Set noatime,nodiratime for the neo4j ext4 partition in /etc/fstab so that inodes don't get updated every time there is a file/directory access.
Changed I/O scheular to "noop" from "cfq" as mentioned in
http://www.cyberciti.biz/faq/linux-change-io-scheduler-for-harddisk/
JVM parameters: In short, max heap size is 1GB and neostore memory mapped files size is 425 MB.
Xms and Xmx to 1GB.
GC to Concurrent-Mark-Sweep.
neostore.nodestore.db.mapped_memory=25M,
neostore.relationshipstore.db.mapped_memory=50M
neostore.propertystore.db.mapped_memory=90M
neostore.propertystore.db.strings.mapped_memory=130M
neostore.propertystore.db.arrays.mapped_memory=130M
Sadly, this didn't make any difference. I wrote a simple script which creates N nodes and M random relationships among these nodes to get a better picture.
Neo4j 1.8.1 community edition with oracle java version "1.6.0_45":
new-sys-admin#ThinkPad:~/temp$ php perftest.php
Creating 1000 Nodes with index
Time taken : 67.02s
Creating 4000 relationships
Time taken : 201.27s
Neo4j 2.0.1 enterprise edition with oracle java version "1.7.0_51":
new-sys-admin#ThinkPad:~/temp$ php perftest.php
Creating 1000 Nodes with index
Time taken : 75.14s
Creating 4000 relationships
Time taken : 206.52s
The above results are after 2 warm-up runs. 2.0.1 results seem slower than 1.8.1. Any suggestions on adjusting the relevant configurations to boost up neo4j 2.0.1 performance would be highly appreciated.
EDIT 1
All queries are issued using Gremlin via Everyman Neo4j wrapper.
http://grokbase.com/p/gg/neo4j/143w1fen8c/gremlin-plugin-extremely-slow-on-neo4j-2-0-1
In the mean time, I moved to neo4j-enterprise-edition-1.9.6 (the next recent stable release before 2.0.1) and things were back to Normal
From the fact that you're using PHP, and seeing that creating just a 1000 nodes is 67 seconds, I assume you're using the regular REST API (eg. POST /db/data/node). If this is correct, you may be right that 2.0.1 is some percentage point slower than 1.8 for these CRUD operations. In 2.0 we focused on optimizing Cypher and the new transactional endpoint.
As such, for best performance, I'd suggest these things:
Use the new transactional endpoint, /db/data/transaction
Use cypher, and use it to send as much work as possible in "one go" over to the server
When possible, send multiple cypher queries in the same HTTP request, you can do this as well through the transactional endpoint.
Make sure you re-use TCP connections if you can, I'm not sure exactly how this works in PHP, but sending "Connection: Keep-alive" header and ensuring you re-use the same tcp connection saves significant overhead, since you don't have to re-establish TCP connections over and over.
Creating a thousand nodes in one cypher query shouldn't take more than a few milliseconds. In terms of how many cypher statements you can send per second, on my laptop and from python (using https://github.com/jakewins/neo4jdb-python), I get about 10 000 cypher statements per second in a concurrent setup (10 clients).

How to estimate how large a redis database will be?

I'm trying to decide what size Redis To Go option to go for in heroku.
Say that I want to keep about a million records in Redis for easy access.
If each record is about 1-10kb in size, does this mean that the entire database will be 1,000,000 * 1-10kb or is there some hidden overhead that I don't know about?
Can you start with the smallest option and gain access to the CLI interface? If so, you can do the following:
redis-cli
> info
View the following and note values:
used_memory:931104
used_memory_human:909.27K
used_memory_rss:1052672
used_memory_peak:931008
used_memory_peak_human:909.19K
Then load 100k records and perform the same thing and compare the difference in memory. It's really hard to tell just by guessing on size, and note that there is overhead (the figures above are vanilla install with no data on my MacBookPro 2011 w/ OSX Lion).

Resources