Loading Elasticsearch via Logstash on large dataset runs very slowly

Loading Elasticsearch via Logstash on large dataset runs very slowly - elasticsearch

I have a large dataset in MySql (around 2.2 million rows) and my importing to Elasticsearch via Logstash works, but now is going incredibly slowly.
On my local machine in vagrant instances with 4GB RAM, each, it went relatively quickly (took 3 days) compared to taking an estimate 80+ days for a server-to-server transfer.
The query is quite complex (using a subquery, etc).
I switched the mysql server from using the the /tmp directory to using the /data/tmp_mysqldirectory, but even then I was occasionally running out of temporary space. when I switched to the /data/tmp_mysql directory to hold the /tmp files instead of the /tmp directory.
e.g: I was getting the error:
message=>"Exception when executing JDBC query,
exception Sequel::DatabaseError: Java::JavaSql::SQLException
Error writing file '/data/tmp_mysql/MYHPf8X5' (Errcode: 28)
I updated my query to have this limit (200):
UPDATE p_results set computed_at="0000-00-00 00:00:00" WHERE computed_at IS NULL LIMIT 200;
My configuration file looks like this: (notice that I'm using paging with a page size of 10000).
input {
jdbc {
jdbc_connection_string => "jdbc:mysql://xxx.xxx.xxx.xxx:3306/xxx_production"
jdbc_user => "xxx"
jdbc_password => "xxx"
jdbc_driver_library => "/usr/share/java/mysql.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
statement_filepath => "./sql/req.sql"
jdbc_paging_enabled => "true"
jdbc_page_size => 10000
}
}
output {
elasticsearch {
index => "xxx_resultats_preprod2"
document_type => "resultats"
hosts => ["localhost:9200"]
codec => "plain"
template => "./resultats_template.json"
template_name => "xxx_resultats"
template_overwrite => true
document_id => "%{result_id}"
}
}
I've looked at some of the documentation here
running free -m on my logstash/elasticsearch server, I see this:
total used free shared buffers cached
Mem: 3951 2507 1444 0 148 724
-/+ buffers/cache: 1634 2316
Swap: 4093 173 3920
So total RAM= 4GB, and 2.5GB or 63.4% of it is used. So Ram on the Elasticsearch server doesn't seem to be the issue.
running free -m on my MySql server I see this:
total used free shared buffers cached
Mem: 3951 3836 115 0 2 1154
-/+ cache: 2679 1271
swap: 4093 813 3280
So total Ram = 4GB and ~3,8GB or 97% is used. This looks like a problem.
My theories are that I'm occasionally swapping to disk and that is part of the reason why it's slow. Or maybe I'm using BOTH paging and a limit and that's slowing things down?
The load average on the Mysql server is relatively low right now.
top
load average: 1,00, 1,00, 1,00
under /data I see:
sudo du -h -d 1
13G ./tmp_mysql
4,5G ./production
using df-h I see:
total used utilization%
/dev/sdb1 32G 6,2G 24G 21% /data
If someone can help me make my queries execute much faster I'd very much appreciate it!
Edit:
Thank you all for your helpful feedback. It turns out my logstash import had crashed (due to running out of /tmp space in Mysql for the subquery), and I assumed that I could just keep running the same import job. Well, I could run it, and it loaded into elastic, but very, very slowly. When I re-implemented the loading of the index entirely and started running it on a new index, the load time became pretty-much on par with what it was in the past. I estimate it will take 55 hours to load the data - which is a long time, but at least it's working reasonably now.
I did an EXPLAIN on my Mysql subquery and found some indexing issues I could address/improve, too.

You indicate 2 potential problems here:
slow mysql read
slow elasticsearch write
You must eliminate one! Try to output on stdout to see if elasticsearch is the bottleneck or not.
If yes, you can play with some ES settings to improve ingestion:
refresh_interval => -1 (disable refresh)
remove replica when doing the import (number_of_replicas:0)
Use more shards and more nodes
(more at https://www.elastic.co/guide/en/elasticsearch/reference/master/tune-for-indexing-speed.html)

Related

ES / JVM Memory Locking in Unpriv. Linux Container (LXD/LXC)

I've seen a good bit about docker setups and the like using unpriv containers running ES. Basically, I wan't to set up a simple "prod cluster". Have a total of two nodes, one physical (for data), and one for Injest/Master (LXD Container).
The issue that I've run into is using bootstrap.memory_lock: true as a config option to lock memory (avoid swapping) on my container master/injest node.
[2018-02-07T23:28:51,623][WARN ][o.e.b.JNANatives ] Unable to lock JVM Memory: error=12, reason=Cannot allocate memory
[2018-02-07T23:28:51,624][WARN ][o.e.b.JNANatives ] This can result in part of the JVM being swapped out.
[2018-02-07T23:28:51,625][WARN ][o.e.b.JNANatives ] Increase RLIMIT_MEMLOCK, soft limit: 65536, hard limit: 65536
[2018-02-07T23:28:51,625][WARN ][o.e.b.JNANatives ] These can be adjusted by modifying /etc/security/limits.conf, for example:
# allow user 'elasticsearch' mlockall
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited
...
[1]: memory locking requested for elasticsearch process but memory is not locked
Now, this makes sense given that the ES user can't adjust ulimits on the host. Given that I know enough about this to be dangerous, is there a way/how do I ensure that my unpriv container, can lock the memory it needs, given that there is no ES user on the host?

I'll just call this resolved - set swapoff on parent, and leave that setting to default in container. Not what I would call "the right way" as asked in my question, but good/close enough.

Query failing with "ERROR: Canceling query because of high VMEM usage"

We have small array of gpdb cluster. in that, few queries are failing
System Related information
TOTAL RAM =30G
SWAP =15G
gp_vmem_protect_limit= 2700MB
TOTAL segment = 8 Primary + 8 mirror = 16
SEGMENT HOST=2
VM_OVERCOMMIT RATIO =72
Used this calc : http://greenplum.org/calc/#
SYMPTOM
The query failed with the error message shown below:
ERROR: XX000: Canceling query because of high VMEM usage. Used: 2433MB, available 266MB, red zone: 2430MB (runaway_cleaner.c:135) (seg2 slice74 DATANODE01:40002 pid=11294) (cdbdisp.c:1320)
We tried :
changed following parameters
statement_mem from 125 MB to 8GB
MAX_STATEMENT MEMORY from 200 MB TO 16 GB
Not sure what exactly needs to change here.still, trying to understand root cause of error.
Any help in it would be much appreciated ?

gp_vmem_protect_limit is for per segment. You have 16segments. based on your segments and vm_protect, you need 2700MB X 16 total memory.

Redis high memory usage for almot no keys

I have a redis instance hosted by heroku ( https://elements.heroku.com/addons/heroku-redis ) and using the plan "Premium 1"
This redis is usued only to host a small queue system called Bull ( https://www.npmjs.com/package/bull )
The memory usage is now almost at 100 % ( of the 100 Mo allowed ) even though there is barely any job stored in redis.
I ran an INFO command on this instance and here are the important part ( can post more if needed ) :
# Server
redis_version:3.2.4
# Memory
used_memory:98123632
used_memory_human:93.58M
used_memory_rss:470360064
used_memory_rss_human:448.57M
used_memory_peak:105616528
used_memory_peak_human:100.72M
total_system_memory:16040415232
total_system_memory_human:14.94G
used_memory_lua:280863744
used_memory_lua_human:267.85M
maxmemory:104857600
maxmemory_human:100.00M
maxmemory_policy:noeviction
mem_fragmentation_ratio:4.79
mem_allocator:jemalloc-4.0.3
# Keyspace
db0:keys=45,expires=0,avg_ttl=0
# Replication
role:master
connected_slaves:1
master_repl_offset:25687582196
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:25686533621
repl_backlog_histlen:1048576
I have a really hard time figuring out how I can be using 95 Mo with barely 50 object stored. These objects are really small, usually a JSON with 2-3 fields containing small strings and ids
I've tried https://github.com/gamenet/redis-memory-analyzer but it crashes on me when I try to run it
I can't get a dump because Heroku does not allow it.
I'm a bit lost here, there might be something obvious I've missed but I'm reaching the limit of my understanding of Redis.
Thanks in advance for any tips / pointer.
EDIT
We had to upgrade our Redis instance to keep everything running but it seems the issue is still here. Currently sitting at 34 keys / 34 Mo
I've tried redis-cli --bigkeys :
Sampled 34 keys in the keyspace!
Total key length in bytes is 743 (avg len 21.85)
9 strings with 43 bytes (26.47% of keys, avg size 4.78)
0 lists with 0 items (00.00% of keys, avg size 0.00)
0 sets with 0 members (00.00% of keys, avg size 0.00)
24 hashs with 227 fields (70.59% of keys, avg size 9.46)
1 zsets with 23 members (02.94% of keys, avg size 23.00)
I'm pretty sure there is some overhead building up somewhere but I can't find what.
EDIT 2
I'm actually blind : used_memory_lua_human:267.85M in the INFO command I run when first creating this post and now used_memory_lua_human:89.25M on the new instance
This seems super high, and might explain the memory usage

You have just 45 keys in database, so what you can do is:
List all keys with KEYS * command
Run DEBUG OBJECT <key> command for each or several keys, it will return serialized length so you will get better understanding what keys consume lot of space.
Alternative option is to run redis-cli --bigkeys so it will show biggest keys. You can see content of the key by specific for the data type command - for strings it's GET command, for hashes it's HGETALL and so on.

After a lot of digging, the issue is not coming from Redis or Heroku in anyway.
The queue system we use has a somewhat recent bug where Redis ends up caching a Lua script repeatedly eating up memory as time goes on.
More info here : https://github.com/OptimalBits/bull/issues/426
Thanks for those who took the time to reply.

Sphinx searchd error : failed to mmap file archive.sps': Cannot allocate memory (length=3127114752); NOT

I am using sphinx 2.2.2-id64-beta on mysql 5.5.31 The system is running Centos 6, 32-bit. The server has 8GB RAM where 7.8GB are utilized.
For a full text search on archived data tables, which contain 10,000,000 rows, the indexer creates a 3.0G archive.sps file.
When I search through it the following error is thrown:
preload: failed to map file `'/var/lib/sphinx/archive.sps'`: Cannot allocate memory
(length=3127114752); NOT
I tried using: docinfo = extern,ondisk_attrs = 1 for sql_field_string=details without any luck.
Do I need to free 3GB of RAM for sphinx or is there any way I can work this out?

elasticsearch index getting reset

I have a single node elasticsearch instance ( 0.90 version) running on a single machine ( 8GB RAM, dual core CPU) having RHEL 5.6
After having indexed close to 2 million documents, it runs fine for a few hours and then restarts on its own, wiping out the index in the process. I now need to reindex all the documents again.
Any ideas on why this happens? Maximum file descriptors is set to 32k and the number of open file descriptors at any time does not even come close. So it cant be that.
Here are the modifications i made to the default elasticsearch.yml file :
index.number_of_shards: 5
index.cache.field.type: soft
index.fielddata.cache: soft
index.cache.field.expire: 5m
indices.fielddata.cache.size: 10%
indices.fielddata.cache.expire : 5m
index.store.type: mmapfs
bootstrap.mlockall: true
discovery.zen.ping.multicast.enabled: false
action.disable_delete_all_indices: true
script.disable_dynamic: true
I use the elasticsearch service wrapper to start and stop the instance. In the elasticsearch.conf file, i have set the heap size to 2GB :
set.default.ES_HEAP_SIZE=2048
Any help in diagnosing the problem will be appreciated.
Thanks guys!

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Loading Elasticsearch via Logstash on large dataset runs very slowly - elasticsearch

Related

ES / JVM Memory Locking in Unpriv. Linux Container (LXD/LXC)

Query failing with "ERROR: Canceling query because of high VMEM usage"

Redis high memory usage for almot no keys

Sphinx searchd error : failed to mmap file archive.sps': Cannot allocate memory (length=3127114752); NOT

elasticsearch index getting reset

Categories

Resources