supercsv is working too slow on linux machine - supercsv

I am using supercsv for downloading data as csv. This is working too slow for me is takes 20 30 minutes to download data on Linux machine. CPU usage exceeded to 109% is there any solution to increase performance. The code is given below i am using csvbeanwriter

Related

How to monitor very slow data loading in BigQuery

I'm loading uncompressed JSON files into BigQuery in C#, using Google API method BigQueryClient.UploadJsonAsync. Uploaded files are ranging from 1MB to 400MB. I've been uploading many TB of data with no issues those last months. But it appears since two days that uploading to BigQuery has become very slow.
I was able to upload at 600MB/s, but now I'm at most at 15MB/s.
I have checked my connection and I'm still able to go over 600MB/s in connection tests like Speed Test.
Also strangely, BigQuery load throughput seems to depend on hours of day. When reaching 3PM PST my throughput is falling to near 5-10MB/s.
I have no idea how to investigate this.
Is there a way to monitor BigQuery data loading ?
It's unclear if you're measuring time from when you start sending bytes until the load job is inserted, vs the time from when you start sending until the load job is completed. The first is primarily a question of throughput at a network level , whereas the second one also included ingestion time from the BigQuery service. You can examine the load job metadata to help figure this out.
If you're trying to suss out network issues with sites like speedtest, make sure you're choosing a suitable remote node to test against; by default, they favor something with close network locality relative to the client you are testing.

Elasticsearch speed vs. Cloud (localhost to production)

I have got a single ELK stack with a single node running in a vagrant virtual box on my machine. It has 3 indexes which are 90mb, 3.6gb, and 38gb.
At the same time, I have also got a Javascript application running on the host machine, consuming data from Elasticsearch which runs no problem, speed and everything's perfect. (Locally)
The issue comes when I put my Javascript application in production, as the Elasticsearch endpoint in the application has to go from localhost:9200 to MyDomainName.com:9200. The speed of the application runs fine within the company, but when I access it from home, the speed drastically decreases and often crashes. However, when I go to Kibana from home, running query there is fine.
The company is using BT broadband and has a download speed of 60mb, and 20mb upload. Doesn't use fixed IP so have to update A record whenever IP changes manually, but I don't think is relevant to the problem.
Is the internet speed the main issue that affected the loading speed outside of the company? How do I improve this? Is cloud (CDN?) the only option that would make things run faster? If so how much would it cost to host it in the cloud assuming I would index a lot of documents in the first time, but do a daily max. 10mb indexing after?
UPDATE1: Metrics from sending a request from Home using Chrome > Network
Queued at 32.77s
Started at 32.77s
Resource Scheduling
- Queueing 0.37 ms
Connection Start
- Stalled 38.32s
- DNS Lookup 0.22ms
- Initial Connection
Request/Response
- Request sent 48 μs
- Waiting (TTFB) 436.61.ms
- Content Download 0.58 ms
UPDATE2:
The stalling period seems to been much lesser when I use a VPN?

10k Concurrent connections with jmeter

I have a 32GB, i7 core processor running on windows 10 and I am trying to generate 10kVU concurrent load via jmeter. For some reason I am unable to go beyond 1k concurrent and I start getting BindException error or Socket connection error. Can someone help me with the settings to achieve that kind of load? Also if someone is up for freelancing I am happy to consider that as well. Any help would be great as I am nearing production and am unable to load test this use case. If you guys have any other tools that I can use effectively, that would also help.
You reach the limit of 1 computer, thus you must execute in distributed environment of multiple computers.
You can setup JMeter's distributed testing on your own environment or use blazemeter or other cloud based load testing tool
we can use BlazeMeter, which provides us with an easy way to handle our load tests. All we need to do is to upload our JMX file to BlazeMeter. We can also upload a consolidated CSV file with all the necessary data and BlazeMeter will take care of splitting it, depending on the amount of engines we have set.
On BlazeMeter we can set the amount of users or the combination of engines (slave systems) and threads that we want to apply to our tests. We can also configure additional values like multi locations.
1k concurrent sounds low enough that it's something else ... it's also the default amount of open file descriptor limits on a lot of Linux distributions so maybe try to raise the limit.
ulimit -Sn
will show you your current limit and
ulimit -Hn
will show you the hard limit you can go before you have to touch configuration files. Editing /etc/security/limits.conf as root and setting something like
yourusername soft nofile 50000
yourusername hard nofile 50000
yourusername - will have to be the username of the user which with you run jmeter.
After this you will probably have to restart in order for the changes to take effect. If not on Linux I don't know how to actually do this you will have to google :D
Recommendation:
As a k6 developer I can propose it as an alternative tool, but running 10k VUs on a single machine will be hard with it as well. Every VU will take some memory - like at least 1-3mb and this will go up the larger your script is. But with 32gb you could still run upto 1-2kVUs and use http.batch to make concurrent requests which might simulate the 10k VUs depending on what your actual workflow is like.
I managed to run the stages sample with 300VUs on a single 3770 i7 CPU and 4gb of ram in a virtual machine and got 6.5k+ rps to another virtual machine on a neighboring physical machine (the latency is very low) so maybe 1.5-2kVUs with a a somewhat more interesting script and some higher latency as this will give time to the golang to actually run GC while waiting for tcp packets. I highly recommend using discardResponseBodies if you don't need them and even if you need some to actually get the response for them. This helps a lot with the memory consumption a each VU

OutOfMemory Exception in JMeter

I am running JMeter test on machine having 12 GB RAM but I am getting 'OutOfMemory: Heap Space' exception while running the test for 100 users only.
My script has around 300 http requests and is of sync with server scenario.
I have already increased the max heap to 8 GB and tried running the test in Non-GUI mode after disabling all the listeners.
Moreover, I have other script of same application and they are working fine with 200 users load.
Can anyone suggest the changes that I need to do in JMeter for running sync script successfully for 200 users.
As you do not provide much details, it is very hard to answer but check:
that you're not running in GUI mode with View Results Tree as it is an Anti-Practice, if it's the case use NON GUI mode:
https://www.ubik-ingenierie.com/blog/jmeter_performance_tuning_tips/
http://jmeter.apache.org/usermanual/best-practices
in jmeter.log that the allocated heap is really what you think
that you have not allocated as much heap as you have RAM, you need to give some memory to the OS (300 MB)
That you're running last jmeter version and not an old one

Remote API server slowiness

In our server we reach api.twitter.com and use REST API of Twitter. Until 3 days ago we had no problems. But since that time we have slowiness problem.
Regarding to Twitter API status page there is no problem. But we have very big delays.
We make 350-400 requests per minute.
Before, we had a performance of 600-700 ms. per request. (Snapshot image)
But now it became 3600-4000 ms per request. (Snapshot image)
It doesn't look like a temporary slowiness because it remains nearly for 3 days.
What did I check:
- I didn't make any big code change in our repo. Also when we make minimal reuqests with just one line of request, we still get this slowiness.
- I check server speed with Ookla's speedtest. It looks good. 800 Mb/s download, 250 Mb/s upload.
- We don't have any CPU, RAM or disk problem. CPU average is 30%, RAM is 50% loaded, disk IO is nearly 4-5%.
So what would be the probable causes ?
I can check them and update question.
(Centos 6.5, PHP 5.4.36, Nginx 1.6, Apache/2.2.15, Apache run PHP as PHP module, XCache 3.2.0)

Resources