I'm loading uncompressed JSON files into BigQuery in C#, using Google API method BigQueryClient.UploadJsonAsync. Uploaded files are ranging from 1MB to 400MB. I've been uploading many TB of data with no issues those last months. But it appears since two days that uploading to BigQuery has become very slow.
I was able to upload at 600MB/s, but now I'm at most at 15MB/s.
I have checked my connection and I'm still able to go over 600MB/s in connection tests like Speed Test.
Also strangely, BigQuery load throughput seems to depend on hours of day. When reaching 3PM PST my throughput is falling to near 5-10MB/s.
I have no idea how to investigate this.
Is there a way to monitor BigQuery data loading ?
It's unclear if you're measuring time from when you start sending bytes until the load job is inserted, vs the time from when you start sending until the load job is completed. The first is primarily a question of throughput at a network level , whereas the second one also included ingestion time from the BigQuery service. You can examine the load job metadata to help figure this out.
If you're trying to suss out network issues with sites like speedtest, make sure you're choosing a suitable remote node to test against; by default, they favor something with close network locality relative to the client you are testing.
Related
In my laravel application, I am making a HTTP request to a remote server using guzzle library . However large time is required for data to reach local host.
Here is the resposne I get on browser ,
However if I run command ping server_IP I get roughly 175ms as average transmission time .
I also monitored my CPU usage after making requests in infinte loop, however I couldn't find much usage .
I also tried hosting my laravel application on nginx server, but I still observe around 1-1.1 seconds overhead .
What could be causing this delay and how can I reduce it ?
There are few potential reasons.
Laravel is not the fastest framework. There are few hundred of files which need to be loaded on every single request. If you server doesn't have a SSD drive your performance will be terrible. I'd recommend to create a RAMDISK and serve the files from there.
Network latency. Open up wireshark and look at all the requests that need to be done. All them impact negatively the performance and some of them you cannot get around (DNS,...). Try to combine the CSS and JS files so that the number of requests is minimised.
Database connection on server side takes long time to setup, retrieves lots of data,...
Bear in mind that in most situations the latency is due to IO constraints rather than CPU usage. Moreover in test/preproduction environments where the number of requests per second that the server gets rounds to 0.
Loading time of images from the blob is very slow
something like 3-4 secs
what can i do to make it fast?
If this is slow from multiple sources, the issue is likely in the client. You can turn Azure Storage Analytics on and then compare the server time to the e2e time. Server time is the time that the request took to process on the server, e2e time is the time the request took end-to-end, including network transmission. The difference between the two tells you how much time you are spending on the network.
While our website is not yet complete graphically and design wise, most of the backend operations are near completion.
However, after optimising the mysql database we are still receiving a significant initial receiving period when tested on pingdom.com:
http://tools.pingdom.com/fpt/#!/IuoBna86v/http://foscam-uk.com
According to Pingdom:
The yellow part is the time it takes to resolve the hostname and similar (before the connection is initiated to the web server), the green part is connecting to the web server, and the blue part is the time it takes to retrieve the content from the webserver.
Upon asking our managed VPS support team we got the response : 'Have you tried optimizing your script? I believe that the high wait time on there indicates actual website loading time (meaning for your script to load); not actual connection to the website/server.'
Now, pingdom shows the js/css loading relatively quickly, the mysql database side of things doesn't seem to be slowing anything down either - does anyone have any suggestions of what this could be or might be causing it?
Thank you very much for your time and help.
89 requests are too many.
Reduce number of image request by creating sprites.This is pretty important from what is shown in pingdom.
Keep Alive should be set to On and Keep alive time should be a bit higher(15 seconds or so).
Use of compiler plus merge and minify js/css is recommended.
Change the hosting provider. 8 second loading is very very slow. It means that it actually is around 15-17 seconds for a user that doesn't have cached parts of your site (first time visitor). My site www.bebepunk.ro loads according to pingdom in 2.5 seconds and users still complain about the slowness of the site. Check also with http://www.webpagetest.org for both values.
We have a JPA -> Hibernate -> Oracle setup, where we are only able to crank up to 22 transactions per seconds (two reads and one write per transaction). The CPU and disk and network are not bottlenecking.
Is there something I am missing? I wonder if there could be some sort of oracle imposed limit that the DBA's have applied?
Network is not the problem, as when I do raw reads on the table, i can do 2000 reads per second. The problem is clearly writes.
CPU is not the problem on the app server, the CPU is basically idling.
Disk is not the problem on the app server, the data is completely loaded into memory before the processing starts
Might be worth comparing performance with a different client technology (or even just a simple test using SQL*Plus) to see if you can beat this performance anyway - it may simply be an under-resourced or misconfigured database.
I'd also compare the results for SQLPlus running directly on the d/b server, to it running locally on whatever machine your Java code is running on (where it is communicating over SQLNet). This would confirm if the problem is below your Java tier.
To be honest there are so many layers between your JPA code and the database itself, diagnosing the cause is going to be fun . . . I recall one mysterious d/b performance problem resolved itself as a misconfigured network card - the DBAs were rightly insistent that the database wasn't showing any bottlenecks.
It sounds like the application is doing a transaction in a bit less than 0.05 seconds. If the SELECT and UPDATE statements are extracted from the app and run them by themselves, using SQL*Plus or some other tool, how long do they take, and if you add up the times for the statements do they come pretty near to 0.05? Where does the data come from that is used in the queries, and which eventually gets used in the UPDATE? It's entirely possible that the slowdown is not the database but somewhere else in the app, such a the data acquisition phase. Perhaps something like a profiler could be used to find out where the app is spending its time.
Share and enjoy.
I am creating a service which receives some data from mobile phones and saves it to the database.
The phone is sending the data every 250 ms. As I noticed that the delay for data storing is increasing I tried to run WireShark and write a log as well.
I noticed that the web requests from mobile phone are being made without the delay (checked with WireShark), but the in the service log I noticed the request is received every second and a half or almost two seconds.
Does anyone know where could be the problem or the way to test and determine the cause of such delay?
I am creating a service with WCF (webHttpBinding) and the database is MS SQL.
By the way the log stores the time of http request and also the time of writing data to the database. As mentioned above the request is received every 1.5 - 2 seconds and after that it takes 50 ms to store data to the database.
Thanks!
My first guess after reading the question was that maybe you are submitting data so fast, the database server is hitting a write-contention lock (e.g. AutoNumber fields?)
If your database platform is SQL Server, take a look at http://www.sql-server-performance.com/articles/per/lock_contention_nolock_rowlock_p1.aspx
Anyway please post more information about the overall architecture of the system... what softwares/platforms are used at what parts etc...
Maybe there is some limitation in the connection imposed by the service provider?
What happens if you (for testing) don't write to the database and just log the page hits in the server log with timestamp?
Check that you do not have any tracing running on the web services, this can really kill perf.