Dynamic Table Updation Procedure in HTTP2 - http2

We have a query regarding our understanding in the stack of HTTP2 Protocol
for Dynamic Table Updation. We want to get a clear understanding of
the procedure for updating the dynamic table in case of remote end sends
header frames in a sequence.
Sorry for bad english, hope you understand our concern.
Scenario :
Suppose remote end 'R' wants to send two requests R1 and R2, but due to
window limit R1 is split into two frames R1-F1 and R1-F2.
Remote End 'R' then send the frames in this sequence
*) R1-F1
*) R2
*) R1-F2
I want to know in this case when to update the dynamic table in remote end
as well as local end for R1 request. As if we first update the dynamic table
in remote end for R1 whole and then R2, then how will we sync the
dynamic table in local end as R1 is fully received after R2.
Or We have to build the dynamic table for each frame irrespective of
the total size of header of request ?

Remote End 'R' then send the frames in this sequence
R1-F1
R2
R1-F2
This is invalid, as specified in the HTTP/2 specification for CONTINUATION frames (https://datatracker.ietf.org/doc/html/rfc7540#section-6.10) that states:
Any number of CONTINUATION
frames can be sent, as long as the preceding frame is on the same
stream and is a HEADERS, PUSH_PROMISE, or CONTINUATION frame without
the END_HEADERS flag set.
This means that in your case, the valid sequence of frames must be:
R1-F1
R1-F2
R2
Once the sequence is correct, updating the HPACK dynamic table is not a problem because request headers are always complete and sequential.

Related

Load big amount of data - Allowed memory size exhausted

I have JAVA Spring Boot application (I'll call it A1) that is connected to Rabbit. The A1 receives data and saves it to the database(MySql) (I'll call database DB1).You can imagine this data as football matches with appropriate markets and outcomes.We are receiving data for the next 10 days over A1 app, and that data is stored in the database.
One more thing is worth to emphasize that every football match has 4 markets and every market has 7 outcomes.
I will explain how DB1 looks like.There are 3 tables (matches, markets, outcomes) worth mentioning besides other tables.Those are related in the way:matches 1.....* marketsmarkets 1.....* outcomes
Data that is received over A1 is constantly updating(every second some update is received for the football events from the current moment plus 2 hours).
There is another PHP Symfony application (I'll call it S1). This application serves as a REST API.
There is one more frontend application (I'll call it F1) that is communicating with S1 over HTTP in order to retrieve data from the database.
F1 application is sending an HTTP request to S1 in order to retrieve this data (matches with markets and outcomes) but the date time frame is from the current moment plus 7 days (business requirement).
When an HTTP request is sent to S1 error occurred because there are over 10 000 football matches plus bets and outcomes.
PHP Fatal error:  Allowed memory size of 134217728 bytes exhausted (tried to allocate 20480 bytes) in.
I am considering two options in order to solve this issue and please if neither of my options are good enough, I would appreciate if You could suggest me some solution to my problem.
Option - F1 iterate per day and sending 7 HTTP requests asynchronously to S1 in order to retrieve data for all 7 days.
Option - F1 sends an HTTP request to S1. S1 returns data only for today, and the next 6 days is sending over a socket iterating per day, using https://pusher.com/ or something similar.
One more thing to emphasize is the number of this HTTP that we count is two per second and it tends to grow.
10K matches is turning into 134 MB of data? You are using more than 13K for each record... Likely you are trying to make your data structure too flat, duplicating metadata on matches/bets/etc into a single flat record. Try making your objects hierarchical, having a match have bets instead of single-row objects.
If not, then you have an inefficiency in your processing of the data that we cannot diagnose remotely.
You would do even better if you did more processing server-side instead of sending the raw data over the wire. The more you can answer questions inside of S1 instead of sending the data to the client, the less data you will have to send.

SSIS Lookup transformation not finding matches

I have a Lookup transformation that does not seem to be finding obvious matches. I have an input file that has 43 records that includes the same CustomerID which is set as an 8 byte-Signed Integer. I am using the Lookup to see if the CustomerID already exist in my destination table. In the destination table the CustomerID is defined as BigInt.
For testing, I truncated the Lookup(destination) table. I have tried all three Cache settings with the same results.
When I run the SSIS package, all 43 records are sent through the No Match Output side. I would think that only the 1st record should go that direction and all others would be considered as a match since they have the same CustomerID. Additionally, if I run the job a second time(without truncating the destination) then they are all flagged as Matched.
It seems as if the cache is not being looked at in the Lookup. Ultimately I want the NO Match records to be written to the Destination table and the Matched records to have further processing.
Any ideas?
Lookup transformation is working as expected. I am not sure what's your understanding of look up is, so I'll go point by point.
For testing, I truncated the Lookup(destination) table. I have tried
all three Cache settings with the same results.
When I run the SSIS package, all 43 records are sent through the No
Match Output side
Above behavior is expected. After truncate, lookup is essentially trying to find those 43 records within your truncated destination table. Since it can't find any, it is flagging them as new records ie No match output side.
if I run the job a second time(without truncating the destination)
then they are all flagged as Matched
In this case, all those 43 records from file are matched within destination table, hence lookup refers them as duplicates and thus they are flagged as Matched output
I am using the Lookup to see if the CustomerID already exist in my
destination table
To achieve this, all you need to do is send Matched output to some staging table which can be periodically truncated(as they are duplicate). and all the No match output can be send to your destination table.
You can post screenshot of your lookup as well in case you want further help.
The lookup can't be used this way. SSIS dataflows execute in a transaction. So while the package is running, no rows have been written to the destination until the entire dataflow runs. So regardless of the Cache setting, the new rows being sent to your destination table are not going to be considered by the Lookup while it's running. Then when you run it again, the rows will be considered. This is expected behavior.

Doing String length on SQL Loader input field

I'm reading data from a fixed length text file and loading into a table with fixed length processing.
I want to check the input line length so that i'd discard the records that are not matching the fixed length and logging them into an Error Table.
Example
Load into Input_Log table if line is meeting the specified length
Load into Input_Error_Log table if the input line length is less than or greater than the fixed line length.
I believe you would be better served by bulk loading your data into a staging table, then load into the production table from there via a stored procedure where you can apply rules via normal PL/SQL & DML to your heart's content. This is a typical best practice anyway.
sqlldr isn't really the tool to get too complicated in, even if you could do what you want. Maintainability and restart-ability become more complicated when you add complexity to a tool that's really designed for bulk loading. Add the complexity to a proper program.
Let us know what you come up with.

Questions on SQL* Loader commit range and display

As per my observation, when we load the data to Oracle tables using SQL* Loader, 64 records are committed at once, by default.
Could you please let me know if we can change this default limit of insertion/commit to number other than 64?
Also, can we avoid the display of loading, shown below on the console?
Commit point reached - logical record count 64
Commit point reached - logical record count 128
Commit point reached - logical record count 192
From the reference:
rows -- number of rows in conventional path bind array or between direct
path data saves
(Default: Conventional path 64, Direct path all)
So you'd specify rows=1024 or whatever on the command line or parameter file.
As for avoiding the display, I don't think there's an option to suppress only the progress information. There is silent but that might be more than you want. Filter those out with grep if you don't want to see them.

Oracle Text URL indexing performance

I have an oracle table that has two columns - id and url.
The URL is simply http://somemachinename/mypage.php?id=
I then create an oracle text index using the URL datastore on the url column.
If I then do:
BEGIN
ctx_ddl.sync_index(idx_name => 'MY_INDEX',
memory => '50M',
parallel_degree => 4);
END;
/
Then if I look at the apache logs on somemachinename I can see oracle requesting all of the URLs in turn.
The problem is that oracle requests about 60 urls in turn and then stops for 15s, before requesting another 60-ish urls.
The amount of data in the html page is small - less than 3k, so 60 pages shouldnt be filling any buffers - and even if it were it shouldnt take 15s to clear them.
Running wireshark shows that the delay is definitely in the requests arriving (rather than a problem with the webserver), so I dont know what Oracle is doing in those 15s.
The indexing is a big job (the table has about 2m rows), and it currently takes a week, wheras without the del;ays I think it would be more like hours....
Any thoughts?
"so I dont know what Oracle is doing in those 15s."
You could look at the Wait / Event on v$session. Could be chewing CPU parsing the HTML, or it maybe that it needs to write the data somewhere (temp tablespace) first.
Don't suppose there is anything on the network side treating this as some sort of attack ?

Resources