SQLAlchemy sending a request for each entity in list - performance

I turned on 'echo' on Engine and I see DB queries sent once when I call query.all() and then a query sent for each Report once report.as_dict() accesses a field.
query = db_session.query(Report)
query = query.filter(or_(Report.network_id == network_id, Report.network_id == None))
reports = query.all()
db_session.commit()
resp = [report.as_dict() for report in reports]
The query sent on query.all() -
2017-09-19 16:02:28,504 INFO sqlalchemy.engine.base.Engine SELECT report.id AS report_id, report.network_id AS report_network_id, report.account_id AS report_account_id, report.name AS report_name, report.notes AS report_notes, report.structure AS report_structure, report.type AS report_type, report.version AS report_version
FROM report WHERE report.network_id = %(network_id_1)s OR report.network_id IS NULL ORDER BY report.id
2017-09-19 16:02:28,504 INFO sqlalchemy.engine.base.Engine {'network_id_1': '5850'}
And for each report accessed on report.as_dict() (param_1 = report id) -
2017-09-19 16:04:15,100 INFO sqlalchemy.engine.base.Engine SELECT report.id AS report_id, report.network_id AS report_network_id, report.account_id AS report_account_id, report.name AS report_name, report.notes AS report_notes, report.structure AS report_structure, report.type AS report_type, report.version AS report_version
FROM report WHERE report.id = %(param_1)s
2017-09-19 16:04:15,100 INFO sqlalchemy.engine.base.Engine {'param_1': 1}
It looks like the whole list of reports is retrieved on the initial query but I still see a query sent for each of them, how can I change this behavior?
My environment: windows 10, Python 3.5.0, sqlalchemy 1.2.0b2

This is because you have expire_on_commit set to True, so as soon as you do session.commit() SQLAlchemy throws away all the data you've just queried. You need to turn it off on the session.
db_session = Session(expire_on_commit=False)
Be mindful and check that this does not violate any assumptions you've made due to this.

Related

Ruby GA4 Data API not returning empty rows in RunReportRequest

After running my code to retrieve a report with the Google Analytics Data gem, the response is not returning all the rows as I would expect. Instead it leaves out the empty rows and only returns ones with data in this case. This is the code I am running, the property and analytics account are models to store access to the account info to pass to the api.
property = my_analytics_property
analytics_account = property.google_analytics_account
analytics_data_service = Google::Analytics::Data.analytics_data do |config|
config.credentials = analytics_account.google_account_oauth.credentials
end
date_dimension = ::Google::Analytics::Data::V1beta::Dimension.new(name: "date")
bounce_rate = ::Google::Analytics::Data::V1beta::Metric.new(name: "bounceRate")
page_views = ::Google::Analytics::Data::V1beta::Metric.new(name: "screenPageViews")
avg_session_duration = ::Google::Analytics::Data::V1beta::Metric.new(name: "averageSessionDuration")
date_range = ::Google::Analytics::Data::V1beta::DateRange.new(start_date: 1.month.ago.to_date.to_s, end_date: Date.current.to_s)
request = ::Google::Analytics::Data::V1beta::RunReportRequest.new(
property: "properties/#{property.remote_property_id}",
metrics: [bounce_rate, page_views, avg_session_duration],
dimensions: [date_dimension],
date_ranges: [date_range],
keep_empty_rows: true
)
response = analytics_data_service.run_report request

How to access a single field of the logstash metadata event?

I am using logastash 7.6 with the output-jdbc plugin, but I get an error and I understand that it is because in the event it sends me all the fields to be indexed that are part of #metadata.
Probe just putting the event name without # and it works for me.
How can I get a single field within a #metada set?
ERROR:
ERROR logstash.outputs.jdbc - JDBC - Exception. Not retrying {:exception=>#, :statement=>"UPDATE table SET estate = 'P' WHERE codigo = ? ", :event=>"{\"properties\":{\"rangoAltura1\":null,\"rangoAltura2\":null,\"codigo\":\"DB_001\",\"rangoAltura3\":null,\"descrip\":\"CARLOS PEREZ\",\"codigo\":\"106\",\"rangoAltura5\":null,\"active\":true},\"id\":\"DB_001_555\"}"}
My .conf:
statement => ["UPDATE table SET estate = 'A' WHERE entidad = ? ","%{[#metadata][miEntidad]}"]
{[#metadata][miEntidad]} -----> map['entidad_temp'] = event.get('entidad')
According to the output jdbc plugin README you have it set correctly/
Maybe try the following as a work-around:
statement => ["UPDATE table SET estate = 'A' WHERE entidad = ? ","[#metadata][miEntidad]"]

How to execute multiple inserts in batch in r2dbc?

I need to insert multiple rows into one table in one batch.
In DatabaseClient i found insert() statement and using(Publisher objectToInsert) method which has multiple objects as argument. But would it insert them in one batch or not?
Another possible solution is connection.createBatch(), but it has a drowback : I cannot pass my Entity object there and i cannot generate sql query from the entity.
So, is it possible to create batch insert in r2dbc?
There are two questions:
Would DatabaseClient.insert() insert them in one batch or not?
Not a batch.
Is it possible to create batch insert in r2dbc? (except Connection.createBatch())
No, use Connection.createBatch() is only one way to create a Batch for now.
See also issues:
spring-data-r2dbc#259
spring-framework#27229
There is no direct support till now, but I found it is possible to use Connection to overcome this barrier simply, check out this issue, spring-data-r2dbc#259
The statement has a add to repeat to bind parameters.
The complete codes of my solution can be found here.
return this.databaseClient.inConnectionMany(connection -> {
var statement = connection.createStatement("INSERT INTO posts (title, content) VALUES ($1, $2)")
.returnGeneratedValues("id");
for (var p : data) {
statement.bind(0, p.getTitle()).bind(1, p.getContent()).add();
}
return Flux.from(statement.execute()).flatMap(result -> result.map((row, rowMetadata) -> row.get("id", UUID.class)));
});
A test for this method.
#Test
public void testSaveAll() {
var data = Post.builder().title("test").content("content").build();
var data1 = Post.builder().title("test1").content("content1").build();
var result = posts.saveAll(List.of(data, data1)).log("[Generated result]")
.doOnNext(id->log.info("generated id: {}", id));
assertThat(result).isNotNull();
result.as(StepVerifier::create)
.expectNextCount(2)
.verifyComplete();
}
The generated ids are printed as expected in the console.
...
2020-10-08 11:29:19,662 INFO [reactor-tcp-nio-2] reactor.util.Loggers$Slf4JLogger:274 onNext(a3105647-a4bc-4986-9ad4-1e6de901449f)
2020-10-08 11:29:19,664 INFO [reactor-tcp-nio-2] com.example.demo.PostRepositoryTest:31 generated id: a3105647-a4bc-4986-9ad4-1e6de901449f
//.....
2020-10-08 11:29:19,671 INFO [reactor-tcp-nio-2] reactor.util.Loggers$Slf4JLogger:274 onNext(a611d766-f983-4c8e-9dc9-fc78775911e5)
2020-10-08 11:29:19,671 INFO [reactor-tcp-nio-2] com.example.demo.PostRepositoryTest:31 generated id: a611d766-f983-4c8e-9dc9-fc78775911e5
//......
Process finished with exit code 0

Is there any way to view the physical SQLs executed by Calcite JDBC?

Recently I am studying Apache Calcite, by now I can use explain plan for via JDBC to view the logical plan, and I am wondering how can I view the physical sql in the plan execution? Since there may be bugs in the physical sql generation so I need to make sure the correctness.
val connection = DriverManager.getConnection("jdbc:calcite:")
val calciteConnection = connection.asInstanceOf[CalciteConnection]
val rootSchema = calciteConnection.getRootSchema()
val dsInsightUser = JdbcSchema.dataSource("jdbc:mysql://localhost:13306/insight?useSSL=false&serverTimezone=UTC", "com.mysql.jdbc.Driver", "insight_admin","xxxxxx")
val dsPerm = JdbcSchema.dataSource("jdbc:mysql://localhost:13307/permission?useSSL=false&serverTimezone=UTC", "com.mysql.jdbc.Driver", "perm_admin", "xxxxxx")
rootSchema.add("insight_user", JdbcSchema.create(rootSchema, "insight_user", dsInsightUser, null, null))
rootSchema.add("perm", JdbcSchema.create(rootSchema, "perm", dsPerm, null, null))
val stmt = connection.createStatement()
val rs = stmt.executeQuery("""explain plan for select "perm"."user_table".* from "perm"."user_table" join "insight_user"."user_tab" on "perm"."user_table"."id"="insight_user"."user_tab"."id" """)
val metaData = rs.getMetaData()
while(rs.next()) {
for(i <- 1 to metaData.getColumnCount) printf("%s ", rs.getObject(i))
println()
}
result is
EnumerableCalc(expr#0..3=[{inputs}], proj#0..2=[{exprs}])
EnumerableHashJoin(condition=[=($0, $3)], joinType=[inner])
JdbcToEnumerableConverter
JdbcTableScan(table=[[perm, user_table]])
JdbcToEnumerableConverter
JdbcProject(id=[$0])
JdbcTableScan(table=[[insight_user, user_tab]])
There is a Calcite Hook, Hook.QUERY_PLAN that is triggered with the JDBC query strings. From the source:
/** Called with a query that has been generated to send to a back-end system.
* The query might be a SQL string (for the JDBC adapter), a list of Mongo
* pipeline expressions (for the MongoDB adapter), et cetera. */
QUERY_PLAN;
You can register a listener to log any query strings, like this in Java:
Hook.QUERY_PLAN.add((Consumer<String>) s -> LOG.info("Query sent over JDBC:\n" + s));
It is possible to see the generated SQL query by setting calcite.debug=true system property. The exact place where this is happening is in JdbcToEnumerableConverter. As this is happening during the execution of the query you will have to remove the "explain plan for"
from stmt.executeQuery.
Note that by setting debug mode to true you will get a lot of other messages as well as other information regarding generated code.

Jmeter - UPDATED - Duration Assertion on While Controller (JDBC Sampler)? - UPDATED

My current environment: JMeter v2.11, remote Oracle 12, JDK 7
There is a system (A) that will send 2000 SOAP/XML submissions (per hour) into a receiving system (B). System B will insert a new row to the database table (for each new submission) setting the application.status column value to numeric value of 1. System (B) processes the requests and updates the application.status column from numeric value of 1 to numeric value of 6 once the processing is complete and the submissions are 'approved'.
I have a requirement that states these A to B submissions need to be 'approved' within 60 seconds - I am trying to setup my thread to verify this.
My current workings (after some start up help from Dmitri T) are as follows:
Thread Group
-Beanshell Sampler (to create an XML message)
-Beanshell Sampler (to submit XML to a web service)
-While Controller-->${__javaScript("${status_1}" != "6")}
--Duration Assertion-->60000 milliseconds (Duration)
--JDBC Request-->select status from application where applicationID = (select max(application_id) from application); VarName = status
Currently, my Thread Group will run and I will get multiple JDBC Requests executed until either the JDBC Request takes longer than the Duration Assertion value OR until the status value in the application table is updated to 6 (which equates to 'Approved' status).
This is NOT what I need.
I don't want to verify whether the JDBC request takes longer than the Duration value, it will never take longer than the Duration value, what I need the Duration Assertion for is to fail if the change from application.status=1 to application.status=6 takes longer than 60 seconds
As I state above - it won't prove my requirement to verify if the JDBC request takes longer than the Duration Assertion value (it never will), I need the Duration Assertion to check the application.status change takes less than 60 seconds.
I've tried the following:
Thread Group
-While Controller-->${__javaScript("${status_1}" != "6")}
--Duration Assertion-->60000 milliseconds (Duration)
--JDBC Request-->select status from application where applicationID = (select max(application_id) from application); VarName = status
Thread Group
-While Controller-->${__javaScript("${status_1}" != "6")}
--JDBC Request-->select status from application where applicationID = (select max(application_id) from application); VarName = status
--Duration Assertion-->60000 milliseconds (Duration)
Thread Group
-While Controller-->${__javaScript("${status_1}" != "6")}
--JDBC Request-->select status from application where applicationID = (select max(application_id) from application); VarName = status
---Duration Assertion-->60000 milliseconds (Duration)
I'm running out of ideas! - As with my previous requests, I appreciate any help anyone can provide.
Cheers!
Just move your Duration Assertion one level up (the same level as JDBC Request, not as a child of JDBC Request) - in that case it will be applied to While Controller duration, not the single request.
To learn more regarding assertions scope, cost and interoperability see How to Use JMeter Assertions in 3 Easy Steps guide.

Resources