Unable to check the dataflow output activity metrics - ADF - expression

I am trying to check if the dataflow has written any rows to the sink and capture the activity output. the update statement fails, if the activity doesn't write any rows to the sink , so as per MS docs I am trying the below expression in a lookup activity.
***DECLARE #Date DATETIME;
SET #Date = GETDATE();
DECLARE #ROWSAFFECT INT;
SET #ROWSAFFECT = if(contains(#{activity('dataflow').output.runStatus}, 'sink'), '#{activity('dataflow').output.runStatus.metrics.sink.rowsWritten}','0');
update table [schema].[audit_table]
SET LOAD_STATUS ='Success'
,ROWS_AFFECTED = #ROWSAFFECT
select 1;***
But this fails with a parse error. Can someone please help me with this?
=A database operation failed with the following error: 'Parse error at line: 4, column: 217: Incorrect syntax near ']'.'

The query written with dynamic content to assign the value for #ROWAFFECT is incorrect.
The main issue is with the #contains() inside the if. You should be searching for 'sink' in contains(activity('dataflow').output.runStatus.metrics.
I have tried the following query to insert into my azure SQL database table (dbo.audit) for demonstration.
DECLARE #ROWSAFFECT INT;
SET #ROWSAFFECT = #{if(contains(activity('dataflow').output.runStatus.metrics, 'sink'), activity('dataflow').output.runStatus.metrics.sink.rowsWritten,0)};
update [dbo].[audit] SET LOAD_STATUS ='Success' ,ROWS_AFFECTED = #ROWSAFFECT
The following is the debug query that is being run for each case.
When rowsWritten is 0:
When there are n=10 records:

Related

How to execute Stored Procedure from Laravel - SQL Server

I have the following stored procedure:
ALTER procedure [dbo].[spx_kasir_verifikator_GetData_web]
#id_verifikator int
as
begin
SELECT * FROM tb_kasir_set_verifikator
WHERE tb_kasir_set_verifikator.id_verifikator = id_verifikator;
end
Controller:
public function show($id_verifikator)
{
$setverifikator = DB::select("exec spx_kasir_verifikator_GetData_web ?",[$id_verifikator]);
dd($setverifikator);
}
And I'm trying to call this procedure in Laravel 8, I need to display just one id or by id_verifikator but it's always showed all data. How I solve this?
It's returning everything as you are using select * which tells the database that you want the entire row. If you wish to only get a specific column then you need to change that to select id_verifikator.
A bit off-topic but I would suggest that you just run this query in Laravel instead of having a procedure, especially as this is such a basic query. The below links can help you get started.
https://laravel.com/docs/8.x/queries
https://laravel.com/docs/8.x/eloquent

SimpleJdbcCall : No function exist error

I am getting Exception while executing store procedure. Exception is as below
org.springframework.jdbc.BadSqlGrammarException: CallableStatementCallback;badSQLgrammar [{call find_spot()}]; nestedexception is org.postgresql.util.PSQLException: ERROR: functionfind_spot()
does not exist Hint: No function matches the given name and argument types. You might need to add explicit type casts.Position:15
Its saying function find_spot() does not exist but I checked in database this procedure is there. I am using Postgresql [DBeaver]
Can anyone help me to solve this?
You should cast a json to make search or use the json access query, see the examples:
select * from table where cast(field_json as varchar(500)) !~ 'reg_ex' and id = 11
or
select field_json->>'key' from table where field_json->>'key' ilike 'value'

UPSERT in Memsql from another table

I am trying this query to insert some records from a table to another one,when recods are not already exsiting in the target table, but I am getting the following error, what is the best query to UPSERT in memsql from another table?
Query:
INSERT INTO ema.device_set
(segment_0, segment_1, segment_2, segment_3, segment_4, last_updated)
SELECT tmp.segment_0, tmp.segment_1, tmp.segment_2, tmp.segment_3, tmp.segment_4, tmp.last_updated
FROM ema.tmp_device_set tmp
WHERE NOT EXISTS (
SELECT *
FROM ema.device_set tab
WHERE tmp.segment_0 = tab.segment_0 and tmp.segment_1 = tab.segment_1 and tmp.segment_2 = tab.segment_2 and tmp.segment_3 = tab.segment_3 and tmp.segment_4 = tab.segment_4
);
error:
Partition has no master instance or Leaf Error: The database will be available to query in 2 seconds after recovery from disk is finished.
That error message means your nodes are down or recovering from disk. It has nothing to do with the specific UPSERT you are trying to do.
Check to make sure your query is not in any violations of the MemSQL INSERT...SELECT rules shown at the following link.
https://docs.memsql.com/docs/insert

BigQuery - Check if table already exists

I have a dataset in BigQuery. This dataset contains multiple tables.
I am doing the following steps programmatically using the BigQuery API:
Querying the tables in the dataset - Since my response is too large, I am enabling allowLargeResults parameter and diverting my response to a destination table.
I am then exporting the data from the destination table to a GCS bucket.
Requirements:
Suppose my process fails at Step 2, I would like to re-run this step.
But before I re-run, I would like to check/verify that the specific destination table named 'xyz' already exists in the dataset.
If it exists, I would like to re-run step 2.
If it does not exist, I would like to do foo.
How can I do this?
Thanks in advance.
Alex F's solution works on v0.27, but will not work on later versions. In order to migrate to v0.28+, the below solution will work.
from google.cloud import bigquery
project_nm = 'gc_project_nm'
dataset_nm = 'ds_nm'
table_nm = 'tbl_nm'
client = bigquery.Client(project_nm)
dataset = client.dataset(dataset_nm)
table_ref = dataset.table(table_nm)
def if_tbl_exists(client, table_ref):
from google.cloud.exceptions import NotFound
try:
client.get_table(table_ref)
return True
except NotFound:
return False
if_tbl_exists(client, table_ref)
Here is a python snippet that will tell whether a table exists (deleting it in the process--careful!):
def doesTableExist(project_id, dataset_id, table_id):
bq.tables().delete(
projectId=project_id,
datasetId=dataset_id,
tableId=table_id).execute()
return False
Alternately, if you'd prefer not deleting the table in the process, you could try:
def doesTableExist(project_id, dataset_id, table_id):
try:
bq.tables().get(
projectId=project_id,
datasetId=dataset_id,
tableId=table_id).execute()
return True
except HttpError, err
if err.resp.status <> 404:
raise
return False
If you want to know where bq came from, you can call build_bq_client from here: http://code.google.com/p/bigquery-e2e/source/browse/samples/ch12/auth.py
In general, if you're using this to test whether you should run a job that will modify the table, it can be a good idea to just do the job anyway, and use WRITE_TRUNCATE as a write disposition.
Another approach can be to create a predictable job id, and retry the job with that id. If the job already exists, the job already ran (you might want to double check to make sure the job didn't fail, however).
Enjoy:
def doesTableExist(bigquery, project_id, dataset_id, table_id):
try:
bigquery.tables().get(
projectId=project_id,
datasetId=dataset_id,
tableId=table_id).execute()
return True
except Exception as err:
if err.resp.status != 404:
raise
return False
There is an edit in exception.
you can use exists() now to check if dataset exists same with table
BigQuery exist documentation
recently big query introduced so called scripting statements that can be quite a game changer for some flows.
check them out here:
https://cloud.google.com/bigquery/docs/reference/standard-sql/scripting
Now for example to check if table exists you can use something like this:
sql = """
BEGIN
IF EXISTS(SELECT 1 from `YOUR_PROJECT.YOUR_DATASET.YOUR_TABLE) THEN
SELECT 'table_found';
END IF;
EXCEPTION WHEN ERROR THEN
# you can print your own message like above or return error message
# however google says not to rely on error message structure as it may change
select ##error.message;
END;
"""
With my_bigquery being an instance of class google.cloud.bigquery.Client (already authentified and associated to a project):
my_bigquery.dataset(dataset_name).table(table_name).exists() # returns boolean
It does an API call to test for the existence of the table via a GET request
Source: https://googlecloudplatform.github.io/google-cloud-python/0.24.0/bigquery-table.html#google.cloud.bigquery.table.Table.exists
It works for me using 0.27 of the Google Bigquery Python module
Inline SQL Alternative
tarheel's answer is probably the most correct at this point in time
but I was considering the comment from Ivan above that "404 could also mean the resource is not there for a bunch of reasons", so here is a solution that should always successfully run a metadata query and return a result.
It's not the fastest, because it always has to run the query, bigquery has overhead for small queries
A trick I've seen previously is to query information_schema for a (table) object, and union that to a fake query that ensures a record is always returned even if the the object doesn't. There's also a LIMIT 1 and an ordering to ensure the single record returned represents the table, if it does exist. See the SQL in the code below.
In spite of doc claims that Bigquery standard SQL is ISO compliant, they don't support information_schema, but they do have __table_summary__
dataset is required because you can't query __table_summary__ without specifying dataset
dataset is not a parameter in the SQL because you can't parameterize object names without sql injection issues (apart from with the magical _TABLE_SUFFIX, see https://cloud.google.com/bigquery/docs/querying-wildcard-tables )
#!/usr/bin/env python
"""
Inline SQL way to check a table exists in Bigquery
e.g.
print(table_exists(dataset_name='<dataset_goes_here>', table_name='<real_table_name'))
True
print(table_exists(dataset_name='<dataset_goes_here>', table_name='imaginary_table_name'))
False
"""
from __future__ import print_function
from google.cloud import bigquery
def table_exists(dataset_name, table_name):
client = bigquery.Client()
query = """
SELECT table_exists FROM
(
SELECT true as table_exists, 1 as ordering
FROM __TABLES_SUMMARY__ WHERE table_id = #table_name
UNION ALL
SELECT false as table_exists, 2 as ordering
) ORDER by ordering LIMIT 1"""
query_params = [bigquery.ScalarQueryParameter('table_name', 'STRING', table_name)]
job_config = bigquery.QueryJobConfig()
job_config.query_parameters = query_params
if dataset_name is not None:
dataset_ref = client.dataset(dataset_name)
job_config.default_dataset = dataset_ref
query_job = client.query(
query,
job_config=job_config
)
results = query_job.result()
for row in results:
# There is only one row because LIMIT 1 in the SQL
return row.table_exists

Why do I get "ORA-00932: inconsistent datatypes: expected - got -" when using COLLECT() in a prepared statement?

I am using this query with the Perl DBI:
SELECT c.change_id
, COLLECT(t.tag) AS the_tags
FROM changes c
LEFT JOIN tags t ON c.change_id = t.change_id
WHERE c.project = ?
GROUP BY c.change_id
The DBI uses OCI to prepare this statement, bind the value I pass, and get the results. But Oracle, for some reason, does not like it. The error output is:
ORA-00932: inconsistent datatypes: expected - got - (DBD ERROR: error possibly near <*> indicator at char 41 in '
SELECT c.change_id
, <*>COLLECT(t.tag) AS the_tags
FROM changes c
LEFT JOIN tags t ON c.change_id = t.change_id
WHERE c.project = :p1
GROUP BY c.change_id
'
Not very informative. However, I can make this error go away not only by changing the call to COLLECT() also by replacing the placeholder with the actual value:
SELECT c.change_id
, COLLECT(t.tag) AS the_tags
FROM changes c
LEFT JOIN tags t ON c.change_id = t.change_id
WHERE c.project = 'tryoracle'
GROUP BY c.change_id
That version works perfectly. Why doesn't Oracle like the prepared statement with the COLLECT()?
In case it's any help, here is a trace of the OCI-related calls extracted via ora_verbose = 6 (h/t #bohica).
Finally got a solution to this issue, thanks to some digging by a user. The problem was not with the placeholder; why it worked without the placeholder on the VirtualBox image I have no idea. No, the issue was with the COLLECT(). Seems that both the values being collected need to be cast to a specific type, and the resulting array also needs to be cast to a pre-defined array data type. Just so happens that my code has a custom array type:
CREATE TYPE sqitch_array AS varray(1024) OF VARCHAR2(512);
So I'm able to get the query to work by casting the COLLECT() like so:
CAST(COLLECT(CAST(t.tags as VARCHAR2(512))) AS sqitch_array)

Resources