I have written an SQL query to return an Interactive report in APEX, and I would like to create another (classic) report on the same page that summarizes that report. The report gives test results, and I want to summarize it by grade level (so x # of As, y Bs, etc)
It's a complicated query that takes a while to run, so I'd like to avoid just running it again and using aggregation functions.
Does APEX store that report in a variable or table as it does variables (something like :TABLE_NAME) that I could just run a query on? I haven't been able to find anything on that in documentation or searching.
Thanks!
One option is to
enter parameters you use to filter data in that complex query
instead of running the report directly, create a push button and a process on it
that process should insert the result into a global temporary table (GTT) you'd, of course, have to create
why? Because it can be shared by multiple users, while everyone sees only their own data
interactive report would fetch & display data from the GTT
classic report would summarize data from the GTT
Related
May I have your opinion on below queries please:
Option 1:
I have select script handy with me which fetch data by joining many source tables and performs some transformations like aggregations (group by), data conversion, sub-string etc.
Can I invoke this script through ODI mapping and return results (transformed data output) can be inserted into target of ODI mapping ?
Option2:
Convert the select script into equivalent ODI mapping by using equivalent ODI transformations , functions , look ups etc and use various tables (tables in join clause) as source of mappings.
Basically develop ODI mapping which is equivalent to provided select script plus a target table to insert records into it.
I need to know pros and cons of both options in above (if option 1 is possible).
Is it still possible to track transformation errors, join source tables and where clause condition related errors etc through ODI with option 1?
Log file for mapping failure will have as granular level details as offered by option 2?
Can I still enable Flow Control at Knowledge Module and redirect select script errors into E$_ error tables provided by ODI?
Thanks,
Rajneesh
Option 1: ODI 12c includes that concept out of the box. On the physical tab of a mapping, click on the source node (datastore). Then in the properties pane, there is the CUSTOM_TEMPLATE option under "Extract Options" menu. This allows to enter a custom SQL statement that will be used instead of the code generated by ODI.
However it is probably less maintainable over time than option 2. SQL is less visual than mapping components. Also if you need to bulk change it, it will be trickier. Changing a component in several mappings can be done with the SDK. Changing SQL code would require to parse it. You might indeed have less information in your operator logs as the SQL would be seen as just one block of code. It also wouldn't provide any lineage.
I believe using Flow Control would work but I haven't tested it.
Option 2 would take more time to complete but with that you would benefit from all the functionalities of ODI.
My own preference would be to occasionally use option 1 for really complex SQL queries but to use option 2 for most of the normal use cases.
Wanted some advice on how to deal with table operations (rename column) in Google BigQuery.
Currently, I have a wrapper to do this. My tables are partitioned by date. eg: if I have a table name fact, I will have several tables named:
fact_20160301
fact_20160302
fact_20160303... etc
My rename column wrapper generates aliased queries. ie. if I want to change my table schema from
['address', 'name', 'city'] -> ['location', 'firstname', 'town']
I do batch query operation:
select address as location, name as firstname, city as town
and do a WRITE_TRUNCATE on the parent tables.
My main issues lies with the fact that BigQuery only supports 50 concurrent jobs. This means, that when I submit my batch request, I can only do around 30 partitions at a time, since I'd like to reserve 20 spots for ETL jobs that are runnings.
Also, I haven't found of a way where you can do a poll_job on a batch operation to see whether or not all jobs in a batch have completed.
If anyone has some tips or tricks, I'd love to hear them.
I can propose two options
Using View
Views creation is very simple to script out and execute - it is fast and free to compare with cost of scanning whole table with select into approach.
You can create view using Tables: insert API with properly set type property
Using Jobs: insert EXTRACT and then LOAD
Here you can extract table to GCS and then load it back to GBQ with adjusted schema
Above approach will a) eliminate cost cost of querying (scan) tables and b) can help with limitations. But might not depends on the actual volumke of tables and other requirements you might have
The best way to manipulate a schema is through the Google Big Query API.
Use the tables get api to retrieve the existing schema for your table. https://cloud.google.com/bigquery/docs/reference/v2/tables/get
Manipulate your schema file, renaming columns etc.
Again using the api perform an update on the schema, setting it to your newly modified version. This should all occur in one job https://cloud.google.com/bigquery/docs/reference/v2/tables/update
Using PDI (Kettle) I am filling the entry-stage of my database by utilizing a CSV Inputand Table Output step. This works great, however, I also want to make sure that the data that was just inserted fulfills certain criteria, e.g. fields not being NULL, etc.
Normally this would be a job for database constraints, however, we want to keep the data in the database even if its faulty (for debugging purposes. It is a pain trying to debug a .csv file...). As it is just a staging table anyway it doesn't cause any troubles for integrity, etc.
So to do just that, I wrote some SELECT Count(*) as test123 ... statements that instantly show if something is wrong or not and are easy to handle (if the value of test123 is 0 all is good, else the job needs to be aborted).
I am executing these statements using a Execute SQL Statements step within a PDI transformation. I expected the result to be automatically passed to my datastream, so I also used a Copy rows to result step to pass it up to the executing job.
This is the point where the problem is most likely located.
I think that the result of the SELECT statement was not automatically passed to my datastream, because when I do a Simple evaluation in the main job using the variable ${test123} (which I thought would be implicitly created by executing SELECT Count(*) as test123 ...) I never get the expected result.
I couldn't really find any clues to this problem in the PDI documentation so I hope that someone here has some experience with PDI and might be able to help. If something is still unclear, just hint at it and I will edit the post with more information.
best regards
Edit:
This is a simple model of my main job:
Start --> Load data (Transformation) --> Check data (Transformation) --> Simple Evaluation --> ...
You are mixing up a few concepts, if I read your post correctly.
You don't need a Execute SQL script, this is a job for the Table input step.
Just type your query in the Table input and you can preview your data and see it coming from the step into the data stream by using the preview on a subsequent step. The Execute SQL script is not an input step, which means it will not add external data to your data stream.
The output fields are not Variables. A Variable is set using the Set Variables step, which takes a single input row and maps a specific field to a variable, which can be persisted at parent job or root job levels. Fields are just that: fields. They are passed from one step to the next through hops and eventually to the parent job if you have a Copy rows to result step, but they are NOT variables.
Is there a way in PowerCenter 9.1 to get the number of inserts, deletes and updates after an execution of a session? I can see the data on the log but I would like to see it in a more ordered fashion in a table.
The only way I know requires building the mapping appropriately. You need to have 3 separate instances of the target and use a router to redirect the rows to either TARGET_insert or TARGET_update or TARGET_delete. Workflow Monitor will then show a separate row for the inserted, updated and deleted rows.
There are few ways,
1. You can use $tgtsuccessrows / $TgtFailedRows and assign it to workflow variable
2. Expression transformation can be used with a variable port to keep track of insert/update/delete
3. You can even query OPB_SESSLOG in second stream to get row count inside same session.
Not sure if PowerCenter 9.1 offers a solution to this problem.
You can design your mapping to populate a Audit table to track the number of insert/update/delete's
You can download a sample implementation from Informatica Marketplace block titled "PC Mapping : Custom Audit Table"
https://community.informatica.com/solutions/mapping_custom_audit_table
There are multiple ways like you can create a assignment task attach this assignment task just after you session once the session complete its run the assignment task will pass on the session stats from session to the workflow variable defined at workflow level, sessions stats like $session.status,$session.rowcount etc and now create a worklet having a mapping included in it, pass the session stats captured at workflow level to the newly created worklet and from worklet to the mapping, now once the stats are available at mapping level in the mapping scan these stats (using a SQL or EXP transformation) and then write these stats to the AUDIT table ... attach the combination of assignment task and worklet after each session and it will start capturing the stats of each session after the session completes it run....
I have some reports written in Crystal 2008 using business views. These reports have a date parameter set up and I have a selection on the date defined in the select expert. However, when I run the report it appears to retrieve all the data from the database and only then filter out based on the date. As you can imagine this slows down the report quite a bit. I also clicked on Database-Show SQL Query and confirmed that the date parameter did not appear in the SQL Query. This behavior seems very strange to me. This did not use to happen to me when I used Crystal 8.5 with dictionaries. Is this a limitation using business views?
I did some searching and found that I can create a report using a database command. This helped improve performance on one of my reports but when I tried to do something similar on a different report, even though I was using the database command, it still did not appear in the appear to be doing the selection on the database before retrieving the data and the report took forever to run. I also didn't see the selection in the SQL Query.
Do I need to add the parameter to the database command? Will I be able to prompt the user to enter the value when they run the report?
I hope there is a way to do this properly using business views because otherwise I'll have to rewrite all my reports to use another method.
Any ideas or advice are welcome. Thank you very much!
I had a similar problem. I used the command, but my report was still taking longer than i had hoped to run. so i added a where statement into the command to start checking dates starting from 2009. that sped up my report a little.
you may want to consider creating a stored procedure if you think you are pushing CR to the limit. that may also help sped up the report.
I figured out what the problem is. My business view had fields in it that were formulas. If you try to use selection criteria using a formula, it does not add the criteria to the WHERE clause in the SQL Query. Luckily, I was able to find other fields besides the formula in the business view to do the selection.