PL/SQL UTL_FILE package read from csv and load values into a table - oracle

If I have a CSV file like this:
How can I read this with UTL_FILE package and load the values into one table which have columns: ItemIdentifier, SoldOnWeb, SoldInTheShop?

From my point of view, as CSV files can be edited with MS Excel, I'd suggest you to rearrange the file and uniform it. The way it is now, it contains different headings and - to make it worse - they don't "match" (the same column contains itemidentifier and soldintheshop values; the same goes for the next column).
Add yet another column which would explain what the "sold..." column represents. Finally, you'd have something like this:
itemidentifier amount location
-------------- ------ -------------
1 10 soldOnWeb
2 7 soldOnWeb
3 5 soldOnweb
1 7 soldInTheShop
2 3 soldInTheShop
Doing so, it is a simple task to insert every value where it belongs.
Otherwise, can it be done in PL/SQL? Probably. Will it be difficult? Probably, as you have to "remember" what you're selecting in each row and - according to that - insert values into appropriate columns in the table.
You know how it goes ... garbage in, garbage out.

Related

Power Query Replace null values with values from another column

I am working with data imported from a pdf file. There is an extra column in the Power Query import (Data.Column7), containing data that belongs in the adjacent columns on either side (Data.Column6 and Data.Column8). Columns 6 and 8 have null values in the cells where the data was pushed into Column 7. I would like to replace the null values in Columns 6 and 8 with the correct data from Column 7, leaving all other values Columns 6 and 8 as is.
After looking at the post here:
Power Query / Power BI - replacing null values with value from another column
and watching this video:
https://www.youtube.com/watch?v=ikzeQgdKA0Q
I tried the following formula:
= Table.ReplaceValue(#"Expanded Data",null, each _[Data.Column7] ,Replacer.ReplaceText,{"Data.Column6","Data.Column8"})
(Note, "Expanded Data" is the last step before this Replace Value step.)
I am not getting any kind of syntax error, but the Replace Value step isn't doing anything at all. My null values in Columns 6 and 8 have not been replaced with the correct data from Column 7.
Any insight into how to achieve replacement would be greatly appreciated. Thank you.
(I should mention, I am a new Power Query user, so please be detailed and assume I know nothing!)
I'm sure there must be some way to do this with the ReplaceValue function, but I think it might be easier to do the following:
1: Create a new column with definition NewData6= if[Data.Column6]=null then [Data.Column7] else [Data.Column6]
2: Do the same thing for 8 : NewData8= if[Data.Column8]=null then [Data.Column7] else [Data.Column8]
3: Delete Data.Column6/7/8
4: Rename the newly made columns if neccesary.
You can do these steps either in the advanced editor, or just use the create custom column button in the add column tab.
If the columns are of the text data type, then it might have empty strings instead of actual nulls.
Try replacing null with "" in your formula.

Oracle Apex: Aggregate show as top row

Currently I use interactive report of Oracle Apex
Column1 Column2
A1 1 2
A2 2 3
A3 3 4
A4 4 5
----------------------
10 14
After showing data, I do some calculation such as sum Column1, Column2 using aggregate function SUM of interactive report
But the result show as the last row of report so I have to scroll to see the result.
How can I show the result as the first row of report?
Column1 Column2
10 14
----------------------
A1 1 2
A2 2 3
A3 3 4
A4 4 5
This can be done using JavaScript/jQuery (of course this is only one of many ways this can be done, but it works).
The SQL statement that is used to create the report along with all the filter and search information is stored in
WWV_FLOW_WORKSHEET tables
Create a database package to generate that SQL and sum the columns that you want from that and return as XML.
Call that database package from a jQuery AJAX call then parse the XML and using jQuery add a row to the top (and bottom if you like) of the worksheet table and place the results there.
The JavaScript/jQuery function should be called from an after refresh dynamic action for the worksheet in question.
function htmldbIRTotals(worksheet,columnList) {
sessionId = $("#pInstance").val();
worksheetId = $(worksheet.triggeringElement).find(".a-IRR-table:last").attr("id");
baseReportId=parseInt("0"+$("#"+worksheetId+"_rpt_saved_reports").val());
u="'||pkg||'.getTotals?p_worksheet_id="+worksheetId+"&p_base_Report_Id="+baseReportId+"&p_session_id="+sessionId+"&p_columns="+columnList;
$.post(u,function(data) {
var xml = $.parseXML(data);
});
With sessionId, worksheetId, and baseReportId you can retrieve all the info you need
columnList is a colon separated list of the columns that you want to total
You could even get more fancy and add different agregates (i.e. AVG, MIN, MAX etc.)
Example of this at the link below.
http://apps.htmldb.com/apps/f?p=htmldb:knowledgebase:::NO:RP:P5_ID:84315
The problem is that the totals may or may not be rendered on the page depending on the page size and the number of records in the result set - so with an Interactive Report, unless you force it to always show All Records, the page might not even be able to use JS tricks to move the summary line to the top.
My preference in this sort of situation is to add a separate region to the page based on a simple SQL query. It will be shown above the interactive report. Unfortunately the columns won't be aligned, however.
I found the solution
Add sub query to calculate the sum
Union with existed result
Use highlight function to make up the new row

How to update a column with concatenate of two other column in a same table

I have a table with 3 columns a, b and c. I want to know how to update the value of third column with concatenate of two other columns in each row.
before update
A B c
-------------
1 4
2 5
3 6
after update
A B c
-------------
1 4 1_4
2 5 2_5
3 6 3_6
How can I do this in oracle?
Use the concatentation operator ||:
update mytable set
c = a || '_' || b
Or better, to avoid having to rerun this whenever rows are inserted or updated:
create view myview as
select *, a || '_' || b as c
from mytable
Firstly, you are violating the rules of normalization. You must re-think about the design. If you have the values in the table columns, then to get a computed value, all you need is a select statement to fetch the result the way you want. Storing computed values is generally a bad idea and considered a bad design.
Anyway,
Since you are on 11g, If you really want to have a computed column, then I would suggest a VIRTUAL COLUMN than manually updating the column. There is a lot of overhead involved with an UPDATE statement. Using a virtual column would reduce a lot of the overhead. Also, you would completely get rid of the manual effort and those lines of code to do the update. Oracle does the job for you.
Of course, you will use the same condition of concatenation in the virtual column clause.
Something like,
Column_c varchar2(50) GENERATED ALWAYS AS (column_a||'_'||column_b) VIRTUAL
Note : There are certain restrictions on its use. So please refer the documentation before implementing it. However, for the simple use case provided by OP, a virtual column is a straight fit.
Update I did a small test. There were few observations. Please read this question for a better understanding about how to implement my suggestion.

How to Repeat table in one report side

I have a simple report (1 Table with 2 small columns)
the report works fine but now I'm trying to repeat the table on the same page but I can't find an example on how to do this.
I also crosses only one question about this problem, so maybe I miss something obviously on how to solve this.
Could somebody please enlight me?
You can 'play' with (a single) Tablix details to simulate table repetition.
For example you can create 3 rows of details like this:
row 1: header
row 2: value
row 3: footer space
To obtain a result like this:
You can also use a different layout, for example:
row 1: label/value for column 1
row 2: label/value for column 2
row 3: footer space
If you have a very small table you can also set report columns to fill horizontal space.
This is the result (simulating table repetitions but you can also use a simple Tablix with standard header/details):
This is the result if you use columns and a simple Tablix with standard header/details:

select only new row in oracle

I have table with "varchar2" as primary key.
It has about 1 000 000 Transactions per day.
My app wakes up every 5 minute to generate text file by querying only new record.
It will remember last point and process only new records.
Do you have idea how to query with good performance?
I am able to add new column if necessary.
What do you think this process should do by?
plsql?
java?
Everyone here is really really close. However:
Scott Bailey's wrong about using a bitmap index if the table's under any sort of continuous DML load. That's exactly the wrong time to use a bitmap index.
Everyone else's answer about the PROCESSED CHAR(1) check in ('Y','N')column is right, but missing how to index it; you should use a function-based index like this:
CREATE INDEX MY_UNPROCESSED_ROWS_IDX ON MY_TABLE
(CASE WHEN PROCESSED_FLAG = 'N' THEN 'N' ELSE NULL END);
You'd then query it using the same expression:
SELECT * FROM MY_TABLE
WHERE (CASE WHEN PROCESSED_FLAG = 'N' THEN 'N' ELSE NULL END) = 'N';
The reason to use the function-based index is that Oracle doesn't write index entries for entirely NULL values being indexed, so the function-based index above will only contain the rows with PROCESSED_FLAG = 'N'. As you update your rows to PROCESSED_FLAG = 'Y', they'll "fall out" of the index.
Well, if you can add a new column, you could create a Processed column, which will indicate processed records, and create an index on this column for performance.
Then the query should only be for those rows that have been newly added, and not processed.
This should be easily done using sql queries.
Ah, I really hate to add another answer when the others have come so close to nailing it. But
As Ponies points out, Oracle does have a hidden column (ORA_ROWSCN - System Change Number) that can pinpoint when each row was modified. Unfortunately, the default is that it gets the information from the block instead of storing it with each row and changing that behavior will require you to rebuild a really large table. So while this answer is good for quieting the SQL Server fella, I'd not recommend it.
Astander is right there but needs a few caveats. Add a new column needs_processed CHAR(1) DEFAULT 'Y' and add a BITMAP index. For low cardinality columns ('Y'/'N') the bitmap index will be faster. Once you have the rest is pretty easy. But you've got to be careful not select the new rows, process them and mark them as processed in one step. Otherwise, rows could be inserted while you are processing that will get marked processed even though they have not been.
The easiest way would be to use pl/sql to open a cursor that selects unprocessed rows, processes them and then updates the row as processed. If you have an aversion to walking cursors, you could collect the pk's or rowids into a nested table, process them and then update using the nested table.
In MS SQL Server world where I work, we have a 'version' column of type 'timestamp' on our tables.
So, to answer #1, I would add a new column.
To answer #2, I would do it in plsql for performance.
Mark
"astander" pretty much did the work for you. You need to ALTER your table to add one more column (lets say PROCESSED)..
You can also consider creating an INDEX on the PROCESSED ( a bitmap index may be of some advantage, as the possible value can be only 'y' and 'n', but test it out ) so that when you query it will use INDEX.
Also if sure, you query only for every 5 mins, check whether you can add another column with TIMESTAMP type and partition the table with it. ( not sure, check out again ).
I would also think about writing job or some thing and write using UTL_FILE and show it front end if it can be.
If performance is really a problem and you want to create your file asynchronously, you might want to use Oracle Streams, which will actually get modification data from your redo log withou affecting performance of the main database. You may not even need a separate job, as you can configure Oracle Streams to do Asynchronous replication of the changes, through which you can trigger the file creation.
Why not create an extra table that holds two columns. The ID column and a processed flag column. Have an insert trigger on the original table place it's ID in this new table. Your logging process can than select records from this new table and mark them as processed. Finally delete the processed records from this table.
I'm pretty much in agreement with Adam's answer. But I'd want to do some serious testing compared to an alternative.
The issue I see is that you need to not only select the rows, but also do an update of those rows. While that should be pretty fast, I'd like to avoid the update. And avoid having any large transactions hanging around (see below).
The alternative would be to add CREATE_DATE date default sysdate. Index that. And then select records where create_date >= (start date/time of your previous select).
But I don't have enough data on the relative costs of setting a sysdate as default vs. setting a value of Y, updating the function based vs. date index, and doing a range select on the date vs. a specific select on a single value for the Y. You'll probably want to preserve stats or hint the query to use the index on the Y/N column, and definitely want to use a hint on a date column -- the stats on the date column will almost certainly be old.
If data are also being added to the table continuously, including during the period when your query is running, you need to watch out for transaction control. After all, you don't want to read 100,000 records that have the flag = Y, then do your update on 120,000, including the 20,000 that arrived when you query was running.
In the flag case, there are two easy ways: SET TRANSACTION before your select and commit after your update, or start by doing an update from Y to Q, then do your select for those that are Q, and then update to N. Oracle's read consistency is wonderful but needs to be handled with care.
For the date column version, if you don't mind a risk of processing a few rows more than once, just update your table that has the last processed date/time immediately before you do your select.
If there's not much information in the table, consider making it Index Organized.
What about using Materialized view logs? You have a lot of options to play with:
SQL> create table test (id_test number primary key, dummy varchar2(1000));
Table created
SQL> create materialized view log on test;
Materialized view log created
SQL> insert into test values (1, 'hello');
1 row inserted
SQL> insert into test values (2, 'bye');
1 row inserted
SQL> select * from mlog$_test;
ID_TEST SNAPTIME$$ DMLTYPE$$ OLD_NEW$$ CHANGE_VECTOR$$
---------- ----------- --------- --------- ---------------------
1 01/01/4000 I N FE
2 01/01/4000 I N FE
SQL> delete from mlog$_test where id_test in (1,2);
2 rows deleted
SQL> insert into test values (3, 'hello');
1 row inserted
SQL> insert into test values (4, 'bye');
1 row inserted
SQL> select * from mlog$_test;
ID_TEST SNAPTIME$$ DMLTYPE$$ OLD_NEW$$ CHANGE_VECTOR$$
---------- ----------- --------- --------- ---------------
3 01/01/4000 I N FE
4 01/01/4000 I N FE
I think this solution should work..
What you need to do following steps
For the first run, you will have to copy all records. In first run you need to execute following query
insert into new_table(max_rowid) as (Select max(rowid) from yourtable);
Now next time when you want to get only newly inserted values, you can do it by executing follwing command
Select * from yourtable where rowid > (select max_rowid from new_table);
Once you are done with processing above query, simply truncate new_table and insert max(rowid) from yourtable
I think this should work and would be fastest solution;

Resources