oracle sqlloader new line at the end of the file - oracle

have a file that is in format x,y which should be loaded using sql loader
x y a bnew line
All data is loaded to table A(x,y) where x,y are varchar2 - this step passes successfully.
Next step is processing loaded data - i.e. transforming data to proper formats etc.
At this step i get into trouble, since column y is converted to number (it stores numbers). However due to new line at the end of the file, this line gets corrupted and to_number conversion fails.
How could this be solved?

Contact the provider of the data and have them correct the process that creates the file.
Use the LOAD= parameter to SQL*Loader to restrict how many lines to load LOAD=(lines in file - 1).
Pre-process the file by stripping blank lines or removing special characters before loading.

Related

Etext Oracle : How to get the count of rows on the first line of output instead of last

I am trying to create a Headcount report where I have to present a trailer record showing the count of the rows present in my main record in a separate file.
So I am simply creating an empty record with no attributes for the main Record and then running a count for the rows present in that record. Now the problem is my count line is coming all rows are executed for the main record which are nothing but just empty blank lines due to which my count line is coming at the bottom I want a way to write the Count Line in the first line instead of the last.
As you can see in the image after 4 blank lines we are having the 5th line as the count line showing the count as 4 how to write this line on the first line??
I am also adding screenshots of my etext template and the xml from which this data is being retreived
etext template
xml file with sensitive data blacked out

data factory special character in column headers

I have a file I am reading into a blob via datafactory.
Its formatted in excel. Some of the column headers have special characters and spaces which isn't good if want to take it to csv or parquet and then SQL.
Is there a way to correct this in the pipeline?
Example
"Activations in last 15 seconds high+Low" "first entry speed (serial T/a)"
Thanks
Normally, Data Flow can handle this for you by adding a Select transformation with a Rule:
Uncheck "Auto mapping".
Click "+ Add mapping"
For the column name, enter "true()" to process all columns.
Enter an appropriate expression to rename the columns. This example uses regular expressions to remove any character that is not a letter.
SPECIAL CASE
There may be an issue with this is the column name contains forward slashes ("/"). I accidentally came across this in my testing:
Every one of the columns not mapped contains forward slashes. Unfortunately, I cannot explain why this would be the case as Data Flow is clearly aware of the column name. It can be addressed manually by adding a Fixed rule for EACH offending column, which is obviously less than ideal:
ANOTHER OPTION
The other thing you could try is to pre-process the text file with another Data Flow using a Source dataset that has no delimiters. This would give you the contents of each row as a single column. If you could get a handle on the just first row, you could remove the special characters.

Oracle OBIEE (BI): export result of analysis without hidden columns to CSV

I have an analysis which contains hidden one column. While I'm trying to export result to .xlsx file, it works right, and hidden column doesn't print and calculation works fine. But when I'm trying to export it to .csv - either with ';' delimeter or tab-delimeter - hidden column appears.
There is no opportunity to exclude this column from analysis defenition because of field that I need to calculate, that has strong dependence on hidden column. Also I can't keep it in that form and remove column and add calculation by myself because this file after export automatically will be imported to database which has not enough space to make such operation every month till forever. Is there any way not to print hidden column and save prepared calculation while exporting to CSV?
No. CSV exports exactly what's in the analysis. That's its point and task. You can always clone your analysis, prepare the columns as you need and then just expose it as a download link.
CSV = exact, pure raw data as it's in the analysis construction
Excel = formatted based on what's rendered visually

Power Query – File names loaded from folder become column names, causing failure if new files are later loaded

Power Query sourcing multiple Excel files from a folder.
Files are monthly transactions. The month and year are part of the file names. When the next month comes, new files (in the same format of course, but with new file names) replace the previous ones in the source folder. Having the new file names causes the query to fail on refresh in the following way.
When the files are combined and displayed to begin the transformations, the files names constitute a column of data (named Source). One of my steps in transforming the data is to “use first row as headers”; at this point the first file name in that Source column becomes its column header name.
The problem is that when files having new names replace the previous ones, that column name is no longer found, since the row promoted to be the column header is the name of a new file. PQ is looking for a column header having the original file name and doesn’t find it, so subsequent transformations using that column cause errors.
The error message is: “[Expression.Error] The column ‘[OriginalFileName]’ of the table wasn’t found.”
Basically, that original file name takes on a permanent role as a column name that is part of the query.
I successfully managed to get around the problem by manually renaming all the columns instead of promoting the first data row to be the column headers. Now files with new names are processed without complaint. But this solution is clunky and I would like to keep the step of promoting the first row to be the header.
Does anyone know how to overcome this problem?

How do I validate data in a file in SSIS before inserting into a database?

What I want to do is take data from a dbf file and insert it in a table. Which I've already done. Since there are many files, a For-Each Container is being used. However, before inserting it into a table, I want to look at the date fields and compare it to a date variable. If the dates match the variable, then move on to the step of the flow. But if any of the dates don't match the variable, then that file and its contents are discarded and the next file is looked at.
How do I accomplish this in SSIS?
You're looking for the Conditional Split Component within your Data Flow Task.
Assuming your source column is MyDate and you have an SSIS Variable called #[User::ReferenceDate] then you'd apply an expression like
[MyDate] == #[User::ReferenceDate]
That will evaluate to True when the dates match, false otherwise.
In your Conditional Split, add a row into the component.
OutputName: DatesMatched
Condition: [MyDate] == #[User::ReferenceDate]
Default output name: DatesUnmatched
Now when you connect the output from this to your destination, it'll ask whether you want to route the data using the DatesMatched or DatesUnmatched path. Use the DatesMatched path.
As I re-read this, if any of the dates don't match the variable, then that file and its contents are discarded then you're looking at double processing the file. The first time to read it all in and validate it. The second time, optional, will actually load to the database.
From your Conditional Split, add a RowCount to the DatesUnmatched path. Use a Variable of type Integer/Int32 named CountDatesUnmatched. In a perfect world, that will be zero when the validation of the file completes.
In the Precedent Constraint between the Validation Data Flow and the actual Import Data Flow, double click the connector line and change the evaluation criteria from Constraint to Expression and Constraint. Leave the value as Success and in the Expression use #[User::CountDatesUnmatched] == 0 That data flow will only light up if both conditions are true: parsing was successful and no rows were sent to the Row Count component.
Finally, you can cheat and sometimes this approach makes sense. If you're using an OLE DB Destination, then you can use the MaximumInsertCommitSize of the default 2B and a data access mode of fast load. This translates to "Everything is going to commit or none of it is". That can lock up your target table and cause your transaction log to grow heavily depending on how much data you're loading. Use the Conditional Split as described above but for the DatesUnmatched path, induce a failure. A Derived column with divide by zero or a script task with an explicit FireError event will cause that transaction to go belly up. You'd need to do some magic in the OnError event handler to not abort the overall file processing but it's a lazy hack (or one that is useful when double reading the file is prohibitive but impacting the database is less so)

Resources