I have some tables in Access that I'm trying to export to csv so that I can import to Oracle. I don't use the export via ODBC because I have 70K - 500K records in some of these tables and that feature takes way to long as I have about 25 tables to do so I want to export to csv (which is much faster) then load via sqlldr.
Some numeric columns can go out to 16 decimal places and I need them all. However when I export they only go out 2. I've done some googling around this. Regional settings only allows 9 decimals out (Win XP), formatting the column via a query will change it to text which I don't want when I import to Oracle (maybe I can use to_number() in the control file?).
Why is this so difficult? Why can't Access just export numeric columns as they are?
In my Access 2007 test case, I'm not seeing quite the same result you described. When I export to CSV, I get all the decimal places.
Here is my sample table with decimal_field as decimal(18, 16).
id some_text decimal_field
-- --------- ------------------
1 a 1.0123456789012345
2 b 2
Unfortunately, those exported decimal_field values are quoted in the CSV:
"id","some_text","decimal_field"
1,"a","1.0123456789012345"
2,"b","2"
The only way I could find to remove the quotes surrounding the decimal_field values also removed the quotes surrounding genuine text values.
If quoted numeric values are unworkable, perhaps you could create a VBA custom CSV export procedure, where you write your values to each file line formatted as you wish.
Regarding "Why is this so difficult?", I suspect decimal data type as the culprit. I don't recall encountering this type of problem with other numeric data types. Unfortunately, that's only my speculation and won't help even if it's correct.
Create a query selecting all the records from your table. Format the troublesome column by using the format function:
Select Format(Fieldname,"0000.00000") AS FormattedField
Save this query and export the query instead of the table.
One disadvantage of this approach is that your numeric field is then treated as text, so you then get quotes around the exported numbers, and if you use the option not to enclose text in quotes, then any actual text fields you export in the same query lose their quotes too
The other (quicker, dirtier, bodge job) method is to export first into Excel and from there to text. This leaves decimal places intact, but obviously it's not very elegant.
Related
I have an analysis which contains hidden one column. While I'm trying to export result to .xlsx file, it works right, and hidden column doesn't print and calculation works fine. But when I'm trying to export it to .csv - either with ';' delimeter or tab-delimeter - hidden column appears.
There is no opportunity to exclude this column from analysis defenition because of field that I need to calculate, that has strong dependence on hidden column. Also I can't keep it in that form and remove column and add calculation by myself because this file after export automatically will be imported to database which has not enough space to make such operation every month till forever. Is there any way not to print hidden column and save prepared calculation while exporting to CSV?
No. CSV exports exactly what's in the analysis. That's its point and task. You can always clone your analysis, prepare the columns as you need and then just expose it as a download link.
CSV = exact, pure raw data as it's in the analysis construction
Excel = formatted based on what's rendered visually
In one scenario we are dynamically creating sql to create temp tables on-fly. There is no issue with table_name as it is decided by us however the column-names are provided by sources not in our control.
Usually we would check the column names using below query:
select ..
where NOT REGEXP_LIKE (Column_Name_String,'^([a-zA-Z])[a-zA-Z0-9_]*$')
OR Column_Name_String is NULL
OR Length(Column_Name_String) > 30
However is there any build in function which can do a more extensive check. Also any input on the above query is welcome as well.
Thanks in advance.
Final query based on below answers:
select ..
where NOT REGEXP_LIKE (Column_Name_String,'^([a-zA-Z])[a-zA-Z0-9_]{0,29}$')
OR Column_Name_String is NULL
OR Upper(Column_Name_String) in (select Upper(RESERVED_WORDS.Keyword) from V$RESERVED_WORDS RESERVED_WORDS)
Particularly not happy with character's like $ in column name either hence won't be using..
dbms_assert.simple_sql_name('VALID_NAME')
Instead with regexp I can decide my own set of character's to allow.
This answer does not necessarily offer either a performance or logical improvement, but you can actually validate the column names using a single regex:
SELECT ...
WHERE NOT
REGEXP_LIKE (COALESCE(Column_Name_String, ''), '^([a-zA-Z])[a-zA-Z0-9_]{0,29}$')
This works because:
It uses the same pattern to match columns, i.e. starting with a letter and afterwards using only alphanumeric characters and underscore
NULL column names are mapped to empty string, which fails the regex
We use a length quantifier {0,29} to check the column length directly in the regex
" is there any build in function which can do a more extensive check."
Oracle has the DBMS_ASSERT.SIMPLE_SQL_NAME() function. This returns the passed name if it meets the Oracle naming rules ...
select dbms_assert.simple_sql_name('VALID_NAME') from dual;
... and hurls ORA-44003 if the name is invalid.
Valid names permit any characters if the name is double-quoted (yuck, but then so is creating "temp tables on-fly"). Also the function doesn't check the length of the name, so you will still need to validate that yourself.
Find out more in the docs.
Also here is a SQL Fiddle.
"creating a table with comment column is not possible as its a invalid identifier"
Fair point. DBMS_ASSERT is primarily aimed at preventing SQL injection. So it verifies that a value conforms to Oracle's naming rules, not that the value is a valid Oracle name. To catch things like comment you will also need to check the value against V$RESERVED_WORDS, probably where reserved != 'Y'. As this is a V$ view select on it is not granted by default; if you don't have access you'll need to ask your friendly DBA to help out.
" For validating column names I believe I should check with the entire list"
Up to you. The distinction is that some keywords can legitimately be used as identifiers. For instance TYPE only became a reserved word in Oracle version 8 when they introduced the object-relational stuff. But there were a lot of tables and views in existing systems which used 'TYPE' as a column name (not least the Oracle data dictionary). If Oracle had made TYPE a properly reserved word it would have broken all those systems. So the list of reserved words which cannot be used as identifiers is a sub-set of all the Oracle keywords.
Opinions on the general task:
"we are getting data from external sources (files) and the job of the program/script is to push that data to oracle tables."
There are two parts to this task.
The first is that you should have agreed a standard format for these files with the third parties. There should be no need for discovery of the files' structure or content. (Or if there is such a need because the files are randomly sourced from a carousel of third parties probably you should not be using a relational database but something else: Endeca? Python Pandas library?)
The second is the creating tables on the fly. If you have an agreed file structure then you should be loading into standard tables, using either SQL*Loader or external tables according to your circumstances. If you're on 12c maybe SQL*Loader Express Mode could be of interest.
I am exporting data from oracle table into a csv file. I have a column of varchar2 datatype and it has values like 1.1 and 1.10. When I export these to a csv file the value 1.10 becomes 1.1 and thus creating duplicate records. Is there a way to get both the values 1.10, 1.1 into csv file without loosing the last zero in "1.10".
Thanks
When I export these to a csv file the value 1.10 becomes 1.1 and thus creating duplicate records.
This has nothing to do with Oracle. It is a display problem with the tool you are using. use proper formatting of cells to display up to required decimal places.
Also, 1.1 and 1.10 are same. Appending zeroes to the right after decimal makes no significant difference to the value.
Excel Text Formatting
Right click on the cell.
Select Format Cells.
In the first tab Number, select Text.
Click OK.
Text format cells are treated as text even when a number is in the cell.
The cell is displayed exactly as entered.
For few columns from the source i.e .csv file, we are having values like 1:52:00, 14:45:00.
I am supposed to load to the Oracle table.
Which data type should I choose in Target as well as source?
Should i be doing any thing in the expression transformation?
Use SQLLDR to load the data into database with the format described as in the link
http://docs.oracle.com/cd/B19306_01/server.102/b14200/sql_elements004.htm
ie.'HH24:MI:SS'
Oracle does not support time-only values, it supports dates (with a time component).
You have a few options:
Store the value as a string, perhaps providing a leading zero for
the hour.
Store the value as the number of seconds (or minutes) past midnight.
Store the value as the time component of some arbitrarily defined date, for
example 0001-JAN-01 01:52:00 and 0001-Jan-01 14:45:00. Tell your report writers to ignore the date portion of the value.
Your source datatype will be string(8). Use LPAD to add leading zeroes.
I have one field when importing that can contain large data, it seems that CSV has unofficial limitation of about 65000 (likely 65535*) character. as both libreoffice calc and magento truncating the data for that particular field. I have investigated well and I'm certain it is not because of a special character or quotes. the data pretty straight forward, the lines are similar in format to each other.
Question: How to increase that size? or at least where I should look to find it?
Note: I counted in libreoffice writer and it was about 65040. but probably with carriage return characters it could reach 65535
I change:
1) in table catalog_category_entity_text
type of field "value" from "text" to "longtext"
2) in file app/code/core/Mage/ImportExport/Model/Import/Entity/Abstract.php
const DB_MAX_TEXT_LENGTH = 65536;
to
const DB_MAX_TEXT_LENGTH = 16777215;
and all OK
You are right, there is a limitation in Magento, because it sets texts fields as TEXT in MySQL database and, according to MySQL docs, this kind of field supports a maximum of 65535 chars.
http://dev.mysql.com/doc/refman/5.0/es/storage-requirements.html
So you could change the column type in your Magento database to use MEDIUMTEXT. I guess the correct place is in the catalog_product_entity_text table, where you should modify the 'value' field type to match your needs. But please, keep in mind this is dangerous. Make a full backup before trying. And you may even need to play with core files... not recommended!
I'm having the same issue with 8 products from a list of more than 400, and I think I'm not going to mess with Magento core and database, we can reduce the description strings for those few products.
The CSV could care less. Due to Microsoft Access allowing Memo fields which can contain quite a bit of data, I've exported 2-3k descriptions in CSV format to be imported into Magento quite successfully.
Your limitation is either because you are using a spreadsheet that has a cell limitation or export limitation on cells or because the field you are trying to import into has a maximum character limitation set in its table for that field.
You can determine the latter by using phpMyAdmin to see what the maximum character setting is for that field.