We're migrating our Oracle forms and Oracle reports from 6i to 10g over Windows 7. But when we changed the new PC's with Windows 7, users reported several reports and some forms that generates CSV files, they were generating incomplete data or files in blank -no records, just headers-.
Looking around we find out that when we use BETWEEN CLAUSE like this:
SELECT id, name, lastname FROM employee WHERE date_start BETWEEN :P_INIT_DATE AND :P_FINAL DATE
The resulting file was in blank or with records with mismatched dates, so we deduced there were a problem between Windows 7 date understanding and the Oracle database or whatever, we don't know yet. We could solve all this doing a double conversion TO_DATE(TO_CHAR(:P_DATE)) but now, when we want to generare a CSV file with forms 10g using CLIENT_TEXT_IO.PUT_LINE, we're experimenting a strange behavior. Webutil starts writting a file, but when this reaches certain number of lines it overwrites the same file starting in the beginnig of the CSV file again. So when you open the file in excel you only see the X last lines.
I would really apprecieate any help to fix this problems. There is no specific question, I just explain the problem we have, looking for help
CLIENT_TEXT_IO caches records before writing them to your file. I've seen several different thresholds in the range you cite. If your Form code issues a SYNCHRONIZE; every so many records written, the cache will be flushed each SYNCHRONIZE. I'm not writing large files at the moment, but in the past 100 records per SYNCHRONIZE has worked well. Check your timings carefully; 100 may be too few records per SYNCHRONIZE. Since the number I've seen varies from shop to shop, I'd wager it's NOT related solely to number of records, but how many bytes you stuff into your cache.
Related
I thought for sure this would be an easy issue, but I haven't been able to find anything. In SQL Server SSMS, if I run a SQL Statement, I get back all the records of that query, but in Oracle SQL Developer, I apparently can get back at most, 200 records, so I cannot really test the speed or look at the data. How can I increase this limit to be as much as I need to match how SSMS works in that regard?
I thought this would be a quick Google search to find it, but it seems very difficult to find, if it is even possible. I found one aricle on Stack Overflow that states:
You can also edit the preferences file by hand to set the Array Fetch Size to any value.
Mine is found at C:\Users<user>\AppData\Roaming\SQL
Developer\system4.0.2.15.21\o.sqldeveloper.12.2.0.15.21\product-preferences.xml on Win 7 (x64).
The value is on line 372 for me and reads
I have changed it to 2000 and it works for me.
But I cannot find that location. I can find the SQL Developer folder, but my system is 19.xxxx and there is no corresponding file in that location. I did a search for "product-preferences.xml" and couldn't find it in the SQL Developer folder. Not sure if Windows 10 has a different location.
As such, is there anyway I can edit a config file of some sort to change this setting or any other way?
If you're testing execution times you're already good. Adding more rows to the result screen is just adding fetch time.
If you want to add fetch time to your testing, execute the query as a script (F5). However, this still has a max number of rows you can print to the screen, also set in preferences.
Your best bet I think is the AutoTrace feature. You can tell it to fetch all the rows, you'll also get a ton of performance metrics and the actual execution plan.
Check that last box
Then use this button to run the scenario
I have a 30 million row CSV that gets created each month, I am trying to add 2 fields that are populated based on a Lookup from a separate file and let it run unattended. I am trying to choose the technology right now - I'd rather use a scripting language that can be run from the command line (Windows) and something free ideally but open to suggestions. SQL database not really an option.
Take a look at Pentaho Data Integration. It’s Java based, multi-threaded and can cope with large CSV files at 100k+ rows per second.
You can call it from the command line in either linux or windows, and can parametrize the jobs and transformations to take command line parameters for things such as file paths, db connections, etc.
There’s a paid for Enterprise Edition but also a free, open source community version.
See community.pentaho.com.
Beware: steep learning curve. Shout if you need additional pointers.
I have been asked to create a simple program to submit user defined queries to SQLite databases (.db). I have not worked with the offline databases before and have a question about optimizing performance.
There are a few hundred .db files that I need to query. Is it quicker to attach them all to a single query using ATTACH, or join them all into a single database and work from there? My thoughts are that there will be some trade off between how much time it takes for inital set up versus the query speed. Is there perhaps a different method that would result in better performance?
I dont think it will matter, but this will be written with C# for a windows OS desktop.
Thanks!
The documentation says:
The number of simultaneously attached databases is limited to SQLITE_MAX_ATTACHED which is set to 10 by default. [...] The number of attached databases cannot be increased above 62.
So attaching a few hundred databases will be very quick because outputting an error message can be done really fast. ☺
We a need a csv viewer which can look at 10MM-15MM rows on a windows environment and each column can have some filtering capability (some regex or text searching) is fine.
I strongly suggest using a database instead, and running queries (eg, with Access). With proper SQL queries you should be able to filter on the columns you need to see, without handling such huge files all at once. You may need to have someone write a script to input each row of the csv file (and future csv file changes) into the database.
I don't want to be the end user of that app. Store the data in SQL. Surely you can define criteria to query on before generating a .csv file. Give the user an online interface with the column headers and filters to apply. Then generate a query based on the selected filters, providing the user only with the lines they need.
This will save many people time, headaches and eye sores.
We had this same issue and used a 'report builder' to build the criteria for the reports prior to actually generating the downloadable csv/Excel file.
As other guys suggested, I would also choose SQL database. It's already optimized to perform queries over large data sets. There're couple of embeded databases like SQLite or FirebirdSQL (embeded).
http://www.sqlite.org/
http://www.firebirdsql.org/manual/ufb-cs-embedded.html
You can easily import CSV into SQL database with just few lines of code and then build a SQL query instead of writing your own solution to filter large tabular data.
I'm using Crystal Reports 11 (and VB6) to open a report file, load the data from an Access database and either print the report to a printer or export the report to another .rpt file (for later printing without the database)
Even for small amounts of data the process is somewhat slow. Profiling showed about 1.5 seconds for three records (one page) For about 500 records on 10 pages, it's 1.7 seconds.
Can I do something the speed it up? Can I tweak the data or the report?
This is just an idea, but I 'd first try to build a view on the server, so that CR can directly access the report's data without dealing with any join or filter or whatever on the user's side. I have read terrible things about CR querying the server multiple times with the same query before displaying the report ...
Seeing that there are no other replies I'll just post what I'm really thinking and that is honestly I wish my Crystal Reports 11 app was as fast as yours. In my application there are so many suppressed fields and sections that Crystal just drags for about a minute to generate any report.