How to pass a String with more than 250 characters as job parameter in Spring Batch? - spring

In BATCH_JOB_EXECUTION_PARAMS table, column "STRING_VAL" is defined as varchar(250). If any string longer than 250 is passed as job parameter, the database will complain that data is too long. I did some research and what some people did was to manually change the definition of the column to hold more data. Is there any side effect to store large params in the table? If so, what is the best solution to pass a large job param?
Thanks.

There shouldn't be a side effect; especially, if it is a non-identifiying parameter.
But also then, the only place this could have a sideeffect is the generation of the "JOB_KEY"-field in the JOB_INSTANCE table (have a look at JdbcJobInstanceDao).
The content for this field is generated using a "JobKeyGenerator" and having a look at the used default implementation "org.springframework.batch.core.DefaultJobKeyGenerator", I don't see anything that could cause a side effect.

I would not go down that road since it is part of Spring Framework developed outside of your control. Even if it is safe now to change what if they decide to use 250 character limit in some important framework functionality. You will get either funny bugs when you upgrade to new version or you will get version lock since you changed library code yourself.
I answered similar question in this post. You can create new table for holding parameters or whatever near Spring Batch meta data (in same database) and you can pass just ID. Inside Spring Batch job you can pull whatever from that table based on passed ID.

Related

Serializing query result

I have a financial system with all its business logic located in the database and i have to code an automated workflow for transactions batch processing, which consists of steps listed below:
A user or an external system inserts some data in a table
Before further processing a snapshot of this data in the form of CSV file with a digital signature has to be made. The CSV snapshot itself and its signature have to be saved in the same input table. Program updates successfully signed rows to make them available for further steps of code
...further steps of code
Obvious trouble is step#2: I don't know, how to assign results of a query as a BLOB, that represents a CSV file, to a variable. It seems like some basic stuff, but I couldn't find it. The CSV format was chosen by users, because it is human-readable. Signing itself can be made with a request to external system, so it's not an issue.
Restrictions:
there is no application server, which could process the data, so i have to do it with plsql
there is no way to save a local file, everything must be done on the fly
I know that normally one would do all the work on the application layer or with some local files, but unfortunately this is not the case.
Any help would be highly appreciated, thanks in advance
I agree with #william-robertson. you just need to create a comma delimited values string (assuming header and data row) and write that to a CLOB. I recommend an "insert" trigger. There are lots of SQL tricks you can do to make that easier). On usage of that CSV string will need to be owned by the part of the application that reads it in and needs to do something with it.
I understand yo stated you need to create a CVS, but see if you could do XML instead. Then you could use DBMS_XMLGEN to generate the necessary snapshot into a database column directly from the query for it.
I do not accept the concept that a CVS is human-readable (actually try it sometime as straight text). What is valid is that Excel displays it in human-readable form. But is should also be able to display the XML as human-readable. Further, if needed the data in it can be directly back-ported into the original columns.
Just a alternate idea.

Parse database for each username

I'm trying to make a database table for every single username. I see that for every username, I can add more columns in it's row, but I want to attribute a full table for each one. How can I do that?
Thanks,
Eli
First let me say, what you are trying to do sounds like really, really bad database design and you should rethink your idea of creating a table per user. To get help for this you should add way more detail about the reasoning to your question to get a good answer. As far as I know there is also a maximum number of classes you can create on Parse so sooner or later you will run into problems, either performance wise or due to technical limitations of the platform.
That being said, you can use the Schema API to programmatically create/delete/update tables of your Parse app. It always requires the master key, so doing this from the client side is not recommended for security reasons. You could put this into a Cloud Code function for example and call this one from your app/admin tool to create a new table for a user on the fly or delete a table of a user.
Again, better don't do it and think about a better way to design your database, it would be out of scope here to discuss it.

Cassandra Best Practice on edits: delete & re-insert vs. update?

I am new to Cassandra. I am looking at many examples online. Here is one from JHipster Cassandra examples on GitHub:
https://gist.github.com/jdubois/c3d3bedb869466731316
The repository save(user) method does a read (to look for existence) then a delete and re-insert of the existing user across all the denormalized tables whenever the user data changed.
Is this best practice?
Is this only because of how the data model for this sample is designed?
Is this sample's design a result of twisting a POJO framework into a NoSQL database design?
When would I want to just do a update in Cassandra? It supports updates at the field-level, so it seems like that would be preferred.
First of all, the delete operations should be part of the batch for more robust error handling. But it looks like there are also some concurrency issues with the code. It will update the user based on the current user value read before. It's not save to assume this will still be the latest value while save() is actually executed. It will also just overwrite any keys in the lookup table that might be in use for a different user at that point. E.g. the login could already exist for another user while executing insertByLoginStmt.
It is not necessary to delete a row before inserting a new one.
But if you are replacing rows and new columns are different from existing columns then you need to delete all existing columns and insert new columns. Or insert new and delete old, does not matter if happens in batch.

Disable Hibernate auto update on flush on read only synonyms

I have a table and two databases which have the same table, but one is a symlink of the other one and only read is permitted on this table.
I have mapped the table to Java using Hibernate and I use spring to set the Entity Manager's data source as one of the two databases based on some input criteria.
I call only read only operations (selects) when I am connected to the second database, but it seems Hibernate tries to flush something back to the database and it fails telling update is not allowed on this view.
How do I disable this update only for the second datasource and keep it normal for the first one?
Update:
Looking at the stack trace, the flush seems to be started here:
at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:321)
at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:50)
at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1027)
at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:365)
at org.hibernate.ejb.AbstractEntityManagerImpl$1.beforeCompletion(AbstractEntityManagerImpl.java:504)
... 55 more
Is this related to hibernate.transaction.flush_before_completion property? Can I set it to false for the second data source?
Most probably your entities become "dirty" the same moment they are loaded from the database, and Hibernate thinks that it needs to store the changes. This happens, if your accessors (get and set methods) are not returning the exact same value or reference that had been set by Hibernate.
In our code, this happened with lists, developers created new list instances because they didn't like the type they got in the setter.
If you don't want to change the code, change the mapping to field access.
You can also prevent Hibernate of storing changes by setting FlushMode to never on the session, but this only hides the real problem which will still occur in other situations an will lead to unnecessary updates.
First you need to determine if this is DDL or DML. If you don't know, then I recommend you set hibernate.show_sql=true to capture the offending statement.
If it is DDL, then it's most likely going to be Hibernate updating the schema for you and you'd want to additionally configure the hibernate.hbm2ddl.auto setting to be either "update" or "none", depending on whether you're using the actual db or the symlinked (read-only) one, respectivley. You can use "validate" instead of none, too.
If it is DML, then I would first determine whether your code is for some reason making a change to an instance which is still attached to an active Hibernate Session. If so, then a subsequent read may cause a flush of these changes without ever explicitly saving the object (Grails?). If this is the case, consider evicting the instance causing the flush ( or using transport objects instead ).
Are you perhaps using any aspects or Hibernate lifecycle events to provide auditing of the objects? This, too, could cause access of a read-only to result in an insert or update being run.
It may turn out that you need to provide alternative mappings for the offending class should the updatability of a field come into play, but the code is doing everything exactly as you'd like ( this is unlikely ;0 ). If you are in an all-annotation world, this may be tricky. If working with hbm.xml, then providing an alternative mapping is easier.

Referencing object's identity before submitting changes in LINQ

is there a way of knowing ID of identity column of record inserted via InsertOnSubmit beforehand, e.g. before calling datasource's SubmitChanges?
Imagine I'm populating some kind of hierarchy in the database, but I wouldn't want to submit changes on each recursive call of each child node (e.g. if I had Directories table and Files table and am recreating my filesystem structure in the database).
I'd like to do it that way, so I create a Directory object, set its name and attributes,
then InsertOnSubmit it into DataContext.Directories collection, then reference Directory.ID in its child Files. Currently I need to call InsertOnSubmit to insert the 'directory' into the database and the database mapping fills its ID column. But this creates a lot of transactions and accesses to database and I imagine that if I did this inserting in a batch, the performance would be better.
What I'd like to do is to somehow use Directory.ID before commiting changes, create all my File and Directory objects in advance and then do a big submit that puts all stuff into database. I'm also open to solving this problem via a stored procedure, I assume the performance would be even better if all operations would be done directly in the database.
One way to get around this is to not use an identity column. Instead build an IdService that you can use in the code to get a new Id each time a Directory object is created.
You can implement the IdService by having a table that stores the last id used. When the service starts up have it grab that number. The service can then increment away while Directory objects are created and then update the table with the new last id used at the end of the run.
Alternatively, and a bit safer, when the service starts up have it grab the last id used and then update the last id used in the table by adding 1000 (for example). Then let it increment away. If it uses 1000 ids then have it grab the next 1000 and update the last id used table. Worst case is you waste some ids, but if you use a bigint you aren't ever going to care.
Since the Directory id is now controlled in code you can use it with child objects like Files prior to writing to the database.
Simply putting a lock around id acquisition makes this safe to use across multiple threads. I've been using this in a situation like yours. We're generating a ton of objects in memory across multiple threads and saving them in batches.
This blog post will give you a good start on saving batches in Linq to SQL.
Not sure off the top if there is a way to run a straight SQL query in LINQ, but this query will return the current identity value of the specified table.
USE [database];
GO
DBCC CHECKIDENT ("schema.table", NORESEED);
GO

Resources