Spring data Mongodb - Encrypting a single field using convertor - spring

I have a collection which has several array of objects. In one of the sub-objects there is a field called secret which has to be stored in encrypted format, the field is of type String.
What is the best way of achieving?
I don't think writing a custom writer for the entire document is feasible.
How to write a String convertor that will be only applied for this single field?

There are many answers to this questions and different approaches that depend on your actual requirements.
The first question that you want to ask is, whether MongoDB is a good place to store encrypted values at all or whether there is a better option that gives you features like rewrapping (re-encrypt), key rotation, audit logging, access control, key management, …
Another thing that comes into play is decryption: Every time you retrieve data from MongoDB, the secret is decrypted. Also, encrypting a lot of entries with the same key allows facilitates cryptanalysis so you need to ensure regular key rotation. Least, but not last, you're in charge of storing the crypto keys securely and making sure it's hard to get hold of these.
Having a dedicated data type makes it very convenient writing a with a signature of e.g. Converter<Secret, String> or Converter<Secret, Binary> as you get full control over serialization.
Alternatively, have a look at https://github.com/bolcom/spring-data-mongodb-encrypt or external crypto tools like HashiCorp Vault.

Related

Gorm relationship and issues

I was creating my first-ever rest API in golang with fiber and form. I wanted to create a model that had a slice of strings, but gorm does not allow me to do so. SO, the next thing I tried was to use a map, hoping that it will be easily converted to JSON and saved to my postgres instance. But the same, gorm does not support maps. So, I created another struct into which I put all the data in a not-so-elegant way, where I made a single string value for each possible string I can save, and then I embedded this struct into the other. But now the compiler's complaints that I have to save a primary key into it, and not raw json given from the request. I am a bit overwhelmed by now
If someone knows I way that I can use to save all the data I need into the way that respects my requirements (slice of string, easy to parse when I read from the database), and to finish this CRUD app, I would really be thankful for that. thank you a lot

cachemanager.net - How to get List of objects from redis cache based on key passed?

How do I get a List of objects from redis cache based on the key passed?
I am exploring cachemanager.net for redis cache. I have gone through the examples. But I could not find any example related to getting the List of objects based on the key passed.
var lst =cache.Get("Key_1");
It is returning only one object.
But I would like it like this. I have stored 1000 objects in cache with key name like Key_1, Key_2, Key_3..... Key_1000. I want to get list of 1000 objects if I pass Key_* as Key.
CacheManager does not provide any functionality to search keys or get many keys via wildcard. That's simply not how caches work.
As Karthikeyan pointed out, in Redis you could use the keys operator, but that's not a good solution and should only be used for debugging manually. Other cache systems don't even have something like that, therefore CacheManager also cannot provide that feature. Hope that makes sense ;)
With CacheManager, you can either store all your objects in one cache key and cache the list. That might have some limitations if you use redis because serialization might be become an issue.
Or, you store each object separately and retrieve them in a loop. The redis client will optimize certain things, also, in CacheManager, if you have 2 layers of caching, the performance will get better over time.
You can use redis hash instead. And you can use hgetall command to retrieve all the values in that hash.
http://redis.io/commands#hash
Or if you want to use a normal key Value pair you have to write a lua script to achieve it.
local keys = redis.call('keys','key_*')
return redis.call('mget',keys)
Keys is not advisable in production as it is blocking.
You can use scan command instead of keys to get all the keys matching that pattern and then follow the same procedure to achieve the same.

hadoop CustomWritables

I have more of a design question regarding the necessity of a CustomWritable for my use case:
So I have a document pair that I will process through a pipeline and write out intermediate and final data to HDFS. My key will be something like ObjectId - DocId - Pair - Lang. I do not see why/if I will need a CustomWritable for this use case. I guess if I did not have a key, I would need a CustomWritable? Also, when I write data out to HDFS in the Reducer, I use a Custom Partitioner. So, that would kind of eliminate my need for a Custom Writable?
I am not sure if I got the concept of the need for a Custom Writable right. Can someone point me in the right direction?
Writables can be used for de/serializing objects. For example a log entry can contain a timestamp, an user IP and the browser agent. So you should implement your own WritableComparable for a key that identifies this entry and you should implement a value class that implements Writable that reads and writes the attributes in your log entry.
These serializations are just a handy way to get the data from a binary format to an object. Some Frameworks like HBase still require byte arrays to persist the data. So you'll have a lot of overhead in transfering this by yourself and messes up your code.
Thomas' answer explains a bit. Its way too late but I'd like to add the following for prospective readers:
Partitioner only comes into play between the map and reduce phase and has no role to play in writing from reducer to output files.
I don't believe writing INTERMEDIATE data to hdfs is a requirement in most cases, although there are some hacks that can be applied to do the same.
When you write from a reducer to hdfs, the keys will automatically be sorted and each reducer will write to ONE SEPARATE file. Based on their compareTo method, keys are sorted. So if you want to sort based on multiple variables, go for a Custom key class that extends WritableComparable, and implement the write, readFields and compareTo methods. You can now control the way the keys are sorted, based on the compareTo implementation

Is there any data typing for the parameters in HTTP POST?

I am building a RESTful api using a Ruby server and a MongoDB database. The database stores objects as they are, preserving their natural data types (at least those that it supports).
At the moment I am using HTTP GET to pass params to the API, and understandably everything in my database gets stored as strings (because thats what the ruby code sees when it accesses the params[] hash). After deployment, the API will use exclusively HTTP POST, so my question is whether its possible to specify the data types that get sent via POST individually for each parameter (say I have a "uid" which is an integer and a "name" which is a string), or do I need to cast them within Ruby before passing them onto my database?
If I need to cast them, are there any issues related to it?
No, its not possible.
Post variables are just string key value pairs.
You could however implement your own higher level logic.
For example a common practice is to put a suffix to the names. For example everything that ends with _i gets parsed as integer and so on.
However what benefit would it bring to preserve the types? Or better asked. How do you output them? Is it only for storage?
Then it should not be a problem to convert the strings to proper types if that benefits your application and cast them back to strings before delivering.

Appropriate data structure for flat file processing?

Essentially, I have to get a flat file into a database. The flat files come in with the first two characters on each line indicating which type of record it is.
Do I create a class for each record type with properties matching the fields in the record? Should I just use arrays?
I want to load the data into some sort of data structure before saving it in the database so that I can use unit tests to verify that the data was loaded correctly.
Here's a sample of what I have to work with (BAI2 bank statements):
01,121000358,CLIENT,050312,0213,1,80,1,2/
02,CLIENT-STANDARD,BOFAGB22,1,050311,2359,,/
03,600812345678,GBP,fab1,111319005,,V,050314,0000/
88,fab2,113781251,,V,050315,0000,fab3,113781251,,V,050316,0000/
88,fab4,113781251,,V,050317,0000,fab5,113781251,,V,050318,0000/
88,010,0,,,015,0,,,045,0,,,100,302982205,,,400,302982205,,/
16,169,57626223,V,050311,0000,102 0101857345,/
88,LLOYDS TSB BANK PL 779300 99129797
88,TRF/REF 6008ABS12300015439
88,102 0101857345 K BANK GIRO CREDIT
88,/IVD-11 MAR
49,1778372829,90/
98,1778372839,1,91/
99,1778372839,1,92
I'd recommend creating classes (or structs, or what-ever value type your language supports), as
record.ClientReference
is so much more descriptive than
record[0]
and, if you're using the (wonderful!) FileHelpers Library, then your terms are pretty much dictated for you.
Validation logic usually has at least 2 levels, the grosser level being "well-formatted" and the finer level being "correct data".
There are a few separate problems here. One issue is that of simply verifying the data, or writing tests to make sure that your parsing is accurate. A simple way to do this is to parse into a class that accepts a given range of values, and throws the appropriate error if not,
e.g.
public void setField1(int i)
{
if (i>100) throw new InvalidDataException...
}
Creating different classes for each record type is something you might want to do if the parsing logic is significantly different for different codes, so you don't have conditional logic like
public void setField2(String s)
{
if (field1==88 && s.equals ...
else if (field2==22 && s
}
yechh.
When I have had to load this kind of data in the past, I have put it all into a work table with the first two characters in one field and the rest in another. Then I have parsed it out to the appropriate other work tables based on the first two characters. Then I have done any cleanup and validation before inserting the data from the second set of work tables into the database.
In SQL Server you can do this through a DTS (2000) or an SSIS package and using SSIS , you may be able to process the data onthe fly with storing in work tables first, but the prcess is smilar, use the first two characters to determine the data flow branch to use, then parse the rest of the record into some type of holding mechanism and then clean up and validate before inserting. I'm sure other databases also have some type of mechanism for importing data and would use a simliar process.
I agree that if your data format has any sort of complexity you should create a set of custom classes to parse and hold the data, perform validation, and do any other appropriate model tasks (for instance, return a human readable description, although some would argue this would be better to put into a separate view class). This would probably be a good situation to use inheritance, where you have a parent class (possibly abstract) define the properties and methods common to all types of records, and each child class can override these methods to provide their own parsing and validation if necessary, or add their own properties and methods.
Creating a class for each type of row would be a better solution than using Arrays.
That said, however, in the past I have used Arraylists of Hashtables to accomplish the same thing. Each item in the arraylist is a row, and each entry in the hashtable is a key/value pair representing column name and cell value.
Why not start by designing the database that will hold the data then you can use the entity framwork to generate the classes for you.
here's a wacky idea:
if you were working in Perl, you could use DBD::CSV to read data from your flat file, provided you gave it the correct values for separator and EOL characters. you'd then read rows from the flat file by means of SQL statements; DBI will make them into standard Perl data structures for you, and you can run whatever validation logic you like. once each row passes all the validation tests, you'd be able to write it into the destination database using DBD::whatever.
-steve

Resources