Questions regarding inserting relation when using Upsert in Laravel - laravel

At hourly basis I have a schedule running a job that connects to my bank and fetches my bank transactions.
Unfortunately, my bank's API return transactions without any transaction IDs, and also transaction details will change over time, example:
Once paying for something the transaction amount is only reserved from the account, not actually withdrawn. So if I pay $100, the amount is withdrawn from my account and a transaction item is created with the following details:
{
"amount": {
"amount": -100,
"currencyCode": "USD"
},
"accountingDate": "2021-01-19",
"description": "PAYMENT",
"transactionCode": "123",
"transactionType": null
}
And a few hours later, or even maybe up to a day, this transaction entry in the API will have changed to something like:
{
"amount": {
"amount": -100,
"currencyCode": "USD"
},
"accountingDate": "2021-01-19",
"description": "WALMART",
"transactionCode": "R_123",
"transactionType": "Purchase"
}
In this case it is three different values being changed: description, transactionCode and transactionType.
The description field gets updated with details of the vendor or receiver of the money.
The transactionCode field references the bank's transaction categorization (which is not public), but when the code is prefixed with R_ it means it's withdrawn and accounted.
The transactionType field is updated with information in regards of the type of transaction (not neccessary in correlation with the transactionCode field). Examples are "Visa", "Purchase", "Bill payment", "Fees", "Transfer between own accounts" etc.
So to handle this in my application I use the upsert function where I basically check for changes in any of the fields from the API which in 99% of the cases do work.
But; I thought I would extract the transactionCode values to a separate table, using them as categories in my own app. What is the quickest and easiest way to extract the data (regardless of its R_ prefix) with doing lots of queries to check the database for already existing value?
I'm thinking of something like for each transaction:
Get transactionCode value
Strip the R_ prefix
Check if transaction code exists in database
Insert if not
Create relation for transaction
But for several hundred transactions at a time it might not be suitable to do a check for each transaction code value. Would it make more sense to fetch all transaction codes before looping the transactions and then create the relations afterwards?
And also, how do I insert relation when using upsert?

Related

SSAS: The way to hide certain fields in a table from certain users

For a Microsoft Analysis Services Tabular (1500) data cube, given a Sales table:
CREATE TABLE SalesActual (
Id Int,
InvoiceNumber Char(10),
InvoiceLineNumber Char(3),
DateKey Date,
SalesAmount money,
CostAmount money )
Where the GP Calculation in DAX would be
GP := SUM('SalesActual'[SalesAmount]) - SUM('SalesActual'[CostAmount])
I want to limit some users from accessing cost / GP data. Which approach would you recommend?
I can think of the following:
Split all the Sales and Cost into separate rows and create a MetricType flag 'C', 'S', etc. and set Row-Level Security so that some people won't be able to see lines with costs.
Separate the into two different tables and handle it through OLS.
Any other recommendations?
I am leaning towards approach 1 as I have some other RLS set-up and OLS doesn't mix well with RLS, but I also want to hear from the experts what other approach could fulfill such requirements.
Thanks!
UPDATE: I ended up going with the first approach.
Tabular DB is fast for this kind of split
OLS = renders the field invalid; and I'd have to create and maintain two reports... which is undesirable
RLS is easier to control; and I think cost / GP is the only thing I'd need to exclude for now, but it also gives me some flexibility in the filter if I need to restrict other fields; my data will grow vertically, but I can also add additional data type such as sales budget, sales forecast, expenses and other cost, etc. into the model in the future. All easily controlled by RLS
The accepted answer works and would work for many scenario. I appreciate answerer's sharing, just that it doesn't solve my particular situation.
You can create a role where CLS does the job. There is no gui for CLS, but we can use a script (You can script your current role from SSMS "Script Role As", to modify - but better test this on new one)
{
"createOrReplace": {
"object": {
"database": "YourDatabase",
"role": "CLS1"
},
"role": {
"name": "CLS1",
"modelPermission": "read",
"members": [
{
"memberName": "YourOrganization\\userName"
}
],
"tablePermissions": [
{
"name": "Sales",
"columnPermissions": [
{
"name": "SalesBonus",
"metadataPermission": "none"
},
{
"name": "CostAmount",
"metadataPermission": "none"
}
]
}
]
}
}
}
The key element is TablePermissions and columnPermissions in which we define which column / columns the user cannot use).

Riot Games API: Requests return same identifiers for same player name but different region

I have these two URLs:
https://euw1.api.riotgames.com/lol/summoner/v4/summoners/by-name/okusen
https://eun1.api.riotgames.com/lol/summoner/v4/summoners/by-name/okusen
They just have the same player name and they are two different players from two different regions (Europe West and Europe Nordic & East).
Then, the two JSON responses respectively:
{
"profileIconId": 4275,
"name": "Okusen",
"puuid": "KFM4xJBwzy7T-rytrj9J8lGx0QduGLsBJ-WY9xdx4Q9cZNvxXCSNv_k4YQdfPgQjS52ppwlO_f9vhA",
"summonerLevel": 121,
"accountId": "PsopchdPCOnlQJB4AjXZ6TCrHuEZ9JlMqZMrDP6iAtznGQ",
"id": "zYkVlVUGHDuDmbfo1lmU0neHdpQdqxBNJ-hHMunqC__2K-4",
"revisionDate": 1583882906000
}
{
"profileIconId": 25,
"name": "Okusen",
"puuid": "KFM4xJBwzy7T-rytrj9J8lGx0QduGLsBJ-WY9xdx4Q9cZNvxXCSNv_k4YQdfPgQjS52ppwlO_f9vhA",
"summonerLevel": 30,
"accountId": "PsopchdPCOnlQJB4AjXZ6TCrHuEZ9JlMqZMrDP6iAtznGQ",
"id": "zYkVlVUGHDuDmbfo1lmU0neHdpQdqxBNJ-hHMunqC__2K-4",
"revisionDate": 1495766289000
}
They have the same identifiers so this is incorrect. I need puuid, accountId or id as parameter in other requests in order to get data for a specific player but I can't do that correctly if I don't have the correct identifier.
LoLCHESS.GG does not seem to have this problem as they display different data for these two players so I probably miss something but I really don't know what.
Neither of those IDs are guaranteed to be unique.
summonerId and accountId are guaranteed to be unique on a per region basis (so we won't find two summoners with the same ID on EUW).
puuid is guaranteed to be unique globally but if a user transfers regions, the two accounts will have the same puuid.
Thanks to thomasmarton in GitHub, more details in this thread.

How to transform nested JSON-payloads with Kiba-ETL?

I want to transform nested JSON-payloads into relational tables with Kiba-ETL. Here's a simplified pseudo-JSON-payload:
{
"bookings": [
{
"bookingNumber": "1111",
"name": "Booking 1111",
"services": [
{
"serviceNumber": "45",
"serviceName": "Extra Service"
}
]
},
{
"bookingNumber": "2222",
"name": "Booking 2222",
"services": [
{
"serviceNumber": "1",
"serviceName": "Super Service"
},
{
"serviceNumber": "2",
"serviceName": "Bonus Service"
}
]
}
]
}
How can I transform this payload into two tables:
bookings
services (every service belongsTo a booking)
I read a about yielding multiple rows with the help of Kiba::Common::Transforms::EnumerableExploder at wiki, blog, ... etc.
Would you solve my use-case by yielding multiple rows (the booking and multiple services), or would you implement a Destination which receives a whole booking and calls some Sub-Destinations (i.e. to create or update a service)?
Author of Kiba here!
This is a common requirement, but it can (and this is not specific to Kiba) be more or less complex to handle. Here are a few points you'll need to think about.
Handling of foreign keys
The main problem here is that you'll want to keep the relationships between services and bookings, once they are inserted.
Foreign keys using business keys
A first (most easy) way to handle this is to use a foreign-key constraint on "booking number", and make sure to insert that booking number in each service row, so that you can leverage it later in your queries. If you do this (see https://stackoverflow.com/a/18435114/20302) you'll have to set a unique-constraint on "booking number" in the bookings table target.
Foreign keys using primary keys
If you instead prefer to have a booking_id which points to the bookings table id key, things are a bit more complicated.
If this is a one-off import targeting an empty table, I recommend that you arbitrarily force the primary key using something like:
transform do |r|
#row_index ||= 0
#row_index += 1
r.merge(id: #row_index)
end
If this not a one-off import, you will have to:
* Upsert bookings in a first pass
* In a second pass, look-up (via SQL queries) "bookings" to figure out what is the id to store in booking_id, then upsert the services
As you see it's a bit more work, so stick with option 1 if you don't have strong requirements around this (although option 2 is more solid on the long run).
Example implementation (using Kiba Pro & business keys)
The simplest way to achieve this (assuming your target is Postgres) is to use Kiba Pro's SQL Bulk Insert/Upsert destination.
It would go this way (in single pass):
extend Kiba::DSLExtensions::Config
config :kiba, runner: Kiba::StreamingRunner
source Kiba::Common::Sources::Enumerable, -> { Dir["input/*.json"] }
transform { |r| JSON.parse(IO.read(r)).fetch('bookings') }
transform Kiba::Common::Transforms::EnumerableExploder
# SNIP (remapping / renaming of fields etc)
first_destination = nil
destination Kiba::Pro::Destinations::SQLBulkInsert,
row_pre_processor: -> (row) { row.except("services") },
dataset: -> (dataset) {
dataset.insert_conflict(target: :booking_number)
},
after_read: -> (d) { first_destination = d }
destination Kiba::Pro::Destinations::SQLBulkInsert,
row_pre_processor: -> (row) { row.fetch("services") },
dataset: -> (dataset) {
dataset.insert_conflict(target: :service_number)
},
before_flush: -> { first_destination.flush }
Here we iterate over each input file, parsing it and grabbing the "bookings", then generating one row per element of "bookings".
We have 2 destinations, doing "upsert" (insert or update), plus one trick to ensure we'll save the parent rows before we insert the children, to avoid a failure due to missing pointed record.
You can of course implement this yourself, but this is a bit of work though!
If you need to use primary-key based foreign keys, you'll have (likely) to split in 2 pass (one for each destination), then add some form of lookup in the middle.
Conclusion
I know that this is not trivial (depending on what you'll need, & if you'll use Kiba Pro or not), but at least I'm sharing the patterns that I'm using in such situations.
Hope it helps a bit!

Merging and Aggregating data held in JSON objects

I have two JSON objects, created using JSON.parse, that I would like to merge and aggregate.
I do not have the ability to store the data in a Mongo database and am unclear how to proceed.
The first JSON file contains the raw data:
[
{
"sector": {
"url": "http://TestUrl/api/sectors/11110",
"code": "11110",
"name": "Education policy and administrative management"
},
"budget": 5742
},
{
"sector": {
"url": "http://TestUrl/api/sectors/11110",
"code": "11110",
"name": "Education policy and administrative management"
},
"budget": 5620
},
{
"sector": {
"url": "http://TestUrl/api/sectors/12110",
"code": "12110",
"name": "Health policy and administrative management"
},
"budget": 5524
}, ]
The second JSON file contains the mappings that I require for the data merge operation:
{
"Code (L3)":11110,
"High Level Code (L1)":1,
"High Level Sector Description":"Education",
"Name":"Education policy and administrative management",
"Description":"Education sector policy, planning and programmes; aid to education ministries, administration and management systems; institution capacity building and advice; school management and governance; curriculum and materials development; unspecified education activities.",
"Category (L2)":111,
"Category Name":"Education, level unspecified",
"Category Description":"The codes in this category are to be used only when level of education is unspecified or unknown (e.g. training of primary school teachers should be coded under 11220)."
},
{
"Code (L3)":12110,
"High Level Code (L1)":2,
"High Level Sector Description":"Health",
"Name":"Health policy and administrative management",
"Description":"Health sector policy, planning and programmes; aid to health ministries, public health administration; institution capacity building and advice; medical insurance programmes; unspecified health activities.",
"Category (L2)":121,
"Category Name":"Health, general",
"Category Description":""
},
{
"Code (L3)":99999,
"High Level Code (L1)":9,
"High Level Sector Description":"Unused Code",
"Name":"Extra Code",
"Description":"Shows Data Issue",
"Category (L2)":998,
"Category Name":"Extra, Code",
"Category Description":""
},
I would like to connect the data in the two files using the "code" value in the first file and the "Code (L3)" value in the second file. In SQL terms I would like to do an "inner join" on the files using these values as the connection point.
I would then like to aggregate all of the budget values from the first file for the "High Level Code (L1)" value from the second file to produce the following JSON object:
{
"High Level Code (L1)":1,
"High Level Sector Description":"Education",
"Budget”: 11362
},
{
"High Level Code (L1)":2,
"High Level Sector Description":"Health",
"Budget”: 5524
}
This would be a very simple task with a database but I am afraid that this option is not available. We are running our site on Sinatra so any Rails-specific helper methods are not available to me.
Update: I am now using real data for the inputs and I have found that there are multiple JSON objects in the mappings file that have "Code (L3)" values that do not map to any of the [Sector][code] values in the raw data file.
I have tried a number of workarounds (breaking the data into 2D arrays then trying to bring the resultant array back as a hash table) but I have been unable to get anything to work.
I have come back to the answer that I accepted for this question as it is a very elegant solution and I don't want to ask the same question twice - I just can't figure out how to make it ignore items from the mappings file when they don't match anything from the raw data file.
This is quite easy, image you're first list is named sources, whil the second is named "values", or whatever. We will through "values", and extract the required fields, and for one, find in "sources", the values needed :
values.map do |elem|
{ "High Level Code (L1)" => elem["High Level Code (L1)"],
"High Level Sector Description" => elem["High Level Sector Description"],
"Budget" => sources.select do |source|
source["sector"]["code"] == elem["Code (L3)"].to_s
end.map{|elem|elem["budget"]}.sum
}
end
The equivalent of the "join" with a database is made with the "find" operation. We loop through the sources array to find a sector/code value identical to "Code (L3)", then we extract the "budget" value and we sum all these values extracted....
Results is the following:
[{"High Level Code (L1)"=>1,
"High Level Sector Description"=>"Education",
"Budget"=>11362},
{"High Level Code (L1)"=>2,
"High Level Sector Description"=>"Health",
"Budget"=>5524}]
how about just going through the first dataset and indexing it to a hash using the code as the key, then going through the second dataset and finding the appropriate data for every key from the hash. Sort of brute force but..

How to remove luis entity marker from utterance

I am using LUIS to determine which state a customer lives in. I have set up a list entity called "state" that has the 50 states with their two-letter abbreviations as synonyms as described in the documentation. LUIS is returning certain two letter words, such as "hi" or "in" as state entities.
I have set up an intent with phrases such as "My state is Oregon", "I am from WA", etc. Inside the intent, if the word "in" is included in the utterance, for example in the utterance "I live in Kentucky", the word "in" is marked automatically by LUIS as a state entity and I am unable to remove that marker.
Below is a snip of the LUIS json response to the utterance "I live in Kentucky". As you can see, the response includes both Indiana and Kentucky as entities when there should only be Kentucky.
"query": "I live in Kentucky",
"topScoringIntent": {
"intent": "STATE_INQUIRY",
"score": 0.9338141
},
....
"entities": [
....
{
"entity": "in",
"type": "state",
"startIndex": 7,
"endIndex": 8,
"resolution": {
"values": [
"indiana"
]
}
},
{
"entity": "kentucky",
"type": "state",
"startIndex": 10,
"endIndex": 17,
"resolution": {
"values": [
"kentucky"
]
}
}
], ....
How do I train LUIS not to mark the words "in" and "hi" in this context as states if I can't remove the intent marker from the utterance?
In this particular case (populating a list entity with state abbvreviations/names), you would be better served using the geographyV2 prebuilt entity or Places.AbsoluteLocation prebuilt domain entity. (Please note that at the time of this writing, the geographyV2 prebuilt entity has a slight bug, so using the prebuilt domain entity would be the better option).
The reason for this is two-fold:
One, geographic locations are already baked into LUIS and they don't collide with regular syntactic words like "in", "hi", or "me". I tested this in reverse by creating a [Medical] list that contained "ct" as the normalized value and "ct scan" as a synonym. When I typed "get me a ct in CT" it resulted in "get me a [Medical] in [Medical]". To fix, I selected the second "CT" value and re-assigned it to the Places.AbsoluteLocation entity. After retraining, I tested "when in CT show me ct options" which correctly resulted in "when in [Places.AbsoluteLocation] show me [Medical] options". Further examples and training will refine the results.
Two, lists work well for words that have disparate words that can reference one. This tutorial shows a simple example where loosely associated words are assigned as synonyms to a canonical name (normalized value).
Hope of help!
#StevenKanberg's answer was very helpful but unfortunately not complete for my situation. I tried to implement both geographyV2 and Places.AbsoluteLocation (separately). Neither one works entirely in the way I need it to (recognizing states and their two-letter abbrevs in a way that can be queried from the entities in the response).
So my choices are:
Create my own list of states, using the state name and the two-letter abbrev as synonyms, as described in the list description itself. This works except for two letter abbrevs that are also words, such as "in", "hi" and "me".
Use geographyV2 prebuilt which does not allow synonyms and does not recognize two-letter abbrevs at all, or
Use Places.AbsoluteLocation which does recognize two-letter abbrevs for states, does not confuse them with words, but also grabs all locations including cities, countries and addresses and does not differentiate between them so I have no way of parsing which entity is the state in an utterance like "I live in Lake Stevens, Snohomish County, WA".
Solution: If I combine 1 with 3, I can query for entities that have both of those types. If LUIS marks the word "in" as a state (Indiana), I can then check to see if that word has also been flagged as an AbsoluteLocation. If it has not, then I can safely discard that entity. It's not ideal but is a workaround that solves the problem.

Resources