Apache Pulsar JDBC sink: differentiation between insert/update/delete - jdbc

I'm currently examining Pulsar JDBC sinks, as we plan to use a PostgresSQL sink soon.
Now, it's mentioned that JDBC sinks support insert/update/delete ops, but I wasn't able to find any documentation on HOW the sink connector actually decides on WHAT to execute (is it an insert, an update or a delete for a new event?)
After browsing the source code and ogling into JdbcAbstractSink.java I think I might have an idea now, but I need some confirmation if my idea is right.
Please tell me if this is correct:
1.) There need to be 3 different topics for 1 db entity type. One topic for inserting the entity-type into a table, one for updating same entity-type, one for deletions. Also there need to be 3 different sink connectors, each one having a different configuration.
2.) The command decision is made by configuration properties:
if both nonKey and key properties are missing --> insert is executed
if both nonKey and key props are provided --> update is executed, as in
update nonKey columns where key column(s) = event.value
if only key columns are provided -->
delete where key column = event.value
Is this the way it's done?
In mentioned source code class there's the a code bit
for (Record<T> record : swapList) {
String action = record.getProperties().get(ACTION);
if (action == null) {
action = INSERT;
}
switch (action) {
case DELETE: ...
case UPDATE: ...
but nowhere is mentioned where and how the ACTION property of the record is set...
If I just missed the relevant documentation somehow, it would be nice to provide me a link.
I know about this configuration doc page: https://pulsar.apache.org/docs/en/io-jdbc-sink/#configuration
but it's very vague and there are no real examples

The documentation for this connect is lacking to say the least, so I will do my best to explain it. As you can see from the code, the "action" to take, e.g. insert, update, or delete is passed in as a property inside the Pulsar message itself.
String action = record.getProperties().get(ACTION);
Therefore in order to control the action taken by the Sink, you need to add that property to the message that you publish in the "source" topic of the JDBC Sink connector (unless you want the action to be INSERT, which is the default action).
Here is an example of how to publish a message with a different action in message properties:
producer.newMessage().value("1234").property("action", "delete").send();
Now when the JDBC Sink connector reads this message, it will perform a DELETE operation on the record with the primary key value of "1234".

Related

Loading records into Dynamics 365 through ADF

I'm using the Dynamics connector in Azure Data Factory.
TLDR
Does this connector support loading child records which need a parent record key passed in? For example if I want to create a contact and attach it to a parent account, I upsert a record with a null contactid, a valid parentcustomerid GUID and set parentcustomeridtype to 1 (or 2) but I get an error.
Long Story
I'm successfully connecting to Dynamics 365 and extracting data (for example, the lead table) into a SQL Server table
To test that I can transfer data the other way, I am simply loading the data back from the lead table into the lead entity in Dynamics.
I'm getting this error:
Failure happened on 'Sink' side. ErrorCode=DynamicsMissingTargetForMultiTargetLookupField,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=,Source=,''Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Cannot find the target column for multi-target lookup field: 'ownerid'.
As a test I removed ownerid from the list of source columns it loads OK.
This is obviously a foreign key value.
It raises two questions for me:
Specifically with regards to the error message: If I knew which lookup it needed to use, how can I specify which lookup table it should validate against? There's no settings in the ADF connector to allow me to do this.
This is obviously a foreign key value. If I only had the name (or business key) for this row, how can I easily lookup the foreign key value?
How is this normally done through other API's, i.e. the web API?
Is there an XRMToolbox addin that would help clarify?
I've also read some posts that imply that you can send pre-connected data in an XML document so perhaps that would help also.
EDIT 1
I realised that the lead.ownertypeid field in my source dataset is NULL (that's what was exported). It's also NULL if I browse it in various Xrmtoolbox tools. I tried hard coding it to systemuser (which is what it actually is in the owner table against the actual owner record) but I still get the same error.
I also notice there's a record with the same PK value in systemuser table
So the same record is in two tables, but how do I tell the dynamics connector which one to use? and why does it even care?
EDIT 2
I was getting a similar message for msauto_testdrive for customerid.
I excluded all records with customerid=null, and got the same error.
EDIT 2
This link appears to indicate that I need to set customeridtype to 1 (Account) or 2 (Contact). I did so, but still got the same error.
Also I believe I have the same issue as this guy.
Maybe the ADF connector suffers from the same problem.
At the time of writing, #Arun Vinoth was 100% correct. However shortly afterwards there was a documentation update (in response to a GitHub I raised) that explained how to do it.
I'll document how I did it here.
To populate a contact with against a parent account, you need the parent accounts GUID. Then you prepare a dataset like this:
SELECT
-- a NULL contactid means this is a new record
CAST(NULL as uniqueidentifier) as contactid,
-- the GUID of the parent account
CAST('A7070AE2-D7A6-EA11-A812-000D3A79983B' as uniqueidentifier) parentcustomerid,
-- customer id is an account
'account' [parentcustomerid#EntityReference],
'Joe' as firstname,
'Bloggs' lastname,
Now you can apply the normal automapping approach in ADF.
Now you can select from this dataset and load into contact. You can apply the usual automapping approach, this is: create datasets without schemas. Perform a copy activity without mapping columns
This is the ADF limitation with respect to CDS polymorphic lookups like Customer and Owner. Upvote this ADF idea
Workaround is to use two temporary source lookup fields (owner team and user in case of owner, account and contact in case of customer) and with parallel branch in a MS Flow to solve this issue. Read more, also you can download the Flow sample to use.
First, create two temporary lookup fields on the entity that you wish to import Customer lookup data into it, to both the Account and Contact entities respectively
Within your ADF pipeline flow, you will then need to map the GUID values for your Account and Contact fields to the respective lookup fields created above. The simplest way of doing this is to have two separate columns within your source dataset – one containing Account GUID’s to map and the other, Contact.
Then, finally, you can put together a Microsoft Flow that then performs the appropriate mapping from the temporary fields to the Customer lookup field. First, define the trigger point for when your affected Entity record is created (in this case, Contact) and add on some parallel branches to check for values in either of these two temporary lookup fields
Then, if either of these conditions is hit, set up an Update record task to perform a single field update, as indicated below if the ADF Account Lookup field has data within it

Debezium, Kafka-connect : updates to postgres are not showing up as messages, only inserts are

I am using the command :
# bin/connect-standalone.sh config/connect-standalone.properties config/debezium-config.properties
My debezium-config.properties is :
name=publications-connector
database.hostname=localhost
database.port=5432
database.user=andy
database.password=postgres
database.dbname=postgres
database.server.name=dbserver1
table.whitelist=public.publications
In Postgres, I have a table called publications
When I insert a new record into the publications table, I can see that my consumer shows the new message in json format.
However when I update an existing record into the publications, no new message is published to the topic and hence nothing to consume.
How can I fix this ?
Also, I would like to add another table 'comments' to my database. What changes do I need to make to the debezium-config.properties file or anywhere else to have those messages be published to its own topic ?
I see the following logs from the console :
WARN: no values found for table 'public.publications' from update message at 'source_info[server=dbserver1'db='postgres', .... schema=public, table=publications]'; skipping record (io.debezium.connector.postgres.RecordsStreamProducer:333)
Thanks,
After some detailed reading into many debezium topics, i was able to solve this by setting the REPLICA IDENTITY to FULL. Once I did this, on updates I was able to see the update messages and consume them.
I think the reason is this:
If a table does not have a primary key, the connector does not emit UPDATE or DELETE events for that table. For a table without a primary key, the connector emits only create events. Typically, a table without a primary key is used for appending messages to the end of the table, which means that UPDATE and DELETE events are not useful.
reference resources

Proforma SalesInvoice doesn't show data from all tables

In the salesInvoice ssrs Report i have added a table called carTableEquipTmp which is not there by default, which I insert into along with the other tables(SalesinvoiceTmp and SalesinvoiceHeaderFooterTmp) in SalesInvoiceDP.InsertIntoSalesInvoiceTmp().
Even though my table carTableEquipTmp is getting successfully inserted into, the data doesn't show up on the report if i print a proforma report.
If i add test values to the carTableEquipTmp table in SalesInvoiceDP.processReport() they show up on the proforma invoice, but there's no way for me to get any parameters needed to set in the correct data into the table at this point. If i stop at this point in the debugger none of the data is present because processreport() is being called from a lower level in the code.
I think it might be a problem with maybe pack/unpack or that the proforma code runs from a server instance as the code run when it is proforma is quite different.
I can see that SalesInvoiceJournalPostBase.CreateReportData() creates an instance of salesInvoiceDP
salesInvoiceDP = new SalesInvoiceDP();
salesInvoiceDP.parmDataContract(salesInvoiceContract);
salesInvoiceDP.parmUserConnection(new UserConnection(true));
salesInvoiceDP.createData();
And that this might have something to do with it... but i still cant get the data i want in the carTableEquipTmp table.
So any idea on how to make Ax 2012 accept this new table i have added as it gets inserted into just like the other tables and there seems to be no problem...
I hope you guys can help.
The SalesInvoice report has two data classes you need to look at for the data provider, SalesInvoiceDP and SalesInvoiceDPBase. SalesInvoiceDPBase extends SrsReportDataProviderPreProcess, so there are a couple extra steps you need to take in order to add new datasources to the report.
In the salesInvoiceDP class, there is a method called useExistingReportData(), which re-inserts the pro-forma temp table data under a user connection, so the SrsReportDataProviderPreProcess framework will pick it up in your report. When the pro-forma process creates the report data, it doesn't insert with a user connection so it doesn't get added to the report. This method only gets called when the report is being run pro-forma.
You will need to add your temp table to this method, and follow the pattern for the other tables, so your code will look something like this:
//this is different from the buffer you insert your data with
CarTableEquipTmp localCarTableEquipTmp;
...
recordList = new RecordSortedList(tableNum(carTableEquipTmp));
recordList.sortOrder(fieldNum(carTableEquipTmp, RecId));
//You will need to add a field to relate your temp table
//to the current invoice journal, and insert it in
//InsertIntoSalesInvoiceTmp() if thats where you're inserting your table.
while select localCarTableEquipTmp
where localCarTableEquipTmp.JournalRecId == jourRecId
{
recordList.ins(localCarTableEquipTmp);
}
delete_from localCarTableEquipTmp
where localCarTableEquipTmp.JournalRecId == jourRecId;
recordList.insertDatabase(this.parmUserConnection());
This method re-inserts your data under the framework and deletes the original data. The data that was re-inserted will then get picked up by the framework and show in your report. If you open CarTableEquipTmp in the table browser, you will most likely see data still there from all the times you have tried running the report. This is why we have the delete_from operation after we re-insert the data. When data is inserted under a userConnection, it is automatically deleted when the report is finished
The other method you will want to modify is SalesInvoiceDP.setTableConnections(), and you will just need to add the following line:
CarTableEquipTmp.setConnection(this.parmUserConnection());
This will set the user connection for your table when running regular (not pro-forma). You will probably want to delete the data that is stored currently in your temp table using alt+F9 from the table browser.
Other than that it's all standard RDP stuff, but it sounds like you have that part working fine. Your temp table must be of type "Regular" for this to work.

EF 6 Rollback and Update Best Approach

I have a working solution in place and i m initiating this thread to have a discussion on the best approach.
Environment : EF6, SQL 2012
Scenario:
I have Task and TaskDetail table which have parent child/relationship through TaskID.
Create Method:
While creating a task, i need to ensure an entry is made in TaskDetail table as well.
First approach:
An entry is made into Task Table. SaveChanges. Get the TaskID and assign into the DTO which has the information for Detail table. Pass the DTO to the TaskDetail create Method. Save changes. Commit.. If any error occurs, rollback entire transaction
Second Approach:
Add relavent fields of Task table. Add relevant fields of Task Detail table as well. Add the new detail table object to Task table through the navigation property. Task.Taskdetail.Add(newObj). Finally SaveChanges.
Question 1:
Both the approaches yield same SQL. Couldnt notice much difference though.. But what would be the best approach for doing this???
Question 2:
Also, if you take a look at my scenario, you would have noticed that its SaveAll or SaveNone approach. Initially i tried with looping through DbEntityEntries and then rollbacked the change. But that sounds working for Second approach described above and not for the first approach since i m making a save after my insertion in order to get the TaskID. Then finally i ended up with using "DbConextTransaction" introduced in EF 6. But what is the best approach???
Question 3:
Update Method:
While doing an update , as per my requirement, i will not touch the Task table. It deals with TaskDetail table alone though the task ID would be required which will be passed from the UI.
• Get the existing task detail using the task ID and active flag
(There is one to many relationship)
Update the active flag as false
Create new entry in Task Detail table
I just translated the above statements into code implementation as well but what would be the best approach to handle it????

Determine new record in PreWriteRecord event handler and check value of joined field

There is custom field "Lock Flag" in Account BC, namely in S_ORG_EXT_X table. This field is made available in Opportunity BC using join to above table. The join specification is as follows: Opportunity.Account Id = Account.Id. Account Id is always populated when creating new opportunity. The requirement is that for newly created records in Opportunity BC if "Lock Flag" is equal to 'Y', then we should not allow to create the record and we should show custom error message.
My initial proposal was to use a Runtime Event that is calling Data Validation Manager business service where validation rule is evaluated and error message shown. Assuming that we have to decide whether to write record or not, the logic should be placed in PreWriteRecord event handler as long as WriteRecord have row already commited to database.
The main problem was how to determine if it is new record or updated one. We have WriteRecordNew and WriteRecordUpdated runtime events but they are fired after record is actually written so it doesn't prevent user from saving record. My next approach was to use eScript: write custom code in BusComp_PreWriteRecord server script and call BC's method IsNewRecordPending to determine if it is new record, then check the flag and show error message if needed.
But unfortunately I am faced with another problem. That joined field "Lock Flag" is not populated for newly created opportunity records. Remember we are talking about BC Opportunity and field is placed in S_ORG_EXT_X table. When we create new opportunity we pick account that it belongs to. So it reproduceable: OpportunityBC.GetFieldValue("Lock Flag") returns null for newly created record and returns correct value for the records that was saved previously. For newly created opportunities we have to re-query BC to see "Lock Flag" populated. I have found several documents including Oracle's recomendation to use PreDefaultValue property if we want to display joined field value immediately after record creation. The most suitable expression that I've found was Parent: BCName.FieldName but it is not the case, because active BO is Opportunity and Opportunity BC is the primary one.
Thanks for your patience if you read up to here and finally come my questions:
Is there any way to handle PreWrite event and determine if it is new record or not, without using eScript and BC.IsNewRecordPending method?
How to get value of joined field for newly created record especially in PreWriteRecord event handler?
It is Siebel 8.1
UPDATE: I have found an answer for the first part of my question. Now it seems so simple to me that I am wondering how I haven't done it initially. Here is the solution.
Create Runtime Event triggered on PreWriteRecord. Specify call to Data Validation Manager business service.
In DVM create a ruleset and a rule where condition is
NOT(BCHasRows("Opportunity", "Opportunity", "[Id]='"+[Id]+"'", "AllView"))
That's it. We are searching for record wth the same Row Id. If it is new record there should't be anything in database yet (remember that we are in PreWriteRecord handler) and function returns FALSE. If we are updating some row then we get TRUE. Reversing result with NOT we make DVM raise an error for new records.
As for second part of my question credits goes to #RanjithR who proposed to use PickMap to populate joined field (see below). I have checked that method and it works fine at least when you have appropriate PickMap.
We Siebel developers have used scripting to correctly determine if record is new. One non scripting way you could try is to use RuntimeEvents to set a profileattribute during the BusComp NewRecord event, then check that in the PreWrite event to see if the record is new. However, there is always a chance that user might undo a record, those scenarios are tricky.
Another option, try invokine the BC Method:IsNewRecordPending from RunTime event. I havent tried this.
For the second part of the query, I think you could easily solve your problem using a PickMap.
On Opportunity BC, when your pick Account, just add one more pickmap to pick the Locked flag from Account and set it to the corresponding field on Opportunity BC. When the user picks the Account, he will also pick the lock flag, and your script will work in PreWriteRecord.
May I suggest another solution, again, I haven't tried it.
When new records are created, the field ModificationNumber will be set to 0. Every time you modify it, the ModificationNumber will increment by 1.
Set a DataValidationManager ruleset, trigger it from PreSetFieldValue event of Account field on Opportunity BC. Check for the LockFlag = Y AND (ModificationNumber IS NULL OR ModificationNumber = 0)) and throw error. DVM should throw error when new records are created.
Again, best practices say don't use the ModNumbers. You could set a ProfileAttribute to signal NewRecord, then use that attribute in the DVM. But please remember to clear the value of ProfileAttribute in WriteRecord and UndoRecord.
Let us know how it went !

Resources