I'm currently examining Pulsar JDBC sinks, as we plan to use a PostgresSQL sink soon.
Now, it's mentioned that JDBC sinks support insert/update/delete ops, but I wasn't able to find any documentation on HOW the sink connector actually decides on WHAT to execute (is it an insert, an update or a delete for a new event?)
After browsing the source code and ogling into JdbcAbstractSink.java I think I might have an idea now, but I need some confirmation if my idea is right.
Please tell me if this is correct:
1.) There need to be 3 different topics for 1 db entity type. One topic for inserting the entity-type into a table, one for updating same entity-type, one for deletions. Also there need to be 3 different sink connectors, each one having a different configuration.
2.) The command decision is made by configuration properties:
if both nonKey and key properties are missing --> insert is executed
if both nonKey and key props are provided --> update is executed, as in
update nonKey columns where key column(s) = event.value
if only key columns are provided -->
delete where key column = event.value
Is this the way it's done?
In mentioned source code class there's the a code bit
for (Record<T> record : swapList) {
String action = record.getProperties().get(ACTION);
if (action == null) {
action = INSERT;
}
switch (action) {
case DELETE: ...
case UPDATE: ...
but nowhere is mentioned where and how the ACTION property of the record is set...
If I just missed the relevant documentation somehow, it would be nice to provide me a link.
I know about this configuration doc page: https://pulsar.apache.org/docs/en/io-jdbc-sink/#configuration
but it's very vague and there are no real examples
The documentation for this connect is lacking to say the least, so I will do my best to explain it. As you can see from the code, the "action" to take, e.g. insert, update, or delete is passed in as a property inside the Pulsar message itself.
String action = record.getProperties().get(ACTION);
Therefore in order to control the action taken by the Sink, you need to add that property to the message that you publish in the "source" topic of the JDBC Sink connector (unless you want the action to be INSERT, which is the default action).
Here is an example of how to publish a message with a different action in message properties:
producer.newMessage().value("1234").property("action", "delete").send();
Now when the JDBC Sink connector reads this message, it will perform a DELETE operation on the record with the primary key value of "1234".
I'm using CRM 2013 and I would like to return a large number of custom entities (around 100) based on values in a list.
I can't do context.MyCustomEntity.Where(i=>list.Contains(i.Id));
I can't use RetrieveMultiple since I want to update these entities and send them back to the server.
So I'm forced to call context.MyCustomEntity.Where(i=>i.Id == id) in a loop.
Is there somewhere to prefetch the entities from the context? Or call execute on the where clause in a loop?
If you already have the Id's and you don't need to reference any existing dynamic values on the records, you can just update the records. Here's an example:
var toUpdate = new MyCustomEntity{Id=id};
// update appropriate values
orgService.Update(toUpdate);
However, if you do need to update the fields dynamically, you could chunk the retrievals by using PredicateBuilder to do a dynamic OR condition, but you are limited on the number of criteria you can add to your query. Let me know if this is a path you need to go down, and I can provide more guidance.
I have a stored procedure that loops through a table and it may insert some records in to that table . Its working fine . I can see the changes in db using management studio .
The problem is after that i will call another stored procedure which will return a collection .But it always return a cached value or something like that .The latest changes in db not reflecting in the returned list .Any ideas?
EDIT
I am importing stored procedure to function using EF. All the operations i made is via EF.
Chek following code
TraktorumEntities db = new TraktorumEntities();
var test= db.GetAvailableAttributes(CategoryID).ToList(); // here i get cached values .How can i force to fetch data from data base
If you are querying using the same key, EF will have your results cached.
Note the section of "MergeOption.OverwriteChanges" here
Walkthrough: Mapping an Entity to Stored Procedures (Entity Data Model Tools)
You need to tell EF to 'get new data and overwrite the locally stored version' with this option.
Also you don't really tell us exactly how you are querying this data either. Is this a mapped stored procedure (mapped to an entity operation) or calling it directly on the context, or....?
EDIT
Try something along these lines
var test= db.GetAvailableAttributes(CategoryID)
test.MergeOption = MergeOption.NoTracking;
var results = test.ToList()
Did you leave some Database.SetInitializer in the Global Application_Start?
I am using an IList<Employee> where i get the records more then 5000 by using linq which could be better? empdetailsList has 5000
Example :
foreach(Employee emp in empdetailsList)
{
Employee employee=new Employee();
employee=Details.GetFeeDetails(emp.Emplid);
}
The above example takes a lot of time in order to iterate each empdetails where i need to get corresponding fees list.
suggest me anybody what to do?
Linq to SQL/Linq to Entities use a deferred execution pattern. As soon as you call For Each or anything else that indirectly calls GetEnumerator, that's when your query gets translated into SQL and performed against the database.
The trick is to make sure your query is completely and correctly defined before that happens. Use Where(...), and the other Linq filters to reduce as much as possible the amount of data the query will retrieve. These filters are built into a single query before the database is called.
Linq to SQL/Linq to Entities also both use Lazy Loading. This is where if you have related entities (like Sales Order --> has many Sales Order Lines --> has 1 Product), the query will not return them unless it knows it needs to. If you did something like this:
Dim orders = entities.SalesOrders
For Each o in orders
For Each ol in o.SalesOrderLines
Console.WriteLine(ol.Product.Name)
Next
Next
You will get awful performance, because at the time of calling GetEnumerator (the start of the For Each), the query engine doesn't know you need the related entities, so "saves time" by ignoring them. If you observe the database activity, you'll then see hundreds/thousands of database roundtrips as each related entity is then retrieved 1 at a time.
To avoid this problem, if you know you'll need related entities, use the Include() method in Entity Framework. If you've got it right, when you profile the database activity you should only see a single query being made, and every item being retrieved by that query should be used for something by your application.
If the call to Details.GetFeeDetails(emp.Emplid); involves another round-trip of some sort, then that's the issue. I would suggest altering your query in this case to return fee details with the original IList<Employee> query.
Our development policy dictates that all database accesses are made via stored procedures, and this is creating an issue when using LINQ.
The scenario discussed below has been somewhat simplified, in order to make the explanation easier.
Consider a database that has 2 tables.
Orders (OrderID (PK), InvoiceAddressID (FK), DeliveryAddressID (FK) )
Addresses (AddresID (PK), Street, ZipCode)
The resultset returned by the stored procedure has to rename the address related columns, so that the invoice and delivery addresses are distinct from each other.
OrderID InvAddrID DelAddrID InvStreet DelStreet InvZipCode DelZipCode
1 27 46 Main St Back St abc123 xyz789
This, however, means that LINQ has no idea what to do with these columns in the resultset, as they no longer match the property names in the Address entity.
The frustrating thing about this is that there seems to be no way to define which resultset columns map to which Entity properties, even though it is possible (to a certain extent) to map entity properties to stored procedure parameters for the insert/update operations.
Has anybody else had the same issue?
I'd imagine that this would be a relatively common scenarios, from a schema point of view, but the stored procedure seems to be the key factor here.
Have you considered creating a view like the below for the stored procedure to select from? It would add complexity, but allow LINQ to see the Entity the way you wanted.
Create view OrderAddress as
Select o.OrderID
,i.AddressID as InvID
,d.AddressID as DelID
...
from Orders o
left join Addresses i
on o.InvAddressID= i.AddressID
left join Addresses d
on o.DelAddressID = i.AddressID
LINQ is a bit fussy about querying data; it wants the schema to match. I suspect you're going to have to bring that back into an automatically generated type, and do the mapping to you entity type afterwards in LINQ to objects (i.e. after AsEnumerable() or similar) - as it doesn't like you creating instances of the mapped entities manually inside a query.
Actually, I would recommend challenging the requirement in one respect: rather than SPs, consider using UDFs to query data; they work similarly in terms of being owned by the database, but they are composable at the server (paging, sorting, joinable, etc).
(this bit a bit random - take with a pinch of salt)
UDFs can be associated with entity types if the schema matches, so another option (I haven't tried it) would be to have a GetAddress(id) udf, and a "main" udf, and join them:
var qry = from row in ctx.MainUdf(id)
select new {
Order = ctx.GetOrder(row.OrderId),
InvoiceAddress = ctx.GetAddress(row.InvoiceAddressId),
DeliveryAddress = ctx.GetAddress(row.DeliveryAddressId)) };
(where the udf just returns the ids - actually, you might have the join to the other udfs, making it even worse).
or something - might be too messy for serious consideration, though.
If you know exactly what columns your result set will include, you should be able to create a new entity type that has properties for each column in the result set. Rather than trying to pack the data into an Order, for example, you can pack it into an OrderWithAddresses, which has exactly the structure your stored procedure would expect. If you're using LINQ to Entities, you should even be able to indicate in your .edmx file that an OrderWithAddresses is an Order with two additional properties. In LINQ to SQL you will have to specify all of the columns as if it were an entirely unrelated data type.
If your columns get generated dynamically by the stored procedure, you will need to try a different approach: Create a new stored procedure that only pulls data from the Orders table, and one that only pulls data from the addresses table. Set up your LINQ mapping to use these stored procedures instead. (Of course, the only reason you're using stored procs is to comply with your company policy). Then, use LINQ to join these data. It should be only slightly less efficient, but it will more appropriately reflect the actual structure of your data, which I think is better programming practice.
I think I understand what you're after, but I could wildy off...
If you mock up classes in a DBML (right-click -> new -> class) that are the same structure as your source tables, you could simply create new objects based on what is read from the stored procedure. Using LINQ to objects, you could still query your selection. It's more code, but it's not that hard to do. For example, mock up your DBML like this:
Pay attention to the associations http://geeksharp.com/screens/orders-dbml.png
Make sure you pay attention to the associations I added. You can expand "Parent Property" and change the name of those associations to "InvoiceAddress" and "DeliveryAddress." I also changed the child property names to "InvoiceOrders" and "DeliveryOrders" respectively. Notice the stored procedure up top called "usp_GetOrders." Now, with a bit of code, you can map the columns manually. I know it's not ideal, especially if the stored proc doesn't expose every member of each table, but it can get you close:
public List<Order> GetOrders()
{
// our DBML classes
List<Order> dbOrders = new List<Order>();
using (OrderSystemDataContext db = new OrderSystemDataContext())
{
// call stored proc
var spOrders = db.usp_GetOrders();
foreach (var spOrder in spOrders)
{
Order ord = new Order();
Address invAddr = new Address();
Address delAddr = new Address();
// set all the properties
ord.OrderID = spOrder.OrderID;
// add the invoice address
invAddr.AddressID = spOrder.InvAddrID;
invAddr.Street = spOrder.InvStreet;
invAddr.ZipCode = spOrder.InvZipCode;
ord.InvoiceAddress = invAddr;
// add the delivery address
delAddr.AddressID = spOrder.DelAddrID;
delAddr.Street = spOrder.DelStreet;
delAddr.ZipCode = spOrder.DelZipCode;
ord.DeliveryAddress = delAddr;
// add to the collection
dbOrders.Add(ord);
}
}
// at this point I have a List of orders I can query...
return dbOrders;
}
Again, I realize this seems cumbersome, but I think the end result is worth a few extra lines of code.
this it isn't very efficient at all, but if all else fails, you could try making two procedure calls from the application one to get the invoice address and then another one to get the delivery address.