I have a WCF named pipe service that receives a byte array and writes it into an SQLite DB.
When I moved the SQLite insert logic into the WCF service the write performance decreased almost by half.
I went through various recommendation online but nothing seems to help.
My current configuration looks like this:
pipeBinding.MaxBufferPoolSize = 5000000;
pipeBinding.MaxBufferSize = 5000000;
pipeBinding.MaxReceivedMessageSize = 5000000;
pipeBinding.ReaderQuotas.MaxArrayLength = 5000000;
pipeBinding.Security.Transport.ProtectionLevel = ProtectionLevel.None;
More tweaking recommendations would be more than welcome.
Using protobuf helped increasing the speed however most consuming action was a sum action on the SQLite table so I had change the structure of my db.
Related
My google script web app is recently hitting qps limits. What would be a better way to improve performance.
I have about 50 active users. I use 15,000 rows google spreadsheet as a database and my app is serving json data requested by users from this spreadsheet. I use long-poll to keep connection alive for 5 min and close it if no update in spreadsheet happens. Then client reconnects. Web App is published to be executed as me.
My polling works like this:
function doGet(e){
var userHasVersion = e.parameter.userVersion
while (runningTime < 300001) {
var currentServerVersion = parseInt(cache.get("currentVersion"),10)
if(userVersion<currentServerVersion){
var returndata = []
for(var i = userVersion+1; i <= currentServerVersion;i++){
var newData = cache.get(i)
if(newData!=null){returnData.push(JSON.parse(cache.get(newData)))}
}
return ContentService.createTextOutput(JSON.stringify({currentServerVersion,data:returnData })).setMimeType(ContentService.MimeType.JSON);
} else {
Utilities.sleep(20000)
}
runningTime = calculateRunningTime()
}
}
What I have tried so far:
1) I optimized requests with CacheService to reduce calls to Spreadsheet. It helped for few months, but now I'm getting qps errors more and more often.
2) Asking Google team about quotas. They explained me, that there is no published quotas/limits for simultanous executions and they are subject to change without notice. They advised further usage of cacheService and better error handling.
I think to switch from long-polling to short-polling. But it feels like drawback. Should I try to further optimize performance or move to another service?
Would trying to use "execute app as user accessing the app" help? (users should use the same database)
Is Google Script API Executable different from Web App? It looks like it might fit, but I'm not sure if they share the same qps quotas.
I'm also considering GAE service, but I'd like to avoid going over free quota.
Any advice will be much appreciated!
I think that a following part can be improved. When data is retrieved from cache service, getAll() is more effectively than get(). I have ever measured the difference. That is about 890 times faster than get(). If the number of data retrieving from cache service is large, I think that the improvement of this part is important for performance.
Your script :
var returndata = []
for(var i = userVersion+1; i <= currentServerVersion;i++){
var newData = cache.get(i)
if(newData!=null){returnData.push(JSON.parse(cache.get(newData)))}
}
Improved script :
var ar = [];
for(var i = userVersion+1; i <= currentServerVersion;i++){
ar.push([i]);
}
var r = JSON.parse(JSON.stringify(cache.getAll(ar))); // Since key is number, I used this.
var returnData = [r[j] for each (j in r)if (!r[j])];
Since I cannot see your data, I cannot confirm this execution. So if errors occur, please tell me.
If I misunderstand your question, I'm sorry.
I get a file with 4000 entries and debatch it, so i dont lose the whole message if one entry has corrupting data.
The Biztalkmap is accessing an SQL server, before i debatched the Message I simply cached the SLQ data in the Map, but now i have 4000 indipendent maps.
Without caching the process takes about 30 times longer.
Is there a way to cache the data from the SQL Server somewhere out of the Map without losing much Performance?
It is not a recommendable pattern to access a database in a Map.
Since what you describe sounds like you're retrieving static reference data, another option is to move the process to an Orchestration where the reference data is retrieved one time into a Message.
Then, you can use a dual input Map supplying the reference data and the business message.
In this patter, you can either debatch in the Orchestration or use a Sequential Convoy.
I would always avoid accessing SQL Server in a map - it gets very easy to inadvertently make many more calls than you intend (whether because of a mistake in the map design or because of unexpected volume or usage of the map on a particular port or set of ports). In fact, I would generally avoid making any kind of call in a map that has to access another system or service, but if you must, then caching can help.
You can cache using, for example, MemoryCache. The pattern I use with that generally involves a custom C# library where you first check the cache for your value, and if there's a miss you check SQL (either for the paritcular entry or the entire cache, e.g.:
object _syncRoot = new object();
...
public string CheckCache(string key)
{
string check = MemoryCache.Default.Get(key) as string;
if (check == null)
{
lock (_syncRoot)
{
// make sure someone else didn't get here before we acquired the lock, avoid duplicate work
check = MemoryCache.Default.Get(key) as string;
if (check != null) return check;
string sql = #"SELECT ...";
using (SqlConnection conn = new SqlConnection(connStr))
{
conn.Open();
using (SqlCommand cmd = conn.CreateCommand())
{
cmd.CommandText = sql;
cmd.Parameters.AddWithValue(...);
// ExecuteScalar or ExecuteReader as appropriate, read values out, store in cache
// use MemoryCache.Default.Add with sensible expiration to cache your data
}
}
}
}
else
{
return check;
}
}
A few things to keep in mind:
This will work on a per AppDomain basis, and pipelines and orchestrations run on separate app domains. If you are executing this map in both places, you'll end up with caches in both places. The complexity added in trying to share this accross AppDomains is probably not worth it, but if you really need that you should isolate your caching into something like a WCF NetTcp service.
This will use more memory - you shouldn't just throw everything and anything into a cache in BizTalk, and if you're going to cache stuff make sure you have lots of available memory on the machine and that BizTalk is configured to be able to use it.
The MemoryCache can store whatever you want - I'm using strings here, but it could be other primitive types or objects as well.
I'm working with Oracle BPMN (Fusion middleware), using JDeveloper to create BPMN processes, and writing Java code for a custom page to display the flow diagram for running processes. The problem being encountered is that the BPMN diagrams do not display/update until they hit certain trigger events (apparently asynchronous event points). So in many cases the diagrams do not even show up in a query until the BPMN process completes. Note we don't normally have user input tasks, which qualify as async events and also result in the diagram then showing up.
Our team has talked to Oracle about it and their solution was to wrap every BPMN call (mostly service calls) in asynchronous BPEL wrappers, so that the BPMN calls an async request/response (thus as two actions) that calls the service. Doing this does work, but it adds a huge overhead to the effort of developing BPMN processes, because every action has to be wrapped.
So I'm wondering if anyone else has explored or potentially solved this problem.
Some code snippets of what we're doing (partial code only):
To get the running instance IDs:
List<Column> columns = new ArrayList<Column>();
columns.add(...); // repeated for all relevant fields
Ordering ...
Predicate ...
IInstanceQueryInput input = new IInstanceQueryInput();
List<IProcessInstance> instances = client.getInstanceQueryService().queryProcessInstances(context, columns, predicate, ordering, input);
// however, instances doesn't return the instance until the first async event, or until completion
After that the AuditProcessDiagrammer is used to get the flow diagram, and DiagramEvents uesd to update / highlight the flow in progress. The instanceId does show up in the Oracle fusion control panel, so it must at least potentially be available. But trying to get an image for it results in a null image:
IProcessInstance pi = client.getInstanceQueryService().getProcessInstance(context, instance);
// HERE --> pi is null until the image is available (so the rest of this isn't run)
String compositeDn = pi.getSca().getCompositeDN();
String componentName = pi.getSca().getComponentName();
IProcessModelPackage package = client.getProcessModelService().getProcessModel(context, compositeDn, componentName);
ProcessDiagramInfo info = new ProcessDiagramInfo();
info.setModelPackage(package);
AuditProcessDiagrammer dg = new AuditProcessDiagrammer(info.getModelPackage().getProcessModel().getProcess());
List<IAuditInstance> audits = client.getInstanceQueryService().queryAuditInstanceByProcessId(context, instance);
List<IDiagramEvent> events = // function to get these
dg.highlight(events);
String base64image = dg.getImage();
See the HERE --> part. That's where I need instance to be valid.
If there are good alternatives (setting, config, etc...) that others have successfully used, I'd love to hear it. I'm really not interested in strange workarounds (already have that in the BPEL wrapper). I'm looking for a solution that allows the BPMN process flow to remain simple. Thanks.
I have the need to cache a collection of objects that is mostly static (might have changes 1x per day) that is avaliable in my ASP.NET Web API OData service. This result set is used across calls (meaning not client call specific) so it needs to be cached at the application level.
I did a bunch of searching on 'caching in Web API' but all of the results were about 'output caching'. That is not what I'm looking for here. I want to cache a 'People' collection to be reused on subsequent calls (might have a sliding expiration).
My question is, since this is still just ASP.NET, do I use traditional Application caching techniques for persisting this collection in memory, or is there something else I need to do? This collection is not directly returned to the user, but rather used as the source behind the scenes for OData queries via API calls. There is no reason for me to go out to the database on every call to get the exact same information on every call. Expiring it hourly should suffice.
Any one know how to properly cache the data in this scenario?
The solution I ended up using involved MemoryCache in the System.Runtime.Caching namespace. Here is the code that ended up working for caching my collection:
//If the data exists in cache, pull it from there, otherwise make a call to database to get the data
ObjectCache cache = MemoryCache.Default;
var peopleData = cache.Get("PeopleData") as List<People>;
if (peopleData != null)
return peopleData ;
peopleData = GetAllPeople();
CacheItemPolicy policy = new CacheItemPolicy {AbsoluteExpiration = DateTimeOffset.Now.AddMinutes(30)};
cache.Add("PeopleData", peopleData, policy);
return peopleData;
Here is another way I found using Lazy<T> to take into account locking and concurrency. Total credit goes to this post: How to deal with costly building operations using MemoryCache?
private IEnumerable<TEntity> GetFromCache<TEntity>(string key, Func<IEnumerable<TEntity>> valueFactory) where TEntity : class
{
ObjectCache cache = MemoryCache.Default;
var newValue = new Lazy<IEnumerable<TEntity>>(valueFactory);
CacheItemPolicy policy = new CacheItemPolicy { AbsoluteExpiration = DateTimeOffset.Now.AddMinutes(30) };
//The line below returns existing item or adds the new value if it doesn't exist
var value = cache.AddOrGetExisting(key, newValue, policy) as Lazy<IEnumerable<TEntity>>;
return (value ?? newValue).Value; // Lazy<T> handles the locking itself
}
Yes, output caching is not what you are looking for. You can cache the data in memory with MemoryCache for example, http://msdn.microsoft.com/en-us/library/system.runtime.caching.memorycache.aspx . However, you will lose that data if the application pool gets recycled. Another option is to use a distributed cache like AppFabric Cache or MemCache to name a few.
i'm batching updates with jdbc
ps = con.prepareStatement("");
ps.addBatch();
ps.executeBatch();
but in the background it seems, that the prostgres driver sends the query bit by bit to the database.
org.postgresql.core.v3.QueryExecutorImpl:398
for (int i = 0; i < queries.length; ++i)
{
V3Query query = (V3Query)queries[i];
V3ParameterList parameters = (V3ParameterList)parameterLists[i];
if (parameters == null)
parameters = SimpleQuery.NO_PARAMETERS;
sendQuery(query, parameters, maxRows, fetchSize, flags, trackingHandler);
if (trackingHandler.hasErrors())
break;
}
is there a possibility to let him send 1000 a time to speed it up?
AFAIK is no server-side batching in the fe/be protocol, so PgJDBC can't use it.. Update: Well, I was wrong. PgJDBC (accurate as of 9.3) does send batches of queries to the server if it doesn't need to fetch generated keys. It just queues a bunch of queries up in the send buffer without syncing up with the server after each individual query.
See:
Issue #15: Enable batching when returning generated keys
Issue #195: PgJDBC does not pipeline batches that return generated keys
Even when generated keys are requested the extended query protocol is used to ensure that the query text doesn't need to be sent every time, just the parameters.
Frankly, JDBC batching isn't a great solution in any case. It's easy to use for the app developer, but pretty sub-optimal for performance as the server still has to execute every statement individually - though not parse and plan them individually so long as you use prepared statements.
If autocommit is on, performance will be absolutely pathetic because each statement triggers a commit. Even with autocommit off running lots of little statements won't be particularly fast even if you could eliminate the round-trip delays.
A better solution for lots of simple UPDATEs can be to:
COPY new data into a TEMPORARY or UNLOGGED table; and
Use UPDATE ... FROM to UPDATE with a JOIN against the copied table
For COPY, see the PgJDBC docs and the COPY documentation in the server docs.
You'll often find it's possible to tweak things so your app doesn't have to send all those individual UPDATEs at all.