Drools - Creating a KnowlegeBuilder Faster ? - performance

I've a use case where I need to create a rule dynamically for a single time use only. The rules are dynamic and in the worse case can be different for each iteration/request. Hence, I cannot store them.
My current understanding of Drools is that you need to create a KnowledgeBuilder and then add the rules.
I'm creating the KnowledgeBase as follows.
private KnowledgeBase readKnowledgeBase() throws Exception {
long t1 = System.currentTimeMillis();
KnowledgeBuilder kbuilder = KnowledgeBuilderFactory.newKnowledgeBuilder();
long t11 = System.currentTimeMillis();
kbuilder.add( org.drools.io.ResourceFactory.newByteArrayResource(getRule()), ResourceType.DRL);
long t2 = System.currentTimeMillis();
KnowledgeBuilderErrors errors = kbuilder.getErrors();
if (errors.size() > 0) {
for (KnowledgeBuilderError error: errors) {
System.err.println(error);
}
throw new IllegalArgumentException("Could not parse knowledge.");
}
long t3 = System.currentTimeMillis();
KnowledgeBase kbase = KnowledgeBaseFactory.newKnowledgeBase();
long t4 = System.currentTimeMillis();
kbase.addKnowledgePackages(kbuilder.getKnowledgePackages());
long t5 = System.currentTimeMillis();
Collection<KnowledgePackage> kpackages = kbase.getKnowledgePackages();
for (KnowledgePackage knowledgePackage : kpackages) {
System.out.println("Package -------- " + knowledgePackage.getName());
Collection<Rule> rules = knowledgePackage.getRules();
for (Rule rule : rules) {
System.out.println("****" + rule.getName());
}
}
long t6 = System.currentTimeMillis();
System.out.println( (t11-t1) + " " + (t2-t11) + " " + ( t3-t2) + " " + ( t4-t3) + " " + ( t5-t4) + " " + ( t6-t5)+ " " );
return kbase;
}
Based on the timing logs the code take majority (more than 80%) of time in only these two operations. Is there a way to make it faster? OR is there a way of achieving my use case ?
KnowledgeBuilder kbuilder = KnowledgeBuilderFactory.newKnowledgeBuilder();
kbuilder.add( org.drools.io.ResourceFactory.newByteArrayResource(getRule()), ResourceType.DRL);

The following call will actually compile your code, i.e., it calls the JDT java compiler:
kbuilder.add( org.drools.io.ResourceFactory.newByteArrayResource(getRule()), ResourceType.DRL);
There is no way of avoiding that unless you use internal APIs, that are in fact very complicated to use.
There is an intermediate step that is to generate the canonical model using the descriptor builder API. It will save you a few cycles, but not much as the heaviest operation is the java compilation. Examples of the use of the Descr Builder API:
https://github.com/droolsjbpm/drools/blob/master/drools-compiler/src/test/java/org/drools/lang/api/DescrBuilderTest.java

Related

Bing custom search apis returning only limited results from one location and full result from different location

I am trying to use Bing Custom Search's API for documents from Cognitive Services. The strange thing is that when I run it from India, it gives me more than a thousand results, but when I run it from a US server, it returns only 25 (sometimes 50 results). Here is the sample code for that:
var totalCount = 0;
var filetypes = new List<string> { "pdf", "docx", "doc" };
foreach (var filetype in filetypes)
{
var searchTerm = "microsoft%20.net%20resume+filetype%3a" + filetype;
Console.WriteLine("Searching for : " + filetype);
for (var i = 0; i < 40; i++)
{
var nextCount = 0;
var url = "https://api.cognitive.microsoft.com/bingcustomsearch/v7.0/search?" +
"q=" + searchTerm +
"&customconfig=" + customConfigId +
"&count=25" + "&offset=" + ((i * 25) + nextCount);
using (var client = new HttpClient())
{
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", subscriptionKey);
var httpResponseMessage = client.GetAsync(url).Result;
var responseContent = httpResponseMessage.Content.ReadAsStringAsync().Result;
BingCustomSearchResponse response =
JsonConvert.DeserializeObject<BingCustomSearchResponse>(responseContent);
if (response.webPages == null || response.webPages.value.Length <= 0)
{
Console.WriteLine("response.webPages is null ");
break;
}
foreach (var webPage in response.webPages.value)
{
Console.WriteLine("name: " + webPage.name);
Console.WriteLine("url: " + webPage.url);
Console.WriteLine("displayUrl: " + webPage.displayUrl);
Console.WriteLine("snippet: " + webPage.snippet);
Console.WriteLine("dateLastCrawled: " + webPage.dateLastCrawled);
Console.WriteLine();
}
totalCount = totalCount + response.webPages.value.Length;
}
}
}
The subscription key I am using is a trial key.
I got the reason of this behavior. Actually it had nothing to do with region/country/market.
After looking into the response i got this message.
"Rate limit is exceeded. Try again in 1 seconds"
It means for after each call in the loop i have to wait for 1 second to give next call. Now need to know is this limit for trial subscription or this is kept for all calls to prevent DDOS attack or something.
May be from India it was working because may one iteraction is already taking one or more second.
Two things you can try: 1) In searchTerm, no need to use %20 and %3a, just use punctuations as you type in Bing, e.g. var searchTerm = "microsoft.net resume filetype::"+filetype, and 2) Enforce market by appending mkt=en-in (for India) or en-us (for US) in the query. You can do this by appending +"&mkt=en-in" at the end of url.
I presume for custom search you have selected domains (for both en-in and en-us markets) that return thousands of results for this query.

MongoDB Native Query vs C# LINQ Performance

I am using the following two options, the Mongo C# driver seems to be taking more time. I'm using StopWatch to calculate the timings.
Case 1: Native Mongo QueryDocument (takes 0.0011 ms to return data)
string querytext = #"{schemas:{$elemMatch:{name: " + n + ",code : " + c + "} }},{schemas:{$elemMatch:{code :" + c1 + "}}}";
string printQueryname = "Query: " + querytext;
BsonDocument query1 = MongoDB.Bson.Serialization.BsonSerializer.Deserialize<BsonDocument>(querytext);
QueryDocument queryDoc1 = new QueryDocument(query1);
var queryResponse = collection.FindAs<BsonDocument>(queryDoc1);
Case 2: Mongo C# Driver (takes more than 3.2 ms to return data)
Schema _result = new Schema();
_result = (from c in _coll.AsQueryable<Schema>()
where c.schemas.Any(s => s.code.Equals(c) && s.name.Equals(n) ) &&
c.schemas.Any(s => s.code.Equals(c1))
select c).FirstOrDefault();
Any thoughts ? Anything wrong here ?

ldap searchtype substring not working when value is an integer is less than one thousand

I search and users from active directory. My code is below:
List<DirectoryEntry> dirEntries = ActiveDirectoryActions.getListByQuery("(&(objectClass=user)(displayName~=*" + q + "*))");
for (int i = 0; i < dirEntries.Count; i++)
{
SiteSearchResult r = new SiteSearchResult();
r.title = dirEntries[i].Properties["displayName"].Value.ToString();
r.url = "/" + lang + "/directory/user/" + dirEntries[i].Properties["sAMAccountName"].Value.ToString();
r.content = dirEntries[i].Properties["title"].Value.ToString();
result.Add(r);
}
And it is getListByQuery() function
public static List<DirectoryEntry> getListByQuery(string q)
{
DirectorySearcher drSearch = new DirectorySearcher(rootEntry);
drSearch.Filter = "(distinguishedName=" + Config.xml().Root.Elements("active_directory").Elements("root_ou").Select(x => x.Value).FirstOrDefault().ToString() + ")";
DirectoryEntry searchRoot = drSearch.FindAll()[0].GetDirectoryEntry();
drSearch.SearchRoot = searchRoot;
drSearch.Filter = q;
List<DirectoryEntry> r = new List<DirectoryEntry>();
SearchResultCollection sr = drSearch.FindAll();
for (int i = 0; i < sr.Count; i++)
{
r.Add(sr[i].GetDirectoryEntry());
}
return r;
}
Everthing is ok on my local server. But gives error on global server when I search integer value. And that is interesting when the value less than 1000 (<1000) .
[NullReferenceException: Object reference not set to an instance of an
object.] Myproject.Controllers.SearchController.Index(String
lang, String q) in
D:\dotNET\Myproject\Myproject\Controllers\SearchController.cs:60
Help please.
I think there are few issues you need to check
1) Does it make sense to call DirectoryEntry.FindAll() two times? Instead of using DirectorySearcher(rootEntry) you could try to set
string ldapPath = "LDAP://" + Config.xml().Root.Elements("active_directory").Elements("root_ou").Select(x => x.Value).FirstOrDefault().ToString();
DirectoryEntry de = new DirectoryEntry(ldapPath);
DirectorySearcher desearch = new DirectorySearcher(de);
deSearch.Filter = ...
2) Approximate search ~= might be not compatible with a substring search (=*substring*), e.g. for me it does not work. So try to change to =*1000* (without ~)
3) FindAll() in both cases could return null, so you should check for null in both cases.
4) [MSDN]
Due to implementation restrictions, the SearchResultCollection class
cannot release all of its unmanaged resources when it is garbage
collected. To prevent a memory leak, you must call the Dispose method
when the SearchResultCollection object is no longer needed.
So you either need to call sr.Dispose or use a using Statement.

InputStream read is blocked while reading BLOB from Oracle column through JDBC

While trying to read from an InputStream obtained from a BLOB like below code, the InputStream goes to indefinite wait state.
I set this data from a web application running on Jboss app server and read/write works absolutely fine, problem is while running a standalone java code using plain JDBC.
Environment is JDK6, Oracle 10g.
ResultSet rs = this.stmt.executeQuery();
log.println("ResultSetType: " + (rs != null ? rs.getClass() : null));
while (rs != null && rs.next()) {
. . . // read other columns
Blob savedBlob = rs.getBlob("PERSISTENCE_BLOB");
long len = savedBlob.length();
log.println("Going to read bytes..." + len);
InputStream is = savedBlob.getBinaryStream();
log.println("IS Received...");
log.println("Available : " + is.available());
ObjectInputStream oip = new ObjectInputStream(is);
Object obj = oip.readObject();
oip.close();
is.close();
savedBlob.free();
. . .
Output is as Below...
ResultSetType: class oracle.jdbc.driver.OracleResultSetImpl
RowID: XXXXXXXXXXXXXXX // Row is selected and printed properly
Going to read bytes...6022
IS Received...
Available : 0
But if i try to read as below in chucks.. it works fine, which i don't want as i am reading a serialized object and want to have ObjectInputStream opend from InputStream.
. . .
ResultSet rs = this.stmt.executeQuery();
log.println("ResultSetType: " + (rs != null ? rs.getClass() : null));
while (rs != null && rs.next()) {
. . .
Blob savedBlob = rs.getBlob("PERSISTENCE_BLOB");
long len = savedBlob.length();
int start = 1;
int totalBytesRead = 0;
int buffSize = 2048;
byte[] byteBuff = null;
log.println("Going to read bytes..." + len);
do {
byteBuff = new byte[buffSize];
byteBuff = savedBlob.getBytes(start, buffSize);
totalBytesRead += buffSize;
log.println(start + "," + buffSize + " #BLOB bytes: " + new String(byteBuff));
start += buffSize;
. . .
} while (. . . );
log.println("Total Bytes: " + totalBytesRead);
Output:
ResultSetType: class oracle.jdbc.driver.OracleResultSetImpl
Going to read bytes...6022
1,2048 #BLOB bytes: //......bytes data..........
.....
Total Bytes: 6022
InputStream.available() doesn't indicate how much you can read, it indicates how much it can return to you (eg from a buffer), without going into a - potentially - blocking read operation.
The Javadoc also indicates:
Note that while some implementations of InputStream will return the
total number of bytes in the stream, many will not. It is never
correct to use the return value of this method to allocate a buffer
intended to hold all data in this stream.
and
The available method for class InputStream always returns 0.
So instead of using available() as any sort of indication, just read it (which clearly works as indicated by your other code).

jdbc resultSet slow performance

hiii,
i am calling a stored procedure in a java class in my web application, there are some reports that are taking to much time to render the jsp/HTML ,i have also chercked with stored procedure query its taking only 2 seconds to execute in oracle browser. i have checked my SP calling code where i found that my result set fetch time is very less,but when in trying to iterate result set with While(rst.next) it will almost take 3 mins to print SOP inside the while loop,i doubt there must be issue with ret.next(),
my code is as follows,
Connection connection = null;
CallableStatement stmt = null;
ResultSet rst = null ;
connection = DBConnector.getConnection();
stmt = connection.prepareCall("{call MIS_GSGR_ASON.MIS_DIVNETSALE_ASON(?,?,?,?,?,?,?,?,?)}");
stmt.setString(1,START_DT);
stmt.setString(2,END_DT);
stmt.setString(3,DIVISION);
stmt.setString(4,LOC_ID);
stmt.setInt(5,USER_GRP);
stmt.setInt(6,FIELD_ID);
stmt.setInt(7,Integer.parseInt(PERIOD_ID));
stmt.setString(8,zone);
stmt.registerOutParameter(9+INC,OracleTypes.CURSOR);
stmt.execute();
rst = (ResultSet) stmt.getObject(9+INC);
System.out.println("Got resultset . . . .");
data = new ArrayList<MainActionAll>();
while(rst.next()){
System.out.println("In loop");}
any help will be highly appriciated,please help me out
thanks ,amol
Maybe you run your query via slow network and if there is many rows returned it works very slow?
Add some time checking and instead of printing to console simply count records:
...
...
...
stmt.registerOutParameter(9+INC,OracleTypes.CURSOR);
long t0, t1, t2, t3, t4, t5;
t0 = System.currentTimeMillis();
stmt.execute();
t1 = System.currentTimeMillis();
rst = (ResultSet) stmt.getObject(9+INC);
t2 = System.currentTimeMillis();
System.out.println("Got resultset . . . .");
data = new ArrayList<MainActionAll>();
t3 = System.currentTimeMillis();
t4 = t3;
t5 = t3;
int rec_cnt = 0;
if (rst.next())
{
++rec_cnt;
t4 = System.currentTimeMillis();
while (rst.next())
++rec_cnt;
t5 = System.currentTimeMillis();
}
System.out.println("execute: " + (t1 - t0));
System.out.println("stmt.getObject: " + (t2 - t1));
System.out.println("Array: " + (t3 - t2));
System.out.println("1st next: " + (t4 - t3));
System.out.println("loop: " + (t5 - t4));
System.out.println("rec cnt: " + (rec_cnt));
Try to execute it as near database server as possible to reduce network transfers (localhost preferred). Edit your question with those times.

Resources