Using optional variables in query - linq

I have a search type which is used to query the database as follows,
type UserSearch =
{ limit: int option
skip: int option }
I'm using SQLProvider and would like to use limit and skip variables in my query such as
query {
for user in Db.getDataContext().MyDb.Users do
select user
// if searchParams.limit.IsSome then
// take searchParams.limit.Value
// if searchParams.skip.IsSome then
// skip searchParams.skip.Value
}
Is it possible?
My current approach is as follows;
let mutable search =
query {
for user in Db.getDataContext().MyDb.Users do
select user
}
if searchParams.limit.IsSome then
search <-
query {
for user in search do
select user
take searchParams.limit.Value
}
search |> Seq.toArray
Since queries are lazy, this should result a single query, am i correct on this assumption?

Unfortunately, the query builder turns the result into seq<_> and forces it to run the SQL query, so you cannot easily compose queries this way. There may be more sophisticated tricks you could use, but for your specific case, it may be easier to just use a reasonable default value such as 0 for skip and Int32.MaxValue for take:
query {
for user in Db.getDataContext().MyDb.Users do
skip (defaultArg searchParams.skip 0)
take (defaultArg searchParams.limit.Value Int32.MaxValue)
select user }
This will generate a query with skip and take even when you do not need that, which may have some performance implications - I do not think it matters, but it may be best to test that.

Related

Getting max value on server (Entity Framework)

I'm using EF Core but I'm not really an expert with it, especially when it comes to details like querying tables in a performant manner...
So what I try to do is simply get the max-value of one column from a table with filtered data.
What I have so far is this:
protected override void ReadExistingDBEntry()
{
using Model.ResultContext db = new();
// Filter Tabledata to the Rows relevant to us. the whole Table may contain 0 rows or millions of them
IQueryable<Measurement> dbMeasuringsExisting = db.Measurements
.Where(meas => meas.MeasuringInstanceGuid == Globals.MeasProgInstance.Guid
&& meas.MachineId == DBMatchingItem.Id);
if (dbMeasuringsExisting.Any())
{
// the max value we're interested in. Still dbMeasuringsExisting could contain millions of rows
iMaxMessID = dbMeasuringsExisting.Max(meas => meas.MessID);
}
}
The equivalent SQL to what I want would be something like this.
select max(MessID)
from Measurement
where MeasuringInstanceGuid = Globals.MeasProgInstance.Guid
and MachineId = DBMatchingItem.Id;
While the above code works (it returns the correct value), I think it has a performance issue when the database table is getting larger, because the max filtering is done at the client-side after all rows are transferred, or am I wrong here?
How to do it better? I want the database server to filter my data. Of course I don't want any SQL script ;-)
This can be addressed by typing the return as nullable so that you do not get a returned error and then applying a default value for the int. Alternatively, you can just assign it to a nullable int. Note, the assumption here of an integer return type of the ID. The same principal would apply to a Guid as well.
int MaxMessID = dbMeasuringsExisting.Max(p => (int?)p.MessID) ?? 0;
There is no need for the Any() statement as that causes an additional trip to the database which is not desirable in this case.

Elasticsearch query not returning expected results for multiple should filters

I am performing an Elasticsearch query using the high-level-rest-api for Java and expect to see records that are either active or do not have a reference id. I'm querying by name for the records and if I hit the index directly with /_search?q=, I see the results I want.
Is my logic correct (pseudo-code):
postFilters.MUST {
Should {
MustNotExist {referenceId}
Must {status = Active}
}
Should {
MustNotExist {referenceId}
Must {type = Person}
}
}
What I get are records that are active with a reference id. But, I want to include records that also do not have a referenceId, hence why I have MustNotExist {referenceId}.
For simplicity, the second Should clause can be dropped (for testing) as the first one is not working as expected by itself.
In my case, I had to use a match query instead of a term query because the value I was querying for was not a primitive or a String. For example, the part where Must, type = Person, Person was an enum, and so looking for "Person" was not quite right, whereas match allowed it to "match".

How Should Complex ReQL Queries be Composed?

Are there any best practices or ReQL features that that help with composing complex ReQL queries?
In order to illustrate this, imagine a fruits table. Each document has the following structure.
{
"id": 123,
"name": "name",
"colour": "colour",
"weight": 5
}
If we wanted to retrieve all green fruits, we might use the following query.
r
.db('db')
.table('fruits')
.filter({colour: 'green'})
However, in more complex cases, we might wish to use a variety of complex command combinations. In such cases, bespoke queries could be written for each case, but this could be difficult to maintain and could violate the Don't Repeat Yourself (DRY) principle. Instead, we might wish to write bespoke queries which could chain custom commands, thus allowing complex queries to be composed in a modular fashion. This might take the following form.
r
.db('db')
.table('fruits')
.custom(component)
The component could be a function which accepts the last entity in the command chain as its argument and returns something, as follows.
function component(chain)
{
return chain
.filter({colour: 'green'});
};
This is not so much a feature proposal as an illustration of the problem of complex queries, although such a feature does seem intuitively useful.
Personally, my own efforts in resolving this problem have involved the creation of a compose utility function. It takes an array of functions as its main argument. Each function is called, passed a part of the query chain, and is expected to return an amended version of the query chain. Once the iteration is complete, a composition of the query components is returned. This can be viewed below.
function compose(queries, parameters)
{
if (queries.length > 1)
{
let composition = queries[0](parameters);
for (let index = 1; index < queries.length; index++)
{
let query = queries[index];
composition = query(composition, parameters);
};
return composition;
}
else
{
throw 'Must be two or more queries.';
};
};
function startQuery()
{
return RethinkDB;
};
function filterQuery1(query)
{
return query.filter({name: 'Grape'});
};
function filterQuery2(query)
{
return query.filter({colour: 'Green'});
};
function filterQuery3(query)
{
return query.orderBy(RethinkDB.desc('created'));
};
let composition = compose([startQuery, filterQuery1, filterQuery2, filterQuery3]);
composition.run(connection);
It would be great to know whether something like this exists, whether there are best practises to handle such cases, or whether this is an area where ReQL could benefit from improvements.
In RethinkDB doc, they state it clearly: All ReQL queries are chainable
Queries are constructed by making function calls in the programming
language you already know. You don’t have to concatenate strings or
construct specialized JSON objects to query the database. All ReQL
queries are chainable. You begin with a table and incrementally chain
transformers to the end of the query using the . operator
You do not have to compose another thing which just implicit your code, which gets it more difficult to read and be unnecessary eventually.
The simple way is assign the rethinkdb query and filter into the variables, anytime you need to add more complex logic, add directly to these variables, then run() it when your query is completed
Supposing I have to search a list of products with different filter inputs and getting pagination. The following code is exposed in javascript (This is simple code for illustration only)
let sorterDirection = 'asc';
let sorterColumnName = 'created_date';
var buildFilter = r.row('app_id').eq(appId).and(r.row('status').eq('public'))
// if there is no condition to start up, you could use r.expr(true)
// append every filter into the buildFilter var if they are positive
if (escapedKeyword != "") {
buildFilter = buildFilter.and(r.row('name').default('').downcase().match(escapedKeyword))
}
// you may have different filter to add, do the same to append them into buildFilter.
// start to make query
let query = r.table('yourTableName').filter(buildFilter);
query.orderBy(r[sorterDirection](sorterColumnName))
.slice(pageIndex * pageSize, (pageIndex * pageSize) + pageSize).run();

How to combine collection of linq queries into a single sql request

Thanks for checking this out.
My situation is that I have a system where the user can create custom filtered views which I build into a linq query on the request. On the interface they want to see the counts of all the views they have created; pretty straight forward. I'm familiar with combining multiple queries into a single call but in this case I don't know how many queries I have initially.
Does anyone know of a technique where this loop combines the count queries into a single query that I can then execute with a ToList() or FirstOrDefault()?
//TODO Performance this isn't good...
foreach (IMeetingViewDetail view in currentViews)
{
view.RecordCount = GetViewSpecificQuery(view.CustomFilters).Count();
}
Here is an example of multiple queries combined as I'm referring to. This is two queries which I then combine into an anonymous projection resulting in a single request to the sql server.
IQueryable<EventType> eventTypes = _eventTypeService.GetRecords().AreActive<EventType>();
IQueryable<EventPreferredSetup> preferredSetupTypes = _eventPreferredSetupService.GetRecords().AreActive<EventPreferredSetup>();
var options = someBaseQuery.Select(x => new
{
EventTypes = eventTypes.AsEnumerable(),
PreferredSetupTypes = preferredSetupTypes.AsEnumerable()
}).FirstOrDefault();
Well, for performance considerations, I would change the interface from IEnumerable<T> to a collection that has a Count property. Both IList<T> and ICollection<T> have a count property.
This way, the collection object is keeping track of its size and you just need to read it.
If you really wanted to avoid the loop, you could redefine the RecordCount to be a lazy loaded integer that calls GetViewSpecificQuery to get the count once.
private int? _recordCount = null;
public int RecordCount
{
get
{
if (_recordCount == null)
_recordCount = GetViewSpecificQuery(view.CustomFilters).Count;
return _recordCount.Value;
}
}

Using LINQ Expression Instead of NHIbernate.Criterion

If I were to select some rows based on certain criteria I can use ICriterion object in NHibernate.Criterion, such as this:
public List<T> GetByCriteria()
{
SimpleExpression newJobCriterion =
NHibernate.Criterion.Expression.Eq("LkpStatu", statusObject);
ICriteria criteria = Session.GetISession().CreateCriteria(typeof(T)).SetMaxResults(maxResults);
criteria.Add(newJobCriterion );
return criteria.List<T>();
}
Or I can use LINQ's where clause to filter what I want:
public List<T> GetByCriteria_LINQ()
{
ICriteria criteria = Session.GetISession().CreateCriteria(typeof(T)).SetMaxResults(maxResults);
return criteria.Where(item=>item.LkpStatu=statusObject).ToList();
}
I would prefer the second one, of course. Because
It gives me strong typing
I don't need to learn yet-another-syntax in the form of NHibernate
The issue is is there any performance advantage of the first one over the second one? From what I know, the first one will create SQL queries, so it will filter the data before pass into the memory. Is this kind of performance saving big enough to justify its use?
As usual it depends. First note that in your second snippet there is .List() missing right after return criteria And also note that you won't get the same results on both examples. The first one does where and then return top maxResults, the second one however first selects top maxResults and then does where.
If your expected result set is relatively small and you are likely to use some of the results in lazy loads then it's actually better to take the second approach. Because all entities loaded through a session will stay in its first level cache.
Usually however you don't do it this way and use the first approach.
Perhaps you wanted to use NHibernate.Linq (located in Contrib project ). Which does linq translation to Criteria for you.
I combine the two and made this:
var crit = _session.CreateCriteria(typeof (T)).SetMaxResults(100);
return (from x in _session.Linq<T>(crit) where x.field == <something> select x).ToList();

Resources