Solr combine date fields in sorting - sorting

We have some documents which are indexed with a date d1 on some of the documents and d2 on others, and we want to sort on both of them depending which one os available.
sort=d1 desc, d2 desc
will sort the document with d1 seperatly for the document with d2, like this:
d1: 2014-03-12
d1: 2010-03-12
d2: 2013-03-12
d2: 2011-03-12
What we want is everything sorted like this:
d1: 2014-03-12
d2: 2013-03-12
d2: 2011-03-12
d1: 2010-03-12
Reindexing all documents with a new common field is not an option unfortunately.

The approach of Abdul to use a function is the correct approach.
But the solution of Abdul does not work for me.
I've successfully tested this.
Just add this parameter to your query:
sort=max(d1,d2) desc

As much i know, you can use Function Queries of solr. For this sort, it is something like that
sort=if((abs(ms(d1,d2)) > 0),d1,d2) desc
I have not tested it yet, but here is the helpful link which will sort out your problem.
https://wiki.apache.org/solr/FunctionQuery
Sort result by date difference

Maybe you can add a copyField from d1 and d2 to dAll and then sort on dAll ?
You can check copyField informations here: https://cwiki.apache.org/confluence/display/solr/Copying+Fields

As it was invoqued in this discussion, you can update your data with ConcatFieldUpdateProcessorFactory by trying something like:
<processor class="com.test.solr.update.CustomConcatFieldUpdateprocessorFactory">
<str name="field">d1</str>
<str name="field">d2</str>
<str name="dest">date</str>
<str name="delimiter"></str>
</processor>
After that, you can try to sort by your field date.

Two suggestions, one of which has already been proposed:
(1) Use copyField.
<field name="d" type="date" indexed="true" stored="true" multiValued="false"/>
<copyField source="d1" dest="d" />
<copyField source="d2" dest="d" />
Even if the d1 and d2 fields are important, you can still include them in your query, but just sort on the merged d field.
(2) Depending upon the type of your data source, you could potentially modify the query in your data-config.xml file to merge these two fields into one. In our environment, we are using Solr to index data from a MySQL instance. There are many times where data from different databases is being consolidated into Solr. This introduces a similar problem where we need to normalize data that is coming from different databases. In these situations, we often use constructs like CASE or IFNULL in our queries. I can get into more specifics if this is applicable to your situation.

I had a problem very similar to yours. In my case all documents had the "d1" and "d2" fields, and all of them had the "d1" value. The "d2" value is used to overwrite the "d1" value.
My solution for this is:
sort=map(ms(d2),0,0,ms(d1)) desc
The map(ms(d2),0,0,ms(d1)) will return the "d2" timestamp if it is not empty; if it is, the "d1" timestamp is used instead
Hope it helps someone.

Related

Rails compare same object's 1 field with another + addition of string in Active Record

I've two string fields which contains dates in string like field_1 = "2003.11.14" and I use them in ORM and they are working just fine. Now I want to compare 1 field value with another field's - 18.months. Here is a example
User.where("users.field_1 > '#{Date.today - 18.months}' AND users.field_2 > (users.fields_1 - 18.months)")
something like. Can anyone help me?
Thanks in advance
Most databases support data calculations in SQL. Something like this should work.
query = User.where("users.field_1 > ?", 18.months.ago)
query.where("users.field_2 > users.field_1 - :time", time: 18.months.ago)
edit: Just saw that the values are stored as strings, then you can not use SQL.
can not do that because the table has millions of records
I don't really understand why the size of the table limits to use the correct data type?

Solr 7.5.0, error sorting a "pdate" field in ascending 'asc' direction

by usign the Solr7.5.0 webUI, I am trying to sort a field of my schema "data_creazione", by setting:
&fl=data_creazione:[2017-11-12T00:00:00Z TO NOW]&sort=data_creazione asc
my complete query URL being:
http://localhost:8983/solr/NUR/select?fq=data_creazione:[2017-11-12T00:00:00Z%20%20TO%20%20NOW]&q=regione%20lazio&sort=data_creazione%20asc
When i inspect the results I am observing a strange and erroneous sorting behavior:
results from 0 to 9 (start=0,rows=10) are correctly ordered (from 2018-03-01 to 2018-03-02)
results from 10 to 19 (start=10,rows=10) are correctly ordered (from 2018-03-05 to 2018-03-07)
results from 20 to 29 (start=20,rows=10) are incorrectly ordered (from 2018-02-23 to 2018-03-1)
Moreover, I am trying to send the same query to Solr 7.5.0 from a .netcore2.1 application by using the Solr.net driver, and I am returned the same identical sort error.
NOTICE: when I try to query the same query but in DESCENDING 'desc' direction all things go fine. All results' pages are correctly ordered.
This error does not appear with Solr 7.2.0: ascending and descending sorting over date fields works fine
In my managed-schema the field "data_creazione" is declared this way:
<fieldType name="pdate" class="solr.DatePointField" docValues="true"/>
...
<field name="data_creazione" type="pdate" multiValued="false" indexed="false" stored="true"/>

ElasticSearch sort by id's in array

Is there a way to sort some elasticsearch response in the same direction, which I am posting an array with ids?
Example: array[23,45,67] and the results should be sort in the same way like the id's are: first come all rows with ID 23, after that all rows with ID 45 and at the end all rows with ID 67 ?
Thanks
Nik
You can use scripting in sort or the other option is to use bool - should query where you boost documents with these values.

how to sort the documents according to an field in lucene?

guys.
I've got billions of records which have two attributes:
RecordCreatedTime, RecordContent
I've used lucene to index the records, and it is done.
Now I want to query some records according to the RecordCreatedTime, for example, check out the document just in November, 2013.
I am considering to sort the documents with RecordCreatedTime, and have tried some methods like NumericDocValuesSorter but it didn't work.
Can you guys provide some more materials so I can take a careful look??
Much thanks.
You should check out Lucene's DateTools which provides you with the tools to represent dates in a way that is appropriate for searching and sorting in the index. A TermRangeQuery can be used to search a particular range (such as the month of November, 2013), when indexed in that format.
You can also sort easily, by passing a Sort into your search call.
For example, something like:
String startDateString = DateTools.dateToString(startDate, DateTools.Resolution.DAY);
String endDateString = DateTools.dateToString(endDate, DateTools.Resolution.DAY);
TermRangeQuery query = TermRangeQuery.newStringRange("recordCreatedTime", startDateString, endDateString, true, false);
SortField field = new SortField('recordCreatedTime', SortField.Type.STRING);
Sort sort = new Sort(field);
TopDocs results = searcher.search(query, numDocs, sort);

Using FetchXML in CRM 2011

I am using FetchXML and I am grouping and using count for two of the entities but the rest of the entities I don't need grouped I just need the data to be pulled down. For example this is my code:
string groupby1 = #"
<fetch distinct='false' mapping='logical' aggregate='true'>
<entity name='opportunity'>
<attribute name='name' alias='opportunity_count' aggregate='countcolumn' />
<attribute name='ownerid' alias='ownerid' groupby='true' />
<attribute name='createdon' alias='createdon' />
<attribute name='customerid' alias='customerid' />
</entity>
</fetch>";
EntityCollection groupby1_result = orgProxy.RetrieveMultiple(new FetchExpression(groupby1));
foreach (var c in groupby1_result.Entities)
{
Int32 count = (Int32)((AliasedValue)c["opportunity_count"]).Value;
string OwnerID = ((EntityReference)((AliasedValue)c["ownerid"]).Value).Name;
DateTime dtCreatedOn = ((DateTime)((AliasedValue)c["createdon"]).Value);
string CustomerName = ((EntityReference)((AliasedValue)c["customerid"]).Value).Name;
}
But I get this error:
EXCEPTION: System.ServiceModel.FaultException`1[Microsoft.Xrm.Sdk.OrganizationServiceFault]: An attribute can not be requested when an aggregate operation has been specified and its neither groupby nor aggregate. NodeXml : (Fault Detail is equal to Microsoft.Xrm.Sdk.OrganizationServiceFault).
How do you use aggregates on some values and not others?
The error message is giving you the answer - this isn't possible with FetchXML. You'll need to make two FetchXML calls; one for your aggregates and one for your data.
So, there are a few things going on here. The problem isn't so much that your query is not achievable with FetchXML as it is it's not achievable in any query language, including SQL. The SQL of your FetchXML above is as follows:
SELECT ownerID, createdon, customerid
FROM dbo.Opportunity
GROUP BY ownerID
If you try to run this SQL against your database, you'd get the standard Column 'dbo.Opportunity.CreatedOn' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. error, which is exactly what your exception is trying to tell you. To solve this, you need to also group by every non-aggregated column (in this case, createdon and customerid) by including the groupby='true' attribute in each respective attribute element.
Secondly, grouping by the create date of each opportunity is more or less a fruitless grouping, as opportunities are generally created on different datetimes, so this grouping would just return the whole table. Microsoft rightly recognizes this, so if you fix the grouping issue above, you will encounter another exception related to the datetime column. If you continue reading the article I'd previously shared, there is an example that address the different ways you can group by different date intervals (year, month, quarter, etc.) using the dategrouping attribute.
But, I get the feeling that even after fixing the general grouping issues, and then the dategrouping issue, the result set might still not be what you want. If it's not, it might help to post an example of a result set that you'd expect to see and then address whether FetchXML has the power to deliver that set to you.

Resources