GSA sorting with over multiple metadata indexes - sorting

I am familiar with how to sort GSA results on metadata.
I'm interested in sorting across multiple indexes.
For example, sort by Last Name, then by First Name.
So that Alice Smith appears before Bob Smith.
In SQL, this would be quite simple, equivalent to:
SELECT value FROM table ORDER BY last, first
Does GSA support this?
I've been playing with a few different syntaxes, but haven't found a way yet.
If it's only possible to sort on one index, how does google sort within the set of equivalent results? e.g. How does GSA determine whether Alice or Bob appears first? I can't find any good explanation on this.

Sorry if I post it as answer but I can't comment your question because of my reputation is still too low.. (wtf stackoverflow!?).
I just wanna know if you find a way to solve this problem. Thank you!

From what I can tell, GSA does not support multiple dependent sort order.
Instead, I've built an additional meta index that combines the two indexes I want to sort.
So, for example, I have index A for "First Name", index B for "Last Name", and index C which is the combination of both values into "Last Name"_"First Name".
This seems to be working well for me so far.

Related

Google Search Appliance sort by metadata content

I'm trying to refine the search results received by my application by including the sort parameter in my HTTP requests. I've combed through the documentation here, but I can't find exactly what I'm looking for.
I'm searching for DOC filetypes, and I am able to sort by date or sort by metadata, as in alphabetizing by title, author, etc. I can also filter by whether or not the title contains certain keywords. What I want to do is to sort by whether or not the title contains certain keywords (these documents appearing first in the results), but to still keep the other results.
For example, with keywords [winter, Christmas, holiday] I could do a descending sort by the sum of inmeta:title~winter, inmeta:title~Christmas, inmeta:title~holiday and the top result might be
Winter holidays other than Christmas
followed by documents with one or two of the keywords, followed by documents that meet the other search parameters but contain no keywords.
Is this possible in GSA?
I finally achieved what I was trying to do, so figured I'd post in case it helps anyone else.
As far as I know, it is impossible to create a query with this capability, but with Google's Custom Search API, you can create a search engine with the desired keywords in the context file (by editing the XML file directly or by adding keywords through the CSE console). Then you can formulate the query as usual, but perform the search on your personalized engine.
https://developers.google.com/custom-search/docs/ranking

Good way to exclude records in SOLR or Elasticsearch

For a matchmaking portal, we have one requirement where in, if a customer viewed complete profile details of a bride or groom then we have to exclude that profile from further search results. Currently, along with other detail we are storing the viewed profile ids in a field (Comma Separated) against that bride or groom's details.
Eg., if A viewed B, then in B's record under the field saw_me we will add A (comma separated).
while searching let say the currently searching members id is 123456 then we will fire a query like
Select * from profiledetails where (OTHER CON) AND 123456 not in saw_me;
The problem here is the saw_me field value is growing like anything, is there any better way to handle this requirement? Please guide.
If this is using Solr:
first, DON'T add the 'AND NOT ...' clauses along with the main query in q param, add them to fq. This have many benefits (the fq will be cached)
Until you get to a list of values that is maybe 1000s this approach is simple and should work fine
After you reach a point where the list is huge, maybe it time to move to a post filter with a high cost ( so it is looked up last). This would look up docs to remove in an external source (redis, db...).
In my opinion no matter how much the saw_me field grows, it will not make much difference in search time.Because tokens are indexed inversely and doc_values are created at index time in column major fashion for efficient read and has support for caching from OS. ES handles these things for you efficiently.

Sort by multiple fields in specific order in Solr

So I want to sort my Solr response by the following fields:
published_year (desc)
series_number (asc)
status_color
Problem is that status_color must be sorted by the following values (e.i. not alphabetically):
"Green"
"Yellow"
"Red"
This field may only contain one of these values.
I'm hoping theres a way of doing this in the Solr query instead of massaging the result in code. With a result of hundreds of thounsands of documents it's not really an option.
Any help is appreciated.
I think the answer for this question will be valid for you too:
Is it possible in solr to specify an ordering of documents
I believe Solr has Enum types, though I have never seen them used in a while. But they would be a perfect match, so worth a try.

Web search algorithm with multiple words

I want to use search from database on my website, so I think about effective algorithm to use.
For example if I try to search "Hello my name is xxx" I want to see results:
Hello my name is John
Hello my name is Peter
Hello mr. xxx
His name is Peter
He is here
So I want to search all data from database with part of this text and sort result by number of matching words.
I made algorithm but I am pretty scared that it's so complicated and slow:
I split search text into words and use SQL select with multiple like or commands. Then I save this results into list. Then I count up numbers of matched words in each result and sort it by this count.
Problem is that when I will try to search long text.
Should I use better algorithm or should I learn somethink about thinks like Sphinx
For the first two results, a simple regex search should be able to retrieve results like that.
For the later ones, you might consider using an existing searching library thing, like Google Search Appliance, which can be used to search database information.

Making a Database query "Intelligent"?

I have the following requirement.
I have a table with a column that contains the city names. I am going to implement a search option by City.
But the user may not enter the city name correctly.
Examples :
The city "Matara" is sometimes spelled as "Mathara".
The city "Nuwara Eliya" is sometimes written as "Nuwaraeliya"
I can keep the consistency on the database column but I want to return the hits even the end user uses an alternative word.
What is the approach I need to use to implement this effectively?
You should probably implement a string distance check like Levenshtein distance
More approaches can be found here: How do you implement a "Did you mean"?
I think the above problem can be sufficiently solved by using Levenshtein Distance, PHP Similar Text or JaroWinkler Similarity. All the approaches provided me the sufficiently correct results.
Edit Distance Tool
You want something like a phonetic search.
Several algotithm exists. You can get an overview here
The idea is to add a column to you table with the phonetic equivalent to your city,
and perform the search against this (after having performed the same function for the searched term).
Some RDBMS such as Oracle possess a pre-implemented SOUNDEX function, that could allow you to perform the search without the added column.

Resources