Get a single, matching row from a .csv without searching twice? - performance

I was wondering if there was a better way I can grab a single row matching a specific query, without searching twice.
In my code originally, I wanted to count how many rows in my .CSV matched my search query.
$raisins=($table | ? {$_.'fruit' -like '*grapes*' -and $_.'hours in sunlight' -like '*a lot*'}| measure).Count
Now, I'm tasked with taking a single row from those same search query results.
$raisinHit=($table | ? {$_.'fruit' -like '*grapes*' -and $_.'hours in sunlight' -like '*a lot*})
$raisinHit=$raisinHit[0]
However, I find it really inefficient that I have to search through my .CSV using the same query from earlier, just to find a single result I had already glossed over.
Is there a better way to do this? If so, can you explain how?

Just assign the matches to a variable, then return the count and an element from the variable.
$raisins = #($table |
? {$_.'fruit' -like '*grapes*' -and $_.'hours in sunlight' -like '*a lot*'})
$raisins.count
$raisins[0]
The #() around the pipeline ensures that $raisins will be an array, even if only one match is returned.

Related

Is it possible to use -Like with ADSI query?

Sometimes my AD queries using the ADSI method fails on surnames with Jr. or III on the end of the name. Is it possible with ADSI to capture the last names using the -LIKE filter?
For example:
$search = [adsisearcher]"(&(objectCategory=person)(objectClass=User)(sn -like Perry))"

How to boost "starts with" search results above "contains" search results in elastic

I am trying to find a way that I can boost the search results for a particular query such that the search results that have the query at the beginning of the field (i.e. starts with) are above the results that do not.
e.g. Suppose my query is for 'bat'
I want my results to look like
bat
bath
bathe
abate
debate
etc.
You could try adding a prefix query with a boost value to make the score for prefix matches higher than the rest of the items.

Solr boost query sort by whether result is boosted then by another field

I'm using Solr to run a query on one of our cores. Suppose my documents have two fields: ID, and Name. I also have a separate list of IDs I'm grabbing from a database and passing into the query to boost certain results.
If the document gets returned in the query and the ID is in the list it goes to the top of the results, and if it gets returned in the query and the ID is not in the list then it goes below those that are in the list. The former is from the "boost". My query is something like this -
http://mysolrserver:8983/solr/MyCore/MyQueryHandler?q=Smith&start=0&rows=25&bq=Id%3a(36+OR+76+OR+90+OR+224+OR+391)
I am able to get the boost query working but I need the boosted results to be in alphabetical order by name, then the non boosted results under that also in alphabetical order by name. I need to know what to user for the &sort= parameter.
&sort=score%20desc,Name+asc does not work.
I've looked over a lot of documentation, but I still don't know if this even possible. Any help is appreciated. Thanks!
Solr version is 6.0.1. I am actually using SolrNet to interface with Solr, but I think I can figure out the SolrNet part if I know what the url's &sort= parameter value needs to be.
I figured it out, by doing away with the boost query. I added a sort query using the "exists" function and passing it a sub-query for the ID. The exists returns a boolean value to sort on, then I added the name as a second sort. It works perfect!!
The URL looks like this:
http://mysolrserver:8983/solr/MyCore/MyQueryHandler?q=Smith&start=0&rows=25&sort=exists(query({!v=%27Id:(36+OR+76+OR+90+OR+224+OR+391)%27}))%20DESC,%20Name%20ASC
The closest match to your requirement is the query elevation component[1] .
In your particular case I would first sort my Ids according to my requirements ( sorting them by name for example), then maintain them in the elevate.xml.
At query time you can use the "forceElevation" parameter to force the elevation and then sort the remaining results by name.
[1] https://cwiki.apache.org/confluence/display/solr/The+Query+Elevation+Component

SQLite - how to return rows containing a text field that contains one or more strings?

I need to query a table in an SQLite database to return all the rows in a table that match a given set of words.
To be more precise: I have a database with ~80,000 records in it. One of the fields is a text field with around 100-200 words per record. What I want to be able to do is take a list of 200 single word keywords {"apple", "orange", "pear", ... } and retrieve a set of all the records in the table that contain at least one of the keyword terms in the description column.
The immediately obvious way to do this is with something like this:
SELECT stuff FROM table
WHERE (description LIKE '% apple %') or (description LIKE '% orange %') or ...
If I have 200 terms, I end up with a big and nasty looking SQL statement that seems to me to be clumsy, smacks of bad practice, and not surprisingly takes a long time to process - more than a second per 1000 records.
This answer Better performance for SQLite Select Statement seemed close to what I need, and as a result I created an index, but according to http://www.sqlite.org/optoverview.html sqlite doesn't use any optimisations if the LIKE operator is used with a beginning % wildcard.
Not being an SQL expert, I am assuming I'm doing this the dumb way. I was wondering if someone with more experience could suggest a more sensible and perhaps more efficient way of doing this?
Alternatively, is there a better approach I could use to the problem?
Using the SQLite fulltext search would be faster than a LIKE '%...%' query. I don't think there's any database that can use an index for a query beginning with %, as if the database doesn't know what the query starts with then it can't use the index to look it up.
An alternative approach is putting the keywords in a separate table instead, and making an intermediate table that has the information about which row in your main table has which keywords. If you indexed all the relevant columns that way, it could be queried very quickly.
Sounds like you might want to have a look at Full Text Search. It was contributed to SQLite by someone from google. The description:
allows the user to efficiently query
the database for all rows that contain
one or more words (hereafter
"tokens"), even if the table contains
many large documents.
This is the same problem as full-text search, right? In which case, you need some help from the DB to construct indexes into these fields if you want to do this efficiently. A quick search for SQLite full text search yields this page.
The solution you correctly identify as clumsy is probably going to do up to 200 regular expression matches per document in the worst case (i.e. when a document doesn't match), where each match has to traverse the entire field. Using the index approach will mean that your search speed will be independent of the size of each document.

Solr OR query for different combination of facets

I have a sample Solr schema as follows
isPublic = boolean
source = facebook| twitter | wordpress
I want to write a query which returns all documents from the index which matches either the isPublic = true or isPublic is false and source= facebook. Something like this
solrUrl/?q=blah&fq=(isPublic:true OR (isPublic:false AND source:facebook))
Is such a thing possible or should I search the index two times with each of these conditions and then combine + de-duplicate the results?
Sure you can run such a filter query, but I think that particular query will not get you the results you're looking for, see this question about it. A logically equivalent query would be: isPublic:true OR source:facebook

Resources