solr query for field value starting with a number - ajax

I have to modify a query that searches for a value starting with a letter (relevant snippet fo the query): &fq=Organization:"+letter+"*&
If I pass 'A' as the letter param I'll get 'ABC Hardware', something that start with an A.
How would i modify the letter variable to return only something that starts with a number, as '1A Widgets'.
Tried things like letter = '[0 TO 5]', but I honestly have no idea if I'm on the right track with that.

Seems like a dupe of this question
For cases like this, my favourite approach is to index another boolean field called "StartsWithNumber" and then it's a simple boolean filter. This might not work for you if you can't reindex all of your documents.
For a brute force approach, you could do something like:
fq=Organization:0* OR Organization:1* OR Organization:2* OR .. etc
Not pretty, but fq's get cached so at least it should be fast.

Related

IFTTT JavaScript filter - How to make case insensitive searches + How to search Include and Exclude sets of terms

First off I'm a total novice for Javascript, so please go gently. I'm aware of how people feel about having to now pay for IFTTT, but it's perfect for what I need.
I am using a more expansive version of this code below to capture certain keywords from Tweets to then generate emails if the search returns a positive result. This search works very nicely, except it is case sensitive which is a problem.
Yes, I know you can manipulate the twitter search to pick up specific words or phrases. I am very proficient in achieving searches this way. I am casting a wide net to pick up approx 120 search words or phrases which is too long to achieve through "OR" Twitter search parameters alone which is why I'm using this.
Q1 - I have tried adding item.toLowerCase() and just .toLowerCase() in various parts of the code so it wouldn't matter if the sentence case of the search term is different to that of the original tweet text case. I just can't get it to work though. I've seen various posts on here but I can't get any of them to work in IFTTT. I believe IFTTT doesn't accept REGEX either, which is annoying.
Any advice of how to get this code running so it's case-insensitive for text within IFTTT?
Q2 - I have approx 120 search terms for the tweet text to return positive results. There is a lot of junk that comes through with that. Does anyone know how to add a second layer of 'and exclude' search terms?
I have something like 300-400 words and specific phrases which would be used to stop the email from being triggered - so it'd be something like "IF tweet text contains a, b, c BUT text ALSO contains x, y, z... do not send the email"
let str=Twitter.newTweetFromSearch.Text;
let searchTerms=[
"Northbound",
"Westbound",
"Southbound",
"Eastbound"
]
let foundOne=0;
if(searchTerms.some(function(v){return str.indexOf(v)>=0;})){
foundOne=1;
}
if(foundOne==0){
Email.sendMeEmail.skip();
}
I have looked at the Twitter API, but that is a step too far for my coding ability which is why I'm using IFTTT.
Any help is very much appreciated
Thank you.
I'm playing with IFTTT Filter myself at the moment, so here are some thoughts about solving your solution.
If you want to do a case insensitive seatch on the original text, convert the original text to lowercase, then have all your search terms in lowercase.
Plus I think you want to iterate over the searchTerms array, and use the includes() method. Ok, just realised that .some() does the iteration for you, but I prefer includes() over indexof().
let str=Twitter.newTweetFromSearch.Text.toLowerCase();
let searchTerms=[
"northbound",
"westbound",
"southbound",
"eastbound"
]
let foundOne=0;
if(searchTerms.some(function(term){return str.includes(term);})){
foundOne=1;
}
if(foundOne==0){
Email.sendMeEmail.skip();
}
Or you could just skip having the foundOne variable, and do the search in the if() statement.
let str=Twitter.newTweetFromSearch.Text.toLowerCase();
let searchTerms=[
"northbound",
"westbound",
"southbound",
"eastbound"
]
if(!searchTerms.some(function(term){return str.includes(term);})){
Email.sendMeEmail.skip();
}

Solr query conundrum

I've recently swapped from using Lucene for Sitecore to Solr.
For the most part it has been smooth, but the way I was writing some queries (using Sitecore.ContentSearch.Linq) abstraction now don't seem to be compatible.
Specifically, I have a situation where I've got "global" content and "regional" content, like so:
Home (000)
X
Y
Z
Regions (ID: 111)
Region 1 (ID: 221)
A
B
Region 2 (ID: 222)
D
My code worked on Lucene, but now doesn't on Solr. It should find all "global" and a single region's content, excluding all other region's content. So as an example, if the user's current region was Region 1, I'd want the query to return content X, Y, Z, A, B.
Sitecore's Item Crawler has a field for each item in the index called "_path" which is a multivalued string field of IDs, so as an example, Region 1's _path field value would be [000, 111, 221 ].
When I write this using the Linq abstraction it comes out as below which doesn't return results.
-_path:(111) OR _path:(221)
But _path:(111) does return result. Mind blown.
When I use the Solr interface and wrap each side of the OR in extra brackets like below (which I'd consider redundant) it works! Mind blown v2.
(-_path:(111)) OR (_path:(221))
Firstly, what's the difference between those queries?
Secondly, my real problem is I can't add these extra brackets as I'm working in an abstraction Linq so the brackets will be "optimized" out.
Any advice would be awesome! Cheers.
The problem here is, lucene's negative queries don't work like you think they do. They only remove results from what has been found. -_path:111 doesn't find all documents which aren't in 111, it doesn't find anything at all. It only removes results. So you are finding all results with path "221", then removing any that also have path "111", which from your heirarchy, I assume is all of them. See my answer here for a bit more on that topic.
The OR makes it seem like it ought to work, but really -_path:(111) OR _path:(221) is the same as -_path:(111) _path:(221). The moral here is: Don't use Lucene's AND/OR/NOT syntax, if you can help it. Use +/-. +/- syntax actually expresses how the query operates, AND/OR/NOT doesn't. It attempts to shoehorn it into a different, SQL-like retrieval model and leads to some unexpected behavior like this.
So, what about: (-_path:(111)) OR (_path:(221))
Well, first, does it actually work? Or does it just get some results?
If it just gets some results, but just seems to get the same results as _path:221: The reason is -_path:111 gets no results, so your query is, in practice, something like: (nothing) OR (_path:221), which is equivalent to _path:221
If it really does get the results you expect (I'm guessing it probably does): Something is translating your query into something like: (*:* -_path:111) (_path:221). Solr does have some logic along these lines, though I'm not quite sure in this case. Essentially, it puts a match-all in front of any lonely negative queries it finds, allowing them to do what you were expecting. If the implicit *:* makes you nervous about performance, well, it should. But lucene is an inverted index, it does well with finding matches on a term quickly. Getting everything that doesn't match goes against the grain of that retrieval model, and will pretty much have to do a full scan of the index.

LDAP search on multiple fields like an if/else-statement

I have a question regarding LDAP search, i have three attributes that i want to involve in my filter.
I want that the filter always shall search for objectClass, if attribute skaPersonType has a value, look for that, else look for employeeType.
I'm stuck and really don't now how to continue.
Best regards / C
Always search for objectclass
Unnecessary, but (objectClass=*): all LDAP entries have an objectClass.
IF skaPerson=EMP is met, look for that value
(skaPerson=EMP)
ELSE look for employeetype=External
(employeetype=External)
Any ideas how i can manage that?
You're looking for (2) or (3). So:
(|(skaPerson=EMP)(employeetype=External))
If you must have the redundant objectClass test:
(&(objectClass=*)(|(skaPerson=EMP)(employeetype=External)))
Not sure what filter you actually want:
...always shall search for objectClass, if attribute skaPersonType has a
value, look for that, else look for employeeType...
Are you looking for something like this?
(&(objectClass=MyClass)(|(skaPersonType=A)(&(!(skaPersonType=*))(employeeType=B))))
Above filter will get object which:
objectClass equals MyClass, AND
one of following condition is met
skaPersonType equals A, OR
skaPersonType has no value, and employeeType equals B
The code is not tested.

Prefix the result of a XPATH query

I use libxmljs to parse some html.
I have a xpath query which has an "or" conjunction to retrieve basically the information of two queries
Example
doc.find("//div[contains(#class,'important') or contains(#class,'overdue')]")
this returns all the divs with either important or overdue...
Can I prefix or see within my result set which comes from which condition?
The result could be an array with an index for the match 0 for the first condition and 1 for the 2... Is this possible...
Or how can I find out which result comes from which query condition...
Thanks for any help...
P.S.: this is a simplified exampled of a sequence of elements which either have an important or an overdue item ... both, one or none of them... So I cannot go by looking for every second entry ... etc
This is the result I want to get...
message:{},
message:{
.....
important: "some immportant text",
overdue: "overdue date,
.....
}
There is no way to know which clause of an or XPath query caused a particular result to be included. It's simply not information that's kept around.
You'll either need to do entirely separate queries for important and overdue, or do one large query to get the entire result set (as you are now) and then further test each result's class to find out which one it is.

CouchDB - Querying array key value for first key element only

I have a couchdb view set up using an array key value, in the format:
[articleId, -timestamp]
I want to query for all entries with the same article id. All timestamps are acceptable.
Right now I am using a query like this:
?startkey=["A697CA3027682D5JSSC",-9999999999999]&endkey=["A697CA3027682D5JSSC",0]
but I would like something a bit simpler.
Is there an easy way to completely wildcard the second key element? What would be the simplest syntax for this?
First, as a comment pointed out, there is indeed a special value {} that is ordered after any value, so your query becomes:
startkey=["target ID"]&endkey=["target ID",{}]
This is as equivalent to a wildcard match.
As a side note, there is no need to reverse the ordering in the map function by emitting a negative timestamp, you can reverse the order as an option to the view invocation (your start and end key will be swapped).
startkey=["target ID",{}]&endkey=["target ID"]&descending=true
For future reference, in CouchDB 3 you can use "\ufff0" instead of {}, which would be ordered after a string or number, but before an object.
From the CouchDB 3 docs:
Beware that {} is no longer a suitable “high” key sentinel value. Use a string like "\ufff0" instead.
The query startkey=["foo"]&endkey=["foo",{}] will match most array keys with “foo” in the first element, such as ["foo","bar"] and ["foo",["bar","baz"]]. However it will not match ["foo",{"an":"object"}]

Resources