Substring with spacebar search in RavenDB - linq

I'm using such a query:
var query = "*" + QueryParser.Escape(input) + "*";
session.Query<User, UsersByEmailAndName>().Where(x => x.Email.In(query) || x.DisplayName.In(query));
With the support of a simple index:
public UsersByEmailAndName()
{
Map = users => from user in users
select new
{
user.Email,
user.DisplayName,
};
}
Here I've read that:
"By default, RavenDB uses a custom analyzer called
LowerCaseKeywordAnalyzer for all content. (...) The default values for
each field are FieldStorage.No in Stores and FieldIndexing.Default in
Indexes."
The index contains fields:
DisplayName - "jarek waliszko" and Email - "my_email#domain.com"
And finally the thing is:
If the query is something like *_email#* or *ali* the result is fine. But while I use spacebar inside e.g. *ek wa*, nothing is returned. Why and how to fix it ?
Btw: I'm using RavenDB - Build #960

Change the Index option for the fields you want to search on to be Analyzed, instead of Default
Also, take a look here:
http://ayende.com/blog/152833/orders-search-in-ravendb

Lucene’s query parser interprets the space in the search term as a break in the actual query, and doesn’t include it in the search.
Any part of the search term that appears after the space is also disregarded.
So you should escape space character by prepending the backslash character before whitespace character.
Try to query *jarek\ waliszko*.

So.., I've came up with an idea how to do it. I don't know if this is the "right way" but it works for me.
query changes to:
var query = string.Format("*{0}*", Regex.Replace(QueryParser.Escape(input), #"\s+", "-"));
index changes to:
public UsersByEmailAndName()
{
Map = users => from user in users
select new
{
user.Email,
DisplayName = user.DisplayName.Replace(" ", "-"),
};
}
I've just changed whitespaces into dashes for the user input text and spacebars to dashes in the indexed display name. The query gives expected results right now. Nothing else really changed, I'm still using LowerCaseKeywordAnalyzer as before.

Related

CloudSearch or CloudQuery to search by 'contains' in CloudBoost

I need to filter data by substring, I mean, if I have got this data:
'John','Markus','james'
And i want to look by all elements which contains 'm' it should return:
'Markus','james'
Or if I filter by 'hn', the results should be:
'John'
How can I do it using CloudSearch or CloudQuery?
EDIT: I have seen wildcard method which seems to fit with my requirements, except for only is allowed a column (string) param. I would need to filter also by columns (array). As in searchOn method.
This should work I think. did you try it with this :
var query = new CB.CloudQuery('TableName');
//then you can:
query.substring('ColName','Text');
//or
query.substring(['ColName1','ColName2'],'Text');
//or
query.substring('ColName',['Text1', 'Text2']);
//or
query.substring(['ColName1','ColName2'],['Text1', 'Text2']);
query.find(callback);

Sphinx search infix and exact words in different fields

I'm using sphinx as search engine and I need to be able to do a search in different fields but using infix for one of the fields and exact word matches for another.
Simple example:
My source has for field_1 the value "abcdef" and for field_2 the value "12345", what I need to accomplish is to be able to search by infix in field_1 and exact word in field_2. So a search like "cde 12345" would return the doc I mentioned.
Before when using sphinx v2.0.4 I was able to obtain these results just by defining infix_fields/prefix_fields on my index but now that I'm using v2.2.9 with the new dict=keywords mode and infix_fields are deprecated.
My index definition:
index my_index : my_base_index
{
source = my_src
path = /path/to/my_index
min_word_len = 1
min_infix_len = 3
}
I've tried so far to use extended query syntax in the following way:
$cl = new SphinxClient();
$q = (#(field_1) *cde* *12345*) | (#(field_2) cde 12345)
$result = $cl->Query($q, 'my_index');
This doesn't work because for each field, sphinx is doing an AND search and one of the words is not in the specified field, "12345" is not a match on field_1 and "cde" is not a match in field_2. Also I don't want to do an OR search, but need the both words to match.
Is there a way to accomplish what I need?
Its a bit tricky, but can do
$q = "((#field_1 *cde*) | (#field_2 cde)) ((#field_1 *12345*) | (#field_2 12345))"
(dont need the brackets around the field name in the #syntax - if just one field, so removed them for brevity)

Rails 4 and Mongoid: programmatically build query to search for different conditions on the same field

I'm building a advanced search functionality and, thanks to the help of some ruby fellows on SO, I've been already able to combine AND and OR conditions programmatically on different fields of the same class.
I ended up writing something similar to the accepted answer mentioned above, which I report here:
query = criteria.each_with_object({}) do |(field, values), query|
field = field.in if(values.is_a?(Array))
query[field] = values
end
MyClass.where(query)
Now, what might happen is that someone wants to search on a certain field with multiple criteria, something like:
"all the users where names contains 'abc' but not contains 'def'"
How would you write the query above?
Please note that I already have the regexes to do what I want to (see below), my question is mainly on how to combine them together.
#contains
Regex.new('.*' + val + '.*')
#not contains
Regex.new('^((?!'+ val +').)*$')
Thanks for your time!
* UPDATE *
I was playing with the console and this is working:
MyClass.where(name: /.*abc.*/).and(name: /^((?!def).)*$/)
My question remains: how do I do that programmatically? I shouldn't end up with more than two conditions on the same field but it's something I can't be sure of.
You could use an :$and operator to combine the individual queries:
MyClass.where(:$and => [
{ name: /.*abc.*/ },
{ name: /^((?!def).)*$/ }
])
That would change the overall query builder to something like this:
components = criteria.map do |field, value|
field = field.in if(value.is_a?(Array))
{ field => value }
end
query = components.length > 1 ? { :$and => components } : components.first
You build a list of the individual components and then, at the end, either combine them with :$and or, if there aren't enough components for :$and, just unwrap the single component and call that your query.

Couchdb view filtering by date

I have a simple document named Order structure with the fields id, name,
userId and timeScheduled.
What I would like to do is create a view where I can find the
document.id for those who's userId is some value and timeScheduledis
after a given date.
My view:
"by_users_after_time": {
"map": "function(doc) { if (doc.userId && doc.timeScheduled) {
emit([doc.timeScheduled, doc.userId], doc._id); }}"
}
If I do
localhost:5984/orders/_design/Order/_view/by_users_after_time?startKey="[2012-01-01T11:40:52.280Z,f98ba9a518650a6c15c566fc6f00c157]"
I get every result back. Is there a way to access key[1] to do an if
doc.userId == key[1] or something along those lines and simply emit on the
time?
This would be the SQL equivalent of
select id from Order where userId =
"f98ba9a518650a6c15c566fc6f00c157" and timeScheduled >
2012-01-01T11:40:52.280Z;
I did quite a few Google searches but I can't seem to find a good tutorial
on working with multiple keys. It's also possible that my approach is
entirely flawed so any guidance would be appreciated.
You only need to reverse the key, because username is known:
function (doc) {
if (doc.userId && doc.timeScheduled) {
emit([doc.userId, doc.timeScheduled], 1);
}
}
Then query with:
?startkey=["f98ba9a518650a6c15c566fc6f00c157","2012-01-01T11:40:52.280Z"]
NOTES:
the query parameter is startkey, not startKey;
the value of startkey is an array, not a string. Then the double quotes go around the username and date values, not around the array.
I emit 1 as value, instead of doc._id, to save disk-space. Every row of the result has an id field with the doc._id, then there's no need to repeat it.
don't forget to set an endkey=["f98ba9a518650a6c15c566fc6f00c157",{}], otherwise you get the data of all users > "f98ba9a518650a6c15c566fc6f00c157"
The answer actually came from the couchdb mailing list:
Essentially, the Date.parse() doesn't like the +0000 on the timestamps. By
doing a substring and removing the +0000, everything worked.
For the record,
document.write(new Date("2012-02-13T16:18:19.565+0000")); //Outputs Invalid
Date
document.write(Date.parse("2012-02-13T16:18:19.565+0000")); //Outputs NaN
But if you remove the +0000, both lines of code work perfectly.

How do I search for a wildcard character in Microsoft CRM 4.0?

I need to search for accounts in Microsoft CRM, using a wildcard search to get a "contains" search for the user's input. So if the user enters "ABC", I use ConditionOperator.Like and the value "%ABC%".
My question is, how would I search for a customer name that contains a percentage sign, such as "100% Great llc"? I can't find a way to escape the %.
Sounds like you're looking for a SQL-based approach so I'm not sure if this helps.
One way I know is through the user interface with an asterisk *
So if you want to find all of the accounts that have a % sign just type in *% into the account search.
Try using square blocks for special characters, for instance like [%]. So the condition would be: 100[%] Great llc or %100[%] Great llc%.
--EDIT--
This is in response to your comment.
Try utilizing the ConditionExpression, something like following:
//1. Condition expression.
ConditionExpression nameCondition= new ConditionExpression();
nameCondition.AttributeName = "AccountName";
nameCondition.Operator = ConditionOperator.Like;
nameCondition.Values = new string[] { "%100[%] Great llc%" };
//2. Create filter expression
FilterExpression nameFilter = new FilterExpression();
nameFilter.Conditions = new ConditionExpression[] { nameCondition };
//3. Provide columns
ColumnSet resultSetColumns = new ColumnSet();
resultSetColumns.Attributes = new string[] { "name", "address" };
//4. Prepare query expression
QueryExpression qryExpression = new QueryExpression();
qryExpression.Criteria = nameFilter;
qryExpression.ColumnSet = resultSetColumns;
//5. Set the table to query.
qryExpression.EntityName = EntityName.account.ToString();
//6. BusinessEntityCollection accountsResultSet = service.RetrieveMultiple(qryExpression);
Though I have played alot with CRM, but never came across special characters scenario. Let me know your findings. This article has some revelations.

Resources