I'm developping a mini search engine, and I want to implement the feature of searches based on logic operators AND OR...
I'm having a difficulty on parsing a query containing AND, OR, NOT... especially when it comes to parentheses... (cat or dog) not (bike not mike)
For simple AND, and OR queries, it's obviously too simple and I figured out how to formulate the sql query, but when it becomes that complicated I'm lost !!!
I'm not sure if search engines have this feature, but I want to dive into it for learning purpose.
I apologize for my last question which wasn't really clear, I hope this time I'm doing better.
I'd recommend looking at a lexer/parser generator like ANTLR. A simple grammar should be able to sort you out. There might even be an existing grammar for such a thing.
Take a look at the searchparser.py example from the pyparsing project.
It shows a way to implement:
AND,
OR,
NOT,
grouping and
wildcards.
All done in 293 lines of code (including comments and tests) ...
If you are using MySQL you can use the builtin boolean search:
http://dev.mysql.com/doc/refman/5.1/en/fulltext-boolean.html
Related
I'd like to know what suggestions there are for Googling (or using other search engines if preferable) for Ruby syntax. I'm very new, and a substantial part of my baptism by fire comes by way of reading other people's code. Ruby in particular can be challenging this way-- it's fantastically compact and easy to read if you know how to read it, so to speak. But figuring that out can be difficult at times. It's worth it, but difficult. So, for example, let's say I encounter an expression like this:
tquery = "#{MASTER_URL}#{query_str}"
Well, apparently there's something going on with the syntax #{stuff}, but what? A variable being manipulated, it seems? If you encountered such an expression and didn't know about interpolation/substitution and have no ready access to someone to ask directly, how would you go about Googling that? That's just an example, of course, but I hope it illustrates the type of problem I'd like to address.
Also, if there are better tags to apply to this post please let me know and I will add. Thank you.
Symbolhound is pretty good for this. For example, here's the search for Ruby's #{}.
Mind, as you can see from the results, it doesn't necessarily come immediately back and tell you what the notation you searched for is named, or how it's defined, but it does return some helpful results to get you started. It's especially useful for punctuation-based syntax elements that are difficult or impossible to search for in other search engines.
Can anyone please help me with a general Entity Framework question? I'm a newbie and trying to teach myself from reading and trial & error. However, I'm getting REALLY confused on all the syntax and terminology. And the more I google, the more confused I get!
What in the world are those little arrows (=>) used in the syntax? And I'm not even sure what the name of the syntax is...is it Entity Framework syntax? Linq to method syntax? Linq to Entity syntax?
Why does it seem like you can use random letters when using that syntax? the "f" below seems interchangeable with any alphabet letter since Intellisense gives me options no matter what letter I type. So what is that letter supposed to stand for anyway? There seems to be no declaration for it.
var query = fruits.SelectMany(f => f.Split(' '));
Is it better to use the syntax with the little arrows or to use the "psuedo SQL" that I keep seeing, like below. This seems a little easier to understand, but is this considered not the Real Entity Framework Way?
var query = from f in fruits from word in f.Split(' ') select word;
And, for any of them - is there any documentation out there ANYWHERE?? I've been scouring the internet for tutorials, articles, anything, but all that comes back are small sample queries varying with the little arrows or that psuedo SQL, with no explanations beyond "here's how to do a select:"
I would much appreciate any guidance or assistance. I think if I can just find out where to start, then I can build myself from there. Thanks!
There is no real entity way, there is LINQ and there is LINQ extension methods which is my opinion is much cleaner to the eyes. Also you can use LINQ not just with EE.
Language Integrated Query
LINQ extends the language by the addition of query expressions, which are akin to SQL statements, and can be used to conveniently extract and process data from arrays, enumerable classes, XML documents, relational databases, and third-party data sources. Other uses, which utilize query expressions as a general framework for readably composing arbitrary computations, include the construction of event handlers2 or monadic parsers.3
1 It is called lambda expression and it is basically an anonymous method.
Exploring Lambda Expression in C#
2 You can use anything you want, word, or letters, anything that is a valid name for a parameter, because that is a parameter
3 I find the LINQ extension methods to be cleaner, and to be honest the last I want to see is SQL like statements laying in the code.
4 A good start can be found here
101 LINQ SAMPLES
The arrow is called a Lambda operator, and it's used to create Lambda expressions. This has nothing to do with EF, or Linq or anything else. It's a feature of C#. EF and Linq just use this feature a lot because it's very useful for writing queries.
Marco has given links to the relevant documentation.
Linq is a library of extension methods that primarily operate on types like IEnumerable and IQueryable interfaces, and give you a lot of power to work with collections of various types. You can write Linq queries either in two formats, so called Method syntax and Query Syntax. They are functionally identical, but their usage is generally a matter of personal preference which one you use (although many of us use both, depending on the context it's used in.. one or the other is easier to use).
It seems there are two ways to build queries -- either using query expressions:
IEnumerable<Customer> result =
from customer in customers
where customer.FirstName == "Donna"
select customer;
or using extension methods:
IEnumerable<Customer> result =
customers.Where(customer => customer.FirstName == "Donna");
Which do you use and why? Which do you think will be more popular in the long-run?
Only a limited number of operations are available in the expression syntax, for example, Take() or First() are only available using extension methods.
I personally prefer expression if all the required operations are available, if not then i fall back to extension methods as I find them easier to read than lambdas.
take a look at this answer,
Linq Extension methods vs Linq syntax
I use the method syntax (almost) exclusively, because the query syntax has more limitations. For maintainability reasons, I find it preferable to use the method syntax right away, rather than maybe converting it later, or using a mix of both syntaxes.
It might be a little harder to read at first, but once you get used to it, it works fairly natural.
I only use the method syntax. This is because I find it a lot faster to write, and I write a ton of linq. I also like it because it is more terse. If working on a team, its probably best to come to a concensus as to which is the preferred style, as mixing the two styles is hard to read.
Microsoft recommends the query syntax. "In general, we recommend query syntax because it is usually simpler and more readable". http://msdn.microsoft.com/en-us/library/bb397947.aspx
It depends on which you and your team find more readable, and I would choose this on a case by case basis. There are some queries that read better in syntax form and there are some that read better in method form. And of course, there is that broad middle ground where you can't say one way or the other, or some prefer it this way and others that way.
Keep in mind that you can mix both forms together where it might make it more readable.
I see no reason to suspect that either form will dissappear in the future.
I'm looking for an application to display what a linq expression would do, in particular regarding the usage of multiple access to the same list in a query.
Better yet, the tool would tell if the linq query is good.
I used the expression tree visualizer in the past to at least help decode what is inside of an expression tree. It aids in figureing out the parts of the tree and how gives each part is related.
Well, to begin with, I could easily foresee a tool that would pick a query apart and detect that the Where-clause is the standard runtime implementation, and thus not examine that method, but "know" what the execution plan for that method would be, and could thus piece together a plan for the whole query.
Right up until the point where you introduce a custom Linq provider, where the only way to figure out what it will be doing would be to read the code.
So I daresay there is no such tool, and making one would be very hard.
Would be fun to try though, at least for standard classes, would be a handy debugging visualizer for Visual Studio.
What about making the tool yourself?! ;)
Take a look at Expression trees, I believe they might be useful
What are the drawbacks of linq in general.
Can be hard to understand when you first start out with it
Deferred execution can separate errors from their causes (in terms of time)
Out-of-process LINQ (e.g. LINQ to SQL) will always be a somewhat leaky abstraction - you need to know what works and what doesn't, essentially
I still love LINQ massively though :)
EDIT: Having written this short list, I remembered that I've got an answer to a very similar question...
The biggest pain with LINQ is that (with database backends) you can't use it over a repository interface without it being a leaky abstraction.
LINQ is fantastic within a layer (especially the DAL etc), but since different providers support different things, you can't rely on Expression<Func<...>> or IQueryable<T> features working the same for different implementations.
As examples, between LINQ-to-SQL and Entity Framework:
EF doesn't support Single()
EF will error if you Skip/Take/First without an explicit OrderBy
EF doesn't support UDFs
etc. The LINQ provider for ADO.NET Data Services supports different combinations. This makes mocking and other abstractions unsafe.
But: for in-memory (LINQ-to-Objects), or in a single layer/implementation... fantastic.
Some more thoughts here: Pragmatic LINQ.
Like any abstraction in programming, it is vulnerable to a misunderstanding: "If I just understand this abstraction, I don't need to understand what's happening under the covers."
The truth is, if you do understand what's happening under the covers, you'll get much better value out of the abstraction, because you'll understand where it ceases to be applicable, so you'll be able to apply it with greater confidence of success where it is appropriate.
This is true of all abstractions, and applies to Linq in bucketfuls. To understand Linq to Objects, the best thing to do is to learn how to write Select, Where, Aggregate, etc. in C# with yield return. And then figure out how yield return replaces a lot of hand-written code by writing it all with classes. Then you'll be able to use it with an appreciation of the effort it is saving you, and it will no longer seem like magic, so you'll understand the limitations.
Same for the variants of Linq where the predicates are captured as expressions and transported off to another environment to be executed. You have to understand how it works in order to safely use it.
So the number 1 drawback of Linq is: the simple examples look deceptively short and simple. The problem is, how did the author of the sample know what to write? Because they knew how to write it all out in long form, and they knew how pieces of Linq could be used as abreviations, and so they arrived at the nice short version.
As I say, not really specific to Linq, but highly relevant to it anyway.
Anonymous types. Proper ORM should always return objects of 'your' type (partial class, with possiblity of adding my methods, overriding etc.). There are doezne of tutorials and examples of different complex queries using linq but non of them care to explain the advantage of returning a 'bag of properties' (return new { .........} ). How am I supposed to work with anonymous type, wrap it in another class again?
Actually I can´t think of any drawbacks. It makes programming life a lot simpler because a lot of things can be written in a more compact but still better readable way.
But having said this, I must also agree with Jon that you should have some idea what you´re doing (but that holds for all technological advances).
the only drawback which it has is its performance see this article