Big ontology based on rules - prolog

Could someone tell me where I can find an ontology based on rules, not an OWL one. It can be a Prolog one, or similar.
The ontology should have between 10.000 and 100.000 rules in size, or even more. The maximum limit is not very important, the important is the minimum.
Thank you very much.

Well, mostly depends on what you mean by 'rule'. It's a very generic term... If you just want a shallow data model, you can easily generate your own 'ontology' populating SWI-Prolog well engineered Semantic Web DB with some data of your own choice.
For instance, attempting to learn more about RDF, I've done a similar job, putting filesystem info (Prolog sources, actually) in an 'ontology' I designed, and then drawn various hierarchies, testing my pqGraphviz interface.
I think you can get millions 'rules' this way, in seconds... but, how to do something useful with such data it's much more difficult. A good step would be integrating in Cliopatria, but I didn't get it.
There are so many resource available... for instance ROS, and specifically knowrob, seems a good place to search, since it has specific SWI-Prolog hooks.
I hope someone other will contribute a more comprehensive/interesting answer...

Related

Artificial Intelligence/Rules to guess user taste in Apparel/Clothing

Are there standard rules engine/algorithms around AI that would predict the user taste on a particular kind of product like clothes.
I know it's one thing all e-commerce website will kill for. But I am looking out for theoretical patterns defined out there which would help make that prediction in a better way, if not accurately.
Two books that cover recommender systems:
Programming Collective Intelligence: Python, does a good job explaining the algorithm, but doesn't provide enough help IMO in terms of understanding how to scale.
Algorithms of the Intelligent Web: Java, harder to follow, but also covers using persistence, in this case MySQL, to facilitate scaling and identifiers areas in example code that will not scale as-is.
Basically two ways of approaching the problem, user or item based. Netflix appears to use the former, while Amazon the latter. Typically user based requires more time and/or processing power to generate recommendations because you tend to have more users than items to consider.
Not sure how to answer this, as this question is overly broad. What you are describing is a Machine Learning kind of task, and thus would fall under that (very broad) umbrella. There are a number of different algorithms that can be used for something like this, but most texts would tell you that the definition of the problem is the important part.
What parts of fashion are important? What parts are not? How are you going to gather the data? How noisy is the data? All of these are important considerations to the problem space. Pandora does a similar type of thing with music, with their big benefit being that their users tell them initially what they like and don't like.
To categorize their music, they actually have trained musicians listening to the music to identify all sorts of stuff. See the article on Ars Technica here for more information about that. Based on what I know about fashion tastes, I would say that it is a similar problem space, and would probably require experts to "codify" the information before you could attempt to draw parallels.
Sorry for the vague answer - if you want more specifics, I would recommend asking a more specific question, about specific algorithms or data sets, etc.

Anyone with specific pointers or experience with a 'Generic Rules Engine'?

I am looking to integrate with a 'Generic Rules Engine' based on the request of a customer.
I think the objective is to allow business stakeholders to add 'Rules', and have those be incorporated into an overall metric calculated on a dataset. So far, the Rules i have heard seem like straightforward snippets of logic in the code. I suppose the drawback is that even though simple, this would still need to be coded... (as opposed to some kind of runtime or data driven rule specification automatically used in the analysis.)
hopefully not too vague - but anyone have any success with such a thing? which open source projects have the most promise?
thanks
I have played around with DROOLS, a rule engine from JBOSS. I have seen it use in large scale production systems. It offers representation of rules in various different formats such as -- flat rule file written in JAVA or MVEL; using DROOLS rule flow, and decision tables composed in EXCEL.
The execution of rules are using RETE algorithm, which is supposedly faster due to rule memorization and variable sharing. As pointed out by Doug, there are a lot of information on Wikipedia
Expert Systems were the AI rage in the 80s.
There's lots of info on Wikipedia on the Rete Algorithm
See also Inference Engine
One well regarded toolkit is CLIPS

What are algorithms and data structures in layman’s terms?

I currently work with PHP and Ruby on Rails as a web developer. My question is why would I need to know algorithms and data structures? Do I need to learn C, C++ or Java first? What are the practical benefits of knowing algorithms and data structures? What are algorithms and data structures in layman’s terms? (As you can tell unfortunately I have not done a CS course.)
Please provide as much information as possible and thank you in advance ;-)
Data structures are ways of storing stuff, just like you can put stuff in stacks, queues, heaps and buckets - you can do the same thing with data.
Algorithms are recipes or instructions, the quick start manual for your coffee maker is an algorithm to make coffee.
Algorithms are, quite simply, the steps by which you do something. For instance the Coffee Maker Algorithm would run something like
Turn on Coffee Maker
Grind Coffee Beans
Put in filter and place coffee in filter
Add Water
Start brewing process
Drink coffee
A data structure is a means by which we store information in a organized fashion. For further info, check out the Wikipedia Article.
An algorithm is a list of instructions and data structures are ways to represent information. If you're writing computer programs then you're already using algorithms and data structures even if you don't know what the words mean.
I think the biggest advantages in knowing standard algorithms and data structures are:
You can communicate with other programmers using a common language.
Other people will be able to understand your code once you've left.
You will also learn better methods for solving common problems. You could probably solve these problems eventually anyway even without knowing the standard way to do it, but you will spend a lot of time reinventing the wheel and it's unlikely your solutions will be as good as those that thousands of experts have worked on and improved over the years.
An algorithm is a sequence of well defined steps leading to the solution of a type of problem.
A data structure is a way to store and organize data to facilitate access and modifications.
The benefit of knowing standard algorithms and data structures is they are mostly better than you yourself could develop. They are the result of months or even years of work by people who are far more intelligent than the majority of programmers. Knowing a range of data structures and algorithms allows you to fit a problem roughly to a data structure or/and algorithm and tweak as required.
In the classic "cooking/baking equivalent", algorithms are recipes and data structures are your measuring cups, your baking sheets, your cookie cutters, mixing bowls and essentially any other tool you would be using (your cooker is your compiler/interpreter, though).
(source: mit.edu)
This book is the bible on algorithms. In general, data structures relate to how to organize your data to access it in memory, and algorithms are methods / small programs to resolve problems (ex: sorting a list).
The reason you should care is first to understand what can go wrong in your code; poorly implemented algorithms can perform very badly compared to "proven" ones. Knowing classic algorithms and what performance to expect from them helps in knowing how good your code can be, and whether you can/should improve it.
Then there is no need to reinvent the wheel, and rewrite a buggy or sub-optimal implementation of a well-known structure or algorithm.
An algorithm is a representation of the process involved in a computation.
If you wanted to add two numbers then the algorithm might go:
Get first number;
Get second number;
Add first number to second number;
Return result.
At its simplest, an algorithm is just a structured list of things to do - its use in computing is that it allows people to see the intent behind the code and makes logical (as opposed to syntactical) errors easier to spot.
e.g. if step three above said multiply instead of add then someone would be able to point out the error in the logic without having to debug code.
A data structure is a representation of how a system's data should be referenced. It might match a table structure exactly or may be de-normalised to make data access easier. At its simplest it should show how the entities in a system are related.
It is too large a topic to go into in detail but there are plenty of resources on the web.
Data structures are critical the second your software has more than a handful of users. Algorithms is a broad topic, and you'll want to study it if a good knowledge of data structures doesn't fix your performance problems.
You probably don't need a new programming language to benefit from data structures knowledge, though PHP (and other high level languages) will make a lot of it invisible to you, unless you know where to look. Java is my personal favorite learning language for stuff like this, but that's pretty subjective.
My question is why would I need to know algorithms and data structures?
If you are doing any non-trivial programming, it is a good idea to understand the class data structures and algorithms and their uses in order to avoid reinventing the wheel. For example, if you need to put an array of things in order, you need to understand the various ways of sorting, so that you can choose the most appropriate one for the task in hand. If you choose the wrong approach, you can end up with a program that is grossly inefficient in some circumstances.
Do I need to learn C, C++ or Java first?
You need to know how to program in some language in order to understand what the algorithms and data structures do.
What are the practical benefits of knowing algorithms and data structures?
The main practical benefits are:
to avoid having to reinvent the wheel all of the time,
to avoid the problem of square wheels.

What does this software quote mean?

I was reading Code Complete (2nd Edition), and came across a quote in the margin on page 87 by Bertrand Meyer.
Ask not first what the system does; ask WHAT it does it to!
What exactly is the point Mr. Meyer is trying to get across here. I have some rough ideas, but I would like to make sure I really understand.
... So this is the second fallacy of teleology
- to attribute goal-directed
behavior to things that are not
goal-directed, perhaps without even
thinking of the things as alive and
spirit-inhabited, but only thinking, X
happens in order to Y. "In order to"
is mentalistic language, even though
it doesn't seem to name a blatantly
mental property like "fearful" or
"thinks it can fly". — Eliezer Yudkowsky, artificial intelligence theorist
concerned with self-improving AIs with stable goal systems
Bertrand Meyer's homily suggests that sound reasoning about systems is grounded in knowing what concrete entities are altered by the system; the purpose of the alterations is an emergent property.
I believe the point here is not on what the system does, but on the data it operates on and what those operations are.
This provides two major thinking shifts:
You think of the data and concepts first
You think of operations on that data
With those two "baselines" you will better prepared to organize a system to achieve your goals so that operations on data are well understood and make sense.
In effect, he is laying the ground work to be able to write the "contracts" on the code you write.
From Google search it picked up Art Gittleman's Computing With C# and the .Net Framework:
Bertrand Meyer gives an example of
payroll program, which produces
paychecks from timecards. Management
may later want to extend this program
to produce statistics or tax
information. The payroll function
itself may need to be changed to
produce weekly checks instead of
biweekly checks, for example. The
procedures used to implement the
original payroll program would need to
be changed to make any of these
modifications. Meyer notes that any of
these payroll programs will manipulate
the same sort of data, employee
records, company regulations, and so
forth.
Focusing on the more stable
aspect of such systems, Mayer states a
principle: "Ask not first what the
system does: Ask WHAT it does to!";
and a definition: "Object-oriented
design is the method which leads to
software architectures based on
objects every system or subsystem
manipulates (rather than "the"
function it meant to ensure)."
We today take UML's class diagram and other OOAD approach for granted, but it was something that was "discovered" along the way.
Also see Object-Oriented Design.
My opinion is that the quote is meant as a method to find good abstractions in your software. The text next to this quote deals with finding real-world objects to design your classes.
An simple example would be something like this:
You are making software for a bank. Because your software is working with bank accounts, it should have a class for an account. Then you start thinking what properties accounts have and the interactions you can have with accounts.
Of course, this quote makes more sense if the objects you are trying to model aren't as clear as this case.
Fred Brooks stated it this way:
"Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowcharts; they'll be obvious."
Domain-Driven design... Understand the problem the software is designed to solve. What "domain" entities, (data abstractions) does the system manipulate ? And what does it do to those domain entities?

Minimum CompSci Knowledge Needed for Writing Desktop Apps

Having been a hobbyist programmer for 3 years (mainly Python and C) and never having written an application longer than 500 lines of code, I find myself faced with two choices :
(1) Learn the essentials of data structures and algorithm design so I can become a l33t computer scientist.
(2) Learn Qt, which would help me build projects I have been itching to build for a long time.
For learning (1), everyone seems to recommend reading CLRS. Unfortunately, reading CLRS would take me at least an year of study (or more, I'm not Peter Krumins). I also understand that to accomplish any moderately complex task using (2), I will need to understand at least the fundamentals of (1), which brings me to my question : assuming I use C++ as the programming language of choice, which parts of CLRS would give me sufficient knowledge of algorithms and data structures to work on large projects using (2)?
In other words, I need a list of theoretical CompSci topics absolutely essential for everyday application programming tasks. Also, I want to use CLRS as a handy reference, so I don't want to skip any material critical to understanding the later sections of the book.
Don't get me wrong here. Discrete math and the theoretical underpinnings of CompSci have been on my "TODO: URGENT" list for about 6 months now, but I just don't have enough time owing to college work. After a long time, I have 15 days off to do whatever the hell I like, and I want to spend these 15 days building applications I really want to build rather than sitting at my desk, pen and paper in hand, trying to write down the solution to a textbook problem.
(BTW, a less-math-more-code resource on algorithms will be highly appreciated. I'm just out of high school and my math is not at the level it should be.)
Thanks :)
This could be considered heresy, but the vast majority of application code does not require much understanding of algorithms and data structures. Most languages provide libraries which contain collection classes, searching and sorting algorithms, etc. You generally don't need to understand the theory behind how these work, just use them!
However, if you've never written anything longer than 500 lines, then there are a lot of things you DO need to learn, such as how to write your application's code so that it's flexible, maintainable, etc.
For a less-math, more code resource on algorithms than CLRS, check out Algorithms in a Nutshell. If you're going to be writing desktop applications, I don't consider CLRS to be required reading. If you're using C++ I think Sedgewick is a more appropriate choice.
Try some online comp sci courses. Berkeley has some, as does MIT. Software engineering radio is a great podcast also.
See these questions as well:
What are some good computer science resources for a blind programmer?
https://stackoverflow.com/questions/360542/plumber-programmers-vs-computer-scientists#360554
Heed the wisdom of Don and just do it. Can you define the features that you want your application to have? Can you break those features down into smaller tasks? Can you organize the code produced by those tasks into a coherent structure?
Of course you can. Identify any 'risky' areas (areas that you do not understand, e.g. something that requires more math than you know, or special algorithms you would have to research) and either find another solution, prototype a solution, or come back to SO and ask specific questions.
Moving from 500 loc to a real (eve if small) application it's not that easy.
As Don was pointing out, you'll need to learn a lot of things about code (flexibility, reuse, etc), you need to learn some very basic of configuration management as well (visual source safe, svn?)
But the main issue is that you need a way to don't be overwhelmed by your functiononalities/code pair. That it's not easy. What I can suggest you is to put in place something to 'automatically' test your code (even in a very basic way) via some regression tests. Otherwise it's going to be hard.
As you can see I think it's no related at all to data structure, algorithms or whatever.
Good luck and let us know
I must say that sitting down with a dry old textbook and reading it through is not the way to learn how to do anything effectively, even if you are making notes. Doing it is the best way to learn, using the textbooks as a reference. Indeed, using sites like this as a reference.
As for data structures - learn which one is good for whatever situation you envision: Sets (sorted and unsorted), Lists (ArrayList, LinkedList), Maps (HashMap, TreeMap). Complexity of doing basic operations - adding, removing, searching, sorting, etc. That will help you to select an appropriate library data structure to use in your application.
And also make sure you're reasonably warm with MVC - i.e., ensure your model is separate from your view (the QT front-end) as best as possible. Best would be to have the model and algorithms working on their own, and then put the GUI on top. Or a unit test on top. Etc...
Good luck!
It's like saying you want to move to France, so should you learn french from a book, and what are the essential words - or should you just go to France and find out which words you need to know from experience and from copying the locals.
Writing code is part of learning computer science. I was writing code long before I'd even heard of the term, and lots of people were writing code before the term was invented.
Besides, you say you're itching to write certain applications. That can't be taught, so just go ahead and do it. Some things you only learn by doing.
(The theoretical foundations will just give you a deeper understanding of what you wind up doing anyway, which will mainly be copying other people's approaches. The only caveat is that in some cases the theoretical stuff will tell you what's futile to attempt - e.g. if one of your itches is to solve an NP complete problem, you probably won't succeed :-)
I would say the practical aspects of coding are more important. In particular, source control is vital if you don't use that already. I like bzr as an easy to set up and use system, though GUI support isn't as mature as it could be.
I'd then move on to one or both of the classics about the craft of coding, namely
The Pragmatic Programmer
Code Complete 2
You could also check out the list of recommended books on Stack Overflow.

Resources