How to extract fuzzy knowledge when building a knowledge graph? - methods

I recently have been researching how to extract the fuzzy knowledge when building the knowledge graph, but I currently only know to use some fuzzy mathematics algorithm, but I haven't got a clue, so, any good idea?
By the way, I also want to ask whether ontology construction is included in knowledge extraction?


What's the most accurate algorithm in finding the similarities between different images

I'm learning to find a way to search similar images recently.
There is some popular algorithms in features matching area.For example, Perceptual Hash Algorithm, SIFI and SURF in openCV. I'm wondering that which one is the most accurate.Or is using multiple algorithms a good idea?
Or is there some nice conclusions about the popular algorithms.
Thanks in advance.
There are a lot of algorithms for check similarities, actually matching features.
I searched some algorithms to find the features which are SURF, SIFT, BRISK, LBP, Harris MSER, A-KAZE, FAST and so on.
In many applications, the SIFT is selected to check feature matching. However, I think that you should evaluate the performances algorithm. To find right algorithm for your application.
If you can't evluate the algorithms, I think that using multiple algorithms is better to you.
If you want to check the features, I recommend this link to understand feature extractors, descriptors, matching.
Thank you.

Fast Downward planner

I am currently studying the AI fast downward planner, and I would like some help in this area. I know that the planner receives a domain.pddl file and a problem.pddl file, in addition, it receives a search algorithm and a heuristic function.
Many planners (not just the fast downward - ex. the pyperplan planner) gives us the opportunity to modify or create our new search algorithms to reach a solution. But as I have seen there are so many search algorithms already.
My question is: what is the idea in implementing our own search algorithm? or am
Am I missing something?
I'm not sure what your question is, so I'm going to give two different answers.
Why do planning systems like Fast Downward have the option to write your own search algorithms, heuristics, etc.?
Automated domain-independent planning is an active research area where new ideas are constantly developed (for example at ICAPS). Implementing a new idea to evaluate is is much easier if you can base the implementation on an existing framework than if you have to start from scratch every time. It also helps with comparability. For example, if you develop a new search algorithm but leave the heuristic the same, your implementation is much easier to compare to a baseline if the baseline uses the same heuristic implementation. That is why a lot of work is based on Fast Downward an similar frameworks.
How do I come up with an idea for a new search algorithm?
This is much harder to answer. As a general approach I would say: Try to find cases where existing search algorithms "don't get it", for example, a problem that you can solve by looking at it but the search algorithm fails to solve it. Then try to figure out what you did to solve it, generalize that idea so it works on other cases as well and write it down as an algorithm.

How to tackle twitter sentiment analysis?

I'd like you to give me some advice in order to tackle this problem. At college I've been solving opinion mining tasks but with Twitter the approach is quite different. For example, I used an ensemble learning approach to classify users opinions about a certain Hotel in Spain. Of course, I was given a training set with positive and negative opinions and then I tested with the test set. But now, with twitter, I've found this kind of categorization very difficult.
Do I need to have a training set? and if the answer to this question is positive, don't you think twitter is so temporal so if I have that set, my performance on future topics will be very poor?
I was thinking in getting a dictionary (mainly adjectives) and cross my tweets with it and obtain a term-document matrix but I have no class assigned to any twitter. Also, positive adjectives and negative adjectives could vary depending on the topic and time. So, how to deal with this?
How to deal with the problem of languages? For instance, I'd like to study tweets written in English and those in Spanish, but separately.
Which programming languages do you suggest to do something like this? I've been trying with R packages like tm, twitteR.
Sure, I think the way sentiment is used will stay constant for a few months. worst case you relabel and retrain. Unsupervised learning has a shitty track record for industrial applications in my experience.
You'll need some emotion/adj dictionary for sentiment stuff- there are some datasets out there but I forget where they are. I may have answered previous questions with better info.
Just do English tweets, it's fairly easy to build a language classifier, but you want to start small, so take it easy on yourself
Python (NLTK) if you want to do it easily in a small amount of code. Java has good NLP stuff, but Python and it's libraries are way more user friendly
This site: provides 3 ways to do sentiment analysis using R.
The twitter package is now updated to work with the new twitter API. I'd you download the source version of the package to avoid getting duplicated tweets.
I'm working on a spanish dictionary for opinion mining, and would publish somewhere accesible.
Sentiment Analysis will give only 3 results as said above - positive, negative and neutral. I found a tutorial on Twitter Sentiment analysis and it's quiet easy.
I found it here -
Only 3 dependencies, i downloaded and lesser code, done. Just go through it, you will get the solution.

Recommendations for using graphs theory in machine learning? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I have been learning alot about using graphs for machine learning by watching Christopher Bishops videos( ). I find it very interesting and watched a few others in the same categories(machine learning/graph) but was wondering if anyone had any recommendations for ways of learning more?
My problem is, although the videos gave a great high level understanding, I don't have much practical skills in it yet. I've read Bishops book on machine learning/patterns as well as Norvig's AI book but both don't seem to touch upon specific using graphs much. With the emergence of search engines and social networking, I would think machine learning on graphs would be popular.
If possible, can anyone suggestion an a resource to learn from? (I'm new to this field and development is a hobby for me, so I'm sorry in advance if there's a super obvious resource to learn from..I tried google and university sites).
Thanks in advance!
First, i would strongly recommend the book Social Network Analysis for Startups by Maksim Tsvetovat and Alexander Kouznetsov. A book like this is a godsend for programmers who need to quickly acquire a basic fluency in a specific discipline (in this case, graph theory) so that they can begin writing code to solve problems in this domain. Both authors are academically trained graph theoreticians but the intended audience of their book is programmers. Nearly all of the numerous examples presented in the book are in python using the networkx library.
Second, for the projects you have in mind, two kinds of libraries are very helpful if not indispensible:
graph analysis: e.g., the excellent networkx (python), or igraph
(python, R, et. al.) are two that i can recommend highly; and
graph rendering: the excellent graphViz, which can be used
stand-alone from the command line but more likely you will want to
use it as a library; there are graphViz bindings in all major
languages (e.g., for python there are at least three i know of,
though pygraphviz is my preference; for R there is rgraphviz which is
part of the bioconductor package suite). Rgraphviz has excellent documentation (see in particular the Vignette included with the Package).
It is very easy to install and begin experimenting with these libraries and in particular using them
to learn the essential graph theoretic lexicon and units of analysis
(e.g., degree sequence distribution, nodes traversal, graph
to distinguish critical nodes in a graph (e.g., degree centrality,
eigenvector centrality, assortivity); and
to identify prototype graph substructures (e.g., bipartite structure,
triangles, cycles, cliques, clusters, communities, and cores).
The value of using a graph-analysis library to quickly understand these essential elements of graph theory is that for the most part there is a 1:1 mapping between the concepts i just mentioned and functions in the (networkx or igraph) library.
So e.g., you can quickly generate two random graphs of equal size (node number), render and then view them, then easily calculate for instance the average degree sequence or betweenness centrality for both and observer first-hand how changes in the value of those parameters affects the structure of a graph.
W/r/t the combination of ML and Graph Theoretic techniques, here's my limited personal experience. I use ML in my day-to-day work and graph theory less often, but rarely together. This is just an empirical observation limited to my personal experience, so the fact that i haven't found a problem in which it has seemed natural to combine techniques in these two domains. Most often graph theoretic analysis is useful in ML's blind spot, which is the availability of a substantial amount of labeled training data--supervised ML techniques depend heavily on this.
One class of problems to illustrate this point is online fraud detection/prediction. It's almost never possible to gather data (e.g., sets of online transactions attributed to a particular user) that you can with reasonable certainty separate and label as "fraudulent account." If they were particularly clever and effective then you will mislabel as "legitimate" and for those accounts for which fraud was suspected, quite often the first-level diagnostics (e.g., additional id verification or an increased waiting period to cash-out) are often enough to cause them to cease further activity (which would allow for a definite classification). Finally, even if you somehow manage to gather a reasonably noise-free data set for training your ML algorithm, it will certainly be seriously unbalanced (i.e., much more "legitimate" than "fraud" data points); this problem can be managed with statistics pre-processing (resampling) and by algorithm tuning (weighting) but it's still a problem that will likely degrade the quality of your results.
So while i have never been able to successfully use ML techniques for these types of problems, in at least two instances, i have used graph theory with some success--in the most recent instance, by applying a model adapted from the project by a group at Carnegie Mellon initially directed to detection of online auction fraud on ebay.
MacArthur Genius Grant recipient and Stanford Professor Daphne Koller co-authored a definitive textbook on Bayesian networks entitled Probabalistic Graphical Models, which contains a rigorous introduction to graph theory as applied to AI. It may not exactly match what you're looking for, but in its field it is very highly regarded.
You can attend free online classes at Stanford for machine learning and artificial intelligence:
The classes are not simply focused on graph theory, but include a broader introduction in the field and they will give you a good idea of how and when you should apply which algorithm. I understand that you've read the introductory books on AI and ML, but I think that the online classes will provide you with a lot of exercises that you can try.
Although this is not an exact match to what you are looking for, textgraphs is a workshop that focuses on the link between graph theory and natural language processing. Here is a link. I believe the workshop also generated this book.

Efficient way to practice graph theory algorithms

I just read about the breadth-first search algorithm in the Introduction to Algorithms book and I hand simulated the algorithm on paper. What I would like to do now is to implement it in code for extra practice.
I was thinking about implementing all the data structures from scratch (the adjacency list, the "color", "distance", and "parent" arrays) but then I remembered that there are currently graph libraries out there like the Boost graph library and some other graph APIs in Python.
I also tried looking for some BFS-related problems on UVA and Sphere Judge Online but I can't tell which problems would require a BFS solution.
My question is what would be the most painless way to practice these graph algorithms (not just limited to BFS, but will also come in useful when I want to implement DFS, Dijkstra, Floyd-Warshall, etc). Sites with practice problems are welcomed.
I personally think that the best way to understand those would be implementing the graph representation yourself from scratch.
On the one hand, that would show you actual implementation caveats from which you learn why or why not a particular algorithm might be interesting / good / efficient / whatever. On the other hand, I think that understanding graphs and their real life use, including its implications (recursion, performance/scalability, applications, alternatives, ...), is made easier through the bottom-up approach.
But maybe that's just me. The above is very personal taste.
I found your question interesting, I googled a bit and I found JGraphEd.
It does not cover all graph algorithms but it looks like a good tool for experimentation.
I agree with balpha. The best way to really learn and understand algorithms is to do the implementation. Just pick an algorithm and implement it. When you reach a point where you get stuck or are unsure, look at a number of existing examples. You will then be able to compare your own thinking with that of others from a position of understanding instead of simply accepting what is offered.
Once you have learned what you want to, the best way to solidify your understanding is to try teach it to or describe it to somebody else. You might have some people willing to listen to you, or at the very least you could write a blog entry for people new to the algorithm you have just studied.
But if you are looking for "painless", then maybe you should stay clear of algorithms altogether ;-)
This site could help you
Here you have description of every problem on acm problemset. You can see category of each problem, and hint to solve it. Just browse for graph related problems. Good advice is to use those hints only if you tried to solve problem yourself and failed.
Visualization of some shortest path algorithms on real data, where the explored area is displayed in yellow:
(bidirectional) Dijkstra
