Is there a way to find the outputs of a given dataset? - feature-extraction

I have got a dataset that there is no background knowledge available about it and as a result I don't know which features are inputs and which are outputs. Is there a way to find out which are the outputs?

Related

Is there a way to input a numeric data in cnn and the output is also regression?

Currently I am working in a project where I predict the future gold prices using cnn method. I have been searching for a way so that I can use numbers as input (historical prices) and I need to get the output in a regression not classification. I searched everywhere for the past few months but I found it hard to understand. If someone can assist me with a way for that please I would be grateful.

LabVIEW - How to accumulate data in array?

I made a program aimed to simulate the intensity of light when many light bulbs are put together. I have intensity data of one bulb in xls.-files. So, I want to program to work as follows.
Open the xls.-file and get the data.
Put the data into different positions. I put one data set (one bulb) in each excel sheet. This is to simulate putting the bulb in different places.
Sum the data in the same cell across the different sheets.
My LabVIEW front panel and block diagram are:
My problem is this program runs too slowly. How should I improve this? I have an idea to make a big array and accumulate data in that array. However, I do not know how to do it. The Insert Into Array and Replace Array Subset functions are not suitable for my purposes.
The most probable reason of slow performance is that you do a lot of operations on Excel file. You should rather read data into memory and operate on them in VI. At the end, if you need, you can update the Excel file with final results.
It would be difficult to tell you exactly how to do it. As you said, you're beginner and I think that the best way would be to simple do some LabVIEW exercises and gain more experience to understand how to work with arrays :) I recommend to take a look at examples (Help->Find Examples), read some user guides from ni.com or find other "getting started" materials on the Internet.
Check these, you may find them useful:
https://zone.ni.com/reference/en-XX/help/371361R-01/lvhowto/lv_getting_started/
https://www.ni.com/getting-started/labview-basics/data-structures
https://www.ni.com/pl-pl/support/documentation/supplemental/08/labview-arrays-and-clusters-explained.html

How to distinguish input files in Hadoop

I'm just trying to understand how Hadoop distinguish multiple files in HDFS. I want to do sentiment analysis using Hadoop (just a test). I have two files positive.json and negative.json. I'm trying to use Naive Bayes Classification. So, when I train the model, I want to know which ones are positive and which ones are negative. How do I do this? I haven't written any codes to show; I'm stuck in the first part. Any suggestions? I did read tons of papers, and I think I do have a basic concept. I want to see if I can use this concept in Rhipe. Or do you have any other better and easier solutions?

Practical image processing books - For the example of canny filter

I know there exist many many books about image processing but I need an advise for a particular good one giving practical hints for using the algorithms. I don't need background information about HOW an algorithm works, e.g. HoughTrafo or Canny Filter as I know that already from various books. But I need a good advise on how to use those filters efficiently and in particular on how to set the thresholds etc.
It currently gives me a huge headache on how to chose those values. When I set them to fixed values, they work for one picture and when changing the illumination slightly, the dont work anymore for various reasons. So I wonder on how to dynamically set them from image specific values. I read on SO to e.g. set the canny thresholds to:
low = 0.666*mean(img)
high = 1.333*mean(img)
(http://www.kerrywong.com/2009/05/07/canny-edge-detection-auto-thresholding/)
but somehow I havent had much success with it so far.
I'm interested in good advises for books etc in particular but included the special example on how to determine thresholds for canny to make it a valid SO-question :-)
If you want a practical book, typically the book will be targeted to a specific library.
Here you can find a list of most of the books about the OpenCV library.

Yahoo! LDA Implementation Questions

All,
I have been running Y!LDA (https://github.com/shravanmn/Yahoo_LDA) on a set of documents and the results look great (or at least what I would expect). Now I want to use the resulting topics to perform a reverse query against the corpus. Does anyone know if the 3 human readable text files that are generated after the learntopics executable is run is the final output for this library? If so, is that what I need to parse to perform my queries? I am stuck with a little shoulder shrugging at this point...
Thanks,
Adam
If LDA is working the way I think it is (I use a java implementation, so explanations may vary) then what you get out are the three following things:
P(word,concept) -- The probability of getting a word given a concept. So, when LDA finishes figuring out what concepts exist within the corpus, this P(w,c) will tell you (in theory) which words map to which concepts.
A very naive method of determining concepts would be to load this file into a matrix and combine all these probabilities for all possible concepts for a test document in some method (add, multiply, Root-mean-squared) and rank order the concepts.
Do note that the above method does not recognize the various biases introduced by weakly represented topics or dominating topics in LDA. To accommodate that, you need more complicated algorithms (Gibbs sampling, for instance), but this will get you some results.
P(concept,document) -- If you are attempting to find the intrinsic concepts in the documents in the corpus, you would look here. You can use the documents as examples of documents that have a particular concept distribution, and compare your documents to the LDA corpus documents... There are uses for this, but it may not be as useful as the P(w,c).
Something else probably relating to the weights of words, documents, or concepts. This could be as simple as a set of concept examples with beta weights (for the concepts), or some other variables that are output from LDA. These may or may not be important depending on what you are doing. (If you are attempting to add a document to the LDA space, having the alpha or beta values -- very important.)
To answer your 'reverse lookup' question, to determine the concepts of the test document, use P(w,c) for each word w in the test document.
To determine which document is the most like the test document, determine the above concepts, then compare them to the concepts for each document found in P(c,d) (using each concept as a dimension in vector-space and then determining a cosine between the two documents tends to work alright).
To determine the similarity between two documents, same thing as above, just determine the cosine between the two concept-vectors.
Hope that helps.

Resources