Cluster Labels Storage in Carrot2 - carrot2

Where the cluster labels are stored in Carrot2? After storing the labels, the FoamTree and Circle results are generated, but where are these labels stored? How can I get them using some code etc.?

Carrot2 is stateless, it does not store the labels anywhere. It simply generates clusters in response to the documents you feed in.
The simplest way to integrate Carrot2 with FoamTree, Circles or any other JavaScript visualization is to up Carrot2 Document Clustering Server (DCS), make requests to the DCS directly from your browser using Ajax and once the clusters are received, show the clusters in the visualization.

Related

Is there a way to import data (csv data) to the winlogbeat kibana dashboard?

I had just started learning about ElasticSearch and Kibana. I created a Winlogbeat dashboard where the logs are working fine. I want to import additional data (CSV data) which I created using Python. I tried uploading the CSV file but I am only allowed to create a separate index and not merge it with the Winlogbeat data. Does anyone know how to do this?
Thanks in advance
In many use cases, you don't need to actually combine into a single index. Here's a few ways you can show combined data, in approximate order of complexity:
Straightforward methods, using separate indices:
Use multiple charts on a dashboard
Use multiple indices in a single chart
More complex methods that combine data into a single index:
Pivot indices using Data Transforms
Combine at ingest-time
Roll your own
Use multiple charts on a dashboard
This is the simplest way: ingest your data into separate indices, make separate visualizations for them, then add those visualisations to one dashboard. If the events are time-based, this simple approach could be all you need.
Use multiple indices in a single chart
Lens, TSVB and Timelion can all use multiple data sources. (Vega can too, but that's playing on hard mode)
Here's an official Elastic video about how to do it in Lens: youtube
Create pivot indices using Data Transforms
You can use Elasticsearch's Data Transforms functionality to fetch, combine and aggregate your disparate data sources into a combined data structure which is then available for querying with Kibana. The official tutorial on Transforming the eCommerce sample data is a good place to learn more.
Combine at ingest-time
If you have (or can add) Logstash in the mix, you have several options for combining datasets during the filter phase of your pipelines:
Using a file-based lookup table and the translate filter plugin
By waiting for related documents to come in then outputting a combined document to Elasticsearch with the aggregate filter plugin
Using external lookups with filter plugins like elasticsearch or http
Executing arbitrary ruby code using the ruby filter plugin
Roll your own
If you're generating the CSV file with a Python program, you might want to think about incorporating the python Elasticsearch DSL lib to run queries on the winlogbeat data, then ingest it in its combined state (whether via a CSV or other means).
Basically, Winlogbeat is a data shipper to Elasticsearch. Which ships windows specific data to an index named winlogbeat with a specific schema and document structure.
You can't merge another document with a different schema into winlogbeat index.
If your goal is to correlate different data points. Please use Time-series visual builder to overlay two different datasets to visualize.

ElasticSearch - Kibana raw data chart

I have set-up an elasticsearch server and I am feeding data to it with logstash. My data consists of various numeric fields which are logged through time. I want to create a chart in Kibana that will display the raw data as points, and the average of my data as a line chart on top of that. Pretty much something like the picture below... (This is from Kibana's homepage)
I have a real problem displaying the raw values of my data and I cannot find a way to do it in the documentation. Has anybody else encountered the same problem? Is this doable in Kibana?

Is there any possibility to extract Google Analytics data and post that to Elastic Search?

I have been working on ways to import Google Analytics raw data without having to use a premium account .So far this is the nearest link to what I want to do
How to extract data from Google Analytics and build a data warehouse (webhouse) from it?
I want to load that data into elastic search and display using kibana .What is the best ETL approach for this ? Has anyone tried to display GA data using ELK stack ?
You should do it in two times
First, get the info, a very very useful site is https://developers.google.com/webmaster-tools/v3/how-tos/search_analytics but you have first to have a google wembaster tool account and create oauth credential on https://console.developers.google.com/apis
Then once you have your data, find a way to import them in elasticsearch, I'm still looking for the best way to do so, maybe transform the result table into csv and then using https://www.elastic.co/guide/en/logstash/current/plugins-filters-csv.html
Have a look at this:
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-http_poller.html
You can use this to poll an endpoint, in this case GA, and load the response data into Elasticsearch. You may want to filter the response with the Split and / or Mutate plugins as well.
I have done this same setup.
Extracted data from Google Analytics with 7 Dimensions and 6 Metrics, out of which 2 Dimensions were primary key (Timestamp and ID). This was done using R.
Did some transformations on the data using linux awk and sed commands.
Loaded the data into Apache Hive with the row column formatting, created like total 9 tables.
Joined all the 9 tables in Hive using Hive Join queries, with 2 primary keys.
Used elasticsearch-hadoop connector to load the final resulting table to elasticsearch. Had to do a little data transformations to match Hive and Elasticsearch data types.
Used Kibana to visualize the data in Elasticsearch.
Now I am planning to avoid all the manual steps and somehow automate all the steps above.

How to create visualisation on the fly using a script in Kibana 4

I have some requirement where I need to create different visualization for different users which will differ very slightly on the query param. So, I am considering to create a script which will enable me to do this.Have anyone done this on Kibana 4. Some pointers on how to create visualization using query would be of great help.
I would also like to create Dashboards on the fly but that can wait till I get this one sorted out.
If you want to go ahead with Java plugin (as mentioned in comments), here are the steps:
Create different visualizations with different X-axis parameters. Visualizations are basically json strings so you can write a java code which changes the value of x aggregation based on the mapping that you have. Now each chart will have different ids.
While you are creating a custom dashboard based on the user, check the mapping between user and the visualization and use the following command to add the visualization:
client.prepareIndex(,"visualization",).setSource().execute();

Possible to store images in Elasticsearch?

Is it possible to store images in Elasticsearch clusters? If yes, then is there a resource about the work flow? I checked the following link: https://github.com/kzwang/elasticsearch-image
Since we have to handle large image files (over 500GB), we are planning to use HDFS.
Storing whole images in Elasticsearch will not be very beneficial, because if the image is scaled/cropped and then used as a query, it will give incorrect results. What you need depends on why you want to index these images.
In my case, I need to find if an image after some scaling or cropping, has a close match in my database. I am extracting local descriptors (SIFT/SURF) of images and using them to build an Elasticsearch index. This will reduce the image index size as instead of storing the whole image, only a few features are stored. I will be storing all these images on S3 for now and Elasticsearch will store ids for these images along with the features extracted from them.
Regarding elasticsearch-image: This plugin has not been updated in a while and the most recent responses to issues were from last year. This plugin integrates LIRE with Elasticsearch, where LIRE provides the functionality of a multiple image fingerprints extractor.
Possible solutions:
Integrate the library OpenCv (to compute feature vectors for an image) and Elasticsearch and build your own index using these image features instead of storing a whole image. For the product architecture, you can get some hints here.
Use an older version of Elasticsearch with a compatible version of elasticsearch-image.
Upgrade elasticsearch-image to work with the latest version of Elasticsearch.
You can also use SOLR along with LireSolr plugin to integrate with the LireSolr library.
UPDATE:- This is update on task of Image retrieval where you need to search for close image matches. I would recommend you to go through this link https://paperswithcode.com/task/image-retrieval. The best solution - Deep Local Features is already integrated in tensorflow.

Resources