Watson Discovery Service - watson-discovery

We are trying to design a Knowledge Search box , using Watson Discovery Service . Planning to use Data Crawler to read real-time data from our Knowledge repository. We are looking for some information on how the responses of Discovery services can be ranked .

At this point, you can't affect the ranking of Discovery service results. That feature is on the roadmap, but AFAIK, there's been no announcement date.

Discovery will rank the documents based on relevancy that they have regarding your query. You can use filters with the metadata created to improve the relevancy.
In the latest version of the API relevancy training is an option. With this training you can improve the search results by training the systems with examples based on queries. This will provide a significant uplift in the results.

Related

elasticsearch architecture/development query - ADFS/Security Filtering/SearchUI

I had a few questions in regards to elasticsearch architecture and associated services and/or products that is not clear to me.
The idea is to setup an elasticsearch instance for searching through file shares, Exchange mailboxes, Sharepoint sites and even Teams conversations if possible.
How would I setup the elasticsearch instance to support the following requirements:
Security filtering results from these sources for users
Develop on a simple and clean web search page like SearchUI from Elastic themselves.
Active Directory or ADFS authentication
Use nodejs on a separate server to proxy to elastic, as elastic user management means that users get access to all search results
I can find tutorials and blogs on some of these items, but no comprehensive description of how the architecture would actually work specifically with the SearchUI and proxying of data to ES.
Please have a look at this new product released by Elastic guys using same elastic search framework
https://www.elastic.co/workplace-search
it closley matches your requirement.

How integrate FireStore Health Check and Dashboard metrics with our internal Company systems

Context: it is my first use of FireStore. I want to use it to push notification status to our Mobile Application. I can see that there is Google Firestore Dashboard under Analytics umbrella. In our company we use mainly three tools for monitoring our applications: Zabbix, Dynatrace and certain internal solution based on Elasticsearch. I need to ntegrate our internal monitoring systems with metrics resulted from our first Firestore project.
What I am looking for: based on personal assumptions:
1) Maybe there might exist either some GET endpoints that a I can connect and poll for information let's say each minute
2) Maybe, following the idea of Database Realtime pushing events accross a long time connection, I can code a Spring Boot application that import Firebase SDK and every day I connect to some specific Firestore endpoint which will push any interested events (eg. delay based on custom logic or dead service)
3) Maybe some plugin I can connect straight to a Kafka hosted in our internal Datacent
4) Some plugin to connect from Firestore/Firebase to either third tools (eg. Zabbix or Dynatrace or Elasticsearch)
5) Some dependency I could import in google-cloud-funtions thiggered from Firestore Healcheck engine in orther to consume some internal end-point posting data
Perhaps there is already some approach universally used for a scenario when you have to connect Firestore to internal monitoring system. I will be highly appreciated if tell me that than I can narrow my googling searchs because I am not finding anything usefull.
Please, it is not part of this question comparing Monitoring approach. It is a very solid fact in our company use internal Dashboards and some custom alerts trigger. I just mentioned the names above to clarify what I mean by internal monitoring tools. The focus on this question is HOW IMPORT/INTEGRATE/OBSERVE/CONSUME Firestore monitoring data. Our internal stack is beyond this question.
Here is the Official Documentation for Cloud Monitoring using which you can collect metrics, events, and metadata from Google Cloud Platform products that you can use to create dashboards, charts, and alerts.
Please let me know if you have further questions.

OEM for using Elastic stack basic subscription

I am using Basic subscription of elastic and planning to deploy the same on customer tools with basic license. Should I become an OEM partner with elastic if I have to deploy or distribute elastic components with basic license?
Also what is the difference between the OpenSource and Basic subscriptions?
The Basic subscription is based on the elastic distro containing non open source features. So if you want to distribute features covered by the basic subscription you need to check the OEM partnership.

Visitor / User profiling based on clickstream data?

We build a rails 4 site and use ES for our search travel/accommodation engine. We created a separate ES index for clickstream data, and we store data for non-login(session_id) and login users (user_id). We use the stored data now to show viewed and favorites accommodations on the site.
Now i want based on a click analysis to cluster a visitor (non and login) in a specified cluster. A cluster can be "budget", "Couple", "Family" ect.
These clusters i want to "feed" with the user/session id profiles so i can use it to personalize our channels as site and email.
Can someone guide me. How can we create "rules" so we can assign profiles to a cluster?
Thanks..remco
I suggest you talk to ActionML We have several pieces of software that could apply like our Page Variant Recommender, which will learn from user responses which variant of a page or email works better for certain user profile or segment info. We have the state-of-the-art Universal Recommender which can ingest user clickstream info to make recommendations. All this is build on the PredictionIO Machine Learning Framework, which in turn supports a large number of algorithm plug-ins that can be applied to classification (supervised categorization) or clustering (unsupervized categorization) of users and many other use cases.
ActionML does contract consulting and tries to answer questions about their open source Apache 2 licensed software on the PredictionIO Google Group.

How to evaluate hosted full text search solutions?

What are the options when it comes to SaaS/hosted full text search? How should I evaluate the different options available?
I'm looking for something that uses Lucene, solr, or sphinx on the backend, and provides a REST API for submitting documents to index, and running searches.
I could build my own EC2 AMI, but I'd have to configure EBS and other stuff, monitor it, etc.
Websolr provides a cloud-based Solr with a control panel. It's in private beta as of this writing, but you can get the service through Heroku.
Another hosted Solr service is PowCloud, also in private beta, which seems to offer strong Wordpress integration.
SolrHQ: another beta service providing a hosted Solr solution, with Joomla and Wordpress integrations.
Acquia Search offers Solr integration for Drupal sites.
If you decide to build your own EC2 instance, the SolrOnAmazonEC2 wiki page might be useful. Or you could just get LucidWorks Solr for EC2, which is probably the easiest and fastest way to get Solr on EC2.
Engine Yard provides a cloud-based Sphinx service.
Indextank is a hosted real-time full text search solution. It's pretty simple to set up (you can get an index running in a couple of minutes) and it's very powerfull (Reddit runs over IndexTank). It provides Java, Python, Ruby and Php clients as well as a Rest API specification. There's an awesome support service (including live chat). You should give it a try.
Another option, particularly for UK people is http://www.netaphorsearch.com/ . I should point out I own Netaphor Ltd. We support the Solr REST API but also have a PHP connector so that you can get up and running very quickly.
Have a look at Artirix - UK company but also in the US http://www.artirix.com. I know they power some sites such as Globrix.com in the UK based on SOLR and have a bunch of other products for crawling and data processing
My five cents
http://indexisto.com/
Offers free hosted Elastic Search if you are ready for advertisement in search results. But anyway you can start with free, and switch to no ads paid account.
It's also not just hosted Elastic Search, but ready to ase Ajax search box (that really impress) to embed to you site (mobile and tablet adopted), and some useful features like statistics, image resizing. There are several options to fill the index with documents - crawler, API and DB connector
Another option for lower-volume websites is Midwestern Mac's hosted Solr search (I am the owner of Midwestern Mac, LLC, just fyi).
Although it's not too hard (if you can use a command line respectably well) to provision your own server on a VPS somewhere...

Resources