Visitor / User profiling based on clickstream data? - hadoop

We build a rails 4 site and use ES for our search travel/accommodation engine. We created a separate ES index for clickstream data, and we store data for non-login(session_id) and login users (user_id). We use the stored data now to show viewed and favorites accommodations on the site.
Now i want based on a click analysis to cluster a visitor (non and login) in a specified cluster. A cluster can be "budget", "Couple", "Family" ect.
These clusters i want to "feed" with the user/session id profiles so i can use it to personalize our channels as site and email.
Can someone guide me. How can we create "rules" so we can assign profiles to a cluster?
Thanks..remco

I suggest you talk to ActionML We have several pieces of software that could apply like our Page Variant Recommender, which will learn from user responses which variant of a page or email works better for certain user profile or segment info. We have the state-of-the-art Universal Recommender which can ingest user clickstream info to make recommendations. All this is build on the PredictionIO Machine Learning Framework, which in turn supports a large number of algorithm plug-ins that can be applied to classification (supervised categorization) or clustering (unsupervized categorization) of users and many other use cases.
ActionML does contract consulting and tries to answer questions about their open source Apache 2 licensed software on the PredictionIO Google Group.

Related

Training Model for Each Individual in AzureML

I want to train an ANN model for each individual, in azure ml. For example, there is an application which wants to learn the behavior of each individual separately. How is this possible in azure-ml? Any suggestion?
As I know, I can create a model and train it with some data, but I don't know how can I train it specifically for each user. I should mention that I am seeking for a scalable idea which is applicable for a real situation (might be for 100 thousands users).
I highly recommend the Create many Machine Learning models and web service endpoints from one experiment using PowerShell article on this topic. It uses Azure ML PowerShell to automate creation of web services that have identical structure but user-specific trained models. Your application would need to keep track of the correspondence between web service and user.

How to share/port a dashboard in QlikView over non-web content?

Background:
ETL on source data from Excel, Access, Sql Server '8, .txt files.
Data Cloud is created
Dashboard is in progress
I have searched online because I remember seeing a marketting demo video by QlikView that it's possible to share the dashboard among other users. Not just a snapshot image or pdf. The real dashboard as a working file.
If client pcs receive a link to connect to the same data cloud via web - that's easy.
But what I want to know, is it possible to package and "port" the entire working file with underlying data to another person? (I am not asking for zipping!)
Depending on if you've purchased a license for Qlikview, there are several ways to approach this... Best case scenario for you is if you and the client you want to send the .qvw to both have Named licenses, you can just send them the file and they'll be able to open it in their licensed Personal Edition. I'm imagining this is not the case since you mentioned they are clients and not colleagues within your organization.
You need to know that if the client or you do not own licenses, you will not be able to share a working version of your dashboard with them.
The common implementation would be purchasing Qlikview Server Software and then deploying a Qlikview server in the cloud that would handle incoming web requests and provide clients with an access point from which to access your dashboards (and underlying data). This solution requires you (or your company) to have purchased a set of licenses from Qlik as well as Server software.
You can review Qlik's license structure here. You may also want to review their End User License Agreement to make sure their model works for what you are trying to do.

Reporting tools for web visits to show clients

I have a web portal in which I create content for different clients. For example, I have articles about dentist A, or car mechanic A, with each targeting different keywords on these pages.
Is there a reporting or web analytics tool that would allow me to create reports by keyword and/or certain webpages and allow clients to log in and see this information?
For example, a report that would show all the Dentist keywords, and another all keywords for Mechanic.
Currently I am using Google Analytics, but is not very user friendly for this type of reporting. What I am doing now is logging in to my GA account, creating a report for the keywords and emailing to client, which is not very efficient.
I am wondering if anyone knows about a pixel tracking reporting package out there for this need or any other way of getting the clients this information.
While searching online I found this link which gives you a list of the top free web analytics tools.
I choose to work with statcounter which is exactly what I was looking for. I can open a free account for a client and include their tracking script on individual pages.

Recommendation On Data Store For Website's User Activity Log

I am looking for some recommendations on a good data store for activity feeds. The goal is to have a Twitter/Facebook type feed log consisting of various activities users can do throughout our website. The "wall" or "feed" would updated via AJAX showing what the users of the website are currently doing. It will be written to often and then the most recent will be displayed on the site.
(e.g. John Smith recommended Jane Smith's article 2 seconds ago)
We currently are storing the feeds in MySQL but performance has been poor and I'm concerned with hindering performance throughout the rest of the website if we are constantly hitting the database to grab the most recent user activity as well as writing the feeds.
Any recommendations would be greatly appreciated!
Make use of the best caching solutions like memcache to increase performance. Other than scaling, there are no performance-increasing possibilities for an activity feed.
I would vote for using http://redis.io/ or http://www.mongodb.org/ as an alternative to MySQL for short-term, almost live activity feeds across a site. And a cron job to dump history of activities into MySQL for record keeping.
A look at tumblr's or twitters architectures can push you to the right direction as well.
You should take the microservices approach to separate between the datastore that stores the users' actions to the one that store the actual data.
Pub/Sub is the right approach to handle the big stream of users' actions.
Use Kafka or Google Pub/Sub cloud service for a scalable data pipeline. They can take the load with its scalable architecture.
Independently consume the messages from Kafka to some database such as MySQL or Google BigQuery for analytics purposes you must have.

Reporting / BI Framework for social websites

I'm looking for ideas / open source frameworks to use for creating individual Analytics for user profiles and all the other profile types. Users will have different custom metrics, businesses willl have seperate metrics, the admin section will have seperate, Advertises will have seperate, etc. So basically the goal is to have 1 framework in place for all Analytics, which will be custom user to user and even use that for the system analytic needs also. It will include data analytics as there will be user ratings/reviews to perfomr data mining on for businesses, USers will have basic reporting on their needs (like friend demographics, filter by different preferences, etc).
System is being developed in cakePhp.
Thanks.
Check out myDBR reporting tool. With myDBR you can easily create reports and include them into your application. myDBR is also written in PHP and can easily be integrated into any existing web-application via Single-Sign On authentication.

Resources