My app doesn't collect any data that I would see. But it has GoogleADs so I guess it collesct some data since it is Google.
I tried to search it but all I could find was pretty wage.
Related
I am validating the data from Eloqua insights with the data I pulled using Eloqua API. There are some differences in the metrics.So, are there any issues when pulling the data using API vs .csv file using Eloqua Insights?
Absolutely, besides undocumented data discrepancies that might exist, Insights can aggregate, calculate, and expose various hidden relations between data in Eloqua that is not accessible by an API export definition.
Think of the api as the raw data with the ability to pick and choose fields and apply a general filter on those, but Insights/OBIEE as a way to calculate that data, create those relationships across tables of raw data, and then present it in a consumable manner to the end user. A user has little use with a 1 gigabyte csv of individual unsubscribes for the past year, but present that in several graphs on a dashboard with running totals, averages, and timeseries, and it suddenly becomes actionable.
I have a web app that displays the analysis data in browser with elasticsearch as backend data store.
Everything was cool as elasticsearch was handling about 1TB data and search queries were blazing fast.
Then came the decision to add data from all services into the app, close to a peta byte, and we switched to bigquery.[yes, we abandoned the elasticsearch and started querying bigquery directly ].
Now users of my app are complaining that their queries are slow, they are taking seconds (4~10~15), which used to display under a second before.
Naturally the huge amount of data here is to be blamed but I am wondering if there is a way to bring back elasticsearch into the game and make elasticsearch and bigquery play together nicely so that I can get the petaytes of storage from bigquery but still retain the lightspeed search of elasticsearch.
I am sure I am not the first one to face this issue rather I believe I am bit late to the bigquery party so I should be able to reap the benefits of delayed entry by getting all the problems already solved.
Thanks in advance if you can point me to the right direction.
This is a common pattern I see deployed by customers:
Use Elasticsearch to display results from the latest day/week - whatever fits within Elasticsearch's RAM.
Use BigQuery for everything else.
In this way your users will get sub-second results for 90% of their queries, and they will also be able to go wherever they want to go if Elasticsearch can't find an answer within its resources.
I'm not sure what are your users interfaces for getting data - but that's where this logic would need to be deployed.
(of course, expect improvements in the connections and speed as tech progresses)
How do we send data from elastic search to google big query, Is there any specific connector?
I have been looking into various options and will need data to be available in google big query real time
I found google_bigquery output pligin that might be useful, but I have never use it personally.
Experiment with the settings depending on how much log data you generate, your needs to see "fresh" data, and how much data you could lose in the event of crash. For instance, if you want to see recent data in BQ quickly, you could configure the plugin to upload data every minute or so (provided you have enough log events to justify that)
I recently began reading up on big data, and how there are tools like hadoop or BigInsights that can manage both structured and unstructured data.
Social Media Analytics is something that can be done on BigInsights, and it takes unstructured data and analyzes/structures it accordingly.
This got me wondering, how is Social Media Data unstructured? For example, the information you can receive on tweets can be called using the Twitter REST API, and returned to you in a structured JSON format.
So isn't Social Media data already structured? If so why do you need a platform that manages mainly unstructured data?
Some make the distinction „semi-structured”, too.
But the point is the ability to query the data. Yes, Tweets etc. usually have some structure. But it's not helpful for analysis.
Given an ugly SQL schema, you could indeed run a query like
SELECT AVG(TweetID) FROM Twitter;
but that functionality is useless in practise. And that is probably why the data is best considered unstructured: you do not benefit from squeezing it into a relational schema.
Beware of buzzword bingo with big data, though. More often than not „supports unstructured data” actually means „does not benefit from structure in your data (by using indexes) but rereads data every time”
Its not only about getting the tweets. The real value of the data is knowing about what is being tweeted. Consider Facebook, where we can comment about any picture or a video. We need a platform to know what all the comments are positive about the video or how many are sledging it, or how many comments are real feedback about it. How many are providing suggestions to that to be a better one. And also you need to know how many times the video is shared and liked. Again those who all shared are whom, the one who dislikes it or likes it. Such so many varieties of data can be collected hence these are all called unstructured data.
This is a description of the application I want to build and I'm not sure whether to use Core Data or Sqlite (or something else?):
Single user, desktop, not networked, only one frontend is accessing datastorage
User occasionally enters some data, no bulk data importing or large data inserts
Simple datamodel: entity with up to 20-30 attributes
User searches in data (about 50k datasets max.)
Search takes place mostly in attribute values, not looking for any keys here, but searching for text in values
Writing the data is nothing I see as critical, it happens not very often and with small amounts of data. The text search in the attributes has to be blazingly fast, a user would expect almost instant results. This is absolutely critical.
I would rather go with Core Data, but is this a scenario CD can handle?
Thanks
-Fish
Core Data can handle this scenario. But because you're looking for blazingly fast full text search, you'll have to do some extra work. Session 211 of WWDC 2013 goes into depth about how to do this (slides 117-131). You'll probably want to have a separate Entity with text search tokens: all of the findable words in your dataset.
Although one of the FTS extensions is available in Apple's deployment of SQLite, it's not exposed in Core Data.