How to export Mixpanel events, segmented by time? - mixpanel

Besides "from and to date", I'd like to further segment event info from the Mixpanel export() API method by their times ( events contain a 'time' property, expressed as a unix timestamp ); but, eg, when I specify a "where" param, I get an empty response ( see below ).
"where" param = 'properties["time"] >= 1401642000'
I definitely have events that match this. Should this be working ?
no "where" param:
args = {'from_date': '2014-06-01', 'api_key': 'REDACTED', 'sig': 'REDACTED', 'to_date': '2014-06-04', 'expire': 1402424767}
response = [ I receive expected events ]
'where' param = 'properties["time"] >= 1401642000'
args = {'from_date': '2014-06-01', 'expire': 1402424767, 'sig': 'REDACTED', 'to_date': '2014-06-04', 'api_key': 'REDACTED', 'where': 'properties["time"] >= 1401642000'}
response = [ HTTP request succeeds, but response is empty ]
Same as above, except putting time value in quotes:
'where' param = 'properties["time"] >= "1401642000"'
args = {'from_date': '2014-06-01', 'expire': 1402424767, 'sig': 'REDACTED', 'to_date': '2014-06-04', 'api_key': 'REDACTED', 'where': 'properties["time"] >= "1401642000"'}
response = [ HTTP request succeeds, but response is empty ]
Try casting properties["time"] to a number:
'where' param = 'number(properties["time"]) >= 1401642000'
args = {'from_date': '2014-06-01', 'expire': 1402424767, 'sig': 'REDACTED', 'to_date': '2014-06-04', 'api_key': 'REDACTED', 'where': 'number(properties["time"]) >= 1401642000'}
response = [ HTTP request succeeds, but response is empty ]

I ran into the same problem on this API and tried various solutions before contacting their customer support. Their response was as follows:
"The bad news is that there's unfortunately no way to selectively export events from a particular time through the export API. This is largely due to the fact that time is not one of the exposed properties on your events to the API. The only properties that are exposed within the export API are the properties that were sent in with your events and actually exposed within the UI."
However, he went on to point out their newer JQL system https://mixpanel.com/help/reference/jql which can apparently filter on things like time.
Hope that helps.

You can use the typecast function datetime() to cast the target timestamp to a datetime type, and then compare as you are above, e.g. "where" param = 'properties["time"] >= datetime(1401642000)'.
You can also do time interval arithmetic inside the parentheses to build periods off a given starting or endpoint, e.g. "where" param = 'properties["time"] >= datetime(1401642000 - 60*60*24)' for events where the "time" property occurred more than an hour before the target datetime.

I was able to solve this problem using Mixpanel's JQL API endpoint. As mentioned in other answers, the other Mixpanel API endpoints seem to ignore a 'time' parameter in the where clause. (Which is frustrating and confusing!)
Below is my solution in Python, in which I used the mixpanel-jql library to make things easy. It fetches 'Chat' events for after noon on Halloween (31-Oct-2018 GMT). It's worth noting that the from_date, to_date and e.time comparison need to all overlap to actually get events.
now = datetime(2018, 10, 31)
from_timestamp_epoch_ms = 1540987200000 # Noon, Halloween 2018, GMT
query = JQL(
MIXPANEL_SECRET,
events=Events({
# 'event_selectors' can be left out to grab all events
'event_selectors': [{'event': 'Chat'}],
'from_date': datetime(now.year, now.month, now.day),
'to_date': datetime(now.year, now.month, now.day)
})
).filter('e.time > {}'.format(from_timestamp_epoch_ms))
for row_dict in query.send():
# Work your magic

Answering the same problem with a different method: first export your data into MYSQL or any other database.
I used to have my own script to do that but as the data volume became bigger and bigger I started to see the limitations of my custom script (duplicate events, long / heavy http requests, ...).
I am now using Mixpaneldb, which works very well for my needs.
Treasuredata can also do it apparently but they are not self served.

Related

Web.Content calling API service and merging pages with List.Transform started to fail

I created PowerBI report which which is connecting to data source via API service. Returning json contains thousands of entities. API service is called via Web.Content function. API service returns always total record count and so we are able to calculate nr. of pages which has to be called to obtain whole dataset. This report is displaying data from our servicedesk app, which is deployed on many servers and for many customers and use Query parameters to connect to any of these servers.
Detail of Power query is below.
Why am I writing here. This report was working without any issue more than 1,5 year but on August 17th one of servers start causing erros in step Pages where are some random lines (pages) with errors - see attached picture labeled "Errors in step Pages". and this is reason that next step Entities (List.Union) in query is stopping refresh and generate errors with message:
Expression.Error: We cannot apply field access to the type List. Details: Value=[List] Key=requests
What is notable
API service si returning records in the same order but faulty lists are random when calling with same parameters
some times is refresh without any error
The same power query called on another server is working correctly , problem is only with one specific server.
This problem started without notice on the most important server after 1,5 year without any problem.
Here is full text power of query for this main source, which is used later in other queries to extract all necessary data. Json is really complicated and I extract from it list of requests, list of solvers, list of solver groups,.... and this base query and its output is input for many referenced queries.
Errors in step Pages
let
BaseAPIUrl = apiurl&"apiservice?", /*apiurl is parameter - name of server e.g. https://xxxx.xxxxxx.sk/ */
EntitiesPerPage = RecordsPerPage, /*RecordsPerPage is parameter and defines nr. of record per page - we used as optimum 200-400 record per pages, but is working also with 4000 record per page*/
ApiToken = FnApiToken(), /*this function is returning apitoken value which is returning value of another api service apiurl&"api/auth/login", which use username and password in body of call to get apitoken */
GetJson = (QParm) => /*definiton general function to get data from data source*/
let
Options =
[ Query= QParm,
Headers=
[
Accept="application/json",
ApiKeyName="apitoken",
Authorization=ApiToken
]
],
RawData = Web.Contents(BaseAPIUrl, Options),
Json = Json.Document(RawData)
in Json,
GetEntityCount = () => /*one times called function to get nr of records using GetJson, which is returned as a part of each call*/
let
QParm = [pp="1", pg="1" ],
Json = GetJson(QParm),
Count = Json[totalRecord]
in
Count,
GetPage = (Index) => /*repeatadly called function to get each page of json using GetJson*/
let
PageNr = Text.From(Index+1),
PerPage = Text.From(EntitiesPerPage),
QParm = [pg = PageNr, pp=PerPage],
Json = GetJson(QParm),
Value = Json[data][requests]
in Value,
EntityCount = List.Max({ EntitiesPerPage, GetEntityCount() }), /*setup of nr. of records to variable*/
PageCount = Number.RoundUp(EntityCount / EntitiesPerPage), /*setup of nr. of pages */
PageIndices = { 0 .. PageCount - 1 },
Pages = List.Transform(PageIndices, each GetPage(_) /*Function.InvokeAfter(()=>GetPage(_),#duration(0,0,0,1))*/), /*here we call for each page GetJson function to get whole dataset - there is in comment test with delay between getpages but was not neccessary*/
Entities = List.Union(Pages),
Table = Table.FromList(Entities, Splitter.SplitByNothing(), null, null, ExtraValues.Error)
I also tried another way of appending pages to list using List.Generate. This is also bringing random errors in list but
it is bringing possibility to transform to table in contrast with original way with using List.Transform, but other referenced queries are failing and contains on the last row errors
When I am exploring content of faulty page/list extracting it via Add as New Query there are always all record without any fail.....
Source = List.Generate( /*another way to generate list of all pages*/
() => [Page = 0, ReqPageData = GetPage(0) ],
each [Page] < PageCount,
each [ReqPageData = GetPage( [Page] ),
Page = [Page] + 1 ],
each [ReqPageData]
),
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error), /*here i am able to generate table from list in contrast when is used List.Generate*/
#"Expanded Column1" = Table.ExpandListColumn(#"Converted to Table", "Column1"), /*here aj can expand list to column*/
#"Removed Errors" = Table.RemoveRowsWithErrors(#"Expanded Column1", {"Column1"}) /*here i try to exclude errors, but i dont know what happend and which records (if any) are excluded*/
Extracting errored page
and finnaly I am tottaly clueless not able to find the cause of this behavior on this specific server. I tested to call pages which are errored via POSTMAN, I discused this issue with author of API service and He also tried to call this API service with all parameters but server is returning every page OK, only Power query is not able to List.Transform ...
I will be grateful and appreciate any tips or advice or if somebody solved the same issue in the past ....
Kuby
No, each error line of list in step List.Transform coud by extracted as new query and there are all records from one page OK. hmmmm
Finnaly, problem described in this issue was caused by "corrupted" content of returning json. The provider of core system informed me that they found bug and after fixing on the side of servisdesk is everything OK again. I tried to find problem in Power query and problem was in servisdesk. :(

using forkJoin multiple times

I am working on a Project where our client generates almost 500 request simultaneously. I am using the forkJoin to get all the responses as Array.
But the Server after 40-50 request Blocks the requests or sends only errors. I have to split these 500 requests in Chunks of 10 requests and loop over this chunks array and have to call forkJoin for each chunk, and convert observable to Promise.
Is there any way to get rid of this for loop over the chucks?
If I understand right you question, I think you are in a situation similar to this
const clientRequestParams = [params1, params2, ..., params500]
const requestAsObservables = clientRequestParams.map(params => {
return myRequest(params)
})
forkJoin(requestAsObservables).subscribe(
responses => {// do something with the array of responses}
)
and probably the problem is that the server can not load so many requests in parallel.
If my understanding is right and if, as you write, there is a limit of 10 for concurrent requests, you could try with mergeMap operator specifying also the concurrent parameter.
A solution could therefore be the following
const clientRequestParams = [params1, params2, ..., params500]
// use the from function from rxjs to create a stream of params
from(clientRequestParams).pipe(
mergeMap(params => {
return myRequest(params)
}, 10) // 10 here is the concurrent parameter which limits the number of
// concurrent requests on the fly to 10
).subscribe(
responseNotification => {
// do something with the response that you get from one invocation
// of the service in the server
}
)
If you adopt this strategy, you limit the concurrency but you are not guaranteed the order in the sequence of the responses. In other words, the second request can return before the first one has returned. So you need to find some mechanism to link the response to the request. One simple way would be to return not only the response from the server, but also the params which you used to invoke that specific request. In this case the code would look like this
const clientRequestParams = [params1, params2, ..., params500]
// use the from function from rxjs to create a stream of params
from(clientRequestParams).pipe(
mergeMap(params => {
return myRequest(params).pipe(
map(resp => {
return {resp, params}
})
)
}, 10)
).subscribe(
responseNotification => {
// do something with the response that you get from one invocation
// of the service in the server
}
)
With this implementation you would create a stream which notifies both the response received from the server and the params used in that specific invocation.
You can adopt also other strategies, e.g. return the response and the sequence number representing that response, or maybe others.

Count maching date data [duplicate]

I need to call ToShortDateString in a linq query suing lambda expressions:
toRet.Notification = Repositories
.portalDb.portal_notifications.OrderByDescending(p => p.id)
.FirstOrDefault(p => p.date.ToShortDateString() == shortDateString);
but I get the error:
An exception of type 'System.NotSupportedException' occurred in
System.Data.Entity.dll but was not handled in user code
Additional information: LINQ to Entities does not recognize the method
'System.String ToShortDateString()' method, and this method cannot be
translated into a store expression.
What can I do, considering that I do need to use ToShortDateString() ?
Thanks.
Linq to Entities cannot convert ToSortDateString method into SQL code. You can't call it on server side. Either move filtering to client side (that will transfer all data from server to client), or consider to use server-side functions to take date part of date (you should pass DateTime object instead of shortDateString):
EntityFunctions.TruncateTime(p.date) == dateWithoutTime
You shouldn't be forcing a string comparison when what you're working with is Date/time data - as soon as you force string comparisons, you're suddenly having to deal with how the strings are formatted.
Instead, have something like:
var endDate = targetDate.AddDays(1);
toRet.Notification = Repositories
.portalDb.portal_notifications.OrderByDescending(p => p.id)
.FirstOrDefault(p => p.date >= targetDate && p.date < endDate);
(Assuming that targetDate is whatever DateTime variable you had that was used to produce shortDateString in your code, and is already a DateTime with no time value)
Try this,
You can also used with below code.
Activity = String.Format("{0} {1}", String.Format("{0:dd-MMM-yyyy}", s.SLIDESHEETDATE), String.Format("{0:HH:mm}", s.ENDDATETIME))
ToShortDateString() method usually used to work only with date and ignore time stamps.
You will get exactly today result-set by using the following query.
Repositories.portalDb.portal_notifications.OrderByDescending(p => p.id)
.FirstOrDefault(p => p.date.Date == DateTime.Now.Date);
By using Date property of DateTime struct you can just fetch record of that date only.
Note: Linq to Objects. Only works if you CAN (or have option) to bypass ToShortDateString() method

Retrieving user metadata about a provisioned SoftLayer server comes back null

I'm trying to retrieve the user meta information for each machine consistently but what I'm finding is most of my machines are missing this data. I'd like to understand better what is required for this user data to be there. I'm curious if a server can be requested and provisioned without requiring user information (e.g. an API call to order a server and no user data is given). Or whether I am missing something in how I retrieve this information. Here is the basic ruby program I'm running:
user = ###
api_key = ###
client = SoftLayer::Client.new(:username => user, :api_key => api_key, :timeout => 999999)
list_of_virtual_machines = client['Account'].result_limit(i*50,50).object_mask("mask[id, billingItem[recurringFee, associatedChildren[recurringFee], orderItem[description, order[userRecord[username], id]]], userData]").getVirtualGuests
for x in 0..list_of_virtual_machines.length - 1
pp list_of_virtual_machines[i]['userData']
if list_of_virtual_machines[i]['billingItem'] && list_of_virtual_machines[i]['billingItem']['orderItem'] && list_of_virtual_machines[i]['billingItem']['orderItem']['order'] && list_of_virtual_machines[i]['billingItem']['orderItem']['order']['userRecord']
pp list_of_virtual_machines[i]['billingItem']['orderItem']['order']['userRecord']
end
end
My prints are consistently showing null. This question is related to a similar question I asked not too long ago (but the focus of that question moved towards the provisionDate):
How to get order username and provisionDate for all SoftLayer machines using Ruby?
They are missing that data because you did not added.
you can create the user data at moment to order a new server or VSI, you just have to send the data in your order request either using the createObject method or the placeOrder method. see http://sldn.softlayer.com/reference/services/SoftLayer_Virtual_Guest/createObject
e.g.
{
"userData": [
{
"value": "someValue"
}
]
}
or you can set it after the server has been provisioned using these methods
http://sldn.softlayer.com/reference/services/SoftLayer_Virtual_Guest/setUserMetadata
http://sldn.softlayer.com/reference/services/SoftLayer_Hardware_Server/setUserMetadata
Basically the usermetadata are useful if you are going to use a post install script. The usermetadata value is not required to order a new server
take a look this article for examples:
http://sldn.softlayer.com/blog/jarteche/getting-started-user-data-and-post-provisioning-scripts

How can I change the column name of an existing Class in the Parse.com Web Browser interface?

I couldn't find a way to change a column name, for a column I just created, either the browser interface or via an API call. It looks like all object-related API calls manipulate instances, not the class definition itself?
Anyone know if this is possible, without having to delete and re-create the column?
This is how I did it in python:
import json,httplib,urllib
connection = httplib.HTTPSConnection('api.parse.com', 443)
params = urllib.urlencode({"limit":1000})
connection.connect()
connection.request('GET', '/1/classes/Object?%s' % params, '', {
"X-Parse-Application-Id": "yourID",
"X-Parse-REST-API-Key": "yourKey"
})
result = json.loads(connection.getresponse().read())
objects = result['results']
for object in objects:
connection = httplib.HTTPSConnection('api.parse.com', 443)
connection.connect()
objectId = object['objectId']
objectData = object['data']
connection.request('PUT', ('/1/classes/Object/%s' % objectId), json.dumps({
"clonedData": objectData
}), {
"X-Parse-Application-Id": "yourID",
"X-Parse-REST-API-Key": "yourKEY",
"Content-Type": "application/json"
})
This is not optimized - you can batch 50 of the processes together at once, but since I'm just running it once I didn't do that. Also since there is a 1000 query limit from parse, you will need to do run the load multiple times with a skip parameter like
params = urllib.urlencode({"limit":1000, "skip":1000})
From this Parse forum answer : https://www.parse.com/questions/how-can-i-rename-a-column
Columns cannot be renamed. This is to avoid breaking an existing app.
If your app is still under development, you can just query for all the
objects in your class and copy the value of the old column to the new
column. The REST API is very useful for this. You may them drop the
old column in the Data Browser
Hope it helps
Yes, it's not a feature provided by Parse (yet). But there are some third party API management tools that you can use to rename the fields in the response. One free tool is called apibond.com
It's a work around, but I hope it helps

Resources