I have a DataFrame with event logs:
eventtime, eventname, user, execution_in_s, delta_event_time
The eventname e.g. can be "new_order", "login" or "update_order".
My problem is that I want to know if there is eventname == "error" in the periods between login and update_order by distinct user. A period for me has a start time and an end time.
That all sounded easy until I tried it this morning.
For the time frame of the 24h logs I might not have a pair, because the login might have happened yesterday. I am not sure how to deal with something like that.
delta_event_time is a computed column of the eventtime minus the executions_in_s. I am considering these the real time stamps. I computed them:
event_frame["delta_event_time"] = event_frame["eventtime"] - pandas.to_timedelta(event_frame["execution_in_s"], unit='s')
I tried something like this:
events_keys = numpy.array(["login", "new_order"])
users = numpy.unique(event_frame["user"])
for user in users:
event_name = event_frame[event_frame["eventname"].isin(events_keys) & event_frame["user" == user]]["event_name"]
But this not using the time periods.
I know that Pandas has between_time() but I don't know how to query a DataFrame with periods, by user.
Do I need to iterate over the DataFrame with .iterrows() to calculate the start and end time tupels? It takes a lot of time to do that, just for basic things in my tries. I somehow think that this would make Pandas useless for this task.
I tried event_frame.sort(["user", "eventname"]) which works nicely so that I can see the relevant lines already. I did not have any luck with .groupby("user"), because it mixed users although they are unique row values.
Maybe a better workflow solution is to dump the DataFrame into a MongoDB instead of pursuing a solution with Pandas to perform the analysis in this case. I am not sure, because I am new to the framework.
Here is pseudocode for what I think will solve your problem. I will update it if you share a sample of your data.
grouped = event_frame.groupby('user') # This should work.
# I cannot believe that it didn't work for you! I won't buy it till you show us proof!
for name, group in grouped:
group.set_index('eventtime') # This will make it easier to work with time series.
# I am changing index here because different users may have similar or
# overlapping times, and it is a pain in the neck to resolve indexing conflicts.
login_ind = group[group['eventname'] == 'login'].index
error_ind = group[group['eventname'] == 'error'].index
update_ind = group[group['eventname'] == 'update_order'].index
# Here you can compare the lists login_ind, error_ind and update_ind however you wish.
# Note that the list can even have a length of 0.
# User name is stored in the variable name. So you can get it from there.
Best way might be to create a function that does the comparing. Because then you can create a dict by declaring error_user = {}.
Then calling your function inside for name, group in grouped: like so: error_user[name] = function_which_checks_when_user_saw_error(login_ind, error_ind, update_ind).
I have an API in rails that will handle a lot of requests. Lets say millions a day. I want to be able to assign a unique id to each of these requests. The way i am doing it is starting with a uid of 3 letters/numbers and continue up till 9 letters/numbers when the previous bracket is all taken.
One one i am doing it is generate the uid in real time when the request comes, so the app will try to find the first available uid and assign it. But after a while i have the impression that this will impact the performance of the app.
The second way I am thinking is to auto-generate all the possible uids in advance and have a flag assigned to them [free/taken] so that when a request comes in i assign the first free uid to it, which should be very fast if that filed is indexed.
Any suggestions are much appreciated. Thank you
I would just generate a random string and assign it while it doesn't exist. Before I get into that let me just mention that SecureRandom.uuid is the best way to go. It generates random uuids whose chance of collision are mathematically improbable.
Anyway, here is a way to use your own custom random string generator that will only assign if it doesn't already exist:
def generate_random_uid
begin
uid = my_custom_random_string_method
object.uid = uid
end while ObjectModel.exists? uid: uid
object.save
end
The do while block will execute once and set a variable called uid to be a random string you generate in the method my_custom_random_string_method, then in the while part it checks if a record exists whose uid attribute matches the random uid you just generated, if it does it runs the do block again, rinse and repeat until the expression in the while part returns false meaning the uid doesn't exists, then your object is saved and the uid written to the db. This guarantees that you will only ever save the object with a uid that doesn't exist in your db.
Query's throwing an ORA-00907 Error when I try to paste a list of values into a criteria.
Background: I'm not a developer, I'm just an end user that's studied enough to where I can write queries using PS/Query within Peoplesoft,
for my company's implementation. I work with Peoplesoft's FSCM module
(Financials and Supply Chain Management), currently on Version FSCM
8.90.08.024, using I think Oracle 11g as the base database.
I'm mostly self-taught, and the technical experts we have are busy
with database/application stuff, or they aren't familiar with my
section's specific data needs.
I should point out that I'm unable to directly write SQL statements to
Query the database. I have to use a built-in program called "PS/Query"
(also known as Query Manager) with a GUI that writes the SQL for you
and saves it as a Query that you can run to the database to extract
data. This is relevant to my question only in that:
1. I cannot create or alter views/tables
2. I cannot perform any type of SQL Statement except "SELECT"
3. I can embed PL/SQL, MetaSQL and plain SQL into Expressions
4. At this point, Query Manager is the only option I have.
PS/Query is my only experience with SQL so far, aside from Oracle's
documentation and sites like this. From my research, it's considered
extremely confining by "actual" SQL programmers.The restrictions on it
require you to do things in a manner that violates what seem to be
best practices of SQL coding.
Query Request: I have a query I've been requested to write that pulls out spend (on Vouchers and POs) against certain system-defined
Category Codes. What I'm trying to do is pull in Voucher IDs, sum the
merchandise amounts on them by Vendor and Category Code, and display
the results. Or in other words, for every unique combination of
Vendor/Category, add up all the Voucher Amounts that have that
Vendor/Category combination.
Using the SUM (Fieldname) OVER (PARTITION BY fieldname, fieldname)
syntax.
So the end result should look something like...
Code Vendor Amount
123-45 Acme $5000.00
123-45 Apple $4200.00
123-46 Acme $750.00
With that said, here's the SQL that Query Manager is displaying to get the result set I showed above:
SELECT DISTINCT D.CATEGORY_CD, D.TN_DESCR1000, C.VENDOR_ID, E.NAME1, SUM ( A.MERCH_AMT_VCHR) OVER (PARTITION BY D.CATEGORY_CD, C.VENDOR_ID),E.SETID,E.VENDOR_ID
FROM PS_PO_LINE_MATCHED A, PS_PO_LINE B, PS_PO_HDR C, PS_ITM_CAT_TBL D, PS_VENDOR E, PS_PYMNT_VCHR_XREF F
WHERE A.BUSINESS_UNIT = B.BUSINESS_UNIT
AND A.PO_ID = B.PO_ID
AND A.LINE_NBR = B.LINE_NBR
AND B.BUSINESS_UNIT = C.BUSINESS_UNIT
AND B.PO_ID = C.PO_ID
AND D.CATEGORY_ID = B.CATEGORY_ID
AND D.EFFDT =
(SELECT MAX(D_ED.EFFDT) FROM PS_ITM_CAT_TBL D_ED
WHERE D.SETID = D_ED.SETID
AND D.CATEGORY_TYPE = D_ED.CATEGORY_TYPE
AND D.CATEGORY_CD = D_ED.CATEGORY_CD
AND D.CATEGORY_ID = D_ED.CATEGORY_ID
AND D_ED.EFFDT <= SYSDATE)
AND ( F.SCHEDULED_PAY_DT >= TO_DATE('2010-07-01','YYYY-MM-DD')
AND F.SCHEDULED_PAY_DT <= TO_DATE('2011-06-30','YYYY-MM-DD'))
AND D.CATEGORY_CD LIKE :1
AND E.VENDOR_ID = C.VENDOR_ID
AND A.BUSINESS_UNIT = F.BUSINESS_UNIT
AND A.VOUCHER_ID = F.VOUCHER_ID
ORDER BY 1
Underlying Issue: This works fine, but it can only prompt on one
Category Code at a time. Category Codes are 5 digits, a 3-digit
"Class" followed by a dash and then a 2-digit "subclass. I have a list
of 375 Category Codes I need to get this Query result for.
I've set up a prompt on this version that allows entry of a Wildcard
(So 123-%%), but that's still about a hundred separate runs of the
Query. Query Manager allows use of an "In List" expression type in
Criteria, but it requires you to manually enter each entry in the
list.
I'm trying to set it up to where I can paste a plaintext copy of the
Code list into an Expression, with proper quotes/commas, and have it
evaluate that to give me a combined list of all the NIGP codes
specified. The Prompt field created by Query Manager doesn't allow
pasting of lists (as far as I know).
Attempted Solution: I viewed the page at http://peoplesoft.ittoolbox.com/groups/technical-functional/peoplesoft-other-l/create-an-expression-in-psoft-90-query-to-paste-a-list-of-emplids-2808427 and I've tried some of the answers given there, but none of them worked. That page led to me trying this modified SQL (obviously the list of codes is truncated a bit for display here):
SELECT DISTINCT D.CATEGORY_CD, D.TN_DESCR1000, C.VENDOR_ID, E.NAME1, SUM ( A.MERCH_AMT_VCHR) OVER (PARTITION BY D.CATEGORY_CD, C.VENDOR_ID),E.SETID,E.VENDOR_ID
FROM PS_PO_LINE_MATCHED A, PS_PO_LINE B, PS_PO_HDR C, PS_ITM_CAT_TBL D, PS_VENDOR E, PS_PYMNT_VCHR_XREF F
WHERE A.BUSINESS_UNIT = B.BUSINESS_UNIT
AND A.PO_ID = B.PO_ID
AND A.LINE_NBR = B.LINE_NBR
AND B.BUSINESS_UNIT = C.BUSINESS_UNIT
AND B.PO_ID = C.PO_ID
AND D.CATEGORY_ID = B.CATEGORY_ID
AND D.EFFDT =
(SELECT MAX(D_ED.EFFDT) FROM PS_ITM_CAT_TBL D_ED
WHERE D.SETID = D_ED.SETID
AND D.CATEGORY_TYPE = D_ED.CATEGORY_TYPE
AND D.CATEGORY_CD = D_ED.CATEGORY_CD
AND D.CATEGORY_ID = D_ED.CATEGORY_ID
AND D_ED.EFFDT <= SYSDATE)
AND ( F.SCHEDULED_PAY_DT >= TO_DATE('2010-07-01','YYYY-MM-DD')
AND F.SCHEDULED_PAY_DT <= TO_DATE('2011-06-30','YYYY-MM-DD'))
AND D.CATEGORY_CD = '005-00' OR D.CATEGORY_CD IN ('015-00,'' '015-06,'' '015-10,'' '615-07'')
AND E.VENDOR_ID = C.VENDOR_ID
AND A.BUSINESS_UNIT = F.BUSINESS_UNIT
AND A.VOUCHER_ID = F.VOUCHER_ID
ORDER BY 1
And the SQL above is what's giving me the ORA-00907 error. Has anyone ran into this problem before? Massive wall of text, I know. My apologies. This is my first post here and I'm trying not to leave anything relevant out.
I've got the immediate problem that spurred this question fixed,but that request is just the tip of a very large iceberg, and at some point I need to figure out a way to be able to paste plaintext lists in as criteria using Query Manager, preferably in a way that plays nice with Analytic Grouping.
TL;DR version:
Using Peoplesoft Query Manager to do an Analytic SUM with grouping using OVER, PARTITION BY. When I try to paste a list into the criteria, it throws an ORA-00907 Error.
Any help would be greatly appreciated. Thanks!
Ok, after a bit more tweaking with this, I've found what I think is the underlying issue.
The error, in this case, is two-fold. Part of it was my fault (I didn't check for Peoplesoft mangling the quotation marks I pulled from Word), and part of it was the way Query Manager interprets some kinds of functions (you have to wrap some stuff in a Case When statement to get it to evaluate properly).
First, the "My Fault" part:
Every time I was pasting in my list of test NIGP Codes, I was doing it from a file I kept saved in Microsoft Word.
Which has the probably-handy "replace straight quotes with smart quotes" feature. Peoplesoft goes bonkers when its presented a "smart quote", and will display them as upside-down question marks (there's probably a technical term, I don't know it).
So when I'd test suggestions (such as fixing the quote/comma order as suggested by #Rene Nyffenegger and #WayneH) I'd start with my base test query, add in the expressions and test it, saving it as a separate query. If they didn't work, I'd go back to the base query. That way I could iterate changes and save potential tests as different versions.
My mistake was in not saving the different versions, leaving the application and going back in. It's when you save the query, leave the page, go somewhere else in Peoplesoft, then go back to open Query Manager that it actually shows you that it's doing the character conversion. You can't see it unless you do that. Even though Query Manager is doing it. So it was throwing a character Query Manager wouldn't recognize, but not showing me the character it wouldn't recognize.
I got a new work PC recently, and I've now disabled the Smart Quotes auto-replace for future use.
Second, the "Query Manager: part:
On the version of this that I got to work, I made use of wrapping the "IN" function inside a Case statement. I've found that a lot of SQL functions, when used "plain" (as I'd define them by just copy-pasting from Oracle's definitions pages and filling in the appropriate variables) tend to give PS/Query (Query Manager) heartburn. But if you wrap them inside a CASE...WHEN...END statement that evaluates the result of the function and then build a criteria that selects based on certain values of that result, the function will work and properly display a result.
So for an example, set up this expression (like in the example from #qyb2zm302). I'm using different codes from what was in my original example, but they work the same (they're all five-digit, character-typed codes consisting of three digits, a dash, then two digits)
Case when E.CATEGORY_CD IN
('375-15', '375-30', '375-54', '375-60', '380-30','938-63')
then 'true'
else 'false'
end
And then set a criteria:
AND
Case when E.CATEGORY_CD IN
('375-15', '375-30', '375-54', '375-60', '380-30','938-63')
then 'true'
else 'false'
end
= 'true'
It'll run to completion and return any rows that have that Category Code.
If you don't want to do that, you can do like in #qyb2zm302's Method 2. The only downside to that in Query Manager is that you have to enter them into individual rows in the "List", and if you can only copy-paste 25 at a time.
Wrapping it in a Case Statement lets you paste it directly into an Expression, which is far better for larger lists.
Solutions:
The above is the code I went with that worked. It's simplifying a bit for brevity's sake, but it works.
In List works through the native Query Manager option as long as you manually-populate the list
D.CATEGORY_CD = '005-00' OR works as long as you wrap it in a Case Statement
D.CATEGORY_CD IN ('015-00','015-06','015-10','615-07') works as long as you wrap it in a Case Statement
Peoplesoft hates Smart Quotes. None of the above will work if you're copying quotation marks directly from Word, but you won't see it unless you save, leave and come back to the same query in edit mode
Formatting is important. All of the above require the proper comma/quotation formatting, as pointed out by Rene and Wayne. Meaning: ('xxx-xx', 'xxx-01','xxx-02') etc
Thanks to everyone who helped on this! I don't think I've head-desked this hard before on any question, but I guess that's part of the learning process. Since all the answers posted are valid and correct (or at least a portion of the larger "correct"), I'm going to flag them all.
The
D.CATEGORY_CD IN ('015-00,'' '015-06,'' '015-10,'' '615-07'')
part looks fishy to me
Since a '' within a string "evaluates" to a single ' the first string is
'015-00,'' '
followed by (the non-string)
015-06,
The following '' is probably the thing that the parser stumbles upon since it's pretty meaningless.
Edit try it with a D.CATEGORY_CD IN ('015-00', '015-06', '015-10', '615-07').
Following the link you posted, I see 2 methods for doing what you are trying to accomplish.
I also notice that you tried a 3rd method.
Method 1
Criteria > Add Criteria
Expression Type: Character
Length: 255
Expression Text: D.CATEGORY_CD IN ('015-00','015-06','015-10','615-07') AND 1
Condition Type: equal to
Constant: 1
Method 2
Criteria > Add Criteria
Field: D.CATEGORY_CD
Condition Type: in list
Value: 015-00','015-06','015-10','615-07
Method 3 (Your Method)
Criteria > Add Criteria
Field: D.CATEGORY_CD
Condition Type: equal to
Define Expression: '015-00' OR D.CATEGORY_CD IN ('015-00','015-06','015-10','615-07')
Question) Does the below exactly match the text you are putting the Expression box?
'015-00' OR D.CATEGORY_CD IN ('015-00','015-06','015-10','615-07')
If not, what are you putting in that box?
I think the D.CATEGORY_CD criteria are giving you the problems, I changed the double quotes to single quotes and then it still looked strange to me. I then notice the commas are inside your quotes and not between them, try making the one criteria line look like this:
before:
OR D.CATEGORY_CD IN ('015-00,'' '015-06,'' '015-10,'' '615-07'')
after:
OR D.CATEGORY_CD IN ('015-00', '015-06', '015-10', '615-07')
Also, the "IN" is an implied "OR" and I am not sure if you have parenthesis around the two D.CATEGORY_CD,
I would just put the one additional code into the IN criteria and remove the "D.CATEGORY_CD =" line:
before:
AND D.CATEGORY_CD = '005-00' OR D.CATEGORY_CD IN ('015-00', '015-06', '015-10', '615-07')
after:
AND D.CATEGORY_CD IN ('015-00', '015-06', '015-10', '615-07', '005-00')
Of course, you are already ordering by CATEGORY_CD, you could remove this criteria and pull all categories in one run (that is unless there are too many rows for excel), and then you might also want to include either VENDOR_ID or NAME1 in the ORDER BY clause.
Hope that helps you.
I need a 6 character alphanumeric ID for use in my rails app, which will be presented to users of the system and must be unique among all the object instances in my system. I don't expect more than a few thousand object instances, so 6 characters is far more than I really need.
At this point I'm using the UUIDTools gem in my Rails app to generate a uuid. Which of the UUIDTools generation methods should I use, and which end of the resulting uuid should I take the 6 characters from, to guarantee uniqueness?
for example, if I generate ef1cf087-95c9-4868-bd95-cea950a52b58, would I want to use ef1cf0 from the front of it, or a52b58 from the back end?
... as a side note / question: am i going about this wrong? is there a better way?
No way. UUID is considered unique because it is very long and it is practically impossible to generate same UUIDs. If you trim it to 6 chars then you drammatically increase possiblility of duplicate. You have to use either incrementing id or full UUID.
Only deterministic generation (id(x + 1) = id(x) + 1) can guarantee uniqueness. UUID doesn't guarantee it and 6 chars guarantee it even less.
Other option is to create ID generation service, it will have single method getNewId and will keep knowledge that will be enought to provide unique ids. (Simplest case - counter)
When you say that incrementing the ID isn't an option, is that because you don't want users to see the scheme you're using, or because the generation must be stateless (i.e., you can't keep track of all IDs you've generated)?
If it's the former, then you can generate an ID, check to see if you've already used it, and if so, generate another new ID. (Seems pretty obvious so sorry if I'm on the wrong track.) You could do something like this:
while id = rand(2**256).to_s(36)[0..5]
break unless Ids.exists?(id)
end
where Ids.exists?(id) is the does-this-already-exist method.
I need a technique (an a pointer to sample code if you have) for generating conformation numbers for web payment. I don't want the customer to write down a long sequence like a GUID but I don't want it easily predictable as well.
Using C#
Thanks for all the tips.
I decided on a format like this:
TdddRROOO
T = 2009 (next year will be U = 2010)
ddd = days this year
RR = two random numbers
000 = order number (I'll offset this so folks can't know the order number that day)
So the confirmation number will be something like
P23477098
You could do something with a mixture. Generate the first half of the key as a known, predictable value (e.g. 00001, 00002, 00003, etc.) and then generate the second half as a randomly generated value so it won't be predictable. Then, increment the "known, predictable" value so that you will never get a match.
Your unique code would then become: 00001-53481, 00002-43853, 00003-54511, etc.
Of course, I am sure there are libraries out there that probably do this already. (It might help if you specify what language you are using.)
I recent did same thing in PHP. We use random function in this class,
https://github.com/kohana/core/blob/3.3/master/classes/Kohana/Text.php
We use random('distinct', 8) to generate confirmation number. It generates strings like this,
4CFY24HJ
JH5AYL7J
2TVWTMJ5
As you can see, it has no confusing numbers/letters like (1/l, 0/O etc) so it makes it much clearer when customers have to read the numbers over the phone.
Decide on the characters (char[] chars) that you want in your confirmation code, decide on the length of confirmation code (n), generate n random numbers (i_1, i_2, ... i_n) in the range [0..chars.Length) and return the string chars[i_1]chars[i_2]...chars[i_n].
In C#:
public string ConfirmationCode(char[] chars, int length, Random rg) {
StringBuilder codeBuilder = new StringBuilder();
for(int i = 0; i < length; i++) {
int index = rg.Next(chars.Length);
codeBuilder.Append(chars[index]);
}
return codeBuilder.ToString();
For uniqueness, prepend the current time in yyyyMMddhhmmss format.
Just generate a random number between 100000 and 999999, for example. Also a good idea is to put some letters in front that identify that it is a confirmation number, such as CONF-843682 so that people will recognize it more easily when you ask for it.
Store the number in the database, together with an ID for the order and an expiry date (say 1 year).
You could do something like get a random number of a specified length, convert to base64 and add a checksum character.
How about something like Amazon's PayPhrase? Use a library like Faker (Ruby) or Data::Faker (Perl) to generate random phrases, or write your own utility. Then just use a simple hash function to convert the "confirmation phrase" into a number you can index.
As for C# there exists a port Ruby's Faker gem at http://github.com/slashdotdash/faker-cs