I have a fetchxml report setup to pull data from our CRM instance. Inside Visual Studio 2010 it is laid out as such when it pulls the information
job number new lot rough start date city builder
30774-00c custom 8/4/2014 city1 builder1
30876-19 465 7/11/2014 city5 builder2
30876-19 466 7/11/2014 city5 builder2
30876-19 467 7/11/2014 city5 builder2
30876-19 489 7/12/2014 city5 builder2
30856-01 2 8/26/201 city3 builder5
I want to be able to combine the job number and "new lot" where "new roughstartdate" are the same so it would look like
job number new lot rough start date city builder
30774-00c custom 8/4/2014 city1 builder1
30876-19 465,466,467 7/11/2014 city5 builder2
30876-19 489 7/14/2014 city5 builder2
But I just cant seem to figure out the grouping correctly any guidance would be great.
I thought I could do =Join(LookupSet(Fields!jobnumber.Value,Fields!jobnumber.Value,Fields!roughstartdate.Value,"DataSet1"),",")
But that seems to just only show one item when they match and not combine the lots onto a single line.
First group by "rough start date" and then by "Job number" then use below expression in "new lot":
=Join(LookupSet(Fields!roughstartdate.Value,Fields!roughstartdate.Value,Fields!newlot.Value,"DataSet2"),",")
DataSet2 should be same as DataSet1.
I was just going to comment above but I can't.. So - I think the issue where you have all lots coming back is that the group is just on the Date.
You need to group on Job Number AND Date and then use the Join(LookupSet...
That way you will have groups job number 30876-19 for 7/11/2014 and 30876-19 for 7/12/2014.
Related
So this example would be for any interactive dashboard like Kibana or PowerBI.
So say I have a vehicle 556 in a fleet of vehicles.
at 08:41:50 said vehicle generates an alarm (Engine overheated)
2021-04-21T08:41:50 Priority 2 alarm - Vehicle 556 - Engine overheating
I would like to then be able to click on this alarm, and get a dashboard which would show me known states from the vehicle at the time of the event.
e.g. speed of vehicle, accelaration percentage, braking percentage, and off course data which would come from other datasources, like weather information, driver etc.
So from the dashboard, I would expect based on below logs to see the following:
Vehicle: 556
Datetime: 2021-04-21 08:41:50
Speed: 42mph
Accelaration: 70%
Braking: 65%
Weather: Sunny, 80 deg
Other alarms: Engine fan stalled
2021-04-21T06:35:50 info - Weather - Sunny 75 deg
2021-04-21T07:35:50 info - Weather - Sunny 78 deg
2021-04-21T08:34:23 Priority 1 alarm - Vehicle 556 - Engine fan stalled
2021-04-21T08:35:50 info - Weather - Sunny 80 deg
2021-04-21T08:41:38 info - Vehicle 556 - Acceleration: 100%
2021-04-21T08:41:38 info - Vehicle 556 - Braking: 40%
2021-04-21T08:41:38 info - Vehicle 556 - Speed 24 mph
2021-04-21T08:41:42 info - Vehicle 556 - Acceleration: 97%
2021-04-21T08:41:42 info - Vehicle 556 - Braking: 50%
2021-04-21T08:41:42 info - Vehicle 556 - Speed 38 mph
2021-04-21T08:41:45 info - Vehicle 556 - Acceleration: 70%
2021-04-21T08:41:45 info - Vehicle 556 - Braking: 65%
2021-04-21T08:41:45 info - Vehicle 556 - Speed 42 mph
2021-04-21T08:41:50 Priority 2 alarm - Vehicle 556 - Engine overheating
For "bonus points" being able to then click the last known speed, and see a log which shows the last e.g. 10 logs regarding vehicle speed would also be nice.
But is this a thing in any database or specifically powerbi or elasticsearch/kibana
or would this require a custom created application-dashboard.
or is it indeed just because I don't know the right search term
For Power BI, there's a useful function LASTNONBLANKVALUE which can handle this. The best documentation source (DAX Guide) for this is below, it includes the product documentation plus a live sandpit:
https://dax.guide/lastnonblankvalue/
For your scenario, I imagine using the timestamp for the 1st parameter (ColumnName), then the 3rd segment of the log text (e.g. a derived column with values like "Speed 42 mph") for the 2nd parameter.
I would build a set of DAX columns to prepare the data, using CALCULATE to restrict the set of rows considered. DAX Measures might be more flexible and dynamic, but are much more abstract to develop and query performance can suffer if they are complex.
Here's an example to get the Last Speed for each non-info Log row. I assume some data transformation to get helper columns for Vehicle etc.
Last Speed =
VAR v_Timestamp_Max = LogTable[Timestamp]
VAR v_Vehicle = LogTable[Vehicle]
RETURN
IF ( LogTable[Level] <> "info" ,
CALCULATE (
LASTNONBLANKVALUE ( LogTable[Timestamp], MAX ( LogTable[Value] ) ),
ALL ( LogTable ),
LogTable[Timestamp] < v_Timestamp_Max,
LogTable[Vehicle] = v_Vehicle,
LogTable[Value Type] = "Speed"
)
)
I am looking to merge data in way described below:
I have a table below:
table: PTLANALYSIS
RENTALDATE
OUTBOUND,
INBOUND,
VEHICLE_SIZE,
COMPETITOR,
RATE;
The data I am trying to load into the tabs:
RENTALDATE,
OUTBOUND,
INBOUND,
VEHICLE_SIZE,
LOLY,
KAY,
RATE;
Now LOLY and KAY are suppose to be in column "Competitor" in table PTLANALYSIS. Can someone help me merge my data in an appropriate manner, the output should look something like this...
Rental Date | OUTBOUND | INBOUND | VEHICLE_SIZE | COMPETITOR | RATE
12/28/2019 223 333 small loly 33.5
12/28/2019 223 333 small kay 33.5
Currently it looks like this in my csv..
Rental Date | OUTBOUND | INBOUND | VEHICLE_SIZE | lolyRATE | KAYRATE
12/28/2019 223 333 small 33.5 NULL
12/28/2019 223 333 small NULL 33.5
Thanks in advance!
Most of the columns in the CSV file have fixed targets. You need to evaluate the LOLYRATE and KAYRATE to conditionally populate COMPETITOR and RATE. Something like this:
insert into PTLANALYSIS (
RENTALDATE
OUTBOUND,
INBOUND,
VEHICLE_SIZE,
COMPETITOR,
RATE
)
select
RENTALDATE,
OUTBOUND,
INBOUND,
VEHICLE_SIZE,
case when LOLYRATE is not null then 'loly' else 'kay' end as competitor,
coalesce(LOLYRATE, KAYRATE) as rate
from ext_table
;
You haven't said how you intend to load the data but I have assumed an external table, because it allows you to use SQL, and everything is easier with SQL. Find out more.
here is the tsv file. c2is2r3.tsv
The O
fate O
of O
Lehman ORGANIZATION
Brothers ORGANIZATION
. . .
New ORGANIZATION
York ORGANIZATION
Fed ORGANIZATION
, O
and O
Treasury TITLE
Secretary TITLE
Henry PERSON
M. PERSON
Paulson PERSON
Jr. PERSON
. O
more c2is2r3.prop
trainFile = c2is2r3.tsv
serializeTo = c2is2r3-ner-model.ser.gz
map = word=0,answer=1
useClassFeature=true
useWord=true
useNGrams=true
noMidNGrams=true
maxNGramLeng=6
usePrev=true
useNext=true
useSequences=true
usePrevSequences=true
maxLeft=1
useTypeSeqs=true
useTypeSeqs2=true
useTypeySequences=true
wordShape=chris2useLC
useDisjunctive=true
Here is the original sequence
java -cp stanford-ner-3.5.2.jar edu.stanford.nlp.ie.crf.CRFClassifier -prop c2is2r3.prop
java -cp stanford-ner-3.5.2.jar -mx2g edu.stanford.nlp.ie.NERClassifierCombiner -ner.model c2is2r3-ner-model.ser.gz,classifiers/english.muc.7class.distsim.crf.ser.gz -ner.useSUTime false -ner.combinationMode HIGH_RECALL -serializeTo c2is2.serialized.ncc.ncc.ser.gz
java -cp stanford-ner-3.5.2.jar -mx1g edu.stanford.nlp.ie.crf.CRFClassifier -loadClassifier c2is2.serialized.ncc.ncc.ser.gz -textFile c2is2r3.txt
CRFClassifier invoked on Fri Jul 17 09:51:13 EDT 2015 with arguments:
-loadClassifier c2is2.serialized.ncc.ncc.ser.gz -textFile c2is2r3.txt
loadClassifier=c2is2.serialized.ncc.ncc.ser.gz
textFile=c2is2r3.txt
Loading classifier from /mnt/hgfs/share/nlp/stanford-ner-2015-04-20/c2is2.serialized.ncc.ncc.ser.gz ... Error deserializing /mnt/hgfs/share/nlp/stanford-ner-2015-04-20/c2is2.serialized.ncc.ncc.ser.gz
Exception in thread "main" java.lang.RuntimeException: java.lang.ClassCastException: java.util.Properties cannot be cast to [Ledu.stanford.nlp.util.Index;
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifierNoExceptions(AbstractSequenceClassifier.java:1572)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifierNoExceptions(AbstractSequenceClassifier.java:1523)
at edu.stanford.nlp.ie.crf.CRFClassifier.main(CRFClassifier.java:2987)
Caused by: java.lang.ClassCastException: java.util.Properties cannot be cast to [Ledu.stanford.nlp.util.Index;
at edu.stanford.nlp.ie.crf.CRFClassifier.loadClassifier(CRFClassifier.java:2613)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1451)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1558)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifierNoExceptions(AbstractSequenceClassifier.java:1569)
... 2 more
This is an attempt to use the NERClassifierCombiner
java -cp stanford-ner-3.5.2.jar -mx1g edu.stanford.nlp.ie.NERClassifierCombiner -loadClassifier c2is2.serialized.ncc.ncc.ser.gz -testFile c2is2r3.txt
This is the error stack:
NERClassifierCombiner invoked on Fri Jul 17 10:11:17 EDT 2015 with arguments:
-loadClassifier c2is2.serialized.ncc.ncc.ser.gz -testFile c2is2r3.txt
testFile=c2is2r3.txt
loadClassifier=c2is2.serialized.ncc.ncc.ser.gz
testFile=c2is2r3.txt
ner.useSUTime=false
ner.model=c2is2r3-ner-model.ser.gz,classifiers/english.muc.7class.distsim.crf.ser.gz
serializeTo=c2is2.serialized.ncc.ncc.ser.gz
loadClassifier=c2is2.serialized.ncc.ncc.ser.gz
ner.combinationMode=HIGH_RECALL
loading CRF...
loading CRF...
Error on line 1: The fate of Lehman Brothers, the beleaguered investment bank, hung in the balance on Sunday as Federal Reserve officials and the leaders of major financial institutions continued to gather in emergency meetings trying to complete a plan to rescue the stricken bank. Several possible plans emerged from the talks, held at the Federal Reserve Bank of New York and led by Timothy R. Geithner, the president of the New York Fed, and Treasury Secretary Henry M. Paulson Jr.
Exception in thread "main" java.lang.UnsupportedOperationException: Argument array lengths differ: [word, tag, answer] vs. [The, fate, of, Lehman, Brothers,, the, beleaguered, investment, bank,, hung, in, the, balance, on, Sunday, as, Federal, Reserve, officials, and, the, leaders, of, major, financial, institutions, continued, to, gather, in, emergency, meetings, trying, to, complete, a, plan, to, rescue, the, stricken, bank., Several, possible, plans, emerged, from, the, talks,, held, at, the, Federal, Reserve, Bank, of, New, York, and, led, by, Timothy, R., Geithner,, the, president, of, the, New, York, Fed,, and, Treasury, Secretary, Henry, M., Paulson, Jr.]
at edu.stanford.nlp.ling.CoreLabel.initFromStrings(CoreLabel.java:153)
at edu.stanford.nlp.ling.CoreLabel.<init>(CoreLabel.java:133)
at edu.stanford.nlp.sequences.ColumnDocumentReaderAndWriter$ColumnDocParser.apply(ColumnDocumentReaderAndWriter.java:85)
at edu.stanford.nlp.sequences.ColumnDocumentReaderAndWriter$ColumnDocParser.apply(ColumnDocumentReaderAndWriter.java:60)
at edu.stanford.nlp.objectbank.DelimitRegExIterator.parseString(DelimitRegExIterator.java:67)
at edu.stanford.nlp.objectbank.DelimitRegExIterator.setNext(DelimitRegExIterator.java:60)
at edu.stanford.nlp.objectbank.DelimitRegExIterator.<init>(DelimitRegExIterator.java:54)
at edu.stanford.nlp.objectbank.DelimitRegExIterator$DelimitRegExIteratorFactory.getIterator(DelimitRegExIterator.java:122)
at edu.stanford.nlp.sequences.ColumnDocumentReaderAndWriter.getIterator(ColumnDocumentReaderAndWriter.java:54)
at edu.stanford.nlp.objectbank.ObjectBank$OBIterator.setNextObject(ObjectBank.java:436)
at edu.stanford.nlp.objectbank.ObjectBank$OBIterator.<init>(ObjectBank.java:415)
at edu.stanford.nlp.objectbank.ObjectBank.iterator(ObjectBank.java:253)
at edu.stanford.nlp.sequences.ObjectBankWrapper.iterator(ObjectBankWrapper.java:52)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.classifyAndWriteAnswers(AbstractSequenceClassifier.java:1160)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.classifyAndWriteAnswers(AbstractSequenceClassifier.java:1111)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.classifyAndWriteAnswers(AbstractSequenceClassifier.java:1071)
at edu.stanford.nlp.ie.NERClassifierCombiner.main(NERClassifierCombiner.java:382)
So not sure what to do next. Any other combinations.
During the serialize step you are serializing with:
edu.stanford.nlp.ie.NERClassifierCombiner
During the load step you are loading with:
edu.stanford.nlp.ie.crf.CRFClassifier
So in the second command, use edu.stanford.nlp.ie.NERClassifierCombiner instead and the error should go away. You serialized an NERClassifierCombiner, but are trying to load it as a CRFClassifier. Please let me know if you have any other troubles!
The 2nd file c2is2r3.txt needs to be converted into a tsv file first and then it needs to be passed into your command.
You can just associate O (if you are not sure or want to save time manually tagging it) to all the tokens generated and then test with your model.
I am trying to parse a bunch of log data using pig. Unfortunately the data for one command is spread across multiple lines (an audit log). I know that there is an id that correlates all of the log messages and that there are different types that contain pieces of the whole, but I am unsure how to gather them all into one message.
I split the message based on type and then joined based on the id, but since there is a one to many relationship between SYSCALL and PATH, this doesn't gather all of the information on one line. I can group by id, but then I want to be able to pull out the same field (name) from every PATH tuple but I don't know of anyway to do that.
Should I just write my own UDF? A FOREACH doesn't keep track of state such that I can concatenate the name field from each tuple.
Edited to add example:
{"message":"Jan 6 15:30:11 r01sv06 auditd: node=r01sv06 type=SYSCALL
msg=audit(1389047402.069:4455727): arch=c000003e syscall=59
success=yes exit=0 a0=7fff8ef30600 a1=7fff8ef30630 a2=270f950
a3=fffffffffffffff0 items=2 ppid=1493 pid=1685 auid=0 uid=0 gid=0
euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=8917
comm=\"ip\" exe=\"/sbin/ip\"
key=\"command\"","#timestamp":"2014-01-06T22:30:14.642Z","#version":"1","type":"audit","host":"r01sv09a","path":"/data/logs/audit.log","syslog_timestamp":"Jan
6 15:30:11","syslog_program":"auditd","received_at":"2014-01-06
22:30:14 UTC", "received_from":"r01sv06" ,"syslog_severity_code":5
,"syslog_facility_code":1
,"syslog_facility":"user-level","syslog_severity":"notice","#source_host":"r01sv06"}
{"message":"Jan 6 15:30:11 r01sv06 auditd: node=r01sv06 type=EXECVE
msg=audit(1389047402.069:4455727): argc=4 a0=\"/sbin/ip\" a1=\"link\"
a2=\"show\"
a3=\"lo\"","#timestamp":"2014-01-06T22:30:14.643Z","#version":"1","type":"audit","host":"r01sv09a","path":"/data/logs/audit.log","syslog_timestamp":"Jan
6 15:30:11","syslog_program":"auditd","received_at":"2014-01-06
22:30:14 UTC", "received_from":"r01sv06", "syslog_severity_code":5,
"syslog_facility_code":1,"syslog_facility":"user-level",
"syslog_severity":"notice","#source_host":"r01sv06"}
{"message":"Jan 6 15:30:11 r01sv06 auditd: node=r01sv06 type=CWD
msg=audit(1389047402.069:4455727):
cwd=\"/root\"","#timestamp":"2014-01-06T22:30:14.644Z","#version":"1","type":"audit","host":"r01sv09a","path":"/data/logs/audit.log","syslog_timestamp":"Jan
6 15:30:11","syslog_program":"auditd","received_at":"2014-01-06
22:30:14 UTC","received_from":"r01sv06", "syslog_severity_code":5,
"syslog_facility_code":1, "syslog_facility":"user-level",
"syslog_severity":"notice", "#source_host":"r01sv06"}
{"message":"Jan 6 15:30:11 r01sv06 auditd: node=r01sv06 type=PATH
msg=audit(1389047402.069:4455727): item=0 name=\"/sbin/ip\"
inode=1703996 dev=08:02 mode=0100755 ouid=0 ogid=0
rdev=00:00","#timestamp":"2014-01-06T22:30:14.645Z","#version":"1","type":"audit","host":"r01sv09a","path":"/data/logs/audit.log","syslog_timestamp":"Jan
6 15:30:11","syslog_program":"auditd","received_at":"2014-01-06
22:30:14 UTC", "received_from":"r01sv06", "syslog_severity_code":5,
"syslog_facility_code":1,"syslog_facility":"user-level",
"syslog_severity":"notice", "#source_host":"r01sv06",}
I have a variable address which for now is a long string containing some unneccessary info, eg: "Aboriginal Relations 11th Floor Commerce Place 10155 102 Street Edmonton AB T5J 4G8 Phone 780 427-9658 Fax 780 644-4939 Email gerry.kushlyk#gov.ab.ca"
Aboriginal Relations is in a variable called title, and I'm trying to call address.gsub!(title,''), but its returning the original string.
I've also tried address.gsub!(/#{title}/,'') and address.gsub!("#{title}",'') but those won't work either. Any ideas?
Sorry, the typo occurred when I typed it into stack overflow, heres the code and the output, copied and pasted:
(this is within a loop, so there will be multiple outputs)
p title
address.gsub!(title,'')
p address
output
"Aboriginal Relations "
"Aboriginal Relations 11th Floor Commerce Place 10155 102 Street Edmonton AB T5J 4G8 Phone 780 427-9658 Fax 780 644-4939 Email gerry.kushlyk#gov.ab.ca"
"Aboriginal Tourism Advisory Council "
"Aboriginal Tourism Advisory Council 5th Floor Terrace Building 9515 107 Street Edmonton AB T5K 2C3 Phone 780 427-9687 Fax 780 422-7235 Email foip.fintprccs#gov.ab.ca"
"Acadia Foundation "
"Acadia Foundation PO Box 96 Oyen AB T0J 2J0 Phone 403 664-3384 Fax 403 664-3316 Email acadiafoundation#telus.net"
"Access Advisory Council "
"Access Advisory Council 12th Floor Centre West Building 10035 108 Street Edmonton AB T5J 3E1 Phone 780 427-2805 Fax 780 422-3204 Email barb.joyner#gov.ab.ca"
"ACCM Benevolent Association "
"ACCM Benevolent Association Suite 100 9403 95 Avenue Edmonton AB T6C 4M7 Phone 780 468-4648 Fax 780 468-4648 Email accmmanor#shaw.ca"
"Acme Municipal Library "
"Acme Municipal Library PO Box 326 Acme AB T0M 0A0 Phone 403 546-3845 Fax 403 546-2248 Email aamlibrary#marigold.ab.ca"
likewise, if I try address.match(/#{title}/) I get nil.
I'm assuming you're using ruby 1.9 or higher.
It's possible that the trailing whitespace is a non-breaking space:
p "Relations\u00a0" # looks like a trailing space, but strip won't remove it
to get rid of it:
"Relations\u00a0".gsub!(/^\u00a0|\u00a0$/, '') # => "Relations"
A more generic solution for all unicode whitespace:
"Relations\u00a0".gsub!(/^[[:space:]]|[[:space:]]$/, '') # => "Relations"
To see what the character is in your case:
title[-1].ord # => 160 (example only)
'%x' % title[-1].ord # => "a0" (hex equivalent; example only)
title = title[0..-2] seemed to solve it. for some reason strip and chomp wouldn't work.