Apache Nifi, can I collect an attribute from multiple flow files - apache-nifi

I have a nifi flow that takes in .csv files and partitions each into multiple records with each csv column value added as an attribute.
At one point in the flow, I'd like to collect the value of one attribute from each record that passes though. There could be from 0 to n collected. Once I have the list, it'll be emailed out.
I'm trying to avoid me (or someone else) getting bombed with emails if there are 200+ bad records in a file. So if I could collect for a fixed period of time or until another attribute (filename) changes, that would be great.
I've tried merge content and record. I even tried replace text to replace the content w/ just the attribute value I want to save and merging those, and a slew of other things.
Is there a simple way to do this in nifi?

Have you tried UpdateAttribute with a new attribute of type array. When each flowfile passes the this processor you could continue to update the value of this attribute by appending a new value to the array, attribute.
However, as #daggett pointed out, it will be helpful if you can provide the input and expected output.

Related

Extract data from webpage to Excel

I tried to automate this portal, but since I have a trouble due to new to UiPath.
This is a URL
Have to extract CompanyName,BrokerName,Address,Phone into Excel for a number of records as per user input.
Since that client data is in one element and separated by breaks (br) I would suggest to still use the Scrape Data feature, (pick the first and second data set-group) and pull in the data set as-is; so its in block format separated by new lines.
Then iterate through the results, do a split string array on the results, iterate through the string array and evaluate each line using regex. If an address match or email match or phone..etc.. Then handle it from there, You could dump the results into a temp data table and then dump the results into excel.
Granted there might need to be some fluff on your regexpressions and it might miss a few, but it would be a good start.
Hope that helps get you started

How to read an excel sheet and put the cell value within different text fields through UiPath?

How to read an excel sheet and put the cell value within different text fields through UiPath?
I have a excel sheet as follows:
I have read the excel contents and to iterate over the contents later I have stored the contents in a Output Data Table as follows:
Read Range - Output:
DataTable: CVdatatable
Output Data Table
DataTable: CVdatatable
Text: opCVdatatable
Screenshot:
Finally, I want to read the text opCVdatatable in a iteration and write them into text fields. So in the desired Input fileds I mentioned opCVdatatable or opCVdatatable+ "[k(enter)]" as required.
Screenshot:
But UiPath seems to start from the begining of the Output Data Table whenever I called for opCVdatatable.
Inshort, each desired Input fileds are iteratively getting filled up by all the data with the data stored in the Output Data Table.
Can someone help me out please?
My first recommendation is to use Workbook: Read range activity to read data from Excel because it is quicker, works in the background, and does not require excel to be installed on the system.
Start your sequence like this (note the add headers property is not checked):
You do not need to use Output Data Table because this activity outputs a string containing all row items. What you want to do instead is to access the items in the data table and output each one as a string in your type into, e.g., CVDatatable.Rows(0).Item(0).ToString, like so:
You mention you want to read the text opCVdatatable in an iteration and write them into text fields. This is a little bit more complex, but i'll give you an example. You can use a For Each Row activity and loop through each row in CVDatatable, setting the index property if required. See below:
The challenge is to get the selector correct here and make it dynamic, so that it targets a different text field per iteration. The selector for the type into activity will depend on the system you are targeting, but here is an example:
And the selector for this:
Also, here is a working XAML file for you to test.
Hope this helps.
Chris
Here's a different, more general approach. Instead of including the target in the process itself, the Excel would be modified to include parts of a selector:
Note that column B now contains an identifier, and this ID depends on the application you will be working with. For example, here's my sample app looks like. As you can see, the first text box has an id of 585, the second one is 586, and so on (note that you can work with any kind of identifier including the control's name if exposed to UiPath):
Now, instead of adding multiple Type Into elements to your workflow, you would add just a single one, loop over each of the datatable's row, and then create a dynamic selector:
In my case the selector for the Type Into activity looks as follows:
"<wnd cls='#32770' title='General' /><wnd ctrlid='" + row(1).ToString() + "' />"
This will allow you to maintain the process from the Excel sheet alone - if there's a new field that needs to be mapped, just add it to your sheet. No changes to the Workflow are required.

JMeter - Need to submit POST data but I only wish to modify a single field

I'm running some JMeter tests for editing a field. If I use the JMeter HTTP(S) Test Script Recorder, I can get an accurate representation of the page and edits I made.
It creates a HTTP POST request with a parameter for every field, checkbox and dropdown on the page. I only really care about modifying ONE of them.
My problem is I can't just remove all the other parameters from the POST data because the page interprets this as if I removed all of them from the page (and then complains that there's missing data). So I'm left with trying to obtain the current values for the remaining editable fields and checkboxes so that I can re-submit them when I only want to modify a single field.
For an example, imagine I'm submitting some user data with fields for Name, Email and Address. I want to change the name by adding a 1 to the end of it and leave the other two fields as they are.
My thoughts for accomplishing this:
1) Use XPath to try to get the values shown on the page, store them all in variables and re-submit them in the post request. This is messy and also very difficult as the page is shown in a pop-up window, adding to the complexity of it.
2) Query the database for all the information and re-submit it. Seems like a lot of overhead, plus the data isn't freely available .. I'd rather not have to try to do this.
3) Use some other element of JMeter I'm not aware of to obtain the specific element data from the page. Maybe some listener I haven't figured out yet? If I could pull the parameters from the page and save them, that would be VERY convenient.
4) Somehow submit a POST request with only one field, specifying that I do not wish to clear out the remaining fields, I just want to leave them alone. I will freely admit that I am not super familiar with web applications so there may be a very obvious reason as to why this can't be done (or it's dependent on how its handled by the back-end of the application).
Thoughts?
From the whole post, I understand that you want to parameterise a field, where each time different value will be passed.
If my understanding is correct, the answer lies in CSV Dataset Config, where you can pass the values from a CSV file.
From your Example:
For an example, imagine I'm submitting some user data with fields for
Name, Email and Address. I want to change the name by adding a 1 to
the end of it and leave the other two fields as they are
To achieve this:
Steps to follows:
Create a csv file. fill the names as follows:
names
name1
name2
name3
name4
names is the column header and remaining are values.
Add CSV Dataset Config to your Test Plan.
Specify the file path.
replace the value in the name field in HTTP Post request as ${names}. that's it.

Identify if objects are the same

I want to check certain objects against each other and identify if they are the same.
For example, I need to verify that the total cost in one page is the same as another page. I developed a script that works, however the total cost changes every day so I have to update the object properties in maintenance mode every day.
Is there a way that UFT automatically recognizes this object must change and update?
I request you to elaborate your question. For now, you can use .* if certain values of the object are changing. Alternatively, you can store the values in an excel sheet and you can change everyday depending on the requirement.
If this is not helpful let me know
It sounds like you actually want to compare the values shown in two different objects, and see if those values are the same. (I assume this because you say they are on two different pages)
Also, you mention maintenance mode, so I assume you are using checkpoints to store their expected values.
I would suggest: instead of storing the expected values in a checkpoint, you could read the value of the first object (getROproperty), store it in a variable (dataTable field, environment variable, etc), and then navigate to the other page, read the ROproperty from the other object, and then compare.
i.e.
if {browser,page,object...}.getROproperty({whateverPropertyYouNeed}) = environment({storedFirstValue}) then
reporter.reportevent micPass,"compare step","{details here}"
end if
*replace stuff inside {} with your code, I don't know what it is
If you need to actually store the total cost externally, you could use a DataTable field and export the sheets at the end. then import the same sheet at the beginning. That would save the data to an excel sheet on a drive.

How do I validate data in a file in SSIS before inserting into a database?

What I want to do is take data from a dbf file and insert it in a table. Which I've already done. Since there are many files, a For-Each Container is being used. However, before inserting it into a table, I want to look at the date fields and compare it to a date variable. If the dates match the variable, then move on to the step of the flow. But if any of the dates don't match the variable, then that file and its contents are discarded and the next file is looked at.
How do I accomplish this in SSIS?
You're looking for the Conditional Split Component within your Data Flow Task.
Assuming your source column is MyDate and you have an SSIS Variable called #[User::ReferenceDate] then you'd apply an expression like
[MyDate] == #[User::ReferenceDate]
That will evaluate to True when the dates match, false otherwise.
In your Conditional Split, add a row into the component.
OutputName: DatesMatched
Condition: [MyDate] == #[User::ReferenceDate]
Default output name: DatesUnmatched
Now when you connect the output from this to your destination, it'll ask whether you want to route the data using the DatesMatched or DatesUnmatched path. Use the DatesMatched path.
As I re-read this, if any of the dates don't match the variable, then that file and its contents are discarded then you're looking at double processing the file. The first time to read it all in and validate it. The second time, optional, will actually load to the database.
From your Conditional Split, add a RowCount to the DatesUnmatched path. Use a Variable of type Integer/Int32 named CountDatesUnmatched. In a perfect world, that will be zero when the validation of the file completes.
In the Precedent Constraint between the Validation Data Flow and the actual Import Data Flow, double click the connector line and change the evaluation criteria from Constraint to Expression and Constraint. Leave the value as Success and in the Expression use #[User::CountDatesUnmatched] == 0 That data flow will only light up if both conditions are true: parsing was successful and no rows were sent to the Row Count component.
Finally, you can cheat and sometimes this approach makes sense. If you're using an OLE DB Destination, then you can use the MaximumInsertCommitSize of the default 2B and a data access mode of fast load. This translates to "Everything is going to commit or none of it is". That can lock up your target table and cause your transaction log to grow heavily depending on how much data you're loading. Use the Conditional Split as described above but for the DatesUnmatched path, induce a failure. A Derived column with divide by zero or a script task with an explicit FireError event will cause that transaction to go belly up. You'd need to do some magic in the OnError event handler to not abort the overall file processing but it's a lazy hack (or one that is useful when double reading the file is prohibitive but impacting the database is less so)

Resources