Getting a field value from pipe in outside the pipe in Hadoop Cascading - hadoop

Regarding above subject, is there any way to get the value of a field from a pipe. And use that value outside the pipe's scope in Hadoop Cascading? The data has delimiter as '|':
first_name|description
Binod|nothing
Rohit|nothing
Ramesh|abc
From above pipe I need to get a value from the description, whatever that is 'nothing' or 'abc'.

Hadoop Cascading is developed with a concept of creating real case scenario by flowing data between pipe and executing parallely it over Map-Reduce Hadoop system.
Execution of java program is unnecessary to depend with rest of the cascading flow (from creating source tap to sink tap), and what Hadoop Cascading does is: it executes those two different processes in different independent JVM instances and they will be unable to share their values.
Following code and its output shows brief hints:
System.out.println("Before Debugging");
m_eligPipe = new Each(m_eligPipe, new Fields("first_name"), new Debug("On Middle", true));
System.out.println("After Debugging");
Expected ouput:
Before Debugging
On Middle: ['first_name']
On Middle: ['Binod']
On Middle: ['Rohit']
On Middle: ['Ramesh']
After Debugging
Actual output:
Before Debugging
After Debugging
...
...
On Middle: ['first_name']
On Middle: ['Binod']
On Middle: ['Rohit']
On Middle: ['Ramesh']

I don't understand what you are trying to say. Do you to mean to extract the value of field ${description} outside the scope of the pipe. If possible something like this in pseudo code.
str = get value of description in inputPipe (which is in the scope of the job rather than function or buffer)

I assume this is what you want: you have a pipe with one field, that is the concatenation of ${first_name} and ${description}. And you want the output to be a pipe with field that is ${description}.
If so, this is what I'd do: implement a function that extracts description and have your flow execute it.
You function (let's call it ExtractDescriptionFunction) should override method operate with something like this:
#Override
public void operate(FlowProcess flowProcess, FunctionCall<Tuple> functionCall) {
TupleEntry arguments = functionCall.getArguments();
String concatenation = arguments.getString("$input_field_name");
String[] values = concatenation.split("\\|"); // you might want to have some data sanity check here
String description = values[1];
Tuple tuple = functionCall.getContext();
tuple.set(0, description);
functionCall.getOutputCollector().add(tuple);
}
Then, in your flow definition, add this:
Pipe outputPipe = new Each(inputPipe, new ExtractDescriptionFunction());
Hope this helps.

Related

vars.put function not writing the desired value into the jmeter parameter

Below is the code which i have been trying to address the below UseCase in JMETER.Quick help is appreciated.
Usecase:
A particular text like "History" in a page response needs to be validated and the if the text counts is more than 50 a random selection of the options within the page needs to be made.And if the text counts is less than 50 1st option needs to be selected.
I am new to Jmeter and trying to solve this usingJSR223 POST processor but somehow stuck at vars.put function where i am unable to see the desired number being populated within the V paramter.
Using a boundary extractor where match no 1 should suffice the 1st selection and 0 should suffice the random selection.
def TotalInstanceAvailable = vars.get("sCount_matchNr").toInteger()
log.info("Total Instance Available = ${TotalInstanceAvailable}");
def boundary_analyzer =50;
def DesiredNumber,V
if (TotalInstanceAvailable < boundary_analyzer)
{
log.info("I am inside the loop")
DesiredNumber = 0;
log.info("DesiredNumber= ${DesiredNumber}");
vars.put("V", DesiredNumber)
log.info("v= ${V}");
}
else{
DesiredNumber=1;
log.info("DesiredNumber=${DesiredNumber}");
vars.put("V", "DesiredNumber")
log.info("v= ${V}");
}
def sCount = vars.get("sCount")
log.info("Text matching number is ${sCount_matchNr}")
You cannot store an integer in JMeter Variables using vars.put() function, you either need to cast it to String first, to wit change this line:
vars.put("V", DesiredNumber)
to this one
vars.put("V", DesiredNumber as String)
alternatively you can use vars.putObject() function which can store literally everything however you will be able to use the value only in JSR223 Elements by calling vars.getObject()
Whenever you face a problem with your JMeter script get used to look at jmeter.log file or toggle Log Viewer window - in absolute majority of cases you will find the root cause of your problem in the log file:

How to get value from a column referenced by a number, from JDBC Response object of Jmeter?

I know they advice to get a cell value this way:
columnValue = vars.getObject("resultObject").get(0).get("Column Name");
as stated on jMeter doc : component reference : JDBC_Request.
But: How to access the same RS cell value by just a number of the column?
RS.get(0).get(4);
...instead of giving it a String of column Name/Label.
edit 1: Lets use Groovy/Java, instead of BeanShell. Thanks.
edit 2: The original motivation was the difference between column Name / Label, as these seem to be not fully guaranteed (? seems to be not clear here, not to me), especially due case-sensitivity ("id"/"ID", "name"/"Name"/"NAME" ..)
It should be something like:
String value = (new ArrayList<String>(vars.getObject("resultObject").get(0).values())).get(4)
More information: Debugging JDBC Sampler Results in JMeter
Be aware that according to HashMap documentation:
This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.
So the order of columns might be a big question mark.
The row itself is a HashMap, defined in source code as:
HashMap<String, Object> row
So using BeanShell syntax, you could get it as
row = vars.getObject("resultObject").get(0); // returns HashMap
In HashMap, you cannot access item (column) by ID. You could, however, apply one of the methods described here, but HashMap doesn't guarantee order, so you cannot be sure what "column 4" will contain.
If you want to be able to loop through all columns, it's better to do it in a Map style, not by index. For example using entrySet() with BeanShell:
for(Map.Entry entry : row.entrySet())
{
log.info(entry.getKey() + "=" + entry.getValue());
}
See various ways to iterate through Map here.

Pyspark Streaming transform error

Hi I am new to Pyspark Streaming.
numbers0 = sc.parallelize([1,2,3,4,5])
numbers1 = sc.parallelize([2,3,4,5,6])
numbers2 = sc.parallelize([3,4,5,6,7])
stream0 = ssc.queueStream([numbers0, numbers1, numbers2])
stream0.pprint()
ssc.start()
ssc.awaitTermination(20)
ssc.stop()
This works fine but as soon as I do the following I get an error:
stream1 = stream0.transform(lambda x: x.mean())
stream1.pprint()
ssc.start()
ssc.awaitTermination(20)
ssc.stop()
What I want is stream that only consists of the mean of my previous stream.
Does anyone know what I must do?
The error you are getting when calling transform is because it requires an RDD-to-RDD function as stated in Spark's documentation for the transform operation. When mean is called on an RDD, it does not return a new RDD and therefore the error.
Now, from what I understand you want to calculate the mean of each RDD that consists the DStream. The DStream is created with the queueStream and since the named parameter oneAtATime is left to default, your program will consume one RDD at every batch interval.
To calculate the mean for each RDD, you would normally do this inside a forEachRDD output operation like this
# Create stream0 as you do in your example
def calculate_mean(rdd):
mean_value = rdd.mean()
# do other stuff with mean_value like saving it to a database or just print it
stream0.forEachRDD(calculate_mean)
# Start and stop the Streaming Context

How to WRITE a structure?

How can I do the following:
data: ls_header type BAPIMEPOHEADER.
" fill it
write ls_header.
currently I'm getting an error because write can not parse the complex type to a string. Is there a simple way to get this code running in abap?
You could use something like:
DATA: g_struct TYPE bapimepoheader.
DO.
ASSIGN COMPONENT sy-index OF STRUCTURE g_struct TO FIELD-SYMBOL(<f>).
IF sy-subrc NE 0.
EXIT.
ENDIF.
WRITE: / <f>.
ENDDO.
Perhaps not exactly the answer you expect: If you list each field.
This can be done quite easy via the Pattern-mask in SE38:
Select the Write-pattern:
Enter the structure you want:
Select the fields
Confirm with "Copy"
Confirm and you get
WRITE: bapimepoheader-po_number,
bapimepoheader-comp_code,
bapimepoheader-doc_type,
bapimepoheader-delete_ind,
bapimepoheader-status,
bapimepoheader-creat_date,
bapimepoheader-created_by,
bapimepoheader-item_intvl,
bapimepoheader-vendor,
bapimepoheader-langu,
bapimepoheader-langu_iso,
bapimepoheader-pmnttrms,
bapimepoheader-dscnt1_to,
bapimepoheader-dscnt2_to,
bapimepoheader-dscnt3_to,
bapimepoheader-dsct_pct1,
bapimepoheader-dsct_pct2,
bapimepoheader-purch_org,
bapimepoheader-pur_group,
bapimepoheader-currency,
bapimepoheader-currency_iso,
bapimepoheader-exch_rate,
bapimepoheader-ex_rate_fx,
bapimepoheader-doc_date,
bapimepoheader-vper_start,
bapimepoheader-vper_end,
bapimepoheader-warranty,
bapimepoheader-quotation,
bapimepoheader-quot_date,
bapimepoheader-ref_1,
bapimepoheader-sales_pers,
bapimepoheader-telephone,
bapimepoheader-suppl_vend,
bapimepoheader-customer,
bapimepoheader-agreement,
bapimepoheader-gr_message,
bapimepoheader-suppl_plnt,
bapimepoheader-incoterms1,
bapimepoheader-incoterms2,
bapimepoheader-collect_no,
bapimepoheader-diff_inv,
bapimepoheader-our_ref,
bapimepoheader-logsystem,
bapimepoheader-subitemint,
bapimepoheader-po_rel_ind,
bapimepoheader-rel_status,
bapimepoheader-vat_cntry,
bapimepoheader-vat_cntry_iso,
bapimepoheader-reason_cancel,
bapimepoheader-reason_code,
bapimepoheader-retention_type,
bapimepoheader-retention_percentage,
bapimepoheader-downpay_type,
bapimepoheader-downpay_amount,
bapimepoheader-downpay_percent,
bapimepoheader-downpay_duedate,
bapimepoheader-memory,
bapimepoheader-memorytype,
bapimepoheader-shiptype,
bapimepoheader-handoverloc,
bapimepoheader-shipcond,
bapimepoheader-incotermsv,
bapimepoheader-incoterms2l,
bapimepoheader-incoterms3l.
Now you can make a simple replacement of bapimepoheader with ls_header and you have an output of all fields of the structure.
Maybe this is not elegant and you must adapt your report, if the structure changes. But I like this way, because often I don't need all fields and I can select the fields in an easy way.
I know two ways, one is procedural, the other is oop.
Here is the procedural approach.
Select the structure's fields (or whatever else You might need ) from the data-dictionary table DD03L into a local internal table.
Loop over the table into a work-area
Check, whether current field is a flat single datatype, and if so,
Assign component workarea-fieldname of structure ls_header into anyfieldsymbol
Write anyfieldsymbol
Do You need the code ?
Class CL_ABAP_CONTAINER_UTILITIES was specially introduced for that by SAP.
Use FILL_CONTAINER_C method for output the structure in a WRITE manner:
DATA: ls_header type BAPIMEPOHEADER.
CALL METHOD CL_ABAP_CONTAINER_UTILITIES=>FILL_CONTAINER_C
EXPORTING
IM_VALUE = ls_header
IMPORTING
EX_CONTAINER = DATA(container)
EXCEPTIONS
ILLEGAL_PARAMETER_TYPE = 1
others = 2.
WRITE container.
You can write your structure to a string and then output the string. Same method idoc segments are created.

How to use "Result Variable Name" in JDBC Request object of Jmeter

In JMeter I added the configuration for oracle server. Then I added a JDBC request object and put the ResultSet variable name to status.
The test executes fine and result is displayed in treeview listener.
I want to use the variable status and compare it with string but jmeter is throwing error about casting arraylist to string.
How to retrieve this variable and compare with string in While Controller?
Just used some time to figure this out and think the accepted answer is slightly incorrect as the JDBC request sampler has two types of result variables.
The ones you specify in the Variable names box map to individual columns returned by your query and these you can access by saying columnVariable_{index}.
The one you specify in the Result variable name contains the entire result set and in practice this is a list of maps to values. The above syntax will obviously not work in this case.
The ResultSet variable returned with JDBC request in JMeter are in the for of array. So if you want to use variable status, you will have to use it with index. If you want to use the first(or only) record user status_1. So you need to use it like status_{index}.
String host = vars.getObject("status").get(0).get("option_value");
print(host);
log.info("----- " + host);
Form complete infromation read the "yellow box" in this link:
http://jmeter.apache.org/usermanual/component_reference.html#JDBC_Request
Other util example:
http://jmeter.apache.org/usermanual/build-db-test-plan.html
You can use Beanshell/Groovy (same code works) in JSR233 PostProcessor to work with “Result Variable Name” from JDBC Request like this:
ArrayList results = vars.getObject("status");
for (HashMap row: results){
Iterator it = row.entrySet().iterator();
while (it.hasNext()){
Map.Entry pair = (Map.Entry)it.next();
log.info(pair.getKey() + "=" + pair.getValue());
}
}
Instead of output to log replace with adding to string with delimiters of your choice.

Resources