Can We generate 2 pairs of (key, value) in one map function? If yes how? - hadoop

I have a dataset of userID and a post related to each UserID.
I want to count the number of posts by each user. I also want to put all the posts of each userID together (concat all the posts with some separation).
Any suggestions how to go about it?

IMHO, you can have a mapper and a reducer.
Mapper:
class PostMapper extends Mapper < Object, Text, Text, Text>
map() can write a key which is the UserID (Text) and a value which is a Post(Text) to the Context.
Reducer:
class PostReducer extends Reducer < Text, Text, Text, Text >
reduce() can have an iterable loop with (i) a counter that counts
for every fetched Post and (ii) a Text variable can be used to
concatenate every fetched Post with suitable delimiter.
After completing the loop, the key / UserID and, the value / the
concatenated Text can be written to reducer's context.
After the job ran successfully, the resulting file would contain the UserID and the concatenated posts, separated by a tab.
Note: Remove all tab characters in the posts before you concatenate. Prefix the count followed by a tab and append it with concatenated posts if you want the count also in the output.

Your key in the key/value pair would be the userId. The value would be a list of strings (the messages). Most lists have a count property.
The information you are looking for would be accessed something like this:
var userId = 39;
Get the first message of user 39: userMessages[userId][0].
Get the number of message posted by user 39: userMessages[userId].Count()

Related

Cypress - counting number of elements in an array that contain a specific string

Attempting to confirm that of all the schema in the head of a page exactly 3 of them should have a specific string within them. These schemas have no tags or sub classes to differentiate themselves from each other, only the text within them. I can confirm that the text exists within any of the schema:
cy.get('head > script[type="application/ld+json"]').should('contain', '"#type":"Product"')
But what I need is to confirm that that string exists 3 times, something like this:
cy.get('head > script[type="application/ld+json"]').contains('"#type":"Product"').should('have.length', 3)
And I can't seem to find a way to get this to work since .filter, .find, .contains, etc don't filter down the way I need them to. Any suggestions? At this point it seems like I either need to import a custom library or get someone to add ids to these specific schema. Thanks!
The first thing to note is that .contains() always yields a single result, even when many element match.
It's not very explicit in the docs, but this is what it says
Yields
.contains() yields the new DOM element it found.
If you run
cy.get('head > script[type="application/ld+json"]')
.contains('"#type":"Product"')
.then(console.log) // logs an object with length: 1
and open up the object logged in devtools you'll see length: 1, but if you remove the .contains('"#type":"Product"') the log will show a higher length.
You can avoid this by using the jQuery :contains() selector
cy.get('script[type="application/ld+json"]:contains("#type\": \"Product")')
.then(console.log) // logs an object with length: 3
.should('have.length', 3);
Note the inner parts of the search string have escape chars (\) for quote marks that are part of the search string.
If you want to avoid escape chars, use a bit of javascript inside a .then() to filter
cy.get('script[type="application/ld+json"]')
.then($els => $els.filter((index, el) => el.innerText.includes('"#type": "Product"')) )
.then(console.log) // logs an object with length: 3
.should('have.length', 3);

How to get value from a column referenced by a number, from JDBC Response object of Jmeter?

I know they advice to get a cell value this way:
columnValue = vars.getObject("resultObject").get(0).get("Column Name");
as stated on jMeter doc : component reference : JDBC_Request.
But: How to access the same RS cell value by just a number of the column?
RS.get(0).get(4);
...instead of giving it a String of column Name/Label.
edit 1: Lets use Groovy/Java, instead of BeanShell. Thanks.
edit 2: The original motivation was the difference between column Name / Label, as these seem to be not fully guaranteed (? seems to be not clear here, not to me), especially due case-sensitivity ("id"/"ID", "name"/"Name"/"NAME" ..)
It should be something like:
String value = (new ArrayList<String>(vars.getObject("resultObject").get(0).values())).get(4)
More information: Debugging JDBC Sampler Results in JMeter
Be aware that according to HashMap documentation:
This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.
So the order of columns might be a big question mark.
The row itself is a HashMap, defined in source code as:
HashMap<String, Object> row
So using BeanShell syntax, you could get it as
row = vars.getObject("resultObject").get(0); // returns HashMap
In HashMap, you cannot access item (column) by ID. You could, however, apply one of the methods described here, but HashMap doesn't guarantee order, so you cannot be sure what "column 4" will contain.
If you want to be able to loop through all columns, it's better to do it in a Map style, not by index. For example using entrySet() with BeanShell:
for(Map.Entry entry : row.entrySet())
{
log.info(entry.getKey() + "=" + entry.getValue());
}
See various ways to iterate through Map here.

Text input value as a number

I want input field value to be a number instead of string.
A simple scenario is you have 2 input fields and submit button on a page. When you click submit you should get sum of numbers keyed in both the input fields and not appended strings.
I tried using "number_field_tag" for input type=number but the value is still a String and not Fixnum what I want.
As Surya stated in his comment, what comes from the form fields is always a string. But you can do something like this in the controller action that processes the form (presuming you're using Rails):
def process_form
#result = params[:first_field].to_i + params[:second_field].to_i
end

How to set variable values to specific cell or element in BIRT

I have declared variable in beforeFactory of BIRT Report.
For example:
This variable I am incrementing in table row render like:
Now when all the rows are rendered I want to set above variable to specific cell/ element. I tried
document.getElementName("numberOfMobilityFilesProcessed").text = numberOfMobilityFiles;
AND
reportContext.getDesignHandle().getElementByID
but they are not working out for me.
I had some problems with temporaly local variables used at multiple steps of datasource scripting so I always used global persisting.
After changing your variable you convert it to a String (because only Strings can be persisted) and before editing your variable again, you load the String from persisted context and convert it to the type you want (String to Integer are automatically converted by JavaScripts dynamic typed variables, but don't forget the toString() when you are saving otherwise you will risk an error).
Because you are using reportContext.setPersistentGlobalVariable your variable is accessable in every Element of your Report.
Example:
var rowNum = reportContext.getPersistentGlobalVariable("row_number");
if(rowNum == null){
rowNum = -1;
}
rowNum++;
reportContext.setPersistentGlobalVariable("row_number", rowNum.toString());
Ok, you have a text element displaying a number of row in a table element. The text element appears before the table in the report.
If you are using two separate tasks RunTask and RenderTask:
Add a report variable in your report (see "variable" node on the Data Explorer view). Then you can change the report variable in onCreate() event handler of the table row:
vars["numberOfSomething"] = vars["numberOfSomething"] + 1;
and access its value in an onRender() evenet handler of some text element, for instance DynamicText:
this.text = "Number of something: " + vars["numberOfSomething"];
If you are using RunAndRenderTask, you must look for another approach. In this case an order of onCreate() and onRender() calls is different. You could bind the same DataSet to the text element displaying the counter, as the one bound to the table. Than you can add an aggregation binding to the text element that will count all rows in the dataset.

how to get complete list of original fields along with new Fields which has been modified in trident?

Suppose i have list of fields i.e, {field1,field2,field3,field4}
I performed some operation on field2 say i want to add increment each tuple values by some value say 5,
performed this operation in a function which gave me modified field with "M_field2" as out field name now i want to write complete tuple in a file but in place of field2 i want "M_field2". How i will achieve this.
From the trident API page it says
A function takes in a set of input fields and emits zero or more tuples as output. The fields of the output tuple are appended to the original input tuple in the stream. If a function emits no tuples, the original input tuple is filtered out. Otherwise, the input tuple is duplicated for each output tuple
Now digging more from the trident tutorial page found this
With grouped streams, the output will contain the grouping fields followed by the fields emitted by the aggregator. For example:
stream.groupBy(new Fields("val1"))
.aggregate(new Fields("val2"), new Sum(), new Fields("sum"))
In this example, the output will contain the fields "val1" and "sum".
I am not sure but the closest one I can think of is doing something like
stream.groupBy(new Fields("field1","field3","field4"))
.aggregate(new Fields("field2"), new Sum(), new Fields("M_field2"))
might achieve what you are looking for. Correct me if I am wrong.
I solved this issue.. using trident just you have to use modified field name in list of input fields.
For example :-
topology.newStream("dummySpout",new DummySpout()).stateQuery(tridentState, new QueryFunctionClass(), new Fields("outLpi","outFileId"))
.each(new Fields("outLpi"),new DBReaderFunction((ArrayList<String>)conf.get("listOfFields")), new Fields((ArrayList<String>)conf.get("listOfFields")))
.each(new Fields((ArrayList<String>)conf.get("listOfFields")), new LoggerFilter())
.aggregate(new Fields("SAL"), new ApplyAggregator(),new Fields("sum"))
.each(new Fields("sum","SAL"),new LoggerFilter());
in last line "sum" is the modified field and SAL is original field.

Resources