Pig, replace a string by an integer for a specifc column - hadoop

I am new to Pig, so this might be a trivial question. I could not get a reasonable answer hence asking this.
Have 3 columns as follows:
userid itemid action
245 4 'view'
245 6 'click'
149 12 'buy'
149 1 'click'
and so on...
I have a mapping given such as : 'view'=1 , 'click'=1.4 , 'buy'= 2.1 etc.
My desired output is:
userid itemid action
245 4 1
245 6 1.4
149 12 2.1
149 1 1.4
Simple commands that can help me achieve this?
I ll need to perform some cacluation on the 3rd column and hence can't have it in string format.

Create a mapping file in HDFS with these mapping values, like:
action_string action_value
view 1
click 1.4
buy 2.1
Say this file is stored at <mapping_file>. Then just load this file and join your original dataset with this file:
mapping = LOAD '<mapping_file>' USING PigStorage() AS action_string, action_value;
joined = JOIN original BY action, mapping BY action_string USING 'replicated';
output = FOREACH joined GENERATE userid, itemid, action_value;
There are other ways depending on your use case and your file size. But I think this is the most flexible.

Related

Elastisearch Query to filter out values in column based on another column

I'm working with a dataset that I'm trying to filter, but I'm having trouble trying to get the results I want, and I'm not sure it's possible to do so. Some sample data is below:
Column A
Column B
Column C
Column D
good
awe
1
9834
great
niopre
1
78964
bad
nue
1
12
good
btr
2
6543
great
muy
2
8765
bad
xdrg
2
1432
bad
thr
3
648
good
cfg
3
6
bad
mk
3
1958
What I want to do is use Column A and C in the filter and only show the values in Column C that also have a row that includes "great" in Column A. So for this dataset the filter would return:
Column A
Column B
Column C
Column D
good
awe
1
9834
great
niopre
1
78964
bad
nue
1
12
good
btr
2
6543
great
muy
2
8765
bad
xdrg
2
1432
I've tried through the built in filters and through the Elasticsearch Query DSL and haven't had any luck yet. If anyone can help guide me in the right direction on how to do this it would be greatly appreciated.

Laravel data order in json column

I am using Laravel 7.
I am having some problems with sorting while using Laravel 7.
To put it briefly. My table has a json column.
Example column name: "jsonData" json data:
$data1 =
{
"rank":12,
"value":"test",
}
$data2 =
{
"rank":105,
"value":"test-2",
}
According to these data, the following query is made.
DB::table('tablename')->orderBy('jsonData->rank', 'ASC')->get();
While the output I get should be 12 - 105, normally from small to large. When I print the data here, it becomes 105 - 12.
Another example of sorting like this:
1
1
10
100
108
113
12
120
1231885
13631
144
You may have noticed how absurd this is. I have done a lot of research on how to solve this. But I could not come to a conclusion.

sort a dynamic set of data using Google Apps Script

I have a block of data like this in a new spreadsheet:
GOODS Count Sort Index
111770999 128 9
111771000 32 0
111771005 64 5
111771010 64 0
111771011 64 1
number of rows are dynamic, columns are fixed (3). How can i write a script to sort by column 3 like using Data-Sort Range in the spreadsheet? Many thanks in advance!
You can sort like this using appscript:
var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('YOUR_SHEET_NAME');
sheet.getRange(2,1,sheet.getLastRow()-1,sheet.getLastColumn()).sort(3);
//Here, getRange is(startingRow,startingColumn,NumRows,NumCols)
//and sort(x) represents sort by 'x'th column

Group on sum of distinct values in Tableau

I'm using Tableau 8.3 and i'm trying to find out how to group on each of the values that I find after making a "count of distinct values".
To illustrate the case I have made a fictive dataset which includes 58 rows (buys), 7 different IDs (customers) and 5 different products. Then I have made a count distinct to find out how many of the 5 different products each ID have bought. It looks like this.
ID1 = 4
ID2 = 4
ID3 = 5
ID4 = 4
ID5 = 3
ID6 = 4
ID7 = 2
Now I want to turn the view around and find out how many of the IDs who have bought X different products. It should ultimately look like this.
2 = 1
3 = 1
4 = 4
5 = 1
Hope to find a solution by posting here! Thank you,
Mikael
You need to update to Tableau 9.0 to achieve that (in a fast way).
You can create a calculated field named #of products:
{ FIXED [id_customer] : COUNTD([id_product]) }
Then you can cross the [# of products] with COUNTD(id_customer) to get what you want.
In older versions of Tableau you need to create a new table in a proper format (1 line per customer with the aggregations) and connect to it.

Simple binding questions

I am trying a simple application which is like this-
View:
A table view showing count of entity 1 in first column and count of entity 2 in second column. Here each row specifies count of different entities at a particular date.
A text field showing- total count of entity 1 multiplied by 35.
A text field showing- sum of count of entities in both the columns.
eg.
(entity1) (entity2)
<2> <3>
<4> <2>
<5> <7>
requirement 1: text field specified in
pt. 2 should show - 385 ie. (11 * 35)
requirement 2: text field specified in
pt. 3 should show - 23 ie. (11 + 12)
Model:
An object with two properties:
int entity1Count
int entity2Count
I am using an array controller object to show data in table view.
My question is -
Can I implement my requirements via
bindings in IB? If yes then how?
Thanks,
Miraaj
I am able to implement requirement 1, via bindings in IB using NSNumberFormatter and its multiplier property :)
but I am still unable to implement requirement 2 :(
Any suggestion ?
Thanks,
Miraaj

Resources