facing problems while updating rows in hbase - hadoop

I've run samples : SampleUploader,PerformanceEvaluation and rowcount as given in
hadoop wiki: http://wiki.apache.org/hadoop/Hbase/MapReduce
The problem I'm facing is : table1 is my table with the column family column
>create 'table1','column'
>put 'table1','row1','column:address','SanFrancisco'
hbase(main):020:0> scan 'table1'
ROW COLUMN+CELL
row1 column=column:address, timestamp=1276351974560, value=SanFrancisco
>put 'table1','row1','column:name','Hannah'
hbase(main):020:0> scan 'table1'
ROW COLUMN+CELL
row1 column=column:address,timestamp=1276351974560,value=SanFrancisco
row1 column=column:name, timestamp=1276351899573, value=Hannah
I want both the columns to appear in the same row as a different version
similary,
if i change the name column to sarah, it shows the updated row.... but i want both the old row and the changed row to appear as 2 different versions so that i could make analysis on the data........
what is the mistake im making????
thank u a lot
sammy

To see multiple versions of the same row, you need to specify a VERSIONS option:
get 'my_table', 'my_row_key', {VERSIONS -> 4}
When the hbase shell prints out
row1 column=column:address,timestamp=1276351974560,value=SanFrancisco
row1 column=column:name, timestamp=1276351899573, value=Hannah
That's a single row with multiple columns. The text representation just happens to use multiple lines of text, one per column.

Related

How to add additional column showing sheet names & row source in query importrange function (google sheet)

I have a query importrange function to combine hundreds of my sheet into 1 new googlesheet file, but i need an additional column which show sheet/tab source name and the row source.
this is my query so far :
=QUERY(
{IMPORTRANGE("SheetID-1","Sheet1!A5:Y1000");
IMPORTRANGE("SheetID-2","Sheet2!A5:Y1000");
IMPORTRANGE("SheetID-3","Sheet3!A5:Y1000")},
"SELECT * WHERE Col2 IS NOT NULL Order By Col1", 0)
my query is like that (but for hundreds different sheet) and now i hope i can added aditional column in column Z that showing the source tab/sheet name with row :
Example :
Column Z : Sheet1 - Row 7
If i using filter i can add the additional column with the source row & sheet name like that, but im have no idea to add the column using query. Need some help here.

Power Query - Modify tables before combining them

On my table (point1) I am trying to get, that for each table of the grouped rows (point 2) I will have new row inserted (point 3) at the beginning of each table with the value in the column "Metadata1" equal to value form "Column2" for original row number 2 (starting counting from 0).
Link to excel file:
https://filebin.net/cnb4pia0vvkg937g
Its hard to know how much of your requirement is generic or specific, but
TransformFirst = Table.TransformColumns(#"PriorStepName",{{"Count", each
#table(
{"Column1","Column2","Metadata1"},
{{_{0}[Column1],_{0}[Column2],_{2}[Column2]}}
) & _
}}),
Together=Table.Combine(TransformFirst[Count])
in Together
modifies all the tables in column Count to include an extra row that is made up up Row0/Col1 Row0/Col2 and Row2/Col2
It then combines all those tables into one table

How to auto-create row space / spacial definition between rows with different column values?

I want to create a row, or some kind of definition, between Google Sheets rows whenever one of the value in my columns contain a new / different value.
Here's an example of the kind of spreadsheets I'm putting together, which shows how Agencies are listed in Column A, and Sub-Agencies are listed in Column B.
I'd like to create some kind of row space / spacial definition between every new / different Sub-Agency, which are values that I enter in Column B of my spreadsheet -- but no space between rows with the same Sub-Agency value. Here's an example of the kind of row space / spacial definition that I'm seeking.
Any ideas / suggestion on how do this?
Many thanks for your help!
You can insert a row after X condition.
Using insertRowAfter(afterPosition) inserts a row after the row you want:
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheet = ss.getSheets()[0];
// This inserts a row after the first row position
sheet.insertRowAfter(1);

Update One table Column with Values from Another table Having Similar

Hi Guys I have Two tables (MIGADM.CORPMISCELLANEOUSINFO and CRMUSER.PREFERENCES) and Each Has a field called PREFERENCE_ID and ORGKEY. I want to Update the Preference ID for MIGADM.CORPMISCELLANEOUSINFO with Preference_ID from CRMUSER.PREFERENCES for Each Corresponding ORGKEY. SO I wrote this Query;
update migadm.CORPMISCELLANEOUSINFO s set s.PREFERENCE_ID = (
select e.PREFERENCE_ID from crmuser.preferences e where s.ORGKEY = e.ORGKEY)
But I get:
ORA-01427: single-row subquery returns more than one row
What Should I do?
It means the columns you have selected are not unique enough to identify one row in your source table. Your first step would be to identify those columns.
To see the set of rows that have this problem, run this query.
select e.origkey,
count(*)
from crmuser.preferences e
group by e.origkey
having count(*) > 1
eg : for origkey of 2, let's say there are two rows in the preferences table.
orig_key PREFERENCE_ID
2 202
2 201
Oracle is not sure which of these should be used to update the preference_id column in CORPMISCELLANEOUSINFO
identify the row where the subquery returns more than one row (You could use REJECT ERROR clause to do it for instance) or use the condition 'where rownum = 1'.

How to use "multiple AND in columns" to filter the row in hbase?

The scenario like this:
In a hbase table, there are different columns for different rows, e.g.:
row1 have columns fm:a, fm:b, fm:c
row2 have columns fm:a, fm:d
So I want to use a scan to test if a row have both column fm:a & fm:b, if so, return this row, otherwise, this row should not be returned in the scan result. In above case, I just want to row1 be returned.
I check the Filters in hbase, looks like if do this in Filter level, the performance will be bad. Any ideas?
you need to add each column to the scan - sth like this.
Create a hashTable , say ht
for each columnFamily in ht:
{ for each column in columnFamily:
{ scan.addColumn(Bytes.toBytes(columnFamily), Bytes.toBytes(column)); }
}

Resources