Hive Length outputs more than seen - hadoop

I am trying to run a hive query which should join two table with matching records. However, it never matches but i have the record in the other table. When i do length of a given string it outputs 27, but it should be just 12.
When i download the output file from s3 then i see weird row like
U S 3 F F 1 2 1 4 9 3 3
but in hive console it see it as
US3FF1214933
Also i cannot query the row with
select * from table where item like "US3FF1214933";
It is totally a mess right now and trimming also does not work for me.
I am in need of help.
Thanks in advance,

Thanks to legato for giving me an idea to investigate this by doing
od -c and seeing actual characters between the string.
And after in hive query using regexp_replace(ExString,'\0',"") to replace the weird characters with empty string solved my issue.

Related

Power Query Replace null values with values from another column

I am working with data imported from a pdf file. There is an extra column in the Power Query import (Data.Column7), containing data that belongs in the adjacent columns on either side (Data.Column6 and Data.Column8). Columns 6 and 8 have null values in the cells where the data was pushed into Column 7. I would like to replace the null values in Columns 6 and 8 with the correct data from Column 7, leaving all other values Columns 6 and 8 as is.
After looking at the post here:
Power Query / Power BI - replacing null values with value from another column
and watching this video:
https://www.youtube.com/watch?v=ikzeQgdKA0Q
I tried the following formula:
= Table.ReplaceValue(#"Expanded Data",null, each _[Data.Column7] ,Replacer.ReplaceText,{"Data.Column6","Data.Column8"})
(Note, "Expanded Data" is the last step before this Replace Value step.)
I am not getting any kind of syntax error, but the Replace Value step isn't doing anything at all. My null values in Columns 6 and 8 have not been replaced with the correct data from Column 7.
Any insight into how to achieve replacement would be greatly appreciated. Thank you.
(I should mention, I am a new Power Query user, so please be detailed and assume I know nothing!)
I'm sure there must be some way to do this with the ReplaceValue function, but I think it might be easier to do the following:
1: Create a new column with definition NewData6= if[Data.Column6]=null then [Data.Column7] else [Data.Column6]
2: Do the same thing for 8 : NewData8= if[Data.Column8]=null then [Data.Column7] else [Data.Column8]
3: Delete Data.Column6/7/8
4: Rename the newly made columns if neccesary.
You can do these steps either in the advanced editor, or just use the create custom column button in the add column tab.
If the columns are of the text data type, then it might have empty strings instead of actual nulls.
Try replacing null with "" in your formula.

Get all columns from a table in Monetdb

I have to study a table on monetdb that probably has many columns.
When I do
SELECT * from cat.data limit 1;
I get
1 tuple !5600 columns dropped!
Which I interpret as not getting all the columns from the console.
I am using mclient to connect to the database.
I tried withe DESC, DESCRIBE - didnt work. Any help?
Indeed. See mclient --help
It is possible to extend the width of your output to see more columns.
Alternatively, within the mclient console use the \d tablename command
You need to use \w-1 command before executing your query.
sql>\w-1
sql>SELECT * from cat.data limit 1 ;
This will show all the columns in the terminal. The text will be wrapped.

How to identify column types during sql injection?

Situation is following:
I have identified sql injection attack vector, and have following information about target table:
It has six columns. (Identified using "order by").
I can see output of 3 of them (table is displayed). two seems kind of enum value (integer in database?), and one is a date. I have very strong suspicion that col #6 is date column.
I'm almost sure the database is oracle. (ROWNUM works and LIMIT gives error).
I don't have error messages (always generic text is returned - "something went wrong").
Frontend is PHP if that matters. But there might be middle layer between it and database (e.g. java service), so I'm not sure where the query is being constructed.
E.g. following search query works as expected:
test' AND ROWNUM <= 5 ORDER BY 6--
EDIT-FROM-HERE:
Ok after help from comments, following query works:
test' UNION ALL SELECT null,null,null,null,null,null FROM dual--
(I was missing FROM dual part. Thank you #kordirko very much!)
This query adds one empty record in the output table (it is visually visible), so I'm definitely on the right track!
Now following line also works:
test' UNION ALL SELECT null,null,null,n't',null,null FROM dual--
I correctly identified 4th column and now it displays uppercase(?) letter T where I expected it to appear. So far so good. But it gives error when I input any string longer than 1 char! So following gives an error:
test' UNION ALL SELECT null,null,null,n'test',null,null FROM dual--
I'm no expert in SQL injection, and especially ORACLE (though have experience with MsSql).
I think the problem is something unicode-ansi-whateverencoding-related. For other rows (selected by original query before my UNION ALL SELECT addition) the 4th column gives multi-character normal strings. But when I try to inject desired string, it only works if it's one character, and also misteriously displays it in uppercase. I think this must be some encoding problem. I just discovered I needed n prefix for unicode string after 1 hour of searching and struggling. Maybe some Oracle gurus can quickly spot what mistake do I have in my query?

How MAX of a concatenated column in oracle works?

In Oracle, while trying to concatenate two columns of both Number type and then trying to take MAX of it, I am having a question.
i.e column A column B of Number data type,
Select MAX(A||B) from table
Table data
A B
20150501 95906
20150501 161938
when I’m running the query Select MAX(A||B) from table
O/P - 2015050195906
Ideally 20150501161938 should be the output????
I am trying to format column B like TO_CHAR(B,'FM000000') and execute i'm getting the expected output.
Select MAX(A || TO_CHAR(B,'FM000000')) FROM table
O/P - 2015011161938
Why is 2015050195906 is considered as MAX in first case.
Presumably, column A is a date and column B is a time.
If that's true, treat them as such:
select max(to_date(to_char(a)||to_char(b,'FM000000'),'YYYYMMDDHH24MISS')) from your_table;
That will add a leading space for the time component (if necessary) then concatenate the columns into a string, which is then passed to the to_date function, and then the max function will treat as a DATE datatype, which is presumably what you want.
PS: The real solution here, is to fix your data model. Don't store dates and times as numbers. In addition to sorting issues like this, the optimizer can get confused. (If you store a date as a number, how can the optimizer know that '20141231' will immediately be followed by '20150101'?)
You should convert to number;
select MAX(TO_NUMBER(A||B)) from table
Concatenation will result in a character/text output. As such, it sorts alphabetically, so 9 appears after 16.
In the second case, you are specifiying a format to pad the number to six digits. That works well, because 095906 will now appear before 161938.

Appending trailing zeros to a oracle number in Oracle

I need to format the NUMBER data type in Oracle as follows:
Problem Statement:
-> Append trailing zeros to the ATM_CARD-NUMBER whose length is 14
My Attempt:
SELECT to_char(atm_card_nbr,'9999999999999900')as new_atm_nbr,atm_card_key from atm_card_dm where LENGTH(TRANSLATE(TO_CHAR(atm_card_nbr),'1234567890.-','1234567890'))=14;
Result:
I've gone through several Oracle related online sites but could not find the correct result. So, how can get the correct result?
Thanks in advance!
SELECT rpad(to_char(atm_card_nbr),16,'0') as new_atm_nbr,
atm_card_key
from atm_card_dm
where LENGTH(TRANSLATE(TO_CHAR(atm_card_nbr),'1234567890.-','1234567890'))=14;

Resources