How to Handle NULL values with ToDate function in Apache Pig - hadoop

I have datetime data in my input and would like to load it correctly from Pig. I googled and learned it's suggested to load as chararray then covert to datetime with ToDate function. However, sometimes my datetime fields are NULL. Then, I'm getting NULL Pointer Exception from PIG when I try to apply the ToDate Function. I am trying to use bincond operator but I'm getting the following error:
mismatched input '?' expecting SEMI_COLON
Which does not make sense.
=====================================================================
Here is the code that I have so far:
transactions_edited = FOREACH transactions GENERATE
id,
code,
user_id,
visit_code,
channel_id,
transaction_type,
product_category,
product_subcategory,
specific_id,
specific_type,
email,
cpf,
name,
last_name,
gender,
birth_date,
phone_code,
phone,
additional_phone_code,
additional_phone,
zip_code,
monthly_income,
status,
opportunity_status,
ToDate(created_at,'yyyy-MM-dd HH:mm:ss') AS created_at,
ToDate(updated_at,'yyyy-MM-dd HH:mm:ss') AS updated_at,
old_status,
old_masked_id,
address_type,
address,
address_number,
address_complement,
neighborhood,
city,
state,
landing_path,
referrer,
source,
source_advertising,
keyword,
ad_id,
ad_name,
ad_network,
ad_placement,
ad_device,
cpf_restriction,
mother_name,
registration_form_closed,
(opportunity_status_updated_at is not NULL ) ?ToDate(opportunity_status_updated_at,'yyyy-MM-dd HH:mm:ss') : AS opportunity_status_updated_at,
potential,
interest,
lead_id,
(integrated_at is not NULL) ? ToDate(integrated_at,'yyyy-MM-dd HH:mm:ss') : AS integrated_at,
starred,
channel_input_type,
rg
;
Any help will be really appreciated.
Thanks !

There is a bit correction in ternary operator. Please use below modified code -
transactions_edited = FOREACH transactions GENERATE
id,
code,
user_id,
visit_code,
channel_id,
transaction_type,
product_category,
product_subcategory,
specific_id,
specific_type,
email,
cpf,
name,
last_name,
gender,
birth_date,
phone_code,
phone,
additional_phone_code,
additional_phone,
zip_code,
monthly_income,
status,
opportunity_status,
ToDate(created_at,'yyyy-MM-dd HH:mm:ss') AS created_at,
ToDate(updated_at,'yyyy-MM-dd HH:mm:ss') AS updated_at,
old_status,
old_masked_id,
address_type,
address,
address_number,
address_complement,
neighborhood,
city,
state,
landing_path,
referrer,
source,
source_advertising,
keyword,
ad_id,
ad_name,
ad_network,
ad_placement,
ad_device,
cpf_restriction,
mother_name,
registration_form_closed,
(opportunity_status_updated_at is not NULL ? ToDate(opportunity_status_updated_at,'yyyy-MM-dd HH:mm:ss') : opportunity_status_updated_at) AS opportunity_status_updated_at,
potential,
interest,
lead_id,
(integrated_at is not NULL ? ToDate(integrated_at,'yyyy-MM-dd HH:mm:ss') : integrated_at) AS integrated_at,
starred,
channel_input_type,
rg
;

Related

How to pass all in between dates from date range in "IN" operator for Oracle Pivot?

What I want is to pass a complete range of dates in the Pivot "IN" clause. But what i am doing is that using the values only that i am getting from database.
For ex.
Suppose if user select the From date as '10/10/2015' and To date as '10/15/2015' then I want to use all the values (10/10/2015,10/11/2015,10/12/2015,10/13/2015,10/14/2015,10/15/2015)
But what is happening from my query is ('10/10/2015','10/15'2015').
SELECT * FROM
(
SELECT
Employee.M_NAME AS MANAGER_NAME,
Employee.PHONE,
Employee.JOB_ID,
Employee.Assigned_DATE,
Employee.SHIFT,
Employee.Dept as Assignment,
Employee.E_ID,
Employee.NAME AS EMP_NAME,
Employee.DEPT_COLOR
FROM Employee
WHERE Assigned_Date BETWEEN
TO_DATE('FD_Selected','MM/DD/YYYY') AND
TO_DATE('TD_Selected','MM/DD/YYYY')
ORDER by Employee.E_ID
) x
PIVOT
(
min(Assignment)
FOR Assigned_Date IN (TO_DATE('FD_Selected','YYYY-MM-DD'),
TO_DATE('TD_Selected','YYYY-MM-DD')
)
) p
Now the data is coming like this:
M_NAME PHONE JOB_ID Assigned_DATE SHIFT Assignment E_ID EMP_NAME DEPT_COLOR '10/10/2015' '10/15/2015'
But I want like this:
M_NAME PHONE JOB_ID Assigned_DATE SHIFT Assignment E_ID EMP_NAME DEPT_COLOR '10/10/2015' '10/11/2015' '10/12/2015' '10/13/2015' '10/14/2015' '10/15/2015'

Codeigniter IF inside $this->db->select

i have codeigniter code
$this->db->select("items.name as name, items.category as category, items.supplier_id as supplier_id, items.item_number as item_number,
items.product_id as product_id, items.description as description,
items.size as size, items.tax_included as tax_included, items.cost_price as cost_price,
if(price_tiers.name='".$hasilsplit1."',items_tier_prices.unit_price,items.unit_price) as unit_price, items.promo_price as promo_price,
items.start_date as start_date, items.end_date as end_date, items.reorder_level as reorder_level, items.item_id as item_id, items.allow_alt_description as allow_alt_description, items.is_serialized as is_serialized, items.image_id as image_id, items.override_default_tax as override_default_tax, items.is_service as is_service, items.deleted as deleted");
$this->db->from('items');
$this->db->join('items_tier_prices','items_tier_prices.item_id=items.item_id','left');
$this->db->join("price_tiers","price_tiers.id=items_tier_prices.tier_id and price_tiers.name='".$hasilsplit1."'","left");
$this->db->where('items.item_id', $hasilsplit0);
$this->db->where('items.deleted', 0);
it generates error on part:
if(price_tiers.name='".$hasilsplit1."',items_tier_prices.unit_price,items.unit_price) as unit_price
the error is:
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'as unit_price, items.promo_price as promo_price, items.start_date as sta' at line 1
how to write IF syntax correctly inside $this->db->select?
when i var_dump, it shows
SELECT items.name as name, items.category as category, items.supplier_id as supplier_id, items.item_number as item_number, items.product_id as product_id, items.description as description, items.size as size, items.tax_included as tax_included, items.cost_price as cost_price, if(price_tiers.name='Jendela Swing Double J106', items_tier_prices.unit_price, items.unit_price) as unit_price, items.promo_price as promo_price, items.start_date as start_date, items.end_date as end_date, items.reorder_level as reorder_level, items.item_id as item_id, items.allow_alt_description as allow_alt_description, items.is_serialized as is_serialized, items.image_id as image_id, items.override_default_tax as override_default_tax, items.is_service as is_service, items.deleted as deleted FROM (items) LEFT JOIN items_tier_prices ON items_tier_prices.item_id=items.item_id LEFT JOIN price_tiers ON price_tiers.id=items_tier_prices.tier_id and price_tiers.name='Jendela Swing Double J106' WHERE items.item_id = '1' AND items.deleted = 0
it seems if blocks getting character `
Thanks
I think you need to check the manual, and to try to disable escaping when compiling SELECT:
$this->db->select() accepts an optional second parameter. If you set it to FALSE, CodeIgniter will not try to protect your field or table names. This is useful if you need a compound select statement where automatic escaping of fields may break them.
And the second recommendation would be to escape SQL query, when you are using raw data, for example:
$this->db->escape($someString); // instead of using $someString when concatenating string for query.

Custom Reporting Table Query Boolean Column

I am new to Kentico and really enjoy developing so far! I have exhausted all search efforts and thought I'd reach out to the community. I am creating a custom report (table to be exact) using the reporting built into Kentico. I have a custom query:
Select FirstName as [First Name], LastName as [Last Name], Email, Phone, StreetAddress as [Street Address], City, State, Country, Zip, Email, Phone, PaymentDate as [Payment Date], TransactionID as [Transaction ID], PaymentStatus as [Payment Status]
from TableName E
WHERE E.ID = 1 AND E.PaymentStatus = False
ORDER BY E.ItemCreatedWhen ASC
The issue that I find is that PaymentStatus is coming thru as a "Checkbox - unchecked or checked" instead of True or False. In the actual table and data it shows True/False. Is there any way around this? Thanks for your help!
I get the same checkbox, you can get a text value instead by wrapping your boolean (bit) field in a CASE. Your Report table Query would look like this;
SELECT FirstName as [First Name], LastName as [Last Name], Email, Phone, StreetAddress as [Street Address], City, State, Country, Zip, Email, Phone, PaymentDate as [Payment Date], TransactionID as [Transaction ID],
CASE WHEN PaymentStatus = 0 THEN 'False'
ELSE 'True'
END as [Payment Status]
FROM TableName E
WHERE E.ID = 1
AND E.PaymentStatus = False
ORDER BY E.ItemCreatedWhen ASC

Not a Group By Expression, with own declared functions

Ok so I know when using aggregate functions such as MAX, MIN, AVG and so fort, in a select statement. You need to use the GROUP BY function for all the selected columns that DON'T use the aggregate functions.EX
SELECT name, MAX(age)
FROM person
GROUP BY name
but my issue is, when I use my own functions for certain columns and I use an Aggregate function within my select statement. EX
SELECT f_fullname(name, surname) as fullname, max(age)
FROM person
Should i add the whole function as a part of the group by clause?
GROUP BY f_fullname(name, surname)
because at this moment i get the ORA-00979 not a GROUP BY expression error.
Thanks for your help!
PS. the select statements are just for explanation purposes**
You can either have the whole function or the columns which are the parameters.
select f_fullname(name , surname) full_name, max (age)
from person
group by name, surname;
or
select f_fullname(name , surname) full_name, max (age)
from person
group by f_fullname(name , surname);
Here is a sqlfiddle demo

Temporary tables in Packages - Oracle

I am kind of new in Oracle.
I am trying to create a package that has several functions.
This is the pseudocode of what I want to do
function FunctionA(UserID, startdate, enddate)
/* Select TransactionDate, Amount
from TableA
where TransactionDate between startdate and enddate
and TableA.UserID = UserID */
Return TransactionDate, Amount
end FunctionA
function FunctionB(UserID, startdate, enddate)
/* Select TransactionDate, Amount
from TableB
where TransactionDate between startdate and enddate
and TableB.UserID = UserID */
Return TransactionDate, Amount
end FunctionA
TYPE TRANSACTION_REC IS RECORD(
TransactionDate DATE,
TransactionAmt NUMBER);
function MainFunction(startdate, enddate)
return TBL
is
vTrans TRANSACTION_REC;
begin
FOR rec IN
( Select UserID, UserName, UserStatus
from UserTable
where EntryDate between startdate and enddate )
LOOP
vTrans := FunctionA(rec.UserID, startdate, enddate)
if vTrans.TransactionDate is null then
vTrans := FunctionB(rec.UserID, startdate, enddate)
if vTrans.TransactionDate is null then
rec.UserStatus := 'Inactive'
endif;
endif;
END Loop;
PIPE ROW(USER_OBJ_TYPE(rec.UserID,
rec.UserName,
rec.UserStatus,
vTrans.TransactionDate,
vTtans.TransactionAmt));
end MainFunction
Running this kind of code takes a long time because TableA and TableB is a very large table, and I am only getting 1 entry per record from the tables.
I would want to create a temporary table (TempTableA, TempTableB) within the package that will temporarily store all records based on the startdate and enddate, so that when I try to retrieve the TransactionDate and Amount for each rec, I will only refer to the TempTables (which is smaller than TableA and TableB).
I also want to take into consideration if the UserID is not found in TableA and TableB. So basically, when there are no records found in TableA and TableB, I also want the entry in the output, but it is indicated that the user is inactive.
Thank you for all your help.
SQL is a set-based language. It is far more efficient to execute one statement which returns all the rows you need than to execute many statements which each return a single row.
Here is one way of getting all your rows at once. It uses a common table expression because you read the whole of the UserTable and you should only do that once.
with cte as
(select UserID
, UserStatus
from UserTable )
select cte.UserID
, cte.UserStatus
, TableA.TransactionDate
, TableA.Amount
from cte join TableA
on (cte.UserID = TableA.UserID)
where cte.UserStatus = 'A'
and TableA.TransactionDate between startdate and enddate
union
select cte.UserID
, cte.UserStatus
, TableB.TransactionDate
, TableB.Amount
from cte join TableB
on (cte.UserID = TableB.UserID)
where cte.UserStatus != 'A'
and TableB.TransactionDate between startdate and enddate
By the way, be careful with temporary tables. They aren't like temporary tables in T-SQL. They are permanent heap tables, it's just their data that's temporary. This means that populating a temporary table is an expensive process, because the database writes all those rows to disk. Consequently we need to be certain that the performance gain we get by reading a dataset from a temporary table is worth the overhead of all those writes.
That certainly would not be the case with your code. In fact, it is really pretty rare that the answer to a performance question turns out to be "Use a Global Temporary Table", at least not in Oracle. Better queries are the way to go, and in particular, embracing the Joy of Sets!
Probably better to do it in one query, e.g.:
Select UserTable.UserID, UserTable.UserName, UserTable.UserStatus
,TableA.TransactionDate AS ATransactionDate
,TableA.Amount AS AAmount
,TableB.TransactionDate AS BTransactionDate
,TableB.Amount AS BAmount
from UserTable
left join TableA
on (UserTable.UserID = TableA.UserID)
left join TableB
on (UserTable.UserID = TableB.UserID)
where UserTable.EntryDate between startdate and enddate

Resources