hive UDF - convert StringObjectInspector to String

hive UDF - convert StringObjectInspector to String - hadoop

I am writing generic UDF. If I use UDF directly it works, however if I use UDF with other function (distinct, max, min) it's not even calling evaluate function.
I want to see what's happening and so trying to log the values. However need to understand how to convert StringObjectInspector to String.
Code
#Description(name = "Decrypt", value = "Decrypt the Given Column", extended = "SELECT Decrypt('Hello World!');")
public class Decrypt extends GenericUDF {
Logger logger = Logger.getLogger(getClass().getName());
PrimitiveObjectInspector col;
StringObjectInspector databaseName;
StringObjectInspector schemaName;
StringObjectInspector tableName;
StringObjectInspector colName;
#Override
public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumentException {
System.out.println("****************************** initialize called ******************************");
logger.info("****************************** initialize called ******************************");
if (arguments.length != 5) {
throw new UDFArgumentLengthException("Decrypt only takes 4 arguments: T, String, String, String");
}
ObjectInspector colObject = arguments[0];
ObjectInspector databaseNameObject = arguments[1];
ObjectInspector schemaNameObject = arguments[2];
ObjectInspector tableNameObject = arguments[3];
ObjectInspector colNameNameObject = arguments[4];
if ( !(databaseNameObject instanceof StringObjectInspector) ||
!(schemaNameObject instanceof StringObjectInspector) ||
!(tableNameObject instanceof StringObjectInspector) ||
!(colNameNameObject instanceof StringObjectInspector)
) {
throw new UDFArgumentException("Error: databaseName, schemeName, tableName and ColName should be String");
}
this.col = (PrimitiveObjectInspector) colObject;
this.databaseName = (StringObjectInspector) databaseNameObject;
this.tableName = (StringObjectInspector) tableNameObject;
this.schemaName = (StringObjectInspector) schemaNameObject;
this.colName = (StringObjectInspector) colNameNameObject;
logger.info("****************************** initialize end ******************************");
logger.info(col.toString());
logger.info(col);
logger.info(databaseNameObject.toString());
logger.info(databaseNameObject);
logger.info(colName.toString());
logger.info(colName);
logger.info(colNameNameObject);
logger.info(colNameNameObject.toString());
return PrimitiveObjectInspectorFactory.javaStringObjectInspector;
}
#Override
public Object evaluate(DeferredObject[] deferredObjects) throws HiveException {
System.out.println("******************** Decrypt ********************");
logger.info("******************** Decrypt ******************** ");
if(col.getPrimitiveJavaObject(deferredObjects[0].get()) == null){
return null;
}
String stringToDecrypt = col.getPrimitiveJavaObject(deferredObjects[0].get()).toString();
String database = databaseName.getPrimitiveJavaObject(deferredObjects[1].get());
String schema = schemaName.getPrimitiveJavaObject(deferredObjects[2].get());
String table = tableName.getPrimitiveJavaObject(deferredObjects[3].get());
String col = colName.getPrimitiveJavaObject(deferredObjects[4].get());
return new Text(AES.decrypt(stringToDecrypt, database, schema, table, col));
}
#Override
public String getDisplayString(String[] strings) {
return null;
}
}

Try getPrimitiveJavaObject method instead of toString, more details.
Another thought on your problem, check out optimization flags:
vectorization: hive.vectorized.execution, hive.vectorized.execution.enabled, hive.vectorized.execution.reduce.groupby.enabled
Cost-Based Optimization: hive.cbo.enable
predicate push down: hive.optimize.ppd
Check if those flags are enabled/disabled by typing set <option>, e.g., set hive.optimize.ppd;, in hive shell, and try to switch the value.

Related

How to read numeric value from excel file using spring batch excel

I am reading values from .xlsx using spring batch excel and POI. I see numeric values are printing with different format than the original value in .xlsx
Please suggest me , How to print the values as its in .xlsx file. Below are the details.
In my Excel values are as follows
The values are printing as below
My code is as below
public ItemReader<DataObject> fileItemReader(InputStream inputStream){
PoiItemReader<DataObject> reader = new PoiItemReader<DataObject>();
reader.setLinesToSkip(1);
reader.setResource(new InputStreamResource(DataObject));
reader.setRowMapper(excelRowMapper());
reader.open(new ExecutionContext());
return reader;
}
private RowMapper<DataObject> excelRowMapper() {
return new MyRowMapper();
}
public class MyRowMapper implements RowMapper<DataObject> {
#Override
public DataRecord mapRow(RowSet rowSet) throws Exception {
DataObject dataObj = new DataObject();
dataObj.setFieldOne(rowSet.getColumnValue(0));
dataObj.setFieldTwo(rowSet.getColumnValue(1));
dataObj.setFieldThree(rowSet.getColumnValue(2));
dataObj.setFieldFour(rowSet.getColumnValue(3));
return dataObj;
}
}

I had this same problem, and its root is the class org.springframework.batch.item.excel.poi.PoiSheet inside PoiItemReader.
The problem happens in the method public String[] getRow(final int rowNumber) where it gets a org.apache.poi.ss.usermodel.Row object and convert it to an array of Strings after detecting the type of each column in the row. In this method, we have the code:
switch (cellType) {
case NUMERIC:
if (DateUtil.isCellDateFormatted(cell)) {
Date date = cell.getDateCellValue();
cells.add(String.valueOf(date.getTime()));
} else {
cells.add(String.valueOf(cell.getNumericCellValue()));
}
break;
case BOOLEAN:
cells.add(String.valueOf(cell.getBooleanCellValue()));
break;
case STRING:
case BLANK:
cells.add(cell.getStringCellValue());
break;
case ERROR:
cells.add(FormulaError.forInt(cell.getErrorCellValue()).getString());
break;
default:
throw new IllegalArgumentException("Cannot handle cells of type '" + cell.getCellTypeEnum() + "'");
}
In which the treatment for a cell identified as NUMERIC is cells.add(String.valueOf(cell.getNumericCellValue())). In this line, the cell value is converted to double (cell.getNumericCellValue()) and this double is converted to String (String.valueOf()). The problem happens in the String.valueOf() method, that will generate scientific notation if the number is too big (>=10000000) or too small(<0.001) and will put the ".0" on integer values.
As an alternative to the line cells.add(String.valueOf(cell.getNumericCellValue())), you could use
DataFormatter formatter = new DataFormatter();
cells.add(formatter.formatCellValue(cell));
that will return to you the exact values of the cells as a String. However, this also mean that your decimal numbers will be locale dependent (you'll receive the string "2.5" from a document saved on an Excel configured for UK or India and the string "2,5" from France or Brazil).
To avoid this dependency, we can use the solution presented on https://stackoverflow.com/a/25307973/9184574:
DecimalFormat df = new DecimalFormat("0", DecimalFormatSymbols.getInstance(Locale.ENGLISH));
df.setMaximumFractionDigits(340);
cells.add(df.format(cell.getNumericCellValue()));
That will convert the cell to double and than format it to the English pattern without scientific notation or adding ".0" to integers.
My implementation of the CustomPoiSheet (small adaptation on original PoiSheet) was:
class CustomPoiSheet implements Sheet {
protected final org.apache.poi.ss.usermodel.Sheet delegate;
private final int numberOfRows;
private final String name;
private FormulaEvaluator evaluator;
/**
* Constructor which takes the delegate sheet.
*
* #param delegate the apache POI sheet
*/
CustomPoiSheet(final org.apache.poi.ss.usermodel.Sheet delegate) {
super();
this.delegate = delegate;
this.numberOfRows = this.delegate.getLastRowNum() + 1;
this.name=this.delegate.getSheetName();
}
/**
* {#inheritDoc}
*/
#Override
public int getNumberOfRows() {
return this.numberOfRows;
}
/**
* {#inheritDoc}
*/
#Override
public String getName() {
return this.name;
}
/**
* {#inheritDoc}
*/
#Override
public String[] getRow(final int rowNumber) {
final Row row = this.delegate.getRow(rowNumber);
if (row == null) {
return null;
}
final List<String> cells = new LinkedList<>();
final int numberOfColumns = row.getLastCellNum();
for (int i = 0; i < numberOfColumns; i++) {
Cell cell = row.getCell(i);
CellType cellType = cell.getCellType();
if (cellType == CellType.FORMULA) {
FormulaEvaluator evaluator = getFormulaEvaluator();
if (evaluator == null) {
cells.add(cell.getCellFormula());
} else {
cellType = evaluator.evaluateFormulaCell(cell);
}
}
switch (cellType) {
case NUMERIC:
if (DateUtil.isCellDateFormatted(cell)) {
Date date = cell.getDateCellValue();
cells.add(String.valueOf(date.getTime()));
} else {
// Returns numeric value the closer possible to it's value and shown string, only formatting to english format
// It will result in an integer string (without decimal places) if the value is a integer, and will result
// on the double string without trailing zeros. It also suppress scientific notation
// Regards to https://stackoverflow.com/a/25307973/9184574
DecimalFormat df = new DecimalFormat("0", DecimalFormatSymbols.getInstance(Locale.ENGLISH));
df.setMaximumFractionDigits(340);
cells.add(df.format(cell.getNumericCellValue()));
//DataFormatter formatter = new DataFormatter();
//cells.add(formatter.formatCellValue(cell));
//cells.add(String.valueOf(cell.getNumericCellValue()));
}
break;
case BOOLEAN:
cells.add(String.valueOf(cell.getBooleanCellValue()));
break;
case STRING:
case BLANK:
cells.add(cell.getStringCellValue());
break;
case ERROR:
cells.add(FormulaError.forInt(cell.getErrorCellValue()).getString());
break;
default:
throw new IllegalArgumentException("Cannot handle cells of type '" + cell.getCellTypeEnum() + "'");
}
}
return cells.toArray(new String[0]);
}
private FormulaEvaluator getFormulaEvaluator() {
if (this.evaluator == null) {
this.evaluator = delegate.getWorkbook().getCreationHelper().createFormulaEvaluator();
}
return this.evaluator;
}
}
And my implementation of CustomPoiItemReader (small adaptation on original PoiItemReader) calling CustomPoiSheet:
public class CustomPoiItemReader<T> extends AbstractExcelItemReader<T> {
private Workbook workbook;
#Override
protected Sheet getSheet(final int sheet) {
return new CustomPoiSheet(this.workbook.getSheetAt(sheet));
}
public CustomPoiItemReader(){
super();
}
#Override
protected int getNumberOfSheets() {
return this.workbook.getNumberOfSheets();
}
#Override
protected void doClose() throws Exception {
super.doClose();
if (this.workbook != null) {
this.workbook.close();
}
this.workbook=null;
}
/**
* Open the underlying file using the {#code WorkbookFactory}. We keep track of the used {#code InputStream} so that
* it can be closed cleanly on the end of reading the file. This to be able to release the resources used by
* Apache POI.
*
* #param inputStream the {#code InputStream} pointing to the Excel file.
* #throws Exception is thrown for any errors.
*/
#Override
protected void openExcelFile(final InputStream inputStream) throws Exception {
this.workbook = WorkbookFactory.create(inputStream);
this.workbook.setMissingCellPolicy(Row.MissingCellPolicy.CREATE_NULL_AS_BLANK);
}
}

just change your code like this while reading data from excel.
dataObj.setField(Float.valueOf(rowSet.getColumnValue(idx)).intValue();
this is only working for Column A,B,C

Hive UDF - Generic UDF for all Primitive Type

I am trying to implement the Hive UDF with Parameter and so I am extending GenericUDF class.
The problem is my UDF works find on String Datatype however it throws error if I run on other data types. I want UDF to run regardless of data type.
Would someone please let me know what's wrong with following code.
#Description(name = "Encrypt", value = "Encrypt the Given Column", extended = "SELECT Encrypt('Hello World!', 'Key');")
public class Encrypt extends GenericUDF {
StringObjectInspector key;
StringObjectInspector col;
#Override
public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumentException {
if (arguments.length != 2) {
throw new UDFArgumentLengthException("Encrypt only takes 2 arguments: T, String");
}
ObjectInspector keyObject = arguments[1];
ObjectInspector colObject = arguments[0];
if (!(keyObject instanceof StringObjectInspector)) {
throw new UDFArgumentException("Error: Key Type is Not String");
}
this.key = (StringObjectInspector) keyObject;
this.col = (StringObjectInspector) colObject;
return PrimitiveObjectInspectorFactory.javaStringObjectInspector;
}
#Override
public Object evaluate(DeferredObject[] deferredObjects) throws HiveException {
String keyString = key.getPrimitiveJavaObject(deferredObjects[1].get());
String colString = col.getPrimitiveJavaObject(deferredObjects[0].get());
return AES.encrypt(colString, keyString);
}
#Override
public String getDisplayString(String[] strings) {
return null;
}
}
Error
java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaIntObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.primitive.StringObjectInspector

I would suggest you to replace StringObjectInspector col with PrimitiveObjectInspector col and the corresponding cast this.col = (PrimitiveObjectInspector) colObject. Then there are two ways:
First is to process every possible Primitive type, like this
switch (((PrimitiveTypeInfo) colObject.getTypeInfo()).getPrimitiveCategory()) {
case BYTE:
case SHORT:
case INT:
case LONG:
case TIMESTAMP:
cast_long_type;
case FLOAT:
case DOUBLE:
cast_double_type;
case STRING:
everyting_is_fine;
case DECIMAL:
case BOOLEAN:
throw new UDFArgumentTypeException(0, "Unsupported yet");
default:
throw new UDFArgumentTypeException(0,
"Unsupported type");
}
}
Another way, is to use PrimitiveObjectInspectorUtils.getString method:
Object colObject = col.getPrimitiveJavaObject(deferredObjects[0].get());
String colString = PrimitiveObjectInspectorUtils.getString(colObject, key);
It just pseudocode like examples. Hope it helps.

Custom fields with FormBuilder in the Microsoft Bot Framework - not working

I tried this solution: Custom fields with FormBuilder in the Microsoft Bot Framework
But failed to get it working....The problem I encountered is that when I assign the base.Form = value, the _prompt in the _field gets a default recognizer, and it won't get overriden in the next line's SetRecognizer call, that only replaces the _field's recognizer.
However the matching process uses the _prompt's recognizer internally ( ? ).
Here is my code:
public class LuisIntentRecognizer<T> : RecognizePrimitive<T>
where T : class
{
public LuisIntentRecognizer(IField<T> field, string luisModelID, string luisSubscriptionKey)
: base(field)
{
_luisModelID = luisModelID;
_luisSubscriptionKey = luisSubscriptionKey;
}
public override DescribeAttribute ValueDescription(object value)
{
return new DescribeAttribute((string)value);
}
public override IEnumerable<string> ValidInputs(object value)
{
yield return (string)value;
}
public override TermMatch Parse(string input)
{
TermMatch result = null;
if (!string.IsNullOrWhiteSpace(input))
{
var luisModel = new LuisModelAttribute(_luisModelID, _luisSubscriptionKey);
var luisService = new LuisService(luisModel);
var luisResult = luisService.QueryAsync(input).Result; // TODO refactor somehow to async
var winner = luisResult.Intents.MaxBy(i => i.Score ?? 0d);
if (winner != null && !string.IsNullOrEmpty(winner.Intent))
{
result = new TermMatch(0, winner.Intent.Length, 0.0, winner.Intent);
}
else
{
result = new TermMatch(0, input.Length, 0.0, input);
}
}
return result;
}
public override string Help(T state, object defaultValue)
{
var prompt = new Prompter<T>(_field.Template(TemplateUsage.StringHelp), _field.Form, null);
var args = HelpArgs(state, defaultValue);
return prompt.Prompt(state, _field.Name, args.ToArray()).Prompt;
}
private string _luisModelID;
private string _luisSubscriptionKey;
}
public class LuisIntentField<T> : FieldReflector<T>
where T : class
{
public LuisIntentField(string name, string luisModelID, string luisSubscriptionKey, bool ignoreAnnotations = false)
: base(name, ignoreAnnotations)
{
_luisModelID = luisModelID;
_luisSubscriptionKey = luisSubscriptionKey;
}
public override IForm<T> Form
{
set
{
base.Form = value;
base.SetRecognizer(new LuisIntentRecognizer<T>(this, _luisModelID, _luisSubscriptionKey));
}
}
private string _luisModelID;
private string _luisSubscriptionKey;
}
Could anyone get it working?
Thanks

It seems to be a bug in the framework indeed: https://github.com/Microsoft/BotBuilder/issues/879

Reading a file with newlines as a tuple in pig

Is it possible to change the record delimiter from newline to some other string so as to read a file with newlines into a single tuple in pig.

Yes.
A = LOAD '...' USING PigStorage(',') AS (...); //comma is the delimeter for fields
SET textinputformat.record.delimiter '<delimeter>'; // record delimeter, by default it is `\n`. You can change to any delimeter.

As mentioned here
You can use PigStorage
A = LOAD '/some/path/COMMA-DELIM-PREFIX*' USING PigStorage(',') AS (f1:chararray, ...);
B = LOAD '/some/path/SEMICOLON-DELIM-PREFIX*' USING PigStorage('\t') AS (f1:chararray, ...);
You can even try writing load/store UDF.
There is java code example for both load and store.
Load Functions : LoadFunc abstract class has the main methods for loading data and for most use cases it would suffice to extend it. You can read more here
Example
The loader implementation in the example is a loader for text data
with line delimiter as '\n' and '\t' as default field delimiter (which
can be overridden by passing a different field delimiter in the
constructor) - this is similar to current PigStorage loader in Pig.
The implementation uses an existing Hadoop supported Inputformat -
TextInputFormat - as the underlying InputFormat.
public class SimpleTextLoader extends LoadFunc {
protected RecordReader in = null;
private byte fieldDel = '\t';
private ArrayList<Object> mProtoTuple = null;
private TupleFactory mTupleFactory = TupleFactory.getInstance();
private static final int BUFFER_SIZE = 1024;
public SimpleTextLoader() {
}
/**
* Constructs a Pig loader that uses specified character as a field delimiter.
*
* #param delimiter
* the single byte character that is used to separate fields.
* ("\t" is the default.)
*/
public SimpleTextLoader(String delimiter) {
this();
if (delimiter.length() == 1) {
this.fieldDel = (byte)delimiter.charAt(0);
} else if (delimiter.length() > 1 & & delimiter.charAt(0) == '\\') {
switch (delimiter.charAt(1)) {
case 't':
this.fieldDel = (byte)'\t';
break;
case 'x':
fieldDel =
Integer.valueOf(delimiter.substring(2), 16).byteValue();
break;
case 'u':
this.fieldDel =
Integer.valueOf(delimiter.substring(2)).byteValue();
break;
default:
throw new RuntimeException("Unknown delimiter " + delimiter);
}
} else {
throw new RuntimeException("PigStorage delimeter must be a single character");
}
}
#Override
public Tuple getNext() throws IOException {
try {
boolean notDone = in.nextKeyValue();
if (!notDone) {
return null;
}
Text value = (Text) in.getCurrentValue();
byte[] buf = value.getBytes();
int len = value.getLength();
int start = 0;
for (int i = 0; i < len; i++) {
if (buf[i] == fieldDel) {
readField(buf, start, i);
start = i + 1;
}
}
// pick up the last field
readField(buf, start, len);
Tuple t = mTupleFactory.newTupleNoCopy(mProtoTuple);
mProtoTuple = null;
return t;
} catch (InterruptedException e) {
int errCode = 6018;
String errMsg = "Error while reading input";
throw new ExecException(errMsg, errCode,
PigException.REMOTE_ENVIRONMENT, e);
}
}
private void readField(byte[] buf, int start, int end) {
if (mProtoTuple == null) {
mProtoTuple = new ArrayList<Object>();
}
if (start == end) {
// NULL value
mProtoTuple.add(null);
} else {
mProtoTuple.add(new DataByteArray(buf, start, end));
}
}
#Override
public InputFormat getInputFormat() {
return new TextInputFormat();
}
#Override
public void prepareToRead(RecordReader reader, PigSplit split) {
in = reader;
}
#Override
public void setLocation(String location, Job job)
throws IOException {
FileInputFormat.setInputPaths(job, location);
}
}
Store Functions : StoreFunc abstract class has the main methods for storing data and for most use cases it should suffice to extend it
Example
The storer implementation in the example is a storer for text data
with line delimiter as '\n' and '\t' as default field delimiter (which
can be overridden by passing a different field delimiter in the
constructor) - this is similar to current PigStorage storer in Pig.
The implementation uses an existing Hadoop supported OutputFormat -
TextOutputFormat as the underlying OutputFormat.
public class SimpleTextStorer extends StoreFunc {
protected RecordWriter writer = null;
private byte fieldDel = '\t';
private static final int BUFFER_SIZE = 1024;
private static final String UTF8 = "UTF-8";
public PigStorage() {
}
public PigStorage(String delimiter) {
this();
if (delimiter.length() == 1) {
this.fieldDel = (byte)delimiter.charAt(0);
} else if (delimiter.length() > 1delimiter.charAt(0) == '\\') {
switch (delimiter.charAt(1)) {
case 't':
this.fieldDel = (byte)'\t';
break;
case 'x':
fieldDel =
Integer.valueOf(delimiter.substring(2), 16).byteValue();
break;
case 'u':
this.fieldDel =
Integer.valueOf(delimiter.substring(2)).byteValue();
break;
default:
throw new RuntimeException("Unknown delimiter " + delimiter);
}
} else {
throw new RuntimeException("PigStorage delimeter must be a single character");
}
}
ByteArrayOutputStream mOut = new ByteArrayOutputStream(BUFFER_SIZE);
#Override
public void putNext(Tuple f) throws IOException {
int sz = f.size();
for (int i = 0; i < sz; i++) {
Object field;
try {
field = f.get(i);
} catch (ExecException ee) {
throw ee;
}
putField(field);
if (i != sz - 1) {
mOut.write(fieldDel);
}
}
Text text = new Text(mOut.toByteArray());
try {
writer.write(null, text);
mOut.reset();
} catch (InterruptedException e) {
throw new IOException(e);
}
}
#SuppressWarnings("unchecked")
private void putField(Object field) throws IOException {
//string constants for each delimiter
String tupleBeginDelim = "(";
String tupleEndDelim = ")";
String bagBeginDelim = "{";
String bagEndDelim = "}";
String mapBeginDelim = "[";
String mapEndDelim = "]";
String fieldDelim = ",";
String mapKeyValueDelim = "#";
switch (DataType.findType(field)) {
case DataType.NULL:
break; // just leave it empty
case DataType.BOOLEAN:
mOut.write(((Boolean)field).toString().getBytes());
break;
case DataType.INTEGER:
mOut.write(((Integer)field).toString().getBytes());
break;
case DataType.LONG:
mOut.write(((Long)field).toString().getBytes());
break;
case DataType.FLOAT:
mOut.write(((Float)field).toString().getBytes());
break;
case DataType.DOUBLE:
mOut.write(((Double)field).toString().getBytes());
break;
case DataType.BYTEARRAY: {
byte[] b = ((DataByteArray)field).get();
mOut.write(b, 0, b.length);
break;
}
case DataType.CHARARRAY:
// oddly enough, writeBytes writes a string
mOut.write(((String)field).getBytes(UTF8));
break;
case DataType.MAP:
boolean mapHasNext = false;
Map<String, Object> m = (Map<String, Object>)field;
mOut.write(mapBeginDelim.getBytes(UTF8));
for(Map.Entry<String, Object> e: m.entrySet()) {
if(mapHasNext) {
mOut.write(fieldDelim.getBytes(UTF8));
} else {
mapHasNext = true;
}
putField(e.getKey());
mOut.write(mapKeyValueDelim.getBytes(UTF8));
putField(e.getValue());
}
mOut.write(mapEndDelim.getBytes(UTF8));
break;
case DataType.TUPLE:
boolean tupleHasNext = false;
Tuple t = (Tuple)field;
mOut.write(tupleBeginDelim.getBytes(UTF8));
for(int i = 0; i < t.size(); ++i) {
if(tupleHasNext) {
mOut.write(fieldDelim.getBytes(UTF8));
} else {
tupleHasNext = true;
}
try {
putField(t.get(i));
} catch (ExecException ee) {
throw ee;
}
}
mOut.write(tupleEndDelim.getBytes(UTF8));
break;
case DataType.BAG:
boolean bagHasNext = false;
mOut.write(bagBeginDelim.getBytes(UTF8));
Iterator<Tuple> tupleIter = ((DataBag)field).iterator();
while(tupleIter.hasNext()) {
if(bagHasNext) {
mOut.write(fieldDelim.getBytes(UTF8));
} else {
bagHasNext = true;
}
putField((Object)tupleIter.next());
}
mOut.write(bagEndDelim.getBytes(UTF8));
break;
default: {
int errCode = 2108;
String msg = "Could not determine data type of field: " + field;
throw new ExecException(msg, errCode, PigException.BUG);
}
}
}
#Override
public OutputFormat getOutputFormat() {
return new TextOutputFormat<WritableComparable, Text>();
}
#Override
public void prepareToWrite(RecordWriter writer) {
this.writer = writer;
}
#Override
public void setStoreLocation(String location, Job job) throws IOException {
job.getConfiguration().set("mapred.textoutputformat.separator", "");
FileOutputFormat.setOutputPath(job, new Path(location));
if (location.endsWith(".bz2")) {
FileOutputFormat.setCompressOutput(job, true);
FileOutputFormat.setOutputCompressorClass(job, BZip2Codec.class);
} else if (location.endsWith(".gz")) {
FileOutputFormat.setCompressOutput(job, true);
FileOutputFormat.setOutputCompressorClass(job, GzipCodec.class);
}
}
}

Pass UDT defined in a package as a parameter to stored proc in Oracle

A package is created to define a custom collection and a stored proc having this custom collection as an input param. How do I call this proc from c# ?
Here's the package:
CREATE OR REPLACE PACKAGE pkg_name
AS
TYPE customCollectionType IS VARRAY(200) OF VARCHAR2 (1000);
PROCEDURE ProcName(p_collection IN customCollectionType);
END pkg_name;
/
CREATE OR REPLACE PACKAGE BODY pkg_name
AS
PROCEDURE StudyProc (p_StudyNum IN customCollectionType)
IS
........................
END pkg_name;
Here's factory implementation of customCollectionType:
public class PlaceHolderType : IOracleCustomType, INullable
{
[OracleArrayMapping()]
public string[] Array;
private bool m_bIsNull;
private OracleUdtStatus[] m_statusArray;
public OracleUdtStatus[] StatusArray
{
get
{
return this.m_statusArray;
}
set
{
this.m_statusArray = value;
}
}
public virtual bool IsNull
{
get
{
return m_bIsNull;
}
}
public static PlaceHolderType Null
{
get
{
PlaceHolderType p = new PlaceHolderType();
p.m_bIsNull = true;
return p;
}
}
public virtual void FromCustomObject(OracleConnection con, IntPtr pUdt)
{
OracleUdt.SetValue(con, pUdt, 0, Array, m_statusArray);
return;
}
public virtual void ToCustomObject(OracleConnection con, IntPtr pUdt)
{
object objectStatusArray = null;
Array = (string[])OracleUdt.GetValue(con, pUdt, 0, out objectStatusArray);
m_statusArray = (OracleUdtStatus[])objectStatusArray;
}
public override string ToString()
{
return string.Empty;
}
}
[OracleCustomTypeMappingAttribute("USER_NAME.PKG_NAME.CUSTOMCOLLECTIONTYPE")]
public class CUSTOMCOLLECTIONTYPE: IOracleCustomTypeFactory, IOracleArrayTypeFactory
{
// Implementation of IOracleCustomTypeFactory.CreateObject()
public IOracleCustomType CreateObject()
{
// Return a new custom object
//OracleString or;
return new PlaceHolderType();
}
#region IOracleArrayTypeFactory Members
public Array CreateArray(int numElems)
{
return new string[numElems];
}
public Array CreateStatusArray(int numElems)
{
return new OracleUdtStatus[numElems];
}
#endregion
}
Here's the call:
cmd.Connection = OracleConnectionObj;
cmd.BindByName = true;
cmd.CommandText = "PKG_NAME.PROC_NAME";
cmd.CommandType = System.Data.CommandType.StoredProcedure;
OracleParameter param1 = new OracleParameter();
Array inputValue = (new CUSTOMCOLLECTIONTYPE()).CreateArray(5);
System.Array.Copy(SomeArray, inputValue, 5);
param1.OracleDbType = OracleDbType.Array;
param1.Direction = ParameterDirection.Input;
param1.UdtTypeName = "USER_NAME.CUSTOMCOLLECTIONTYPE";
param1.Value = inputValue;
cmd.Parameters.Add(param1);
cmd.ExecuteNonQuery();
The error in .Net is:
"OCI-22303: type \"USER_NAME\".\"CUSTOMCOLLECTIONTYPE\" not found"

When you are passing an object to the procedure from C# which is mapped to a UDT, then you need to define the type at the schema level and not at the package level. So, you need to execute the following command in sqlplus :-
Create Type customCollectionType AS VARRAY(200) OF VARCHAR2 (1000);
and remove the declaration of customCollectionType from your package spec.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

hive UDF - convert StringObjectInspector to String - hadoop

Related

How to read numeric value from excel file using spring batch excel

Hive UDF - Generic UDF for all Primitive Type

Custom fields with FormBuilder in the Microsoft Bot Framework - not working

Reading a file with newlines as a tuple in pig

Pass UDT defined in a package as a parameter to stored proc in Oracle

Categories

Resources