tab1 = {key1="aaa1","aaa2",key2="bbb"}
print(tab1.key1(1)) fail
print(tab1.key1{1}) fail
print(tab1.key1.1) fail
I cant retrieve second value of key 1..
note that you have not set any keys to "key1" !
another note do you know how index and what table are?
please read about them in lua doc.
it's just a normal table, you can use "next" or "pair", functions to print all values.
tab1 = {key1="aaa1","aaa2",key2="bbb"}
for key, value in next, tab1 do
print(key, value)
end
-- result:
-- [[
1 aaa2
key2 bbb
key1 aaa1
]]
if its a key then just use '.key1' or '["key"]'
or its not a key then by default it's an index.
eg:
print(tab1.key1) -- or tab1["key1"]
print(tab1[1])
print(tab1.key2) -- or tab1["key2"]
and if you want to save more than a key inside another key simple make another table inside that key.
tab1 = {
key1 = {'a', 'b', 'c'}
}
print(tab1.key1[1])
print(tab1.key1[2])
print(tab1.key1[3])
sorry if i did not explain it well enough.
Related
My avro files contain the following column:
{"name":"my_column","type":["null",{"type":"array","items":{"type":"record","name":"my_column","namespace":"v11","fields":[{"name":"my_column","type":["null","int"],"default":null}]}}],"default":null}
I loaded the data into Vertica and stored as VARBINARY. Example:
db=> select MapToString(my_column) from tab limit 1;
MapToString
------------------------------------------------------------------------------------------------------------------------------------------------------
{
"0.__name__": "my_column",
"0.my_column": "5",
"1.__name__": "my_column",
"1.my_column": "9"
}
(1 row)
The data can actually be simplified into ARRAY[INT]. (I.e. ARRAY[5,9]).
What is the correct way of performing this transformation?
Extend Vertica via UDTF or UDParser? Perform this transformation via SQL? Something else?
EDIT: I am going to check whether a scalar UDF can be embedded in the COPY command alongside the AVROPARSER, or wether it requires extra ETL.
Thank you!
Try materialising the column - well, you need to know what to expect in a flex table ...
I had this table first:
SELECT
__id__
, REGEXP_REPLACE(MAPTOSTRING(__raw__),'\s+',' ') AS rawasstring
FROM a;
-- out __id__ | rawasstring
-- out --------+--------------------------------------------------------------------------------------------------
-- out | { "0.__name__": "my_column", "0.my_column": "5", "1.__name__": "my_column", "1.my_column": "9" }
-- out (1 row)
Then, I just added a materialised column, like so:
ALTER TABLE a ADD int_array ARRAY[int,10]
DEFAULT ARRAY[
MAPLOOKUP(__raw__, '0.my_column')
, MAPLOOKUP(__raw__, '1.my_column')
, MAPLOOKUP(__raw__, '2.my_column')
, MAPLOOKUP(__raw__, '3.my_column')
, MAPLOOKUP(__raw__, '4.my_column')
, MAPLOOKUP(__raw__, '5.my_column')
, MAPLOOKUP(__raw__, '6.my_column')
, MAPLOOKUP(__raw__, '7.my_column')
, MAPLOOKUP(__raw__, '8.my_column')
, MAPLOOKUP(__raw__, '9.my_column')
]::ARRAY[INT,10];
Keys that don't exist in the Map are NULL and not really added to the array. And now I have the array:
SELECT
__id__::VARCHAR
, REGEXP_REPLACE(MAPTOSTRING(__raw__),'\s+',' ') AS rawasstring
, int_array
FROM a ;
-- out __id__ | rawasstring | int_array
-- out --------+--------------------------------------------------------------------------------------------------+-----------
-- out | { "0.__name__": "my_column", "0.my_column": "5", "1.__name__": "my_column", "1.my_column": "9" } | [5,9]
-- out (1 row)
I want to create a function that gets the first value of a table field if two other field values match the two given function parameters.
I thought this would be easy. But I found nothing in the internet or M documentation that could solve this.
I don't know if I have to loop through a record or if there is a top level function.
= (val1 as text, val2 as text) as text =>
let
result = if [Field1] = val1 and [Field2] = val2 then [Field3] else ""
in
result
As far as I understand your wish, table and column names are hard coded (i.e. you intend to apply the function only for specific table). Then you may use following approach:
// table
let
t1 = #table({"Field1"}, List.Zip({{"a".."e"}})),
t2 = #table({"Field2"}, List.Zip({{"α".."ε"}})),
join = Table.Join(t1&t1,{}, t2&t2,{}),
add = Table.AddIndexColumn(join, "Field3", 0, 1)
in
add
// func
(val1 as text, val2 as text) => Table.SelectRows(table, each [Field1] = val1 and [Field2] = val2)[Field3]{0}
// result
func("d","β") //31
I need select from tarantool all datat by two values from one space.
How i can perform request to tarantool like in mysql?
select from aaa where a=1a22cadbdb or a=7f626e0123
Now i can make two requests:
box.space.logs:select({'1a22cadbdb'})
box.space.logs:select({'7f626e0123'})
but i don't know how to merge result into one ;(
Following code merge field[0] to lua table
a = box.space.logs:select({'1a22cadbdb'})
b = box.space.logs:select({'7f626e0123'})
c = { field_1 = a[0], field_2 = b[0] }
The select return tuple or tuples so you can extract value via [].
More details about select: http://tarantool.org/doc/book/box/box_index.html?highlight=select#lua-function.index_object.select
More details about tuple: http://tarantool.org/doc/book/box/box_tuple.html?highlight=tuple#lua-module.box.tuple
Nowadays Tarantool allows you to retrieve via SQL, for example box.execute([[select from "aaa" where "a"='1a22cadbdb' or "a"='7f626e0123';]]). You have to add the field names and types of aaa before doing this, with the format() function.
For me this work fine, but need make check for return from first select:
local res = {}
for k, v in pairs (box.space.email:select({email})[1]) do
if type(v) == 'string' then
table.insert(res, box.space.logs:select({v})[1])
end
end
Lets say I have a table like so:
{
value = 4
},
{
value = 3
},
{
value = 1
},
{
value = 2
}
and I want to iterate over this and print the value in order so the output is like so:
1
2
3
4
How do I do this, I understand how to use ipairs and pairs, and table.sort, but that only works if using table.insert and the key is valid, I need to be loop over this in order of the value.
I tried a custom function but it simply printed them in the incorrect order.
I have tried:
Creating an index and looping that
Sorting the table (throws error: attempt to perform __lt on table and table)
And a combination of sorts, indexes and other tables that not only didn't work, but also made it very complicated.
I am well and truly stumped.
Sorting the table
This was the right solution.
(throws error: attempt to perform __lt on table and table)
Sounds like you tried to use a < b.
For Lua to be able to sort values, it has to know how to compare them. It knows how to compare numbers and strings, but by default it has idea how to compare two tables. Consider this:
local people = {
{ name = 'fred', age = 43 },
{ name = 'ted', age = 31 },
{ name = 'ned', age = 12 },
}
If I call sort on people, how can Lua know what I intend? I doesn't know what 'age' or 'name' means or which I'd want to use for comparison. I have to tell it.
It's possible to add a metatable to a table which tells Lua what the < operator means for a table, but you can also supply sort with a callback function that tells it how to compare two objects.
You supply sort with a function that receives two values and you return whether the first is "less than" the second, using your knowledge of the tables. In the case of your tables:
table.sort(t, function(a,b) return a.value < b.value end)
for i,entry in ipairs(t) do
print(i,entry.value)
end
If you want to leave the original table unchanged, you could create a custom 'sort by value' iterator like this:
local function valueSort(a,b)
return a.value < b.value;
end
function sortByValue( tbl ) -- use as iterator
-- build new table to sort
local sorted = {};
for i,v in ipairs( tbl ) do sorted[i] = v end;
-- sort new table
table.sort( sorted, valueSort );
-- return iterator
return ipairs( sorted );
end
When sortByValue() is called, it clones tbl to a new sorted table, and then sorts the sorted table. It then hands the sorted table over to ipairs(), and ipairs outputs the iterator to be used by the for loop.
To use:
for i,v in sortByValue( myTable ) do
print(v)
end
While this ensures your original table remains unaltered, it has the downside that each time you do an iteration the iterator has to clone myTable to make a new sorted table, and then table.sort that sorted table.
If performance is vital, you can greatly speed things up by 'caching' the work done by the sortByValue() iterator. Updated code:
local resort, sorted = true;
local function valueSort(a,b)
return a.value < b.value;
end
function sortByValue( tbl ) -- use as iterator
if not sorted then -- rebuild sorted table
sorted = {};
for i,v in ipairs( tbl ) do sorted[i] = v end;
resort = true;
end
if resort then -- sort the 'sorted' table
table.sort( sorted, valueSort );
resort = false;
end
-- return iterator
return ipairs( sorted );
end
Each time you add or remove an element to/from myTable set sorted = nil. This lets the iterator know it needs to rebuild the sorted table (and also re-sort it).
Each time you update a value property within one of the nested tables, set resort = true. This lets the iterator know it has to do a table.sort.
Now, when you use the iterator, it will try and re-use the previous sorted results from the cached sorted table.
If it can't find the sorted table (eg. on first use of the iterator, or because you set sorted = nil to force a rebuild) it will rebuild it. If it sees it needs to resort (eg. on first use, or if the sorted table was rebuilt, or if you set resort = true) then it will resort the sorted table.
I have a column of strings that I load using Pig:
A
B
C
D
how do I convert this column into a single string like this?
A,B,C,D
You are going to have to first GROUP ALL to put everything into one bag, then join the contents of the bag together using a UDF. Something like this:
-- myudfs.py
-- #!/usr/bin/python
--
-- #outputSchema('concated: string')
-- def concat_bag(BAG):
-- return ','.join(BAG)
Register 'myudfs.py' using jython as myfuncs;
A = LOAD 'myfile.txt' AS (letter:chararray) ;
B = GROUP A ALL ;
C = FOREACH B GENERATE myfuncs.concat_bag(A.letter) AS all_letters ;
If your file/schema contains multiple columns, you are probably going to want to project out the column you want to generate the string for. Something like:
A0 = LOAD 'myfile.txt' AS (letter:chararray, val:int, extra:chararray) ;
A = FOREACH A0 GENERATE letter ;
This way you are not keeping around extra columns that will slow down an already expensive operation.