How can I create an array of key,value pairs within a transformation? - upsolver

I have some source JSON files that contain {key:value} pairs, for example:
{firstName: "jason", lastName: "smith"}
I would like to take this JSON and create an array of key-value pairs as follows:
[{key: "firstName", value: "jason"},{key: "lastName", value: "smith"}]
I've seen the FROM_KEY_VALUE() function in the documentation, but what I want to do is the opposite of FROM_KEY_VALUE(). Do you have any ideas?

If the number of key value pairs are static, you can build two arrays, one of keys, and one of values. You can then ZIP_WITH_INDEX these arrays together to build your desired output. Using your example from the question. I've included some psuedo code below to show you how this works, in a real use case the keys_array[] and values_array[] would likely come from a staging table.
SELECT
ZIP_WITH_INDEX('key,value',keys_array[],values_array[]) as output_array
FROM staging_table
LET keys_array[] = ['firstName','lastName'],
values_array[] = ['jason','smith']
WHERE $commit_time BETWEEN RUN_START_TIME() and RUN_END_TIME();
The resulting array would be:
[{"index":0, "key":"firstName", "value":"jason"}
,{"index":1, "key":"lastName", "value":"smith"}]
If the fields in the two arrays are not static, you can use the ZIP function. The ZIP function will simply merge together any number of arrays, with automatically assigned field names.
SELECT
ZIP(keys_array[],values_array[]) as output_array
FROM staging_table
LET keys_array[] = ['firstName','lastName'],
values_array[] = ['jason','smith']
WHERE $commit_time BETWEEN RUN_START_TIME() and RUN_END_TIME();
The resulting array would look as follows:
[{"field0":"firstName","field1":"jason"}
,{"field0":"lastName","field1":"hall"}]
The field names could always be remapped if needed.

Related

Performance: Replacing Series values with keys from a Dictionary in Python

I have a data series that contains various names of the same organizations. I want harmonize these names into a given standard using a mapping dictionary. I am currently using a nested for loop to iterate through each series element and if it is within the dictionary's values, I update the series value with the dictionary key.
# For example, corporation_series is:
0 'Corp1'
1 'Corp-1'
2 'Corp 1'
3 'Corp2'
4 'Corp--2'
dtype: object
# Dictionary is:
mapping_dict = {
'Corporation_1': ['Corp1', 'Corp-1', 'Corp 1'],
'Corporation_2': ['Corp2', 'Corp--2'],
}
# I use this logic to replace the values in the series
for index, value in corporation_series.items():
for key, list in mapping_dict.items():
if value in list:
corporation_series = corporation_series.replace(value, key)
So, if the series has a value of 'Corp1', and it exists in the dictionary's values, the logic replaces it with the corresponding key of corporations. However, it is an extremely expensive method. Could someone recommend me a better way of doing this operation? Much appreciated.
I found a solution by using python's .map function. In order to use .map, I had to invert my dictionary:
# Inverted Dict:
mapping_dict = {
'Corp1': ['Corporation_1'],
'Corp-1': ['Corporation_1'],
'Corp 1': ['Corporation_1'],
'Corp2': ['Corporation_2'],
'Corp--2':['Corporation_2'],
}
# use .map
corporation_series.map(newdict)
Instead of 5 minutes of processing, took around 5s. While this is works, I sure there are better solutions out there. Any suggestions would be most welcome.

Better way to loop through a set of hashes assigning array values in Ruby

I am trying to insert data into Postgres. I have an array of data and I am trying to assign each column a value of the array. Here is an example.
pg_insert = ['12/09/2015', 41, 'test account', '41.0']
Table.create([date: pg_insert[0],
account_number: pg_insert[1],
account_name: pg_insert[2],
values: pg_insert[3]])
Is there a way where I can loop this so I can put i in pg_insert instead of having to type out numbers? I'm not sure how to loop inside of the create() parameter. Is there any way around this?
Any suggestions would be great thanks.
Table.create is accepting a Hash, I'm sure.
So here is what you can do:
Make an Array called keys that contains 4 symbols :date, :account_number, :account_name, and :values.
pg_insert is already an Array.
Now you can put the two Arrays together to make the Hash you need: Hash[keys.zip(pg_insert)]
This allows you to call Table.create like this: Table.create(Hash[keys.zip(pg_insert)])
Here is the finished code then:
keys = [:date, :account_number, :account_name, :values]
pg_insert = ['12/09/2015', 41, 'test account', '41.0']
Table.create(Hash[keys.zip(pg_insert)]) # or Table.create Hash[keys.zip(pg_insert)] if you don't want so many parentheses.
Note that pg_insert will always have to be in the same order as keys.
You can read more about Array#zip and Hash.new to understand how those work. This SO link might also be helpful: Converting an array of keys and an array of values into a hash in Ruby

Populating array (by 'name') in array of arrays

Lets say i have an array of arrays, of which i dont know the names, just that they are arrays, and how many of them there are.
bigArray=[smallArrayA[], smallArrayB[]]
Now i can fetch the array(s) by indexposition, like:
smallA = bigArray[0]
smallA << 'input'
But what i'd like to know is the names of the arrays, stored in the 'big' one..
bigArray.inspect
..just gives me:
[['input'],[]]
My problem is that the names of the smaller ones are going to be created dynamiclly, and i need to know their names to modify the right one, later on.
Sounds like you need a hash:
bigHash = { :a => smallArrayA, :b => smallArrayB }
Now you can refer to each element of the hash by name:
bigHash[:a]

Associatively sorting a table by value in Lua

I have a key => value table I'd like to sort in Lua. The keys are all integers, but aren't consecutive (and have meaning). Lua's only sort function appears to be table.sort, which treats tables as simple arrays, discarding the original keys and their association with particular items. Instead, I'd essentially like to be able to use PHP's asort() function.
What I have:
items = {
[1004] = "foo",
[1234] = "bar",
[3188] = "baz",
[7007] = "quux",
}
What I want after the sort operation:
items = {
[1234] = "bar",
[3188] = "baz",
[1004] = "foo",
[7007] = "quux",
}
Any ideas?
Edit: Based on answers, I'm going to assume that it's simply an odd quirk of the particular embedded Lua interpreter I'm working with, but in all of my tests, pairs() always returns table items in the order in which they were added to the table. (i.e. the two above declarations would iterate differently).
Unfortunately, because that isn't normal behavior, it looks like I can't get what I need; Lua doesn't have the necessary tools built-in (of course) and the embedded environment is too limited for me to work around it.
Still, thanks for your help, all!
You seem to misunderstand something. What you have here is a associative array. Associative arrays have no explicit order on them, e.g. it's only the internal representation (usually sorted) that orders them.
In short -- in Lua, both of the arrays you posted are the same.
What you would want instead, is such a representation:
items = {
{1004, "foo"},
{1234, "bar"},
{3188, "baz"},
{7007, "quux"},
}
While you can't get them by index now (they are indexed 1, 2, 3, 4, but you can create another index array), you can sort them using table.sort.
A sorting function would be then:
function compare(a,b)
return a[1] < b[1]
end
table.sort(items, compare)
As Komel said, you're dealing with associative arrays, which have no guaranteed ordering.
If you want key ordering based on its associated value while also preserving associative array functionality, you can do something like this:
function getKeysSortedByValue(tbl, sortFunction)
local keys = {}
for key in pairs(tbl) do
table.insert(keys, key)
end
table.sort(keys, function(a, b)
return sortFunction(tbl[a], tbl[b])
end)
return keys
end
items = {
[1004] = "foo",
[1234] = "bar",
[3188] = "baz",
[7007] = "quux",
}
local sortedKeys = getKeysSortedByValue(items, function(a, b) return a < b end)
sortedKeys is {1234,3188,1004,7007}, and you can access your data like so:
for _, key in ipairs(sortedKeys) do
print(key, items[key])
end
result:
1234 bar
3188 baz
1004 foo
7007 quux
hmm, missed the part about not being able to control the iteration. there
But in lua there is usually always a way.
http://lua-users.org/wiki/OrderedAssociativeTable
Thats a start. Now you would need to replace the pairs() that the library uses. That could be a simples as pairs=my_pairs. You could then use the solution in the link above
PHP arrays are different from Lua tables.
A PHP array may have an ordered list of key-value pairs.
A Lua table always contains an unordered set of key-value pairs.
A Lua table acts as an array when a programmer chooses to use integers 1, 2, 3, ... as keys. The language syntax and standard library functions, like table.sort offer special support for tables with consecutive-integer keys.
So, if you want to emulate a PHP array, you'll have to represent it using list of key-value pairs, which is really a table of tables, but it's more helpful to think of it as a list of key-value pairs. Pass a custom "less-than" function to table.sort and you'll be all set.
N.B. Lua allows you to mix consecutive-integer keys with any other kinds of keys in the same table—and the representation is efficient. I use this feature sometimes, usually to tag an array with a few pieces of metadata.
Coming to this a few months later, with the same query. The recommended answer seemed to pinpoint the gap between what was required and how this looks in LUA, but it didn't get me what I was after exactly :- which was a Hash sorted by Key.
The first three functions on this page DID however : http://lua-users.org/wiki/SortedIteration
I did a brief bit of Lua coding a couple of years ago but I'm no longer fluent in it.
When faced with a similar problem, I copied my array to another array with keys and values reversed, then used sort on the new array.
I wasn't aware of a possibility to sort the array using the method Kornel Kisielewicz recommends.
The proposed compare function works but only if the values in the first column are unique.
Here is a bit enhanced compare function to ensure, if the values of a actual column equals, it takes values from next column to evaluate...
With {1234, "baam"} < {1234, "bar"} to be true the items the array containing "baam" will be inserted before the array containing the "bar".
local items = {
{1004, "foo"},
{1234, "bar"},
{1234, "baam"},
{3188, "baz"},
{7007, "quux"},
}
local function compare(a, b)
for inx = 1, #a do
-- print("A " .. inx .. " " .. a[inx])
-- print("B " .. inx .. " " .. b[inx])
if a[inx] == b[inx] and a[inx + 1] < b[inx + 1] then
return true
elseif a[inx] ~= b[inx] and a[inx] < b[inx] == true then
return true
else
return false
end
end
return false
end
table.sort(items,compare)

Can't sort table with associative indexes

Why I can't use table.sort to sort tables with associative indexes?
In general, Lua tables are pure associative arrays. There is no "natural" order other than the as a side effect of the particular hash table implementation used in the Lua core. This makes sense because values of any Lua data type (other than nil) can be used as both keys and values; but only strings and numbers have any kind of sensible ordering, and then only between values of like type.
For example, what should the sorted order of this table be:
unsortable = {
answer=42,
true="Beauty",
[function() return 17 end] = function() return 42 end,
[math.pi] = "pi",
[ {} ] = {},
12, 11, 10, 9, 8
}
It has one string key, one boolean key, one function key, one non-integral key, one table key, and five integer keys. Should the function sort ahead of the string? How do you compare the string to a number? Where should the table sort? And what about userdata and thread values which don't happen to appear in this table?
By convention, values indexed by sequential integers beginning with 1 are commonly used as lists. Several functions and common idioms follow this convention, and table.sort is one example. Functions that operate over lists usually ignore any values stored at keys that are not part of the list. Again, table.sort is an example: it sorts only those elements that are stored at keys that are part of the list.
Another example is the # operator. For the above table, #unsortable is 5 because unsortable[5] ~= nil and unsortable[6] == nil. Notice that the value stored at the numeric index math.pi is not counted even though pi is between 3 and 4 because it is not an integer. Furthermore, none of the other non-integer keys are counted either. This means that a simple for loop can iterate over the entire list:
for i in 1,#unsortable do
print(i,unsortable[i])
end
Although that is often written as
for i,v in ipairs(unsortable) do
print(i,v)
end
In short, Lua tables are unordered collections of values, each indexed by a key; but there is a special convention for sequential integer keys beginning at 1.
Edit: For the special case of non-integral keys with a suitable partial ordering, there is a work-around involving a separate index table. The described content of tables keyed by string values is a suitable example for this trick.
First, collect the keys in a new table, in the form of a list. That is, make a table indexed by consecutive integers beginning at 1 with keys as values and sort that. Then, use that index to iterate over the original table in the desired order.
For example, here is foreachinorder(), which uses this technique to iterate over all values of a table, calling a function for each key/value pair, in an order determined by a comparison function.
function foreachinorder(t, f, cmp)
-- first extract a list of the keys from t
local keys = {}
for k,_ in pairs(t) do
keys[#keys+1] = k
end
-- sort the keys according to the function cmp. If cmp
-- is omitted, table.sort() defaults to the < operator
table.sort(keys,cmp)
-- finally, loop over the keys in sorted order, and operate
-- on elements of t
for _,k in ipairs(keys) do
f(k,t[k])
end
end
It constructs an index, sorts it with table.sort(), then loops over each element in the sorted index and calls the function f for each one. The function f is passed the key and value. The sort order is determined by an optional comparison function which is passed to table.sort. It is called with two elements to compare (the keys to the table t in this case) and must return true if the first is less than the second. If omitted, table.sort uses the built-in < operator.
For example, given the following table:
t1 = {
a = 1,
b = 2,
c = 3,
}
then foreachinorder(t1,print) prints:
a 1
b 2
c 3
and foreachinorder(t1,print,function(a,b) return a>b end) prints:
c 3
b 2
a 1
You can only sort tables with consecutive integer keys starting at 1, i.e., lists. If you have another table of key-value pairs, you can make a list of pairs and sort that:
function sortpairs(t, lt)
local u = { }
for k, v in pairs(t) do table.insert(u, { key = k, value = v }) end
table.sort(u, lt)
return u
end
Of course this is useful only if you provide a custom ordering (lt) which expects as arguments key/value pairs.
This issue is discussed at greater length in a related question about sorting Lua tables.
Because they don't have any order in the first place. It's like trying to sort a garbage bag full of bananas.

Resources