Airtable: Join two tables into a unified master output - view

I have data in two different linked tables in Airtable and I need to join them together. See example:
The PERSON table looks like:
Name | Classes
----------------
John | A,B,C,F
Sally | B,F
Max | B,C
While the linked CLASSES table looks like:
Class | Date | People
---------------------------
A | 1975 | John
B | 2000 | John,Sally,Max
C | 1823 | John,Max
D | 1492 |
E | 2020 |
F | 2010 | John,Sally
What I need is:
Person|Class|Date
--------------
John | A | 1975
John | B | 2000
John | C | 1823
John | F | 2010
Sally | B | 2000
Sally | F | 2010
Max | B | 2000
Max | C | 1823
How do I get this view / table as output?

The more I see questions like this, with no answer, the more I realise how airtable just isn't a database in any real sense.
This is a perfectly reasonable question about how to join 2 tables after those tables have been normalised. Answer? It can't be done, not easily!
So what is airtable supposed to be used for, building non-normalised databases, otherwise known as a spreadsheet!

If you use click "Class" field like "A" or "B" in the "Person" table, it'll show the popup so that you could see the class details.
Or if you really want to need that kind of table, my suggestion is like this
Create a new table called "xxx", and write the code in the scripting block and populate the data from "Person", "Class" tables to the new table.
PS: Scripting block is only supported in the "Pro" plan.

Related

How can i merge multiple columns from two different files in talend

Lets say i have multiple columns coming from two different files like that :
USERNAME | AGE | GENDER | CHILDREN
Joe | 23 | male | 2
Annie | 45 | female | 5
| | |
And another one like this :
USERNAME | AGE |
Jonathan | 33 |
Mike | 41 |
And i want to merge the data of the columns that have the same name into one like this while keeping the data of the columns that are unique at each field:
USERNAME | AGE | GENDER | CHILDREN
Joe | 23 | male | 2
Annie | 45 | female | 5
Jonathan | 33 | |
Mike | 41 | |
Sorry if the answer is obvious, im new to talend, thanks.
What tool is available toy you?
The Append function in SAS for example can do this for you.
You can use the append approach in Python, R or other language you intend using.
For Talen:
Copy the complete subjob1 – copy me sub job and paste it to create a second sub job.
Link the two sub jobs using an onSubjobOK link.
Open tFixedFlowInput, and change Records from first subjob to Records from second subjob.
Open tFileOutputDelimited on the new sub job, and tick Append, as shown in the following screenshot:
use a tUnite component to accomplish that
here is the link of the documentation : https://help.talend.com/r/fr-FR/8.0/orchestration/tunite
your flow would be
tFileInput1(excel or csv ) ----------------------------------------------
|
| ->tUnite -> tLogRow
tFileInput2(excel or csv )->tMap (add to empty fields GENDER & Children )|

Oracle 11g insert into select from a table with duplicate rows

I have one table that need to split into several other tables.
But the main table is just like a transitive table.
I dump data from a excel into it (from 5k to 200k rows) , and using insert into select, split into the correct tables (Five different tables).
However, the latest dataset that my client sent has records with duplicates values.
The primary key usually is ENI for my table. But even this record is duplicated because the same company can be a customer and a service provider, so they have two different registers but use the same ENI.
What i have so far.
I found a script that uses merge and modified it to find same eni and update the same main_id to all
|Main_id| ENI | company_name| Type
| 1 | 1864 | JOHN | C
| 2 | 351485 | JOEL | C
| 3 | 16546 | MICHEL | C
| 2 | 351485 | JOEL J. | S
| 1 | 1864 | JOHN E. E. | C
Main_id: Primarykey that the main BD uses
ENI: Unique company number
Type: 'C' - COSTUMER 'S' - SERVICE PROVIDERR
Some Cases it can have the same type. just like id 1
there are several other Columns...
What i need:
insert any of the main_id my other script already sorted, and set a flag on the others that they were not inserted. i cant delete any data i'll need to send these info to the costumer validate.
or i just simply cant make this way and go back to the good old excel
Edit: as a question below this is a example
|Main_id| ENI | company_name| Type| RANK|
| 1 | 1864 | JOHN | C | 1 |
| 2 | 351485 | JOEL | C | 1 |
| 3 | 16546 | MICHEL | C | 1 |
| 2 | 351485 | JOEL J. | S | 2 |
| 1 | 1864 | JOHN E. E. | C | 2 |
RANK - would be like the 1864 appears 2 times,
1st one found gets 1 second 2 and so on. i tryed using
RANK() OVER (PARTITION BY MAIN_ID ORDER BY ENI)
RANK() OVER (PARTITION BY company_name ORDER BY ENI)
Thanks to TEJASH i was able to come up with this solution
MERGE INTO TABLEA S
USING (Select ROWID AS ID,
row_number() Over(partition by eniorder by eni, type) as RANK_DUPLICATED
From TABLEA
) T
ON (S.ROWID = T.ID)
WHEN MATCHED THEN UPDATE SET S.RANK_DUPLICATED= T.RANK_DUPLICATED;
As far as I understood your problem, you just need to know the duplicate based on 2 columns. You can achieve it using analytical function as follows:
Select t.*,
row_number() Over(partition by main_id, eni order by company_name) as rnk
From your_table t

Efficient way to join by levenshtein in Hive or Impala

I have two tables one includes about 17K (NLIST) records while the other 57K (FNAMES).
I would like to join the both by comparing the records using levenshtein formula.
Here is the example for the content of tables:
Table NLIST:
+------+-------------+
| ID | S_NAME |
+------+-------------+
| 1 | Avi |
| 2 | Moshe |
| 3 | David |
....
Table FNAMES:
+------+-------------+
| ID | NICKNAMES |
+------+-------------+
| 1 | Avile |
| 2 | Dudi |
| 3 | Moshiko |
| 4 | Avi |
| 5 | DAVE |
....
The above tables are just examples. In the real case the names column can include more than one word.
The required result should be:
+------+-------------+--------+
| ID | NICKNAMES | S_NAME |
+------+-------------+--------+
| 1 | Avile | Avi |
| 2 | Dudi | David |
| 3 | Moshiko | Moshe |
| 4 | Avi | Avi |
| 5 | DAVE | David |
...
Here is the code I use:
select FNAMES.NICKNAMES, NLIST.S_NAME
from NICKNAMES
LEFT OUTER JOIN NLIST
ON(true)
WHERE levenshtein (FNAMES.NICKNAMES, NLIST.S_NAME) <=4
The above code runs for a very long time and I stopped its running.
How can I make it run in a reasonable time?
In addition, I think the levenshtein distance depends on the length of the words. How can I find the optimal value for the distance (in this case I chose 4 arbitrarily)?
Hive Table performance is depends upon various point .
Query enginee
File format
use VECTORIZATION set hive.vectorized.execution.enabled = true;set hive.vectorized.execution.reduce.enabled = true;
If you have good server you can try with Impala and definitely it is faster than Hive.
You can do the fine tuning of impala which will give you an edge to execute this query faster .Tuning Impala for Performance

Re: Transpose data using Linq

I have a table, in the following format,
|BallotNo | City | CandidateNo | Votes
|Box1 | AA | Cand1 | 1200
|Box1 | AA | Cand2 | 1500
|Box2 | BB | Cand1 | 2500
|Box2 | BB | Cand2 | 3600
uing linq, I want to a get a result in the format
|Box1 |AA |Cand1 |1200 |Cand2 |1500
|Box2 |BB |Cand1 |2500 |Cand2 |3600
Thanks
You are looking for a grouping option.
As I have understood, you need to group by City row, it is pretty easy, see the http://msdn.microsoft.com/library/bb534492.aspx link on how to use the GroupBy extension method.

Rails ActiveRecord use join or includes instead of find_by_sql to get attributes of two tables

The following gets me the results I want but I am trying to figure out if I could have done this with a join or includes instead.
#items = Item.find_by_sql("SELECT *
FROM items_with_metadata
FULL OUTER JOIN items ON items.id = items_with_metadata.item_id")
The result should be that I get all attributes from both tables and the attributes are null wherever the items_with_metadata did not match an item in the items table.
ALSO, I do not have any associations between the two tables, the id of some items just happens to be in both tables
So for example if I have
items table with
id | name | active
------------------
123 | a | 0
456 | b | 1
and items_with_metadata has
color | usable | location | item_id
-----------------------------------
red | yes | north | 123
the result of the query will be
id | name | active | color | usable | location | item_id
--------------------------------------------------------
123 | a | 0 | red | yes | north | 123
456 | b | 1 | | | |
I was hoping there was a way to do this using ActiveRecord's joins or includes or any other ActiveRecord method that is not find_by_sql
How about:
Item.joins('FULL OUTER JOIN items_with_metadata ON items.id = items_with_metadata.item_id')
EDIT:
You could also use:
#items = Item.includes(:items_with_metadata)
This will return only Item models, but will also load all relevant ItemWithMetadata models to the memory which will make them available via:
#items.first.items_with_metadata
The last statement won't cause a DB query, but load the item metadata from the memory.

Resources