Redis - Representing hierarchies - data-structures

I have a lot of data (about 6.5M rows) that I want to supplement with Redis. Basically, I want to use sorted sets to store the hierarchy of the data. So, some sample data may look like this (using a family tree as an example).
royalty_id | name | date | favorite_color | parent
================================================================
990 | George VI | 1895-1952 | purple | NULL
991 | Margaret | 1930-2002 | green | 990
992 | Elizabeth II | 1926- | yellow | 990
993 | Edward | 1964- | purple | 992
994 | Andrew | 1960- | brown | 992
995 | Anne | 1950- | pink | 992
996 | Charles | 1948- | purple | 992
997 | Harry | 1984- | red | 996
998 | William | 1982- | blue | 996
999 | George | 2013- | blue | 998
You get the idea. The hierarchies that I have aren't very deep (maybe 8 entries deep at most?), there's just a LOT of them. As I go through the list of entries, I'd like to put them in a redis sorted set like this:
lpush ancestors_of.999 998
lpush ancestors_of.999 996
lpush ancestors_of.998 996
And so on. The problem is, that these entries aren't in a nice order in the database like they are in this sample data. For each row, I could be looking at the bottom, top, or middle of the hierarchy. So, what's the best way to do this?
Here's some options I've thought about:
1) Continue as the example above, but for each database row, search the existing entries for every list to see if the new row is in there already (seems very inefficient).
2) Maintain another set with parent to which list they're in (seems to be doubling work).
I'm sure I'm missing something here.

Related

Automated sorting of table after filter is selected

timestamp | product | performance | sort_quantity
--------------|------------------|-----------------------|------------------------
2020-01-01 | Product_A | high | 819
2020-03-15 | Product_A | high | 819
2020-01-01 | Product_B | low | -214
2020-03-15 | Product_B | low | -214
2020-01-01 | Product_C | high | -100
2020-03-15 | Product_C | high | -100
2020-01-01 | Product_D | low | 933
2020-03-15 | Product_D | low | 933
2020-01-01 | Product_E | high | 501
2020-03-15 | Product_E | high | 501
I insert the table above into Tableau looking like this:
(Sorry for only having it available in German)
All this works perfectly.
Now I add a filter for column performance to the report.
When I select one of the values in the filter (e.g. high) the report looks like this:
The filter function is correct but I also want that once the filter is clicked the table is automatically sorted (descending) based on column sort_quantity.
Is it possible to to do this with Tableau?
If yes how can I achieve it?
Create a string version of sort quantity and place it as the first pill on rows, to the left of Product
str([sort quantity])
Set this field to sort Descending on by Field on Sort Quantity (not the str version).
On Str sort quantity, deselect show header to hide the column.
Final view should look like this.

Is there a way to cache and filter a table locally in PL SQL?

I’m faced with having to process a table in ORACLE 11g, that contains around 5 million records. These records are speed limits along a divided highway. There is a SpeedLimitId, HighwayId and a from mile post and to mile post to depict the area that the speed limit is applied to. Currently all the records are only on one side of the divided highway and the records need to be processed to also apply them to the other side. There is a measure equation table that lets us know which range of measure on one side of the highway equal a range of measure on the other side of the highway. This allows us to calculate the measure that the speed limit event will be on the other side by calculating the percentage of the measure value in the range on measure and then finding that same percentage of the range on the opposing side. The speed limit record can be contained to one measure equation record or it can cross several of them. Base on the information in the speed limit table and the measure equation, one or more records need to be inserted into a third table.
SPEED_LIMIT
+--------------+-----------+--------------+------------+------------+-------+
| SpeedLimitId | HighwayId | FromMilePost | ToMilePost | SpeedLimit | Lane |
+--------------+-----------+--------------+------------+------------+-------+
| 1 | 75N | 115 | 123 | 60 | South |
+--------------+-----------+--------------+------------+------------+-------+
MEASURE_EQUATION
+------------+----------------+-----------+---------+-------+----------------+-----------+---------+-------+------------------+
| EquationId | NorthHighwayId | NFromMile | NToMile | NGain | SouthHighwayId | SFromMile | SToMile | SGain | IsHighwayDivided |
+------------+----------------+-----------+---------+-------+----------------+-----------+---------+-------+------------------+
| 1 | 75N | 105 | 120 | 15 | 75S | 100 | 110 | 10 | No |
| 2 | 75N | 120 | 125 | 5 | 75S | 110 | 125 | 15 | Yes |
| 3 | 75N | 125 | 130 | 5 | 75S | 125 | 130 | 5 | No |
+------------+----------------+-----------+---------+-------+----------------+-----------+---------+-------+------------------+
Depending on information in the SPEED_LIMIT and MEASURE_EQUATION table there will be a need to insert at least one but can be as many as three records in a third table. There are a dozen or so different scenarios that can take place as a result of different values in the fields.
Using the above data you can see that the SpeedLimitId 1 is noted as being on the south side of the highway, but it is currently on the north side and that it also spans the 2 equation records with the ids of 1 and 2. In this case it spans two measure ranges as a single roadway splits off and becomes divided highway. We need to split the original records into two events and add them to third processing table and calculate the new measure for the south bound lane.
SPEED_LIMIT_PROCESSING
+--------------+-----------+-------+----------+--------+
| SpeedLimitId | HighwayId | LANE | FromMile | ToMile |
+--------------+-----------+-------+----------+--------+
| 1 | 75N | North | 115 | 120 |
| 1 | 75S | South | 110 | 119 |
+--------------+-----------+-------+----------+--------+
The methodology to calculate the measure on the south bound lane is as follows:
+--------------------+----------------------------+-----------------------------+
| | From Measure Translation | To Measure Translation |
+--------------------+----------------------------+-----------------------------+
| Event Measure as % | ((120 – 120)/5) * 100 = 0% | ((123 – 120)/5) * 100 = 60% |
| Offset Measure | ((15 * 0) / 100 = 0 | ((15 * 60) / 100) = 9 |
| Translated Measure | 110 + 0 = 110 | 110 + 9 = 119 |
+--------------------+----------------------------+-----------------------------+
My concern is to do this in the most efficient way possible. The idea would be to loop through each record in the SPEED_LIMIT table, select the corresponding records in the measure equation table and then based on information from those 2 tables I would insert records into a 3rd table. In order to limit PL/SQL context switches I planned on using "BULK COLLECT and FORALL” statements to query the event table and to run the insert statements, this would allow me to do things in batches. The missing component is how to get the corresponding records from the MEASURE_EQUATION table without having to do a sql query for every record loop in the SPEED_LIMIT table. The MEASURE_EQUATION only has about 700 records in it, so I was wondering if there is a way I can cache it in PL SQL and then filter it to the appropriate records for the current SPEED_LIMIT record.
As you can probably gleamed from my question, I’m fairly new at PL SQL and ORACLE in general, so maybe I’m going about it in the completely wrong way.

How to calculate the "distance" between two rectangles?

Given one rectangle and a bunch of images (also rectangles), I need to find the best image to place in it. That would be the one that requires less stretching or shrinking and that covers the area the best. I want to find the one with the least distance (as in, least transformation) to the target rectangle. The images are screenshots of websites, so, they contain a mix of text and images. The screenshots suffer whether they are stretched (pixelation) or shrunk (text becomes unreadable).
But it also feels like one of these problems that someone might have looked into already and there might be an algorithm to properly solve it.
The data is stored in a SQL database so I would need the analysis to be doable in SQL. The data might look like this:
---------------------------------------------------------
| Id | Width | Height |
---------------------------------------------------------
| 00b701c6-1c31-4323-a292-700b4dff2e45 | 784 | 1310 |
| 0a46a0f6-a3b2-4a5d-a8be-55bad84ba37d | 1414 | 957 |
| 0b79fbe8-6b9e-48d1-89da-8981570e23d7 | 784 | 561 |
| 0e9f5935-0e58-42d2-bba2-3e89db55260f | 400 | 400 |
| 0ebf14fb-094b-47f5-9e25-b4f54bc2eab9 | 2260 | 957 |
| 17131cd6-f5b2-4e4d-a63b-b909e04e2d89 | 1414 | 957 |
| 2298fc73-0bcb-49c8-b54e-3184cf4153d4 | 784 | 1310 |
| 28ffee4a-2d08-4862-aeb0-6546cda4e225 | 2560 | 1387 |
| 29cf92ad-b6fd-43c6-abb1-7c5a7e4af92d | 2260 | 957 |
| 307b2b6e-1f66-4784-bd7d-b6bfc4768fbd | 2560 | 1387 |
| 3edc916b-4b3d-4fd8-a1f9-6418a4d8d27a | 2333 | 435 |
| 3ef1132a-d059-487a-9cad-dbb3895ad25a | 1414 | 957 |
| 43e044e5-5f82-4b86-95ba-a9e76f5d2519 | 657 | 435 |
| 464be0ec-5cb7-4f3f-856d-6beb5fbc2f5e | 657 | 435 |
| 510d0236-e61a-4f1c-bb0b-754c4c1f80f7 | 2260 | 957 |
| 52f217d5-038c-475d-af96-89d1930e8c2f | 657 | 435 |
| 532cadf5-c20b-4b1c-84d4-78e1b501495f | 2333 | 435 |
| 5f3e55aa-12a4-4502-a159-fdc128b53e11 | 2260 | 957 |
| 626c33a9-aaa0-47b6-a6f3-bd5235f1655b | 784 | 561 |
| 6711a717-e1ee-4930-9f21-5e225a99a769 | 657 | 435 |
| 7125c301-c311-4339-b36c-519dc3714c68 | 784 | 561 |
| 8f5d8e3b-8213-4cd6-8ea0-311297f4cfc3 | 2333 | 435 |
| c3d7661f-12e6-4297-8830-15e82850bc32 | 784 | 1310 |
| cd32106e-2f3e-4614-ac40-19e3f5d7fa1f | 784 | 561 |
| d7191194-1f8a-4230-8ee0-8a8b427b86e7 | 784 | 1310 |
| d737de66-849d-4ec3-bf3b-cc48bfa1f3a6 | 2560 | 1387 |
| d935e10b-88f3-4aba-a2b4-a1a9cfd8acb4 | 2560 | 1387 |
| dcc8e9e6-4ee3-4737-a530-d2fcffd35a86 | 2333 | 435 |
| ec3187be-5a81-4ecb-a908-ddedaa5930ec | 1414 | 957 |
---------------------------------------------------------
You can compute the Jaccard index as follows:
function jaccard(rect : Rectangle, img : Rectangle) : float
rectArea := rect.width * rect.height
imgArea := img.width * img.height
interArea := min(rect.width, img.width) * min(rect.height, img.height)
return interArea / (rectArea + imgArea - interArea)
end
Then choose the highest scoring image (values go from zero to one).
I don't have a complete algorithm but my approach would be to score each image based on how good it matches the rectangle. The interesting parameter would the ratio (width/height), so calculate the ratio for each image and compare it to the ratio of the rectangle. The nearest ratio wins.
As for the second problem I'd probably set a threshold, if the ratio of the best fit is really close to the rectangle (below the threshold) you can get away with stretching (looks better than two very thin borders), if it's above the threshold add black borders since distorted text is hideous.

Compute value based on next row in BIRT

I am creating a BIRT Report where each row is a receipt matched with a purchase order. There are usually more than one receipt per purchase order. My client wants the qty_remaining on the purchase order to show only on the last receipt for each purchase order. I am not able to alter the data before BIRT gets it. I see two possible solutions, but I am unable to find how to implement either. This question will deal with first possible solution.
If I can compare the purchase order number(po_number) with the next row, then I can set the current row's qty_remaining to 0 if the po_numbers match else show the actual qty_remaining. Is it possible to access the next row?
Edit
The desired look is similar to this:
| date | receipt_number | po_number | qty_remaining | qty_received |
|------|----------------|-----------|---------------|--------------|
| 4/9 | 723 | 6026 | 0 | 985 |
| 4/9 | 758 | 6026 | 2 | 1 |
| 4/20 | 790 | 7070 | 58 | 0 |
| 4/21 | 801 | 833 | 600 | 0 |
But I'm currently getting this:
| date | receipt_number | po_number | qty_remaining | qty_received |
|------|----------------|-----------|---------------|--------------|
| 4/9 | 723 | 6026 | 2 | 985 |
| 4/9 | 758 | 6026 | 2 | 1 |
| 4/20 | 790 | 7070 | 58 | 0 |
| 4/21 | 801 | 833 | 600 | 0 |
I think you looking at this the wrong way. If you want behavior that resembles for loops you should use grouping and aggregate functions. You can build quite complex stuff by using (or not using) the group headers and footers.
In your case I would try to group the receipts on po_number. Order them by receipt_number then have a aggregate function like MAX or LAST on the receipts_number and name it 'last_receipt'. It should aggregate on the group, not the whole table. This 'total' is available on every row within the group.
Then you can use the visibitly setting to only show the qty_remaining when the row['receipt_number'] == row['last_receipt']

Get the best route between two dirrent paths in wp8

I have created a bus transit application for a city in windows phone 8. I have stored all my routes and bus stops in a separate table in the database.
My bus stop table contains:
stop_id | stop_name | latitude | longitude | status | alias_name
---------------------------------------------------------------------------------------
1736 | Atlas Company | 18.629243 | 73.833814 | Active | Centurenca Corner
1737 | Atlas company | 18.629243 | 73.833814 | Active |
681 | Atma Anand Dhyan Kendra | 18.600349 | 73.926251 | InActive |
My Routes Table contains
bus_id | bus_no | bus_source | bus_destination | days_of_week | total_distance estimated_time | source_stop_names | destination_stop_names | total_stops | source_trip_time | destination_trip_time | bus_status
source_stop_names contains
-----------------
1 | Swargate
2 | Parvati payatha
3 | Dandekar pul
4 | Pan mala Sinhgad Road
5 | Jal Shuddhikarn Kendra Sinhgad Road
6 | Ganesh mala
7 | Vitbhatti Sinhgad Road
8 | Vitthalwadi jakat naka
9 | Jaydeo nagar
10 | Rajaram pul
11 | Vitthalwadi Mandir Hingne
12 | Hingne rasta
13 | Anand nagar singhgad rd
14 | Manik Bag
15 | Indian hum company
16 | Wadgaon phata
17 | Patil colony
18 | Dhayari phata
19 | Sanas Vidyalaya
20 | Dangat wasti
21 | Gar mala
22 | Dhayarai gaon
23 | Raykar wasti
24 | Poultry farm singhgad road
25 | Dhayarigaon shala
26 | Chavan mala
27 | DSK Vishwa
I want to find the shortest path between two bus stops not connected by a bus route, and show the number of routes and buses the user has to take while travelling from one point to another, like google maps does.
I have used default map control in windows phone 8.

Resources