Can I going to access django orm's column as a string + variable? - django-queryset

Can I going to access django orm's column as a string + variable?
examply is it possible?
for x in range(i, 98)
CategoryNick.obejcts.filter(author=request.user).update("ca"+(x+1)=F('ca'+x))
or is there a another way?
thakns for let me know

You can pass it with a dictionary, and do dictionary unpacking. In fact we can do the updates in bulk with:
data = {'ca{}'.format(x+1): F('ca{}'.format(x)) for x in range(i, 98)}
CategoryNick.objects.filter(
author=request.user
).update(**data)
We here thus will perform a single update query, that will look like:
UPDATE app_categorynick
SET ca2 = ca1, ca3 = ca2, ca4 = ca3, …, ca98 = ca97
Notice the two asterisks (**) in front. If you call a function f(**some_dict), with some_dict for example some_dict = {'a': 4, 'b': 2}, it will call f with f(a=4, b=2).
That being said. Usually if you have an arbitrary number of values stored in a model, it is not a good idea to do this with a huge number of columns. Since you can never know an extra column is needed. Usually one makes an extra model, and defines a many-to-one relation from that model to the target model.

Related

Hashmap (O(1)) supporting joker/match-all keys

The title is not so clear, because I cannot put my problem in a sentence (If you have a better title for this question, please suggest). I'll try to clarify my requirement with an example:
Suppose I have a table like this:
| Origin | Destination | Airline | Free Baggage |
===================================================
| NYC | London | American | 20KG |
---------------------------------------------------
| NYC | * | Southwest | 30KG |
---------------------------------------------------
| * | * | Southwest | 25KG |
---------------------------------------------------
| * | LA | * | 20KG |
---------------------------------------------------
| * | * | * | 15KG |
---------------------------------------------------
and so on ...
This table describes free baggage amount that the airlines provide in different routes. You can see that some rows have * value, meaning that they match all possible values (those values are not known necessarily).
So we have a large list of baggage rules (like the table above) and a large list of flights (which their origin, destination and airline is known), and we intend to find the baggage amount for each one of flights in the most efficient way (iterating the list is not an efficient way, obviously, as it will cost an O(N) computation). It is possible to exist more than one result for each flight, but we will assume that in this case either the first matching or the most specific one will be preferred (whichever is simpler for you to continue with).
If there was not * signs in the table, the problem would be easy, and we could use a Hashmap or Dictionary with a Tuple of values as a key. But with presence of those * (lets say match-all) keys, it is not so straight forward to provide a general solution for that.
Please note that the above example was just an example, and I need a solution that can be used for any number of keys, not just three.
Do you have any idea or implementation for this problem, with a lookup method having time complexity equal or close to O(1) like a regular hashmap (memory will not be an issue)? What would be the best possible solution?
Regarding the comments, the more I think about it, and the more it looks like a relational database with indexes rather than an hashmap...
A trivial, quite easy solution could be something like an In-memory SQlite database. But it would probably be something in O(log2(n)), and not O(1). The main advantage is that it's easy to set up, and IF performances are good enough, it could be the final solution.
Here, key is to use proper indexes, the LIKE operator, and of course well-defined JOIN clauses.
From scratch, I can't think about any solution that, having N rows and M columns, isn't at least in O(M)... But usually, you'll have way less columns than rows. Quickly - I may have skipped a detail, I write that on-the-fly - I can propose you this algorithm / container:
Data must be stored in a vector-like container VECDATA, accessed by a simple index in O(1). Think about this as a primary key in databases, and we'll call it PK. Knowing PK gives you instantly, in O(1), the required data. You'll have N rows grand total.
For each row NOT containing any *, you'll insert in a real hashmap called MAINHASH the pair (<tuple>, PK). This is your primary index, for exact results. It will be in O(1), BUT what you requested may not be within... Obviously, you must maintain consistency between MAINHASH and VECDATA, with whatever is needed (mutexes, locks, don't care as long as both are consistents).
This hash contains at most N entries. Without any joker, it will act near as a standard hashmap, but for the indirection to VECDATA. It's still O(1) in this case.
For each searchable column, you'll build a specific index, dedicated to this column.
The index has N entries. It will be a standard hashmap, but it MUST allow multiple values for a given key. That's quite a common container, so it shouldn't be an issue.
For each row, the index entry will be: ( <VECDATA value>, PK ). The container is stored in a vector of indexes, INDEX[i] (with 0<=i<M).
Same as MAINHASH, consistency must be enforced.
Obviously, all these indexes / subcontainers should be constructed when an entry is inserted into VECDATA, and saved on disk across sessions if needed - you don't want to reconstruct all this each time you start the application...
Searching a row
So, user search for a given tuple.
Search it in MAINHASH. If found, return it, search done.
Upgrade (see below): search also in CACHE before going to step #2.
For each tuple element tuple[0<=i<M], search in INDEX[i] for both tuple[i] (returns a vector of PK, EXACT[i]) AND for * (returns another vector of PK, FUZZY[i]).
With these two vectors, build another (temporary) hash TMPHASH, associating ( PK, integer COUNT ). It quite simple: COUNT is initialized to 1 if entry comes from EXACT, and 0 if it comes from FUZZY.
For next column, build EXACT and FUZZY (see #2). But instead of making a new TMPHASH, you'll MERGE the results into rather than creating a new temporary hash.
Method is: if TMPHASH doesn't have this PK entry, trash this entry: it can't match at all. Otherwise, read the COUNT value, add 1 or 0 to it according to where it comes from, reinject it in TMPHASH.
Once all columns are done, you'll have to analyze TMPHASH.
Analyzing TMPHASH
First, if TMPHASH is empty, then you don't have any suitable answer. Return that to user. If it contains only one entry, same: return to user directly.
For more than one element in TMPHASH:
Parse the whole TMPHASH container, searching for the maximum COUNT. Maintain in memory the PK associated to the current maximum for COUNT.
Developper's choice: in case of multiple COUNT at the same maximum value, you can either return them all, return the first one, or the last one.
COUNT if obviously always stricly lower than M - otherwise, you would have found the tuple in MAINHASH. This value, compared to M, can give a confidence mark to your result (=100*COUNT/M% of confidence).
You can also now store the original tuple searched, and the corresponding PK, in another hashmap called CACHE.
Since it would be way too complicated to update properly CACHE when adding/modifying something in VECDATA, simply purge CACHE when it occurs. It's only a cache, after all...
This is quite complex to implement if the language doesn't help you, in particular by allowing to redefine operators and having all base containers available, but it should work.
Exact matches / cached matches are in O(1). Fuzzy search is in O(n.M), n being the number of matching rows (and 0<=n<N, of course).
Without further researchs, I can't see anything better than that. It will consume an obscene amount of memory, but you said that it won't be an issue.
I would recommend doing this with Tries that have a little data decorated. For routes, you want to know the lowest route ID so we can match to the first available route. For flights you want to track how many flights there are left to match.
What this will allow you to do, for instance, is partway through the match ONLY ONCE realize that flights from city1 to city2 might be matching routes that start off city1, city2, or city1, * or *, city2, or *, * without having to repeat that logic for each route or flight.
Here is a proof of concept in Python:
import heapq
import weakref
class Flight:
def __init__(self, fields, flight_no):
self.fields = fields
self.flight_no = flight_no
class Route:
def __init__(self, route_id, fields, baggage):
self.route_id = route_id
self.fields = fields
self.baggage = baggage
class SearchTrie:
def __init__(self, value=0, item=None, parent=None):
# value = # unmatched flights for flights
# value = lowest route id for routes.
self.value = value
self.item = item
self.trie = {}
self.parent = None
if parent:
self.parent = weakref.ref(parent)
def add_flight (self, flight, i=0):
self.value += 1
fields = flight.fields
if i < len(fields):
if fields[i] not in self.trie:
self.trie[fields[i]] = SearchTrie(0, None, self)
self.trie[fields[i]].add_flight(flight, i+1)
else:
self.item = flight
def remove_flight(self):
self.value -= 1
if self.parent and self.parent():
self.parent().remove_flight()
def add_route (self, route, i=0):
route_id = route.route_id
fields = route.fields
if i < len(fields):
if fields[i] not in self.trie:
self.trie[fields[i]] = SearchTrie(route_id)
self.trie[fields[i]].add_route(route, i+1)
else:
self.item = route
def match_flight_baggage(route_search, flight_search):
# Construct a heap of one search to do.
tmp_id = 0
todo = [((0, tmp_id), route_search, flight_search)]
# This will hold by flight number, baggage.
matched = {}
while 0 < len(todo):
priority, route_search, flight_search = heapq.heappop(todo)
if 0 == flight_search.value: # There are no flights left to match
# Already matched all flights.
pass
elif flight_search.item is not None:
# We found a match!
matched[flight_search.item.flight_no] = route_search.item.baggage
flight_search.remove_flight()
else:
for key, r_search in route_search.trie.items():
if key == '*': # Found wildcard.
for a_search in flight_search.trie.values():
if 0 < a_search.value:
heapq.heappush(todo, ((r_search.value, tmp_id), r_search, a_search))
tmp_id += 1
elif key in flight_search.trie and 0 < flight_search.trie[key].value:
heapq.heappush(todo, ((r_search.value, tmp_id), r_search, flight_search.trie[key]))
tmp_id += 1
return matched
# Sample data - the id is the position.
route_data = [
["NYC", "London", "American", "20KG"],
["NYC", "*", "Southwest", "30KG"],
["*", "*", "Southwest", "25KG"],
["*", "LA", "*", "20KG"],
["*", "*", "*", "15KG"],
]
routes = []
for i in range(len(route_data)):
data = route_data[i]
routes.append(Route(i, [data[0], data[1], data[2]], data[3]))
flight_data = [
["NYC", "London", "American"],
["NYC", "Dallas", "Southwest"],
["Dallas", "Houston", "Southwest"],
["Denver", "LA", "American"],
["Denver", "Houston", "American"],
]
flights = []
for i in range(len(flight_data)):
data = flight_data[i]
flights.append(Flight([data[0], data[1], data[2]], i))
# Convert to searches.
flight_search = SearchTrie()
for flight in flights:
flight_search.add_flight(flight)
route_search = SearchTrie()
for route in routes:
route_search.add_route(route)
print(route_search.match_flight_baggage(flight_search))
As Wisblade notices in his answer, for an array of N rows and M columns the best possible complexity is O(M). You can get O(1) only if you consider M to be a constant.
You can easily solve your problem in O(2^M) which is practical for a small M and is effectively O(1) if you consider M to be a constant.
Create a single hashmap which contains (as keys) strings of concatenated column values, possibly separated by some special character, e.g. a slash:
map.put("NYC/London/American", "20KG");
map.put("NYC/*/Southwest", "30KG");
map.put("*/*/Southwest", "25KG");
map.put("*/LA/*", "20KG");
map.put("*/*/*", "15KG");
Then, when you query, you try different combinations of actual data and wildcard characters. E.g. let's assume you want to query NYC/LA/Southwest; then you try the following combinations:
map.get("NYC/LA/Southwest"); // null
map.get("NYC/LA/*"); // null
map.get("NYC/*/Southwest"); // found: 30KG
If the answer in the third step was null, you would continue as follows:
map.get("NYC/*/*"); // null
map.get("*/LA/Southwest"); // null
map.get("*/LA/*"); // found: 20KG
And there still remain two options:
map.get("*/*/Southwest"); // found: 25KG
map.get("*/*/*"); // found: 15KG
Basically, for three data columns you have 8 possibilities to check in the hashmap -- not bad! and possibly you find an answer much earlier.

Python: Printing vertically

The final code will print the distance between states. I'm trying to print the menu with the names of the states numbered and vertically. I really struggle to find my mistakes.
This code doesn't raise any error, it just prints nothing, empty.
state_data = """
LA 34.0522°N 118.2437°W
Florida 27.6648°N 81.5158°W
NY 40.7128°N 74.0060°W"""
states = []
import re
state_data1 = re.sub("[°N#°E]", "", state_data)
def process_states(string):
states_temp = string.split()
states = [(states_temp[x], float(states_temp[x + 1]), float(states_temp[x + 2])) for x in
range(0, len(states_temp), 3)]
return states
def menu():
for state_data in range(state_data1):
print(f'{state_data + 1} {name[number]}')
My first guess is, your code does not print anything without errors because you never actually execute process_airports() nor menu().
You have to call them like this at the end of your script:
something = process_airports(airport_data1)
menu()
This will now raise some errors though. So let's address them.
The menu() function will raise an error because neither name nor number are defined and because you are trying to apply the range function over a string (airport_data1) instead of an integer.
First fixing the range error: you mixed two ideas in your for-loop: iterating over the elements in your list airport_data1 and iterating over the indexes of the elements in the list.
You have to choose one (we'll see later that you can do both at once), in this example, I choose to iterate over the indexes of the list.
Then, since neither name nor number exists anywhere they will raise an error. You always need to declare variables somewhere, however, in this case they are not needed at all so let's just remove them:
def menu(data):
for i in range(len(data)):
print(f'{i + 1} {data[i]}')
processed_airports = process_airports(airport_data1)
menu(processed_airports)
Considering data is the output of process_airports()
Now for some general advices and improvements.
First, global variables.
Notice how you can access airport_data1 within the menu() function just fine, while it works this is not something recommended, it's usually better to explicitly pass variables as arguments.
Notice how in the function I proposed above, every single variable is declared in the function itself, there is no information coming from a higher scope. Again, this is not mandatory but makes the code way easier to work with and understand.
airport_data = """
Alexandroupoli 40.855869°N 25.956264°E
Athens 37.936389°N 23.947222°E
Chania 35.531667°N 24.149722°E
Chios 38.343056°N 26.140556°E
Corfu 39.601944°N 19.911667°E"""
airports = []
import re
airport_data1 = re.sub("[°N#°E]", "", airport_data)
def process_airports(string):
airports_temp = string.split()
airports = [(airports_temp[x], float(airports_temp[x + 1]), float(airports_temp[x + 2])) for x in
range(0, len(airports_temp), 3)]
return airports
def menu(data):
for i in range(len(data)):
print(f'{i + 1} {data[i]}')
# I'm adding the call to the functions for clarity
data = process_airports(airport_data1)
menu(data)
The printed menu now looks like that:
1 ('Alexandroupoli', 40.855869, 25.956264)
2 ('Athens', 37.936389, 23.947222)
3 ('Chania', 35.531667, 24.149722)
4 ('Chios', 38.343056, 26.140556)
5 ('Corfu', 39.601944, 19.911667)
Second and this is mostly fyi, but you can access both the index of a iterable and the element itself by looping over enumerate() meaning, the following function will print the exact same thing as the one with range(len(data)). This is handy if you need to work with both the element itself and it's index.
def menu(data):
for the_index, the_element in enumerate(data):
print(f'{the_index + 1} {the_element}')

Programming and data structures

Suggest a data structure for representing a subset S of integers from 1 to n. Following operations on the set S are to be performed in constant time (independent of cardinality of S).
You may assume that the data structure has been suitable initialized.
(i). MEMBER (X):
Check whether X is in the set S or not
(ii). FIND-ONE(S): If S is not empty, return one element of the set S (any arbitrary element will do)
(iii). ADD (X): Add integer X to set S
(iv). DELETE (X): Delete integer X from S.
My analysis:-
I think hash table will work fine here ,but how will hash table work for FIND-ONES(S) operation.Because i might need to scan the entire has table to look for the present element.
You can just use a regular hashset for this in java. In the case of the FIND-ONE(S) what you would do is, call isEmpty(). If that returns false, use the built in iterator, and just get the first value the iterator returns.
A hash table would work, but you need to think about the specific implementation. If you use the compact version from Python 3.6, you can perform FIND-ONEs in constant time by inspecting the entries list.
For example, the dictionary:
d = {'timmy': 'red', 'barry': 'green', 'guido': 'blue'}
is represented as follows:
indices = [None, 1, None, None, None, 0, None, 2]
entries = [[-9092791511155847987, 'timmy', 'red'],
[-8522787127447073495, 'barry', 'green'],
[-6480567542315338377, 'guido', 'blue']]

take two arrays and make a third array from values that are NOT unique

I'm attempting to de-dupe an enormous email list migration, however there's a catch. I'd like to take the duplicates and turn them into their own array (3rd).
Lets make these arrays very simple, and short.
a = ["rich#aol.com", "ian#aol.com"]
b = ["rich#aol.com"]
Essentially i'm trying to make c = ["rich#aol.com"] because it's the only email that resides on both lists.
What I've attempted so far:
Is there an opposite to unqiq ?
ab = a + b
ab.uniq
returns: ["rich#aol.com", "ian#aol.com"]
Could I dump a + b into a third c array, and compare c to ab.uniq to get what's duplicated?
Am i missing an easier way to do this? Any help will be much appreciated!!!!
You want the intersection of the arrays.
c = a & b

Associatively sorting a table by value in Lua

I have a key => value table I'd like to sort in Lua. The keys are all integers, but aren't consecutive (and have meaning). Lua's only sort function appears to be table.sort, which treats tables as simple arrays, discarding the original keys and their association with particular items. Instead, I'd essentially like to be able to use PHP's asort() function.
What I have:
items = {
[1004] = "foo",
[1234] = "bar",
[3188] = "baz",
[7007] = "quux",
}
What I want after the sort operation:
items = {
[1234] = "bar",
[3188] = "baz",
[1004] = "foo",
[7007] = "quux",
}
Any ideas?
Edit: Based on answers, I'm going to assume that it's simply an odd quirk of the particular embedded Lua interpreter I'm working with, but in all of my tests, pairs() always returns table items in the order in which they were added to the table. (i.e. the two above declarations would iterate differently).
Unfortunately, because that isn't normal behavior, it looks like I can't get what I need; Lua doesn't have the necessary tools built-in (of course) and the embedded environment is too limited for me to work around it.
Still, thanks for your help, all!
You seem to misunderstand something. What you have here is a associative array. Associative arrays have no explicit order on them, e.g. it's only the internal representation (usually sorted) that orders them.
In short -- in Lua, both of the arrays you posted are the same.
What you would want instead, is such a representation:
items = {
{1004, "foo"},
{1234, "bar"},
{3188, "baz"},
{7007, "quux"},
}
While you can't get them by index now (they are indexed 1, 2, 3, 4, but you can create another index array), you can sort them using table.sort.
A sorting function would be then:
function compare(a,b)
return a[1] < b[1]
end
table.sort(items, compare)
As Komel said, you're dealing with associative arrays, which have no guaranteed ordering.
If you want key ordering based on its associated value while also preserving associative array functionality, you can do something like this:
function getKeysSortedByValue(tbl, sortFunction)
local keys = {}
for key in pairs(tbl) do
table.insert(keys, key)
end
table.sort(keys, function(a, b)
return sortFunction(tbl[a], tbl[b])
end)
return keys
end
items = {
[1004] = "foo",
[1234] = "bar",
[3188] = "baz",
[7007] = "quux",
}
local sortedKeys = getKeysSortedByValue(items, function(a, b) return a < b end)
sortedKeys is {1234,3188,1004,7007}, and you can access your data like so:
for _, key in ipairs(sortedKeys) do
print(key, items[key])
end
result:
1234 bar
3188 baz
1004 foo
7007 quux
hmm, missed the part about not being able to control the iteration. there
But in lua there is usually always a way.
http://lua-users.org/wiki/OrderedAssociativeTable
Thats a start. Now you would need to replace the pairs() that the library uses. That could be a simples as pairs=my_pairs. You could then use the solution in the link above
PHP arrays are different from Lua tables.
A PHP array may have an ordered list of key-value pairs.
A Lua table always contains an unordered set of key-value pairs.
A Lua table acts as an array when a programmer chooses to use integers 1, 2, 3, ... as keys. The language syntax and standard library functions, like table.sort offer special support for tables with consecutive-integer keys.
So, if you want to emulate a PHP array, you'll have to represent it using list of key-value pairs, which is really a table of tables, but it's more helpful to think of it as a list of key-value pairs. Pass a custom "less-than" function to table.sort and you'll be all set.
N.B. Lua allows you to mix consecutive-integer keys with any other kinds of keys in the same table—and the representation is efficient. I use this feature sometimes, usually to tag an array with a few pieces of metadata.
Coming to this a few months later, with the same query. The recommended answer seemed to pinpoint the gap between what was required and how this looks in LUA, but it didn't get me what I was after exactly :- which was a Hash sorted by Key.
The first three functions on this page DID however : http://lua-users.org/wiki/SortedIteration
I did a brief bit of Lua coding a couple of years ago but I'm no longer fluent in it.
When faced with a similar problem, I copied my array to another array with keys and values reversed, then used sort on the new array.
I wasn't aware of a possibility to sort the array using the method Kornel Kisielewicz recommends.
The proposed compare function works but only if the values in the first column are unique.
Here is a bit enhanced compare function to ensure, if the values of a actual column equals, it takes values from next column to evaluate...
With {1234, "baam"} < {1234, "bar"} to be true the items the array containing "baam" will be inserted before the array containing the "bar".
local items = {
{1004, "foo"},
{1234, "bar"},
{1234, "baam"},
{3188, "baz"},
{7007, "quux"},
}
local function compare(a, b)
for inx = 1, #a do
-- print("A " .. inx .. " " .. a[inx])
-- print("B " .. inx .. " " .. b[inx])
if a[inx] == b[inx] and a[inx + 1] < b[inx + 1] then
return true
elseif a[inx] ~= b[inx] and a[inx] < b[inx] == true then
return true
else
return false
end
end
return false
end
table.sort(items,compare)

Resources