Logistic Regression: How to maximize function parameters? - algorithm

I have a Python function MyFunction (a, b, c, d, e, f, g, h, i, j) which takes several parameters, then processes some real data and returns a numerical value x. If I should be more specific, the function is basically processing a data table with 150000 rows and counting how many rows fulfill certain conditions based on the inputs.
For example MyFunction (1, 1, 1, 1, 1, 1, 2, 1, 1, 2) returns 79107, MyFunction (1, 3, -1.5545, 7, 3, 1, 3, 15, 1.785, -2.5454) returns 68758 and so on.
How can I find which combination of those 10 parameters a, b, c, d, e, f, g, h, i, j gives the maximum possible value of x? Whereas those passed parameters can be any numbers (float/integer) and within any range. X is always in the range 0-150000.
EDIT: Here's the code with data I use if somebody wants to take a look. Colab

I solved my case using the scipy.optimize.minimize function. The calculation was very fast, took less than a minute. I'm very surprised at how efficient it is. But I had to try different calculation methods, it's only the method='Powell' that worked like a charm in my case.
from scipy.optimize import minimize
def MyFunction (a, b, c, d, e, f, g, h, i, j):
#do something
return x*-1 # because the function is minimize, so x*-1 maximizes it
StartValues = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
res = minimize (MyFunction, StartValues, method='Powell', tol=0.01)
print (res)

Related

Generating rows of combinations

I don't know how I should formulate this question but I hope I can explain what I want to achive.
So I got a set of characters [A, B, C].
I want to generate the minimal amount of rows with length of N needed to contain all possible combinations of [A, B, C].
Example: when N = 4, it generates something like this with 9 rows of length N(1 column = 1 row)
AAABBBCCC
ABCCABBCA
BACACBCBA
ABCABCABC
For example the first row [A, A, B, A] contains the following combinations(1 column = 1 combination), notice how the combinations can wrap around the row.
A, A, A, B
A, A, B, A
A, B, A, A
B, A, A, A
How ever it's allowed for a combination to end up more than 1 time amongst all generated rows but it should be kept at the optimal minimum.
How should I go about this programmatically?
Are you counting in base 3? with the digits 1, 2, 3 instead of 0, 1, 2.
If you don't know how to count in base three, let's do it now.
0, 1, 2, 10, 11, 12, 20, 21, 22, 100
If you want to know what is the number of rows the with N digits then the answer is 3^N.
If you want to know what sequence you have at a given row of a given length at a given position (the first position is zero) in the sorted list you can use the following function
def row(k, N):
d = []
assert(k < 3**N)
for _ in range(N):
k,r = divmod(k, 3)
d.append(r+1)
return ''.join(str(di) for di in d[::-1])
An easy to verify row(1, 3)='112' is the second term in of the rows with lenght 3.
A not so easy to verify is that the billionth term of length 25 is given by row(10**9-1, 25)='1111113231311311132121111'.
Using generic symbols
If you want to return a list of arbitrary objects (not necessarily three) it is just changing the way the output is mapped.
def row(k, symbols, N):
d = []
B = len(symbols)
assert(k < B**N)
for _ in range(N):
k,r = divmod(k, B)
d.append(symbols[r])
return d[::-1];
Using it
print(row(1, ['red', 'green', 'blue'], 6))
print(row(100, ['red', 'green', 'blue'], 6))
> ['red', 'red', 'red', 'red', 'red', 'green']
> ['red', 'green', 'red', 'blue', 'red', 'green']

Biggest non-contiguous submatrix with all ones

I'm tackling the problem of finding a non-contiguous submatrix of a boolean matrix with maximum size such that all of its cells are ones.
As an example, consider the following matrix:
M = [[1, 0, 1, 1],
[0, 0, 1, 0],
[1, 1, 1, 1]]
A non-contiguous submatrix of M is specified as a set of rows R and a set of columns C. The submatrix is formed by all the cells that are in some row in R and in some column in C (the intersections of R and C). Note that a non-contiguous submatrix is a generalization of a submatrix, so any (contiguous) submatrix is also a non-contiguous submatrix.
There is one maximum non-contiguous submatrix of M that has a one in all of its cells. This submatrix is defined as R={1, 3, 4} and C={1, 3}, which yields:
M[1, 2, 4][1, 3] = [[1, 1, 1],
[1, 1, 1]]
I'm having difficulties finding existing literature about this problem. I'm looking for efficient algorithms that don't necessarily need to be optimal (so I can relax the problem to finding maximal size submatrices). Of course, this can be modeled with integer linear programming, but I want to consider other alternatives.
In particular, I want to know if this problem is already known and covered by the literature, and I want to know if my definition of non-contiguous matrix makes sense and whether already exists a different name for them.
Thanks!
Since per your response to Josef Wittmann's comment you want to find the Rectangle Covering Number, my suggestion would be to construct the Lovász–Saks graph and apply a graph coloring algorithm.
The Lovász–Saks graph has a vertex for each 1 entry in the matrix and an edge between each pair of vertices whose 2x2 matrix contains a zero. In your example,
[[1, 0, 1, 1],
[0, 0, 1, 0],
[1, 1, 1, 1]]
we can label the 1s with letters:
[[a, 0, b, c],
[0, 0, d, 0],
[e, f, g, h]]
and then get edges
a--d, a--f, b--f, c--d, c--f, d--e, d--f, d--h.
a b a 0 0 b b c 0 c 0 d 0 d d 0
0 d e f f g d 0 f h e f f g g h
I think an optimal coloring is
{a, b, c, e, g, h} -> 1
{d} -> 2
{f} -> 3.

Lua - Choose a random value from a range (or table) excluding the values of a (or another) table

A range, 1, 2, 3, 4, 5, 6, 7, 8 (it can populate a Lua table if it makes it easier)
table = {1, 4, 3}
The possible random choice should be among 2, 5, 6, 7, 8.
In Python I have used this to get it:
possibleChoices = random.choice([i for i in range(9) if i not in table])
Any ideas how to achieve the same in Lua?
Lua has a very minimal library, so you will have to write your own functions to do some tasks that are automatically provided in many other languages.
A good way to go about this is to write small functions that solve part of your problem, and to incorporate those into a final solution. Here it would be nice to have a range of numbers, with certain of those numbers excluded, from which to randomly draw a number. A range can be obtained by using a range function:
-- Returns a sequence containing the range [a, b].
function range (a, b)
local r = {}
for i = a, b do
r[#r + 1] = i
end
return r
end
To get a sequence with some numbers excluded, a seq_diff function can be written; this version makes use of a member function:
-- Returns true if x is a value in the table t.
function member (x, t)
for k, v in pairs(t) do
if v == x then
return true
end
end
return false
end
-- Returns the sequence u - v.
function seq_diff (u, v)
local result = {}
for _, x in ipairs(u) do
if not member(x, v) then
result[#result + 1] = x
end
end
return result
end
Finally these smaller functions can be combined into a solution:
-- Returns a random number from the range [a, b],
-- excluding numbers in the sequence seq.
function random_from_diff_range (a, b, seq)
local selections = seq_diff(range(a, b), seq)
return selections[math.random(#selections)]
end
Sample interaction:
> for i = 1, 20 do
>> print(random_from_diff_range(1, 8, {1, 4, 3}))
>> end
8
6
8
5
5
8
6
7
8
5
2
5
5
7
2
8
7
2
6
5

Mathematica enumerate combinations

I need to enumerate combinations for 3 groups of values that I have. The groups are (a,b,c,d), (e,f,g,h), (i,j,k,l) for example. The total combinations are 4x4x4=64.
Has anyone an idea, how can I define the ascending numbering of these combinations?
I have written something in that form:
Do[Do[Do[x["formula is needed here"]=s[[i,j,k]],{k,1,4}],{j,1,4}],{i,1,4}]
I cannot find the formula for the numbering of the combinations. I have read something about "Generating the mth Lexicographical Element of a Mathematical Combination" but I am more lost than helped. x is supposed to take values 1,2,3,....,64.
Thank you for your suggestions!
if you need a "formula" for the 'nth' tuple it looks like this:
{ Floor[(# - 1)/16 ] + 1,
Floor[Mod[# - 1, 16]/4] + 1 ,
Floor[Mod[# - 1, 4] ] + 1 } & /# Range[64] ==
Tuples[Range[4], 3]
True
so then if you want say the 12'th combination of your sets you could do something like this:
({
Floor[(# - 1)/16] + 1,
Floor[Mod[# - 1, 16]/4 + 1] ,
Mod[# - 1, 4] + 1 } &#12);
{{a, b, c, d}[[%[[1]]]], {e, f, g, h}[[%[[2]]]], {i, j, k,
l}[[%[[3]]]]}
{a, g, l}
note that whatever you are doing it is almost always best to use the built in object oriented functions.
Tuples[{{a, b, c, d}, {e, f, g, h}, {i, j, k, l}}][[12]]
{a, g, l}
Edit: for completeness a generalization of the first expression:
listlen = 6;
nsamp = 4;
Table[Floor[Mod[# - 1, listlen^i]/listlen^(i - 1) + 1], {i, nsamp,
1, -1}] & /# Range[listlen^nsamp] ==
Tuples[Range[listlen], nsamp]
True
Tuples[{{a, b, c, d}, {e, f, g, h}, {i, j, k, l}}]

How to check for equality element-wise for two vectors?

I have two vectors that I need to check element wise for equality and return the total number of elements that are equal. So comparing a = {1,0,1} and b = {1,0,0} would return 2.
The example below is an effort I've made of a recursive function, but is returning errors.
Elementcompare[list1_, list2_] := If[First[list1] == First[list2], 1, 0] + Elementcompare[Rest[list1], Rest[list2]];
Thanks
I assume length of vectors is the same in general. There is a function for this - HammingDistance you can use it to define:
elcom[a_List, b_List] := Length[a] - HammingDistance[a, b]
Test it out
elcom[a, b]
2
Also check out EditDistance .
An easy and fast method is to use vector-level numeric operations.
a = {0, 1, 0, 1, 2};
b = {2, 1, 3, 1, 2};
a - b
{-2, 0, -3, 0, 0}
Unitize[a - b]
{1, 0, 1, 0, 0}
Tr # Unitize[a - b]
2
This is equivalent to HammingDistance in this use:
HammingDistance[a, b]
2
I use Tr to sum because it is very fast on packed arrays. Speed comparison with HammingDistance on version 7 with two long lists:
a = RandomInteger[3, 500000];
b = RandomInteger[3, 500000];
Do[HammingDistance[a, b], {50}] // Timing // First
Do[Tr # Unitize[a - b], {50}] // Timing // First
0.968
0.171
Performance is more similar when a and b are not packed arrays but the numeric method still wins. You can subtract the returned value from Length[a] to get your target metric just as Vitaliy showed.
If your vectors are bit-vectors (0s and 1s), you can squeeze more speed out of this computation by using bit operators:
a = RandomInteger[1, 500000];
b = RandomInteger[1, 500000];
First, check for consistency:
HammingDistance[a, b]
249965
Tr#Unitize[a - b]
249965
Total#BitXor[a, b]
249965
Check for speed:
Do[HammingDistance[a, b], {50}] // Timing // First
1.98993
Do[Tr#Unitize[a - b], {50}] // Timing // First
0.437551
Do[Total#BitXor[a, b], {50}] // Timing // First
0.139816

Resources