modular inversion in algorithms - algorithm

I am reading Extended Euclid algorithm in Algorithms book by Sanjoy das gupta at followinglink page 33
http://www.cse.iitd.ernet.in/~naveen/courses/CSL630/all.pdf
suppose we wish to compute 11^-1 mod 25.
Using the extended Euclid algorithm, we find that 15 * 25 - 34 * 11 = 1. Reducing both sides modulo 25, we have -34 * 11 congruent equal 1 mod 25. So -34 congruent equal 16 mod 25 is the inverse of 11 mod 25.
My question how author concluded that "-34 congruent equal 16 mod 25 is the inverse of 11 mod 25." from previous statement.

Since 15 * 25 - 34 * 11 = 1 you have 15 * 25 - 34 * 11 = 1 mod 25 which leads to -34 * 11 = 1 mod 25.
If you have a * b = 1, than a is the multiplicative inverse of b regardeless if a and b are matrices, field elements or elements of a residual class ring.
You get the result 16 when you normalize -34 to the range between 0 and 24: 16 = 2 * 25 - 34, and thus 16 * 11 = 1 mod 25. Note: There is exactly one natural k with k * 25 - 34 between 0 and 24, and this is 2 in this case.

Related

Quickly compute `dot(a(n:end), b(1:end-n))`

Suppose we have two, one dimensional arrays of values a and b which both have length N. I want to create a new array c such that c(n)=dot(a(n:N), b(1:N-n+1)) I can of course do this using a simple loop:
for n=1:N
c(n)=dot(a(n:N), b(1:N-n+1));
end
but given that this is such a simple operation which resembles a convolution I was wondering if there isn't a more efficient method to do this (using Matlab).
A solution using 1D convolution conv:
out = conv(a, flip(b));
c = out(ceil(numel(out)/2):end);
In conv the first vector is multiplied by the reversed version of the second vector so we need to compute the convolution of a and the flipped b and trim the unnecessary part.
This is an interesting problem!
I am going to assume that a and b are column vectors of the same length. Let us consider a simple example:
a = [9;10;2;10;7];
b = [1;3;6;10;10];
% yields:
c = [221;146;74;31;7];
Now let's see what happens when we compute the convolution of these vectors:
>> conv(a,b)
ans =
9
37
86
166
239
201
162
170
70
>> conv2(a, b.')
ans =
9 27 54 90 90
10 30 60 100 100
2 6 12 20 20
10 30 60 100 100
7 21 42 70 70
We notice that c is the sum of elements along the lower diagonals of the result of conv2. To show it clearer we'll transpose to get the diagonals in the same order as values in c:
>> triu(conv2(a.', b))
ans =
9 10 2 10 7
0 30 6 30 21
0 0 12 60 42
0 0 0 100 70
0 0 0 0 70
So now it becomes a question of summing the diagonals of a matrix, which is a more common problem with existing solution, for example this one by Andrei Bobrov:
C = conv2(a.', b);
p = sum( spdiags(C, 0:size(C,2)-1) ).'; % This gives the same result as the loop.

Find the Big-O of the modular algorithm

i=1
while i <= n do:
j=0
k=i
while k%3 == 0 do:
k = k/3
j++
end
print i, j
i++
end
What is Big-O of given algorithm (How do we show my work)? And what does this algorithm output // ?
My answer and approach
O(nlogn). Because the outer loop runs linear time as O(n) while the inner loop is dependent as O(logn).
But I'm not sure if it's logn.
When n = 10,
ij
00
10
20
31
40
50
61
70
80
92
100 ( 10, 0)
When n = 30
i j
1 0
2 0
3 1
4 0
5 0
6 1
7 0
8 0
9 2
10 0
11 0
12 1
13 0
14 0
15 1
16 0
17 0
18 2
19 0
20 0
21 1
22 0
23 0
24 1
25 0
26 0
27 3
28 0
29 0
30 1
Appreciate that every third number in the series from 1 to n will be divisible by 3. In the worst case, such a number would end up being divided by 3 in the k loop log_3(i) times. So the sequence will behave like O(n) two thirds of the time, and like O(n*log3(n)) one third of the time. We can therefore claim that your code is upper bounded by O(n*log3(n)), although there is a bound which is tighter than this.
The code will print each value i in the series along with the "three depth" of that number. By "three depth" I mean how many times we were able to divide i by 3. Obviously, for i values which are not multiples of 3, the depth is 0.

Time complexity for a very complicated recursion code

I have some problem while trying to calculate time complexity for this code:
function foo (int a):
if a < 1:
return 1
else:
for i = 1 to 4:
foo(a - 3)
for i = 1 to 4:
foo(a / 2)
end function
As far as I can go:
T(n) = 1 if n<1
T(n) = 4T(n-3) + 4T(n/2) if n>=1
= 4(4T(n-6) + 4T((n-3)/2)) + 4(4T(n/2 - 3) + 4T(n/4))
~ 4^2 (T(n-6) + T((n-3)/2) + T(n/2-3) + T(n/4))
Now, it is very complicated, since number of the next T increase by 2^n and also the child is quite complicated.
Is there any other ways to solve this problem?
Let's expand the recursive cost function:
T(n) = 4 [T(n-3) + T(n/2)]
T(n) = 4^2 [T(n-6) + T((n-3)/2) + T((n-6)/2) + T(n/4)]
T(n) = 4^n [T(n-9) + 2*T((n-6)/2) + T((n-9)/2) + T((n-12)/4) + T((n-3)/4) + T((n-6)/4) + T(n/8)]
From the moment the x in T(x) drops below 1, you should replace T(x) with 1. And from that moment on, the T(x) doesn't generate any "children" anymore so to speak.
what does this means? It means that after the k-'th expansion of T(n), the function will look like:
T(n) = 4^k [number of paths with length `k`]
and keep increasing k until all paths have "died". This is definitely the case after n/3 iterations, because that's the longest possible path.
We thus have some kind of graph, for instance for n=9:
9 + 6 + 3 + 0
| | ` 1
| `3 + 0
| ` 1
`4 + 1
` 2 + -1
` 1
so 6 paths. Now the problem is how to count the number of paths. In order to do this we will first represent the main path: n, n-3, n-6, etc. as a horizontal line of nodes, this is definitely the longest path:
n n-3 n-6 n-9 ... 1
Now out of all these nodes, do originate nodes of i -> i/2 (except for one)
n n-3 n-6 n-9 ... 4 1
| | | |
n/2 (n-3)/2 (n-6)/2 (n-9)/2 ... 2
(the second row shows all nodes created by divisions by 2). Now these nodes, generate again offsprong n -> n-3, which is, since it is divided by two n/2 -> (n-6)/2, in other words, there are edges that make jumps of two:
n n-3 n-6 n-9 ... 4 1
| | /-----+-------(n-9)/2 |
n/2 (n-3)/2 (n-6)/2 (n-9)/2 ... 2
\---------->(n-6)/2 \------->...
on other words, except for the first two elements, all other nodes in the second row count for two. If we would represent it as some kind of graph with the nodes labeled by their weight, it would look like:
1 -- 1 -- 1 -- 1 -- 1 -- .. -- .. -- 1
| | | | | | |
1 -- 1 -- 2 -- 2 -- 2 -- .. -- 2
Or if we keep doing this for this process:
1 -- 1 -- 1 -- 1 -- 1 -- .. -- .. -- .. -- .. -- ..-- 1
| | | | | | | | | |
1 -- 1 -- 2 -- 2 -- 2 -- .. -- .. -- .. -- .. -- 2
| | | | | | | |
1 -- 1 -- 2 -- 2 -- 3 -- .. -- .. -- 4
(the third row generates children 4 items further)
Now we need to calculate the sum of the last row. This is at most O(log n).
Which thus results in an upper bound of O(4^(n/3)*log n) maximum. It is definitely possible that the bound is tighter, or 4^(n/3+epsilon), the log doesn't really matter when it comes down to the exponent.
Experiments
One can turn the program into a program that calculates the cost (used Python):
def memodict(f):
""" Memoization decorator for a function taking a single argument """
class memodict(dict):
def __missing__(self, key):
ret = self[key] = f(key)
return ret
return memodict().__getitem__
#memodict
def foo (a):
if a < 1:
return 1
else:
return 1+4*(foo(a-3)+foo(a//2))
for i in range(1000) :
print '{0} {1}'.format(i,foo(i))
mind the 1+ (this is due to the fact that calling a method not at the leaves requires computational cost as well).
It shows the following graph (with the y axis in log space):
If one looks very closely it looks as if log n is a better estimate. Although I don't know if it is safe to say this.
This results in a table (below, calculated it further up to 2'000).
1 9
2 41
3 41
4 201
5 329
6 329
7 969
8 2121
9 2121
10 5193
11 9801
12 9801
13 22089
14 43081
15 43081
16 96841
17 180809
18 180809
19 395849
20 744009
21 744009
22 1622601
23 3015241
24 3015241
25 6529609
26 12149321
27 12149321
28 26290761
29 48769609
30 48769609
31 105335369
32 195465801
33 195465801
34 422064713
35 782586441
36 782586441
37 1688982089
38 3131929161
39 3131929161
40 6758904393
41 12530692681
42 12530692681
43 27038593609
44 50129261129
45 50129261129
46 108166435401
47 200529105481
48 200529105481
49 432677802569
50 802142540361
51 802142540361
52 1730759807561
53 3208618758729
54 3208618758729
55 6923087827529
56 12834580197961
57 12834580197961
58 27692546388553
59 51338515870281
60 51338515870281
61 110770380632649
62 205354484822601
63 205354484822601
64 443082304393801
65 821418721153609
66 821418721153609
67 1772329999438409
68 3285676572873289
69 3285676572873289
70 7089323128099401
71 13142709421838921
72 13142709421838921
73 28357295642743369
74 52570844443284041
75 52570844443284041
76 113429195098690121
77 210283390300852809
78 210283390300852809
79 453716792922477129
80 841133588239028809
81 841133588239028809
82 1814867221812679241
83 3364534403078885961
84 3364534403078885961
85 7259468937373487689
86 13458137720469918281
87 13458137720469918281
88 29037875950010995273
89 53832551082396717641
90 53832551082396717641
91 116151504000561025609
92 215330204762252612169
93 215330204762252612169
94 464606016804360524361
95 861320819851126870601
96 861320819851126870601
97 1858424068019558519369
98 3445283281135218692681
99 3445283281135218692681
100 7433696275286804238921
101 13781133127749444932169
102 13781133127749444932169
103 29734785104355787117129
104 55124532517920818958921
105 55124532517920818958921
106 118939140430257623503433
107 220498130084517750870601
108 220498130084517750870601
109 475756561733864969048649
110 881992520365763354792521
111 881992520365763354792521
112 1903026246986798196986441
113 3527970081514391739961929
114 3527970081514391739961929
115 7612104987998531108737609
116 14111880326168337145401929
117 14111880326168337145401929
118 30448419952199478498431561
119 56447521304878702645088841
120 56447521304878702645088841
121 121793679809003268057207369
122 225790085219957892102885961
123 225790085219957892102885961
124 487174719236834490168119881
125 903160340880652986350834249
126 903160340880652986350834249
127 1948698876948159378611769929
128 3612641363524384274620912201
129 3612641363524384274620912201
130 7794795507795923189331694153
131 14450565454100822773368263241
132 14450565454100822773368263241
133 31179182031186978432211391049
134 57802261816410380413470806601
135 57802261816410380413470806601
136 124716728124761056435137057353
137 231209047265654664360174719561
138 231209047265654664360174719561
139 498866912499057368446839722569
140 924836189062647014733211275849
141 924836189062647014733211275849
142 1995467649996282044625046245961
143 3699344756250640629770532459081
144 3699344756250640629770532459081
145 7981870599985180749337872339529
146 14797379025002675948264700809801
147 14797379025002675948264700809801
148 31927482399940933280729262494281
149 59189516100010914076436576375369
150 59189516100010914076436576375369
151 127709929599763943406294823113289
152 236758064400044110022526700261961
153 236758064400044110022526700261961
154 510839718399056614758740495864393
155 947032257600177281223668004459081
156 947032257600177281223668004459081
157 2043358873596227300168523186868809
158 3788129030400710939761843707744841
159 3788129030400710939761843707744841
160 8173435494384912565208445703590473
161 15152516121602847123581727787094601
162 15152516121602847123581727787094601
163 32693741977539653625368135770477129
164 60610064486411395753795798399095369
165 60610064486411395753795798399095369
166 130774967910158627959610155397452361
167 242440257945645596473320805911925321
168 242440257945645596473320805911925321
169 523099871640634525296578233905353289
170 969761031782582414931158973141652041
171 969761031782582414931158973141652041
172 2092399486562538155018863817501086281
173 3879044127130329713557186774446281289
174 3879044127130329713557186774446281289
175 8369597946250152673908006151884018249
176 15516176508521318970380250897829106249
177 15516176508521318970380250897829106249
178 33478391785000610910962228937122943561
179 62064706034085276096851207920903295561
180 62064706034085276096851207920903295561
181 133913567140002443859179120078078644809
182 248258824136341104852010847685857284681
183 248258824136341104852010847685857284681
184 535654268560009776298037299361325027913
185 993035296545364420269364209792439587401
186 993035296545364420269364209792439587401
187 2142617074240039106053470016494310560329
188 3972141186181457682935880906387200447049
189 3972141186181457682935880906387200447049
190 8570468296960156427659163345381749723721
191 15888564744725830735188806904953309270601
192 15888564744725830735188806904953309270601
193 34281873187840625714081936660931506377289
194 63554258978903322948188923891891471159881
195 63554258978903322948188923891891471159881
196 137127492751362502870108879768266900279881
197 254217035915613291806536828692106759410249
198 254217035915613291806536828692106759410249
199 548509971005450011494216652197608475890249
200 1016868143662453167255882099869574254596681
(Rewrote to give a better answer.)
Here is a simple and rigorous analysis that shows why T(n) ~ 4^{n/3} is a tight estimate.
We have the recurrence
T(n) = 4T(n-3) + 4T(n/2)
To get the tight result, we want to see that T(n/2) is negligible compared to T(n-3). We can do this as follows.
First, T is nonnegative for all n, so in particular T(n/2) >= 0, so for all n we have an inequality,
T(n) >= 4T(n-3)
Now, we want to use that inequality to compare T(n-3) and T(n/2).
By applying that inqeuality n/6 - 1 times, we get that
T(n-3) >= 4^{n/6 - 1} * T(n/2)
(Because, (n/6 - 1) * 3 = n/2 - 3, and n/2 - 3 + n/2 = n - 3).
It implies that T(n/2) is small compared to T(n-3):
T(n/2) <= 4^{-n/6 + 1} * T(n-3)
Now, for any epsilon > 0, there is an n_0 such that for n > n_0, 4^{-n/6 + 1} < epsilon. (Because, the limit of 4^{-n/6 + 1} is zero as n gets large.)
This implies that for any epsilon > 0, there is large enough n so that
4T(n-3) <= T(n) <= (4 + epsilon) T(n-3)
This yields the tight bound T(n) = 4^(n/3 + o(n)).
Getting a sharper estimate
There's some question in the comments about getting rid of the o(n) above, to get an even sharper estimate.
I fear this is basically just going to get pedantic -- usually no one cares about the low order terms, and nailing them down exactly is just some calculus work. But we can do a little more today anyways.
What's the difference
First of all, what is the difference between O(4^{n/3}) and 4^{n/3 + o(n)}? (Alternatively, we could write the latter as (4+o(1))^{n/3}.)
The difference is in how tightly they control the low order terms. O(4^{n/3}) controls them very tightly -- it says you don't exceed the (concrete) value 4^{n/3}) by more than a constant factor.
4^{n/3 + o(n)}, allows that you may exceed 4^{n/3} by more than a constant factor. But that factor is subexponential in n, it's negligible compared to 4^{n/3}.
For example, consider the function f(n) = n * 4^{n/3}. This function is not O(4^{n/3}). Indeed, it exceeds it by a factor n, more than a constant factor.
However, f(n) is in the class 4^{n/3 + o(n)}. Why? Because n = O(4^{epsilon n}) for every epsilon > 0.
When you have an inequality like,
4T(n-3) <= T(n) <= (4 + epsilon) T(n-3)
for every epsilon > 0, you can only deduce from this T(n) = (4 + o(1))^{n/3}.
To get a sharper bound, we need to treat epsilon as a function of n and not as a constant (like I did in the lazier version.)
Proof
Let epsilon(n) = 4^{-n/6 + 1} in what follows. Then we already showed
T(n) <= (4 + epsilon(n)) T(n-3)
and we want to see T = O(4^{n/3}).
This is can be expanded as an iterated product:
T(n) = PI_{i=1}^{n/3} (4 + epsilon(3i))
We can factor each term and pull out a factor of 4 to get
T(n) = 4^{n/3} * PI_{i=1}^{n/3} (1 + epsilon(3i)/4 )
The goal is now to show that
PI_{i=1}^{n/3} (1 + epsilon(3i)/4 ) = O(1)
and then we will be finished.
To do this we take the log, and show that that is O(1).
SUM_{i=1}^{n/3} log(1 + epsilon(3i/4))
We bound that using log(1+x) <= x for x >= 0.
SUM_{i=1}^{n/3} epsilon(3i/4)
Now we use the definition of epsilon. In fact we only need to know epsilon(n) <= C^{-n} for some C > 1. The above becomes
SUM_{i=1}^{n/3} C'^{-i}
for some constant C' > 1. But this is a geometric series, so it is bounded above by the infinite geometric series as
1 / (1 - 1/C') = O(1)
Thus T(n) = O(4^{n/3}).
Since we already had T(n) = Omega(4^{n/3}) we now have it tight up to constants, T(n) = Θ(4^{n/3})
You can decide for yourself if this extra work made things any more clear :p Personally I prefer to leave the o(n)'s in there usually.
IMO, the time complexity is Θ(r^n), where r=³√4.
Indeed, plugging this expression in the recurrence relation,
r^n = 1 + 4 r^n / r³ + 4 r^(n/2) = 1 + r^n + 4 √(r^n),
where the second term dominates asymptotically.
Here is a plot of the exact number of total calls to foo, divided by r^n for easy reading. We assumed the floor [n/2] in f(n/2).
The ratios tend to the repeating sequence 46.6922952502, 63.4656065932
74.1193985991. This seems to confirm Θ(r^n).
Update:
By induction we can show that for n >= 21,
T(n) < B(n) = 75.(s^(2n) - 4.s^n),
with s=³√2.
Indeed, by the recurrence equation and the induction hypothesis,
T(n+3) = 1 + 4.T(n) + 4.T([(n+3)/2])
< 1 + 4.75.(s^(2n) - 4.s^n) + 4.75.(s^(2[(n+3)/2])) - 4.s^[(n+3)/2])
We compare this to the bound B(n+3) to establish
1 + 4.75.(s^(2n) - 4.s^n) + 4.75.(s^(2[(n+3)/2])) - 4.s^[(n+3)/2])
< 75.(s^(2n+6) - 4.s^[(n+3)/2]
We can simplify the terms 4.75.s^(2n) and divide by 300.s^n:
s^(-n)/300 - 4 + s^(-(n+3)%2) - 4.s^([(n+3)/2]-n) < - s^([(n+3)/2]-n)
or
s^(-n)/300 + s^(-(n+3)%2) < 4 + 5.s^([(n+3)/2]-n).
This inequality is true for any n, so that T(n) < B(n) => T(n+3) < B(n+3).
Now for the base case, we use the table of T(n) given by #CommuSoft (and checked independently) and verify numerically
T(21) = 744009 < 75.(s^42 - 4.s^21) = 1190400
T(22) = 1622601 < 75.(s^44 - 4.s^22) = 1902217.444...
T(23) = 3015241 < 75.(s^46 - 4.s^23) = 3035425.772...
...
T(41) = 12530692681 < 75.(s^82 - 4.s^41) = 12678879361
This shows that the induction step can be applied from n=39 onwards ([(39+3)/2]=21).
Then
T(n) = O(75.(s^(2n) - 4.s^n)) = O(r^n).
(Actually, for all n >= 23, 46.r^n < T(n) < 75.r^n and this is very tight; T(n) = Θ(r^n).)

Where is my mistake in this answer to Project Euler #58?

I am solving project Euler question 58. Here a square is created by starting with 1 and spiralling anticlockwise in the following way (here is side length equal to 7:
37 36 35 34 33 32 31
38 17 16 15 14 13 30
39 18 5 4 3 12 29
40 19 6 1 2 11 28
41 20 7 8 9 10 27
42 21 22 23 24 25 26
43 44 45 46 47 48 49
The question is to find out when we keep spiralling around the square, when the ratio of primes in the diagonals and the amount of numbers in the diagonal is smaller than 0.10.
I am convinced I have the solution with the code below (see code comments for clarification), but the site states that the answer is wrong when I am entering it.
require 'prime'
# We use a mathematical derivation of the corner values, keep increasing the value till we find a ratio smaller
# than 0.10 and increase the grid_size and amount of numbers on diagonals each iteration
side_length = 3 # start with grid size of 3x3 so that we do not get into trouble with 1x1 grid
prime_count = 3 # 3, 5, 7 are prime and on a diagonal in a 3x3 grid
diagonal_size = 5
prime_ratio = 1 # dummy value bigger than 0.10 so we can start the loop
while prime_ratio >= 0.10
# Add one to prime count for each corner if it is prime
# Corners are given by n2 (top left), n2-n+1, n2-2n+2, and n2-3n+3
prime_count += 1 if (side_length**2).prime?
prime_count += 1 if (side_length**2-side_length+1).prime?
prime_count += 1 if (side_length**2-2*side_length+2).prime?
prime_count += 1 if (side_length**2-3*side_length+3).prime?
# Divide amount of primes counted by the diagonal length to get prime ratio
prime_ratio = prime_count/diagonal_size.to_f
# Increase the side length by two (full spiral) and diagonal size by four
side_length += 2 and diagonal_size += 4
end
puts side_length-2 #-2 to account for last addition in while-loop
# => 26612
It probably is wrong and site is right. I am stuck on this problem for quite some time now. Can anyone point me the mistake?
side_length += 2 and diagonal_size += 4 should be at the beginning of the loop.
Couldn't check, I do not have ruby installed, but I can reproduce the same problem on my python solution.

Prime factorization of a factorial

I need to write a program to input a number and output its factorial's prime factorization in the form:
4!=(2^3)*(3^1)
5!=(2^3)*(3^1)*(5^1)
The problem is I still can't figure out how to get that result.
Apparently each first number in brackets is for the ascending prime numbers up until the actual factorial. The second number in brackets is the amount of times the number occurs in the factorial.
What I can't figure out is for example in 5!=(2^3)*(3^1)*(5^1), how does 2 only occur 3 times, 3 only 1 time and 5 only one time in 120 (5!=120).
I have now solved this thanks to the helpful people who commented but I'm now having trouble trying to figure out how could I take a number and get the factorial in this format without actually calculating the factorial.
Every number can be represented by a unique (up to re-ordering) multiplication of prime numbers, called the prime factorization of the number, as you are finding the prime factors that can uniquely create that number.
2^3=8
3^1=3
5^1=5
and 8*3*5=120
But this also means that: (2^3)*(3^1)*(5^1) = 120
It's not saying that 2 occurs 3 times as a digit in the number 120, which it obviously does not, but rather to multiply 2 by 2 by 2, for a total of 3 twos. Likewise for the 3 and 5, which occur once in the prime factorization of 120. The expression which you mention is showing you this unique prime factorization of the number 120. This is one way of getting the prime factorization of a number in Python:
def pf(number):
factors=[]
d=2
while(number>1):
while(number%d==0):
factors.append(d)
number=number/d
d+=1
return factors
Running it you get:
>>> pf(120)
[2, 2, 2, 3, 5]
Which multiplied together give you 120, as explained above. Here's a little diagram to illustrate this more clearly:
Consider e.g. 33!. It's a product of:
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
the factors are:
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2
2 2 2 2
2 2
2
3 3 3 3 3 3 3 3 3 3 3
3 3 3
3
5 5 5 5 5 5
5
7 7 7 7
11 11 11
13 13
17
19
23
29 31
Do you see the pattern?
33! = 2^( 33 div 2 + 33 div 4 + 33 div 8 + 33 div 16 + 33 div 32) *
3^( 33 div 3 + 33 div 9 + 33 div 27) *
5^( 33 div 5 + 33 div 25) *
----
7^( 33 div 7) * 11^( 33 div 11) * 13^( 33 div 13) *
----
17 * 19 * 23 * 29 * 31
Thus, to find prime factorization of n! without doing any multiplications or factorizations, we just need to have the ordered list of primes not greater than n, which we process (with a repeated integer division and a possible summation) in three stages - primes that are smaller or equal to the square root of n; such that are smaller or equal to n/2; and the rest.
Actually with lazy evaluation it's even simpler than that. Assuming primes is already implemented returning a stream of prime numbers in order, in Haskell, factorial factorization is found as
ff n = [(p, sum . takeWhile (> 0) . tail . iterate (`div` p) $ n)
| p <- takeWhile (<= n) primes]
-- Prelude> ff 33
-- [(2,31),(3,15),(5,7),(7,4),(11,3),(13,2),(17,1),(19,1),(23,1),(29,1),(31,1)]
because 33 div 4 is (33 div 2) div 2, etc..
2^3 is another way of writing 23, or two to the third power. (2^3)(3^1)(5^1) = 23 × 3 × 5 = 120.
(2^3)(3^1)(5^1) is just the prime factorization of 120 expressed in plain ASCII text rather than with pretty mathematical formatting. Your assignment requires output in this form simply because it's easier for you to output than it would be for you to figure out how to output formatted equations (and probably because it's easier to process for grading).
The conventions used here for expressing equations in plain text are standard enough that you can directly type this text into google.com or wolframalpha.com and it will calculate the result as 120 for you: (2^3)(3^1)(5^1) on wolframalpha.com / (2^3)(3^1)(5^1) on google.com
WolframAlpha can also compute prime factorizations, which you can use to get test results to compare your program with. For example: prime factorization of 1000!
A naïve solution that actually calculates the factorial will only handle numbers up to 12 (if using 32 bit ints). This is because 13! is ~6.2 billion, larger than the largest number that can be represented in a 32 bit int.
However it's possible to handle much larger inputs if you avoid calculating the factorial first. I'm not going to tell you exactly how to do that because either figuring it out is part of your assignment or you can ask your prof/TAs. But below are some hints.
ab × ac = ab+c
equation (a) 10 = 21 × 51
equation (b) 15 = 31 × 51
10 × 15 = ? Answer using the right hand sides of equations (a) and (b), not with the number 150.
10 × 15 = (21 × 51) × (31 × 51) = 21 × 31 × (51 × 51) = 21 × 31 × 52
As you can see, computing the prime factorization of 10 × 15 can be done without multiplying 10 by 15; You can instead compute the prime factorization of the individual terms and then combine those factorizations.
If you write out the factorial 5!:
1 * 2 * 3 * 4 * 5,
you will notice that there is one non-prime number: 4. 4 can be written as 2 * 2 or 2^2, which is where the extra 2s come from.
Add up all of the occurrences (exponential forms are in parentheses; add exponents for like bases):
2 (2^1) * 3 (3^1) * 4 (2^2) * 5 (5^1), you get the proper answer.
You can use O(n/2 log log n) algorithm using only sums (no need precalc primes).
This is a sieve using relation
f = a * b ~> f^k = a^k * b^k
then, we reduce all initial factors 1 * 2 * 3 * ... * n moving k from big numbers to small numbers.
Using Sieve of Atkin the Will Ness algorithm could be better for very big n if not, I think it could be better
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
int n = atoi(argv[1]);
int *p = (int *) malloc(sizeof(int) * (n + 1));
int i, j, d;
for(i = 0; i <= n; i++)
p[i] = 1;
for(i = n >> 1; i > 1; i--)
if(p[i]) {
for(j = i + i, d = 2; j <= n; j += i, d++) {
if(p[j]) {
p[i] += p[j];
p[d] += p[j];
p[j] = 0;
}
}
}
printf("1");
for(i = 2; i <= n; i++)
if(p[i])
printf(" * %i^%i", i, p[i]);
printf("\n");
return 0;
}

Resources