Multiplying 64-bit number by a 32-bit number in 8086 asm - algorithm

I have been having problems even initiating the solution for the problem. I have tried considering the multiplication is repeated addition algorithm but whatever algorithm I consider, I seem to be fixated over a single issue- the max register size in 8086 is 16-bit.
data segment
num1 dw 0102h,0304h,0506h,0708h
num2 dw 0102h,0304h
res dw ?,?,?,?,?,?
data ends
code segment
assume CS:CODE, DS:DATA
start:
mov ax,DATA
mov DS,ax
..........fill code..........
At this point I am stuck. Even a slight hint of code or just the algorithm would be appreciated.

For multiplying a 64-bit number by a 32-bit number with a 16-bit CPU:
Step 1: Assume the numbers are in "base 65536". For example, a 64-bit number has 16 digits in hex (e.g. 0x0123456789ABCDEF) and would have 4 digits in "base 65536" (e.g.
{0123}{4567}{89AB}{CDEF}); and in the same way a 32-bit number would have 2 digits in "base 65536".
Step 2: To multiply a pair of numbers, multiply each digit from the first number with each digit from the second number, and add this into the right place in the result. The right place depends on the position of each digit in the original numbers. For example (in decimal) for 90 * 30 you'd do "9*3 = 27" and put it in the "hundreds" place (e.g. 2700) because there was one digit to the right in the first number plus another digit to the right in the second number, and that means it needs to go where there's 2 digits to the right in the result.
Example:
0x0123456789ABCDEF * 0x87654321
= {0123}{4567}{89AB}{CDEF} * {8765}{4321}
{0123}{4567}{89AB}{CDEF}
* {8765}{4321}
------------------------------------------
= {3600}{18CF} (from 4321 * CDEF)
+ {6CEA}{484B}{0000} (from 8765 * CDEF)
+ {2419}{800B}{0000} (from 4321 * 89AB)
+ {48CF}{7D77}{0000}{0000} (from 8765 * 89AB)
+ {1232}{E747}{0000}{0000} (from 4321 * 4567)
+ {24B4}{B2A3}{0000}{0000}{0000} (from 8765 * 4567)
+ {004C}{4E83}{0000}{0000}{0000} (from 4321 * 0123)
+ {0099}{E7CF}{0000}{0000}{0000}{0000} (from 8765 * 0123)
------------------------------------------
= {009A}{0CD0}{5C28}{F5C1}{FE56}{18CF}
------------------------------------------
= 0x009A0CD05C28F5C1FE5618CF
Note that 8086 has a "16 bit multiplied by 16 bit = 32-bit result" instruction (MUL); and addition can be done 16 bits at a time by using one ADD instruction followed by however many ADC instructions you need.
Also note that you can avoid some additions by merging. For example:
{0123}{4567}{89AB}{CDEF}
* {8765}{4321}
------------------------------------------
= {3600}{18CF} (from 4321 * CDEF)
+ {6CEA}{484B}{0000} (from 8765 * CDEF)
+ {2419}{800B}{0000} (from 4321 * 89AB)
+ {48CF}{7D77}{0000}{0000} (from 8765 * 89AB)
+ {1232}{E747}{0000}{0000} (from 4321 * 4567)
+ {24B4}{B2A3}{0000}{0000}{0000} (from 8765 * 4567)
+ {004C}{4E83}{0000}{0000}{0000} (from 4321 * 0123)
+ {0099}{E7CF}{0000}{0000}{0000}{0000} (from 8765 * 0123)
------------------------------------------
= {3600}{18CF} (from 4321 * CDEF)
+ {48CF}{7D77}{0000}{0000} (from 8765 * 89AB)
+ {0099}{E7CF}{0000}{0000}{0000}{0000} (from 8765 * 0123)
+ {6CEA}{484B}{0000} (from 8765 * CDEF)
+ {24B4}{B2A3}{0000}{0000}{0000} (from 8765 * 4567)
+ {2419}{800B}{0000} (from 4321 * 89AB)
+ {004C}{4E83}{0000}{0000}{0000} (from 4321 * 0123)
+ {1232}{E747}{0000}{0000} (from 4321 * 4567)
------------------------------------------
= {0099}{E7CF}{48CF}{7D77}{3600}{18CF} (THESE WERE MERGED WITHOUT ADDITION)
+ {24B4}{B2A3}{6CEA}{484B}{0000} (THESE WERE MERGED WITHOUT ADDITION)
+ {004C}{4E83}{2419}{800B}{0000} (THESE WERE MERGED WITHOUT ADDITION)
+ {1232}{E747}{0000}{0000} (from 4321 * 4567)
------------------------------------------
= {009A}{0CD0}{5C28}{F5C1}{FE56}{18CF}
------------------------------------------
= 0x009A0CD05C28F5C1FE5618CF
Of course, because it doesn't matter which order you do the multiplication of pairs of ("base 65536") digits; you can do all the multiplications in the optimum order for merging.
For the final code (with merging); you'd end up with 8 MUL instructions, 3 ADD instructions and about 7 ADC instructions. I'm too lazy to write the code. ;)

Related

Finding the formula for an alphanumeric code

A script I am making scans a 5-character code and assigns it a number based on the contents of characters within the code. The code is a randomly-generated number/letter combination. For example 7D3B5 or HH42B where any position can be any one of (26 + 10) characters.
Now, the issue I am having is I would like to figure out the number from 1-(36^5) based on the code. For example:
00000 = 0
00001 = 1
00002 = 2
0000A = 10
0000B = 11
0000Z = 36
00010 = 37
00011 = 38
So on and so forth until the final possible code which is:
ZZZZZ = 60466176 (36^5)
What I need to work out is a formula to figure out, let's say G47DU in its number form, using the examples below.
Something like this?
function getCount(s){
if (!isNaN(s))
return Number(s);
return s.charCodeAt(0) - 55;
}
function f(str){
let result = 0;
for (let i=0; i<str.length; i++)
result += Math.pow(36, str.length - i - 1) * getCount(str[i]);
return result;
}
var strs = [
'00000',
'00001',
'00002',
'0000A',
'0000B',
'0000Z',
'00010',
'00011',
'ZZZZZ'
];
for (str of strs)
console.log(str, f(str));
You are trying to create a base 36 numeric system. Since there are 5 'digits' each digit being 0 to Z, the value can go from 0 to 36^5. (If we are comparing this with hexadecimal system, in hexadecimal each 'digit' goes from 0 to F). Now to convert this to decimal, you could try use the same method used to convert from hex or binary etc... system to the decimal system.
It will be something like d4 * (36 ^ 4) + d3 * (36 ^ 3) + d2 * (36 ^ 2) + d1 * (36 ^ 1) + d0 * (36 ^ 0)
Note: Here 36 is the total number of symbols.
d0, d1, d2, d3, d4 can range from 0 to 35 in decimal (Important: Not 0 to 36).
Also, you can extend this for any number of digits or symbols and you can implement operations like addition, subtraction etc in this system itself as well. (It will be fun to implement that. :) ) But it will be easier to convert it to decimal do the operations and convert it back though.

Speed up the mullite operator cross product and dot product in numpy python

I have a loop in one part of my code. I tried to change all vectors into an example to make it simple as see in below sample. I have to try it this loop 230000 inside the other loop. this part take about 26.36.
is there any way to speed up or tune the speed to get optimized speed.
trr=time.time()
for i in range (230000):
print(+1 *0.0001 * 1 * 1000 * (
1 * np.dot(np.subtract([2,1], [4,3]), [1,2]) + 1
* np.dot(
np.cross(np.array([0, 0, 0.5]),
np.array([1,2,3])),
np.array([1,0,0]))
- 1 * np.dot((np.cross(
np.array([0,0,-0.5]),
np.array([2,4,1]))), np.array(
[0,1,0]))) )
print(time.time()-trr)
the code with variable:
For i in range (23000):
.......
.....
else:
delta_fs = +1 * dt * 1 * ks_smoot * A_2d * (
np.dot(np.subtract(grains[p].v, grains[w].v), vti) * sign +
np.dot(np.cross(np.array([0, 0, grains[p].rotational_speed]),
np.array(np.array(xj_c) - np.array(xj_p))),
np.array([vti[0], vti[1], 0])) * sign
- np.dot((np.cross( np.array([0, 0, grains[w].rotational_speed]),
np.array(np.array(xj_c) - np.array(xj_w)))), np.array(
[vti[0], vti[1], 0])) * sign)
It would've been better if you kept your examples in variables, since your code is very difficult to read. Ignoring the fact that the loop in your example just computes the same constant value over and over again, I am working under the assumption that you need to run a specific set of numpy operations many times on various numpy arrays/vectors. You may find it useful to spend some time looking into the documentation for numba. Here's a very basic example:
import numpy as np
import numba as nb
CONST = 1*0.0001*1*1000
a0 = np.array([2.,1.])
a1 = np.array([4.,3.])
a2 = np.array([1.,2.])
b0 = np.array([0., 0., 0.5])
b1 = np.array([1.,2.,3.])
b2 = np.array([1.,0.,0.])
c0 = np.array([0.,0.,-0.5])
c1 = np.array([2.,4.,1.])
c2 = np.array([0.,1.,0.])
#nb.jit()
def op1(iters):
for i in range(iters):
op = CONST * (1 * np.dot(a0-a1,a2)
+ 1 * np.dot(np.cross(b0,b1),b2)
- 1 * np.dot(np.cross(c0,c1),c2))
op1(1) # Initial compilation
def op2(iters):
for i in range(iters):
op = CONST * (1 * np.dot(a0-a1,a2)
+ 1 * np.dot(np.cross(b0,b1),b2)
- 1 * np.dot(np.cross(c0,c1),c2))
%timeit -n 100 op1(100)
# 54 µs ± 2.49 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit -n 100 op2(100)
# 15.5 ms ± 313 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Seems like it'll be multiple order of magnitudes faster, which should easily bring your time down to a fraction of a second.

program that senkeys a random digit

I have to make a vbscript program that sendkey's a random 6 digit nomber, any ideas?
I tried this
WscriptSchell.SendKeys "randomNumber = Int( (999999 - 100000 + 1) * Rnd + 100000
Create a file (for example test.vbs) with that content and run it:
Randomize
WScript.CreateObject("WScript.Shell").SendKeys Int((999999 - 100000 + 1) * Rnd + 100000)
Info:
Randomize must be called before Rnd to get random results.

CRC Polynomial Division

I am trying to use polynomial division to find the CRC check bits, but I am struggling with the last stage of the calculation.
I am believe the below conversions are correct:
Pattern = 1010
= x^3 + x
Dataword = 9 8 7
= 1001 1000 0111
= x^11 + x^8 + x^7 + x^2 + x + 1
And finally the polynomial long division I am attempted is:
x^8 + x^6 + x^5 + x^3 + x
_______________________________________
x^3 + x | x^11 + x^8 + x^7 + x^2 + x + 1
x^11 + x^9
....
x^4 + x^2 + x + 1
x^4 + x^2
= x + 1
My question is, is the remainder / answer x + 1 or do I take it a step further and remove the x leaving the remainder as just 1?
Thank you for your help!
We can check by mod 2 division (XOR) too, the following code shows a python implementation of CRC checking, we need to follow the steps listed below:
Convert CRC / data polynomials to corresponding binary equivalents.
if the CRC key (binary representation obtained from the polynomial) has k bits, we need to pad an additional k-1 bits with the data to check for errors. In the example given, the bits 011 should be appended to the data, not 0011, since k=4.
At the transmitter end,
The binary data is to be augmented first by adding k-1 zeros in the end of the data.
Use modulo-2 binary division to divide binary data by the CRC key and store remainder of division.
Append the remainder at the end of the data to form the encoded data and send the same
At the receiver end (Check if there are errors introduced in transmission)
Perform modulo-2 division again on the sent data with the CRC key and if the remainder is 0, then there are no errors.
Now let's implement the above:
def CRC_polynomial_to_bin_code(pol):
return bin(eval(pol.replace('^', '**').replace('x','2')))[2:]
def get_remainder(data_bin, gen_bin):
ng = len(gen_bin)
data_bin += '0'*(ng-1)
nd = len(data_bin)
divisor = gen_bin
i = 0
remainder = ''
print('\nmod 2 division steps:')
print('divisor dividend remainder')
while i < nd:
j = i + ng - len(remainder)
if j > nd:
remainder += data_bin[i:]
break
dividend = remainder + data_bin[i:j]
remainder = ''.join(['1' if dividend[k] != gen_bin[k] else '0' for k in range(ng)])
print('{:8s} {:8s} {:8s}'.format(divisor, dividend, remainder[1:]))
remainder = remainder.lstrip('0')
i = j
return remainder.zfill(ng-1)
gen_bin = CRC_polynomial_to_bin_code('x^3+x')
data_bin = CRC_polynomial_to_bin_code('x^11 + x^8 + x^7 + x^2 + x + 1')
print('transmitter end:\n\nCRC key: {}, data: {}'.format(gen_bin, data_bin))
r = get_remainder(data_bin, gen_bin)
data_crc = data_bin + r
print('\nencoded data: {}'.format(data_crc))
print('\nreceiver end:')
r = get_remainder(data_crc, gen_bin)
print('\nremainder {}'.format(r))
if eval(r) == 0:
print('data received at the receiver end has no errors')
# ---------------------------------
# transmitter end:
#
# CRC key: 1010, data: 100110000111
#
# mod 2 division steps:
# divisor dividend remainder
# 1010 1001 011
# 1010 1110 100
# 1010 1000 010
# 1010 1000 010
# 1010 1011 001
# 1010 1100 110
# 1010 1100 110
#
# encoded data: 100110000111110
# ---------------------------------
# receiver end:
#
# mod 2 division steps:
# divisor dividend remainder
# 1010 1001 011
# 1010 1110 100
# 1010 1000 010
# 1010 1000 010
# 1010 1011 001
# 1010 1111 101
# 1010 1010 000
#
# remainder 000
# data received at the receiver end has no errors
# ---------------------------------

Avoiding while loops

This isn't strictly speaking a coding question as I'm in charge of a spreadsheet rather than code, but the same principles apply.
I'm trying to create a piece of my spreadsheet that is an "average predictor". As an example: say a batsman has an average of 24 from 40 innings (in other words, has scored 960 runs). If he consistently performs at an average of 40 runs per innings from here on in, how many innings will it take for him to raise his career average to 30?
It's pretty easy to work this example out by hand, and just as easy to solve the general problem with a while loop. However, as mentioned, I can't use loops. Any ideas or suggestions?
You don't need a loop for this purpose. You can solve it by using the following formula (moving average):
(current_avg * current_innings + avg * x)/(current_innings + x) = goal_avg
You have to solve the equation for x.
Your example calculated on Wolfram Alpha:
Input: (24 * 40 + 40 * x)/(40 + x) = 30 solve x
Result: x=24
Link
As recommended by #StriplingWarrior in a comment on the question, write out the general equation, solve it algebraically, and use the resulting formula in your spreadsheet. The raw equation is given in the prior answer by #trylimits. I'm using slightly different identifiers to bring out the symmetry of the problem:
(old_avg * old_innings + new_avg * new_innings)/(old_innings + new_innings)
= goal_avg
old_avg * old_innings + new_avg * new_innings
= goal_avg * (old_innings + new_innings)
old_avg * old_innings + new_avg * new_innings
= goal_avg * old_innings + goal_avg * new_innings
new_avg * new_innings - goal_avg * new_innings
= goal_avg * old_innings - old_avg * old_innings
new_innings * (new_avg - goal_avg) = old_innings * (goal_avg - old_avg)
If new_avg - goal_avg is zero there is no solution unless goal_avg - old_avg is also zero, in which case no change is required and new_innings can be zero. If new_avg - goal_avg is non-zero:
new_innings = old_innings * (goal_avg - old_avg) / (new_avg - goal_avg)
The right hand side of this equation is a formula you could put in a spreadsheet.
The values from the example are:
old_avg = 24
old_innings = 40
new_avg = 40
goal_avg = 30
new_innings = 40 * (30 - 24) / (40 - 30)
= 240 / 10
= 24

Resources