{Two's complement} Bit shifting - bit

I got confused by all this shifting thing since I saw two different results of shifting the same number. I know there are tons of questions about this thing but seems like I still couldn't find what I was looking for (Feel free to post link of a question or a website that could help).
So, first I have seen the number 13 binary like: 001101 (not whole word of bits).
When applied shifting to the left by 2 they hold the last bit (bit for sign probably) and results like 0|10100 = 20. However on other place I have seen the number 13 represented like: 01101, and now the 01101<<2 was 0|0100 = 4. I know shifting left is same as multiplying by the base, however this made me confused. Should i present 13 as 001101 or 01101 and apply shifting.
I think we omit the overflow considering the results.
Thank you !!

This behaviour seems to be corresponding with integers of length 5 and
4 (in bits, not counting the sign bit). So it seems overflow is indeed the problem. If it isn't, could you add some context as to where these strange results occur?

001101, 01101 and also 1101 and 00001101 and other sizes have equal claim to "being" 13. You can't really say that 13 has one definitive size, rather it is the operation that has a size (which may be infinite, then a left shift never wraps).
So you have to decide what size of shift you're doing, independently of the value you're shifting. Common choices are 32 or 64 bits, but you're certainly not limited to that, although "strange" sizes take more effort to implement on typical machines and in typical programming languages.
The sign is never deliberately kept in left shifts by the way, there is no useful way to do so: forcefully keeping it means the wrapping happens in a really odd way, instead of the usual wrapping modulo a power of two (which has nice properties).

Related

Bitmasking--when to use hex vs binary

I'm working on a problem out of Cracking The Coding Interview which requires that I swap odd and even bits in an integer with as few instructions as possible (e.g bit 0 and 1 are swapped, bits 2 and 3 are swapped, etc.)
The author's solution revolves around using a mask to grab, in one number, the odd bits, and in another num the even bits, and then shifting them off by 1.
I get her solution, but I don't understand how she grabbed the even/odd bits. She creates two bit masks --both in hex -- for a 32 bit integer. The two are: 0xaaaaaaaa and 0x55555555. I understand she's essentially creating the equivalent of 1010101010... for a 32 bit integer in hexadecimal and then ANDing it with the original num to grab the even/odd bits respectively.
What I don't understand is why she used hex? Why not just code in 10101010101010101010101010101010? Did she use hex to reduce verbosity? And when should you use one over the other?
It's to reduce verbosity. Binary 10101010101010101010101010101010, hexadecimal 0xaaaaaaaa, and decimal 2863311530 all represent exactly the same value; they just use different bases to do so. The only reason to use one or another is for perceived readability.
Most people would clearly not want to use decimal here; it looks like an arbitrary value.
The binary is clear: alternating 1s and 0s, but with so many, it's not obvious that this is a 32-bit value, or that there isn't an adjacent pair of 1s or 0s hiding in the middle somewhere.
The hexadecimal version takes advantage of chunking. Assuming you recognize that 0x0a == 0b1010, you can mentally picture the 8 groups of 1010 in the assumed value.
Another possibility would be octal 25252525252, since... well, maybe not. You can see that something is alternating, but unless you use octal a lot, it's not clear what that alternating pattern in binary is.

What do the digits in WELL generators represent?

In psuedo-random number generators like WELL512a, WELL1024, and WELL44497b, I understand what WELL (well equidistributed long-period linear) stands for, but I can't find any information on the suffix.
I'm writing a paper over rng's and I'm not sure if this is relevant
This is, I believe, log2(RNG period). Thus, WELL512a will have period of 2512, WELL1024 will have period 21024 etc
Reference: http://www.iro.umontreal.ca/~lecuyer/myftp/papers/wsc05rng.pdf, Table 1
This is an old question, and I'm sure that OP has moved on, but others may be interested in the answer. #SeverinPappadeux's answer is pretty much correct. The number n in the suffix is the roughly number of bits in the internal state. The period is 2n - 1. The letters after the numbers indicate different variants of the PRNG with the corresponding period. The different letters don't have any meaning other than indicating different versions.
The Wikipedia page is very brief:
https://en.wikipedia.org/wiki/Well_equidistributed_long-period_linear
This is the official paper on the WELL generators:
http://www.iro.umontreal.ca/~lecuyer/myftp/papers/wellrng.pdf
The table on page 9 lists parameters for the various WELL generators. You have to study the paper to understand the parameters, but the upper Δ1 in the right-hand column is worth noticing. Zero is the best value for Δ1--it's the number of dimensions in which the random numbers are not equidistributed. So it's worth noticing, for example, that Δ1 is not zero for WELL19937a or WELL19937b, but it is zero for WELL19937c. Thus if you want a WELL generator and like the idea of a generator with period 219937 - 1, and you don't mind 624 words of state (624 * 32 = 19968), it's probably slightly better to use WELL19937c rather than the other two. (This is probably one reason why WELL19937c is currently the default generator for Apache Commons Math lib, release 3.6.1, btw.)

How is data stored in a bit vector?

I'm a bit confused how a fixed size bit vector stores its data.
Let's assume that we have a bit vector bv that I want to store hello in as ASCII.
So we do bv[0]=104, bv[1]=101, bv[2]=108, bv[3]=108, bv[4]=111.
How is the ASCII of hello represented in the bit vector?
Is it as binary like this: [01101000][01100101][01101100][01101100][01101111]
or as ASCII like this: [104][101][108][108][111]
The following paper HAMPI at section 3.5 step 2, the author is assigning ascii code to a bit vector, but Im confused how the char is represented in the bit vector.
Firstly, you should probably read up on what a bit vector is, just to make sure we're on the same page.
Bit vectors don't represent ASCII characters, they represent bits. Trying to do bv[0]=104 on a bit vector will probably not compile / run, or, if it does, it's very unlikely to do what you expect.
The operations that you would expect to be supported is along the lines of set the 5th bit to 1, set the 10th bit to 0, set all these bit to this, OR the bits of these two vectors and probably some others.
How these are actually stored in memory is completely up to the programming language, and, on top of that, it may even be completely up to a given implementation of that language.
The general consensus (not a rule) is that each bit should take up roughly 1 bit in memory (maybe, on average, slightly more, since there could be overhead related to storing these).
As one example (how Java does it), you could have an array of 64-bit numbers and store 64 bits in each position. The translation to ASCII won't make sense in this case.
Another thing you should know - even ASCII gets stored as bits in memory, so those 2 arrays are essentially the same, unless you meant something else.

Arbitrary precision arithmetic with Ruby

How the heck does Ruby do this? Does Jörg or anyone else know what's happening behind the scenes?
Unfortunately I don't know C very well so bignum.c is of little help to me. I was just kind of curious it someone could explain (in plain English) the theory behind whatever miracle algorithm its using.
irb(main):001:0> 999**999
368063488259223267894700840060521865838338232037353204655959621437025609300472231530103873614505175218691345257589896391130393189447969771645832382192366076536631132001776175977932178658703660778465765811830827876982014124022948671975678131724958064427949902810498973271030787716781467419524180040734398996952930832508934116945966120176735120823151959779536852290090377452502236990839453416790640456116471139751546750048602189291028640970574762600185950226138244530187489211615864021135312077912018844630780307462205252807737757672094320692373101032517459518497524015120165166724189816766397247824175394802028228160027100623998873667435799073054618906855460488351426611310634023489044291860510352301912426608488807462312126590206830413782664554260411266378866626653755763627796569082931785645600816236891168141774993267488171702172191072731069216881668294625679492696148976999868715671440874206427212056717373099639711168901197440416590226524192782842896415414611688187391232048327738965820265934093108172054875188246591760877131657895633586576611857277011782497943522945011248430439201297015119468730712364007639373910811953430309476832453230123996750235710787086641070310288725389595138936784715274150426495416196669832679980253436807864187160054589045664027158817958549374490512399055448819148487049363674611664609890030088549591992466360050042566270348330911795487647045949301286614658650071299695652245266080672989921799342509291635330827874264789587306974472327718704306352445925996155619153783913237212716010410294999877569745287353422903443387562746452522860420416689019732913798073773281533570910205207767157128174184873357050830752777900041943256738499067821488421053870869022738698816059810579221002560882999884763252161747566893835178558961142349304466506402373556318707175710866983035313122068321102457824112014969387225476259342872866363550383840720010832906695360553556647545295849966279980830561242960013654529514995113584909050813015198928283202189194615501403435553060147713139766323195743324848047347575473228198492343231496580885057330510949058490527738662697480293583612233134502078182014347192522391449087738579081585795613547198599661273567662441490401862839817822686573112998663038868314974259766039340894024308383451039874674061160538242392803580758232755749310843694194787991556647907091849600704712003371103926967137408125713631396699343733288014254084819379380555174777020843568689927348949484201042595271932630685747613835385434424807024615161848223715989797178155169951121052285149157137697718850449708843330475301440373094611119631361702936342263219382793996895988331701890693689862459020775599439506870005130750427949747071390095256759203426671803377068109744629909769176319526837824364926844730545524646494321826241925107158040561607706364484910978348669388142016838792902926158979355432483611517588605967745393958061959024834251565197963477521095821435651996730128376734574843289089682710350244222290017891280419782767803785277960834729869249991658417000499998999
Simple: it does it the same way you do, ever since first grade. Except it doesn't compute in base 10, it computes in base 4 billion (and change).
Think about it: with our number system, we can only represent numbers from 0 to 9. So, how can we compute 6+7 without overflowing? Easy: we do actually overflow! We cannot represent the result of 6+7 as a number between 0 and 9, but we can overflow to the next place and represent it as two numbers between 0 and 9: 3×100 + 1×101. If you want to add two numbers, you add them digit-wise from the right and overflow ("carry") to the left. If you want to multiply two numbers, you have to multiply every digit of one number individually with the other number, then add up the intermediate results.
BigNum arithmetic (this is what this kind of arithmetic where the numbers are bigger than the native machine numbers is usually called) works basically the same way. Except that the base is not 10, and its not 2, either – it's the size of a native machine integer. So, on a 32 bit machine, it would be base 232 or 4 294 967 296.
Specifically, in Ruby Integer is actually an abstract class that is never instianted. Instead, it has two subclasses, Fixnum and Bignum, and numbers automagically migrate between them, depending on their size. In MRI and YARV, Fixnum can hold a 31 or 63 bit signed integer (one bit is used for tagging) depending on the native word size of the machine. In JRuby, a Fixnum can hold a full 64 bit signed integer, even on an 32 bit machine.
The simplest operation is adding two numbers. And if you look at the implementation of + or rather bigadd_core in YARV's bignum.c, it's not too bad to follow. I can't read C either, but you can cleary see how it loops over the individual digits.
You could read the source for bignum.c...
At a very high level, without going into any implementation details, bignums are calculated "by hand" like you used to do in grade school. Now, there are certainly many optimizations that can be applied, but that's the gist of it.
I don't know of the implementation details so I'll cover how a basic Big Number implementation would work.
Basically instead of relying on CPU "integers" it will create it's own using multiple CPU integers. To store arbritrary precision, well lets say you have 2 bits. So the current integer is 11. You want to add one. In normal CPU integers, this would roll over to 00
But, for big number, instead of rolling over and keeping a "fixed" integer width, it would allocate another bit and simulate an addition so that the number becomes the correct 100.
Try looking up how binary math can be done on paper. It's very simple and is trivial to convert to an algorithm.
Beaconaut APICalc 2 just released on Jan.18, 2011, which is an arbitrary-precision integer calculator for bignum arithmetic, cryptography analysis and number theory research......
http://www.beaconaut.com/forums/default.aspx?g=posts&t=13
It uses the Bignum class
irb(main):001:0> (999**999).class
=> Bignum
Rdoc is available of course

How would I go about implementing this algorithm?

A while back I was trying to bruteforce a remote control which sent a 12 bit binary 'key'.
The device I made worked, but was very slow as it was trying every combination at about 50 bits per second (4096 codes = 49152 bits = ~16 minutes)
I opened the receiver and found it was using a shift register to check the codes and no delay was required between attempts. This meant that the receiver was simply looking at the last 12 bits to be received to see if they were a match to the key.
This meant that if the stream 111111111111000000000000 was sent through, it had effectively tried all of these codes.
111111111111 111111111110 111111111100 111111111000
111111110000 111111100000 111111000000 111110000000
111100000000 111000000000 110000000000 100000000000
000000000000
In this case, I have used 24 bits to try 13 12 bit combinations (>90% compression).
Does anyone know of an algorithm that could reduce my 49152 bits sent by taking advantage of this?
What you're talking about is a de Bruijn sequence. If you don't care about how it works, you just want the result, here it is.
Off the top of my head, I suppose flipping one bit in each 12-bit sequence would take care of another 13 combinations, for example 111111111101000000000010, then 111111111011000000000100, etc. But you still have to do a lot permutations, even with one bit I think you still have to do 111111111101000000000100 etc. Then flip two bits on one side and 1 on the other, etc.

Resources