binary masks in CAPL - capl

CAPL allows bitwise operations. Since writing parsers is becoming a tedious operation these days, I'm wandering if there is a way to write binary numbers for masks, e.g.
variables
{
byte a = 0x03;
}
on key 'a'
{
a &= 0b11; // <- invalid, how can we write this?
a &= 0x03;
a &= 3;
}

It is not supported by CAPL.
You just have to add the bits and use the obtained number in Hex or Dec format.
Alternatively you can create a function to display it in your report as a binary if you really want to

Related

How to change a boost::multiprecision::cpp_int from big endian to little endian

I have a boost::multiprecision::cpp_int in big endian and have to change it to little endian. How can I do that? I tried with boost::endian::conversion but that did not work.
boost::multiprecision::cpp_int bigEndianInt("0xe35fa931a0000*);
boost::multiprecision::cpp_int littleEndianInt;
littleEndianIn = boost::endian::endian_reverse(m_cppInt);
The memory layout of boost multi-precision types is implementation detail. So you cannot assume much about it anyways (they're not supposed to be bitwise serializable).
Just read a random section of the docs:
MinBits
Determines the number of Bits to store directly within the object before resorting to dynamic memory allocation. When zero, this field is determined automatically based on how many bits can be stored in union with the dynamic storage header: setting a larger value may improve performance as larger integer values will be stored internally before memory allocation is required.
It's not immediately clear that you have any chance at some level of "normal int behaviour" in memory layout. The only exception would be when MinBits==MaxBits.
Indeed, we can static_assert that the size of cpp_int with such backend configs match the corresponding byte-sizes.
It turns out that there's even a promising tag in the backend base-class to indicate "triviality" (this is truly promising): trivial_tag, so let's use it:
Live On Coliru
#include <boost/multiprecision/cpp_int.hpp>
namespace mp = boost::multiprecision;
template <int bits> using simple_be =
mp::cpp_int_backend<bits, bits, mp::unsigned_magnitude>;
template <int bits> using my_int =
mp::number<simple_be<bits>, mp::et_off>;
using my_int8_t = my_int<8>;
using my_int16_t = my_int<16>;
using my_int32_t = my_int<32>;
using my_int64_t = my_int<64>;
using my_int128_t = my_int<128>;
using my_int192_t = my_int<192>;
using my_int256_t = my_int<256>;
template <typename Num>
constexpr bool is_trivial_v = Num::backend_type::trivial_tag::value;
int main() {
static_assert(sizeof(my_int8_t) == 1);
static_assert(sizeof(my_int16_t) == 2);
static_assert(sizeof(my_int32_t) == 4);
static_assert(sizeof(my_int64_t) == 8);
static_assert(sizeof(my_int128_t) == 16);
static_assert(is_trivial_v<my_int8_t>);
static_assert(is_trivial_v<my_int16_t>);
static_assert(is_trivial_v<my_int32_t>);
static_assert(is_trivial_v<my_int64_t>);
static_assert(is_trivial_v<my_int128_t>);
// however it doesn't scale
static_assert(sizeof(my_int192_t) != 24);
static_assert(sizeof(my_int256_t) != 32);
static_assert(not is_trivial_v<my_int192_t>);
static_assert(not is_trivial_v<my_int256_t>);
}
Conluding: you can have trivial int representation up to a certain point, after which you get the allocator-based dynamic-limb implementation no matter what.
Note that using unsigned_packed instead of unsigned_magnitude representation never leads to a trivial backend implementation.
Note that triviality might depend on compiler/platform choices (it's likely that cpp_128_t uses some builtin compiler/standard library support on GCC, e.g.)
Given this, you MIGHT be able to pull of what you wanted to do with hacks IF your backend configuration support triviality. Sadly I think it requires you to manually overload endian_reverse for 128 bits case, because the GCC builtins do not have __builtin_bswap128, nor does Boost Endian define things.
I'd suggest working off the information here How to make GCC generate bswap instruction for big endian store without builtins?
Final Demo (not complete)
#include <boost/multiprecision/cpp_int.hpp>
#include <boost/endian/buffers.hpp>
namespace mp = boost::multiprecision;
namespace be = boost::endian;
template <int bits> void check() {
using T = mp::number<mp::cpp_int_backend<bits, bits, mp::unsigned_magnitude>, mp::et_off>;
static_assert(sizeof(T) == bits/8);
static_assert(T::backend_type::trivial_tag::value);
be::endian_buffer<be::order::big, T, bits, be::align::no> buf;
buf = T("0x0102030405060708090a0b0c0d0e0f00");
std::cout << std::hex << buf.value() << "\n";
}
int main() {
check<128>();
}
(Changing be::order::big to be::order::native obviously makes it compile. The other way to complete it would be to have an ADL accessible overload for endian_reverse for your int type.)
This is both trivial and in the general case unanswerable, let me explain:
For a general N-bit integer, where N is a large number, there is unlikely to be any well defined byte order, indeed even for 64 and 128 bit integers there are more than 2 possible orders in use: https://en.wikipedia.org/wiki/Endianness#Middle-endian.
On any platform, with any native endianness you can always extract the bytes of a cpp_int, the first example here: https://www.boost.org/doc/libs/1_73_0/libs/multiprecision/doc/html/boost_multiprecision/tut/import_export.html#boost_multiprecision.tut.import_export.examples shows you how. When exporting bytes like this, they are always most significant byte first, so you can subsequently rearrange them how you wish. You should not however, rearrange them and load them back into a cpp_int as the class won't know what to do with the result!
If you know that the value is small enough to fit into a native integer type, then you can simply cast to the native integer and use a system API on the result. As in endian_reverse(static_cast<int64_t>(my_cpp_int)). Again, don't assign the result back into a cpp_int as it requires native byte order.
If you wish to check whether a value is small enough to fit in an N-bit integer for the approach above, you can use the msb function, which returns the index of the most significant bit in the cpp_int, add one to that to obtain the number of bits used, and filter out the zero case and the code looks like:
unsigned bits_used = my_cpp_int.is_zero() ? 0 : msb(my_cpp_int) + 1;
Note that all of the above use completely portable code - no hacking of the underlying implementation is required.

I can not write integer in LCD AVR

I can not write integer into the LCD using those functions :S it shows something weird in screen
I just added the function below!!! please check it for me
I added everything needed
my_delay(1000);
LCDWriteStringXY(0,0,"Welcome..");
my_delay(1000);
LCDWriteStringXY(0,0,"Welcome...");
my_delay(1000);
LCDClear();
LCDWriteStringXY(4,0,"Testing");
LCDGotoXY(2,1);
int m=952520;
LCDWriteInt(m,6);//I can not write it!!!
void LCDWriteInt(int val,unsigned int field_length)
{
char str[5]={0,0,0,0,0};
int i=4,j=0;
while(val)
{
str[i]=val%10;
val=val/10;
i--;
}
if(field_length==-1)
while(str[j]==0) j++;
else
j=5-field_length;
if(val<0) LCDData('-');
for(i=j;i<5;i++)
{
LCDData(48+str[i]);
}
}
I think the function is written for 16-bit integers for which the maximum value would be 65535 (5 digits - same as the length of str[]). You are giving it 6 digit value, which first overruns the string when it tries to write to str[5], and then produces j = -1.
My suggestion is to either use smaller integers (16-bit only), or write another function like the one you showed us to do the same thing for larger values.
Lastly, I don't know if the if(val<0) LCDData('-') would actually ever work properly since you overwrite 'val' in the first while loop.
Use itoa function. That will help you converting integer to string and displaying on lcd. Best of luck!

Making a list of integers more human friendly

This is a bit of a side project I have taken on to solve a no-fix issue for work. Our system outputs a code to represent a combination of things on another thing. Some example codes are:
9-9-0-4-4-5-4-0-2-0-0-0-2-0-0-0-0-0-2-1-2-1-2-2-2-4
9-5-0-7-4-3-5-7-4-0-5-1-4-2-1-5-5-4-6-3-7-9-72
9-15-0-9-1-6-2-1-2-0-0-1-6-0-7
The max number in one of the slots I've seen so far is about 150 but they will likely go higher.
When the system was designed there was no requirement for what this code would look like. But now the client wants to be able to type it in by hand from a sheet of paper, something the code above isn't suited for. We've said we won't do anything about it, but it seems like a fun challenge to take on.
My question is where is a good place to start loss-less compressing this code? Obvious solutions such as store this code with a shorter key are not an option; our database is read only. I need to build a two way method to make this code more human friendly.
1) I agree that you definately need a checksum - data entry errors are very common, unless you have really well trained staff and independent duplicate keying with automatic crosss-checking.
2) I suggest http://en.wikipedia.org/wiki/Huffman_coding to turn your list of numbers into a stream of bits. To get the probabilities required for this, you need a decent sized sample of real data, so you can make a count, setting Ni to the number of times number i appears in the data. Then I suggest setting Pi = (Ni + 1) / (Sum_i (Ni + 1)) - which smooths the probabilities a bit. Also, with this method, if you see e.g. numbers 0-150 you could add a bit of slack by entering numbers 151-255 and setting them to Ni = 0. Another way round rare large numbers would be to add some sort of escape sequence.
3) Finding a way for people to type the resulting sequence of bits is really an applied psychology problem but here are some suggestions of ideas to pinch.
3a) Software licences - just encode six bits per character in some 64-character alphabet, but group characters in a way that makes it easier for people to keep place e.g. BC017-06777-14871-160C4
3b) UK car license plates. Use a change of alphabet to show people how to group characters e.g. ABCD0123EFGH4567IJKL...
3c) A really large alphabet - get yourself a list of 2^n words for some decent sized n and encode n bits as a word e.g. GREEN ENCHANTED LOGICIAN... -
i worried about this problem a while back. it turns out that you can't do much better than base64 - trying to squeeze a few more bits per character isn't really worth the effort (once you get into "strange" numbers of bits encoding and decoding becomes more complex). but at the same time, you end up with something that's likely to have errors when entered (confusing a 0 with an O etc). one option is to choose a modified set of characters and letters (so it's still base 64, but, say, you substitute ">" for "0". another is to add a checksum. again, for simplicity of implementation, i felt the checksum approach was better.
unfortunately i never got any further - things changed direction - so i can't offer code or a particular checksum choice.
ps i realised there's a missing step i didn't explain: i was going to compress the text into some binary form before encoding (using some standard compression algorithm). so to summarize: compress, add checksum, base64 encode; base 64 decode, check checksum, decompress.
This is similar to what I have used in the past. There are certainly better ways of doing this, but I used this method because it was easy to mirror in Transact-SQL which was a requirement at the time. You could certainly modify this to incorporate Huffman encoding if the distribution of your id's is non-random, but it's probably unnecessary.
You didn't specify language, so this is in c#, but it should be very easy to transition to any language. In the lookup you'll see commonly confused characters are omitted. This should speed up entry. I also had the requirement to have a fixed length, but it would be easy for you to modify this.
static public class CodeGenerator
{
static Dictionary<int, char> _lookupTable = new Dictionary<int, char>();
static CodeGenerator()
{
PrepLookupTable();
}
private static void PrepLookupTable()
{
_lookupTable.Add(0,'3');
_lookupTable.Add(1,'2');
_lookupTable.Add(2,'5');
_lookupTable.Add(3,'4');
_lookupTable.Add(4,'7');
_lookupTable.Add(5,'6');
_lookupTable.Add(6,'9');
_lookupTable.Add(7,'8');
_lookupTable.Add(8,'W');
_lookupTable.Add(9,'Q');
_lookupTable.Add(10,'E');
_lookupTable.Add(11,'T');
_lookupTable.Add(12,'R');
_lookupTable.Add(13,'Y');
_lookupTable.Add(14,'U');
_lookupTable.Add(15,'A');
_lookupTable.Add(16,'P');
_lookupTable.Add(17,'D');
_lookupTable.Add(18,'S');
_lookupTable.Add(19,'G');
_lookupTable.Add(20,'F');
_lookupTable.Add(21,'J');
_lookupTable.Add(22,'H');
_lookupTable.Add(23,'K');
_lookupTable.Add(24,'L');
_lookupTable.Add(25,'Z');
_lookupTable.Add(26,'X');
_lookupTable.Add(27,'V');
_lookupTable.Add(28,'C');
_lookupTable.Add(29,'N');
_lookupTable.Add(30,'B');
}
public static bool TryPCodeDecrypt(string iPCode, out Int64 oDecryptedInt)
{
//Prep the result so we can exit without having to fiddle with it if we hit an error.
oDecryptedInt = 0;
if (iPCode.Length > 3)
{
Char[] Bits = iPCode.ToCharArray(0,iPCode.Length-2);
int CheckInt7 = 0;
int CheckInt3 = 0;
if (!int.TryParse(iPCode[iPCode.Length-1].ToString(),out CheckInt7) ||
!int.TryParse(iPCode[iPCode.Length-2].ToString(),out CheckInt3))
{
//Unsuccessful -- the last check ints are not integers.
return false;
}
//Adjust the CheckInts to the right values.
CheckInt3 -= 2;
CheckInt7 -= 2;
int COffset = iPCode.LastIndexOf('M')+1;
Int64 tempResult = 0;
int cBPos = 0;
while ((cBPos + COffset) < Bits.Length)
{
//Calculate the current position.
int cNum = 0;
foreach (int cKey in _lookupTable.Keys)
{
if (_lookupTable[cKey] == Bits[cBPos + COffset])
{
cNum = cKey;
}
}
tempResult += cNum * (Int64)Math.Pow((double)31, (double)(Bits.Length - (cBPos + COffset + 1)));
cBPos += 1;
}
if (tempResult % 7 == CheckInt7 && tempResult % 3 == CheckInt3)
{
oDecryptedInt = tempResult;
return true;
}
return false;
}
else
{
//Unsuccessful -- too short.
return false;
}
}
public static string PCodeEncrypt(int iIntToEncrypt, int iMinLength)
{
int Check7 = (iIntToEncrypt % 7) + 2;
int Check3 = (iIntToEncrypt % 3) + 2;
StringBuilder result = new StringBuilder();
result.Insert(0, Check7);
result.Insert(0, Check3);
int workingNum = iIntToEncrypt;
while (workingNum > 0)
{
result.Insert(0, _lookupTable[workingNum % 31]);
workingNum /= 31;
}
if (result.Length < iMinLength)
{
for (int i = result.Length + 1; i <= iMinLength; i++)
{
result.Insert(0, 'M');
}
}
return result.ToString();
}
}

Little endian data and sha 256

I have to generate sha256 hashes of data that is in little endian form. I would like to know if I have to convert it to big endian first, before using the sha 256 algorithm. Or if, the algorithm is "endian-agnostic".
EDIT: Sorry, I think I wasnt clear. What I would like to know is the following: The sha256 algorithm requires to pad the end of a message with certain bits. The first step is to add a 1 at the end of the message. Then, to pad it with zero up to the end. At the very end, you must add the length of the message in bits. What I would like to know is if this padding can be performed in little endian. For example, for a 640 bit message, I could write the last word as 0x280 (in big endian), or 0x8002000 (in little endian). Can this padding be done in little endian?
SHA256 is endian-agnostic if all you want is a good hash. But if you are writing SHA256 and want to the same results with a correct implementation then you must play games on little endian hardware. SHA256 combines arithmetic addition (mod 2*32) and boolean operation thus is not endian-agnostic internally.
The SHA-256 implementation itself should take care of padding - you shouldn't have to deal with that unless you're implementing your own specialized SHA-256 code. If you are, note that the padding rules specified in the "pre-processing step" say that the length is a 64-bit big-endian integer. See SHA-2 - Wikipedia
It's hard to even figure out what "endian-agnostic" would mean, but the order of all the bits, bytes and words for a hash algorithm matter a whole lot, so I sure wouldn't use that term.
Let me reply regarding sha 256 as well as sha 512.
in short:
The algorithm itself is endian agnostic. The endian sensitive parts are when data is imported from a byte buffer to the algorithm working variables and when it is exported back to the digest result - also a byte buffer. If the import / export include casting, then endian matters.
Where could casting occur:
In sha 512 there is a working buffer of 128 bytes.
In my code its defined like this:
union
{
U64 w [80]; (see U64 example below)
byte buffer [128];
};
Input data is copied to this byte buffer and then work is done on W. This means the data was casted to some 64 bit type. This data will have to be swapped. in my case its swapped for little endian machines.
A better method would be to prepare a get macro that takes each byte and places it in its correct place in the u64 type.
When the algorithm is done the digest result is output from the working variables to some byte buffer, if this is done by memcpy it will also have to be swapped.
Another casting could occur when implementing sha 512 - which is designed for 64 bit machines - on 32 bit machines. In my case I have a 64 bit type that is defined:
typedef struct {
uint high;
uint low;
} U64;
Assume I define it for little endian as well, as follows:
typedef struct {
uint low;
uint high;
} U64;
And then the k algorithm init is done like this:
static const SHA_U64 k[80] =
{
{0xD728AE22, 0x428A2F98}, {0x23EF65CD, 0x71374491}, ...
...
...
}
But i need the logic value of k[0].high to be the same in any machine.
So in this example I will need another k array with high and low values swapped.
After the data is stored in the working parameters any bitwise manipulation would have the same result on both big/little endian machines.
Good method would be to avoid any casting:
Import bytes from input buffer to your working parameters using macro.
Work with logical values without thinking about the memory mapping.
Export output to digest result with a macro.
Macro for taking 32 bits from a byte buffer to int32 (BE = big endian):
#define GET_BE_BYTES_FROM32(a)
((((NQ_UINT32) (a)[0]) << 24) |
(((NQ_UINT32) (a)[1]) << 16) |
(((NQ_UINT32) (a)[2]) << 8) |
((NQ_UINT32) (a)[3]))
#define GET_LE_BYTES_FROM32(a)
((((NQ_UINT32) (a)[3]) << 24) |
(((NQ_UINT32) (a)[2]) << 16) |
(((NQ_UINT32) (a)[1]) << 8) |
((NQ_UINT32) (a)[0]))

Visual Studio C++ 2008 Manipulating Bytes?

I'm trying to write strictly binary data to files (no encoding). The problem is, when I hex dump the files, I'm noticing rather weird behavior. Using either one of the below methods to construct a file results in the same behavior. I even used the System::Text::Encoding::Default to test as well for the streams.
StreamWriter^ binWriter = gcnew StreamWriter(gcnew FileStream("test.bin",FileMode::Create));
(Also used this method)
FileStream^ tempBin = gcnew FileStream("test.bin",FileMode::Create);
BinaryWriter^ binWriter = gcnew BinaryWriter(tempBin);
binWriter->Write(0x80);
binWriter->Write(0x81);
.
.
binWriter->Write(0x8F);
binWriter->Write(0x90);
binWriter->Write(0x91);
.
.
binWriter->Write(0x9F);
Writing that sequence of bytes, I noticed the only bytes that weren't converted to 0x3F in the hex dump were 0x81,0x8D,0x90,0x9D, ... and I have no idea why.
I also tried making character arrays, and a similar situation happens. i.e.,
array<wchar_t,1>^ OT_Random_Delta_Limits = {0x00,0x00,0x03,0x79,0x00,0x00,0x04,0x88};
binWriter->Write(OT_Random_Delta_Limits);
0x88 would be written as 0x3F.
If you want to stick to binary files then don't use StreamWriter. Just use a FileStream and Write/WriteByte. StreamWriters (and TextWriters in generally) are expressly designed for text. Whether you want an encoding or not, one will be applied - because when you're calling StreamWriter.Write, that's writing a char, not a byte.
Don't create arrays of wchar_t values either - again, those are for characters, i.e. text.
BinaryWriter.Write should have worked for you unless it was promoting the values to char in which case you'd have exactly the same problem.
By the way, without specifying any encoding, I'd expect you to get non-0x3F values, but instead the bytes representing the UTF-8 encoded values for those characters.
When you specified Encoding.Default, you'd have seen 0x3F for any Unicode values not in that encoding.
Anyway, the basic lesson is to stick to Stream when you want to deal with binary data rather than text.
EDIT: Okay, it would be something like:
public static void ConvertHex(TextReader input, Stream output)
{
while (true)
{
int firstNybble = input.Read();
if (firstNybble == -1)
{
return;
}
int secondNybble = input.Read();
if (secondNybble == -1)
{
throw new IOException("Reader finished half way through a byte");
}
int value = (ParseNybble(firstNybble) << 4) + ParseNybble(secondNybble);
output.WriteByte((byte) value);
}
}
// value would actually be a char, but as we've got an int in the above code,
// it just makes things a bit easier
private static int ParseNybble(int value)
{
if (value >= '0' && value <= '9') return value - '0';
if (value >= 'A' && value <= 'F') return value - 'A' + 10;
if (value >= 'a' && value <= 'f') return value - 'a' + 10;
throw new ArgumentException("Invalid nybble: " + (char) value);
}
This is very inefficient in terms of buffering etc, but should get you started.
A BinaryWriter() class initialized with a stream will use a default encoding of UTF8 for any chars or strings that are written. I'm guessing that the
binWriter->Write(0x80);
binWriter->Write(0x81);
.
.
binWriter->Write(0x8F);
binWriter->Write(0x90);
binWriter->Write(0x91);
calls are binding to the Write( char) overload so they're going through the character encoder. I'm not very familiar with C++/CLI, but it seems to me that these calls should be binding to Write(Int32), which shouldn't have this problem (maybe your code is really calling Write() with a char variable that's set to the values in your example. That would account for this behavior).
0x3F is commonly known as the ASCII character '?'; the characters that are mapping to it are control characters with no printable representation. As Jon points out, use a binary stream rather than a text-oriented output mechanism for raw binary data.
EDIT -- actually your results look like the inverse of what I would expect. In the default code page 1252, the non-printable characters (i.e. ones likely to map to '?') in that range are 0x81, 0x8D, 0x8F, 0x90 and 0x9D

Resources