add space after every byte to a binary file - image

I have a binary file of an image which is 256x256 array
I just want to add a space after every byte in this file
e.g 000101010101110101010100
expected output: 00010101 01011101 01010100
can any one suggest me to this using matlab prefarably

If you already have those 0/1 values in the form of a char array, such as
a =
101001011111011101010100
100101001010100000111111
100100100001011110110101
001010011111110000001010
you can do it this way:
n = 8; %// byte size
b = a(:,ceil(n/(n+1):n/(n+1):end)); %// repeat every n-th position
b(:,n+1:n+1:end) = ' '; %// replace those repeated positions by spaces
which gives
b =
10100101 11110111 01010100
10010100 10101000 00111111
10010010 00010111 10110101
00101001 11111100 00001010

... and to get the char array mentioned by Luis Mendo you can do the following:
%edge length of image
edge = 256;
%open and read file
fid = fopen('imageData.txt');
C = fread(fid,[edge+2,edge]);
The +2 actually depends on your file, does it contain carriage return? If not, leave it out.
% get char array
chararray = char(C(1:edge,:)');
Then I would proceed as follows:
wordlength = 8;
cellarray = mat2cell(chararray,ones(1,edge),wordlength*ones(1,edge/wordlength));
cellarray = cellfun(#(x) [x ' '],cellarray,'uni',0);
output = cell2mat(cellarray);
%// remove last column with unwanted space
output = output(:,end-1:end)
and write that to a file again.
regarding your comment, try this:
wordlength = 8;
cellarray = mat2cell(chararray,ones(1,edge),wordlength*ones(1,edge/wordlength));
fid2 = fopen( yourFilepath ,'w');
fprintf(fid2 , '%s\r\n', cellarray {:});
fclose(fid2);

You can do it with a simple Perl one-liner:
perl -e '$/=\1;while(<>){print $_," ";}' < yourFile > newfile
The $/=\1 sets the record separator so we read one byte at a time. Then we go into a loop reading one byte at a time, printing it out followed by a space till end of file.

Related

How to use "column" to center a chart?

I was wondering what the best way to sort a chart using the column command to center each column instead of the default left aligned column was. I have been using the column -t filename command.
Current Output:
Label1 label2
Anotherlabel label2442
label152 label42242
label78765 label373737737
Desired Output: Something like this
Label1 label2
Anotherlabel label2442
label152 label42242
label78765 label373737737
Basically, I want it to be centered instead of left aligned.
Here is an awk solution:
# Collect all lines in "data", keep track of maximum width for each field
{
data[NR] = $0
for (i = 1; i <= NF; ++i)
max[i] = length($i) > max[i] ? length($i) : max[i]
}
END {
for (i = 1; i <= NR; ++i) {
# Split record into array "arr"
split(data[i], arr)
# Loop over array
for (j = 1; j <= NF; ++j) {
# Calculate amount of padding required
pad = max[j] - length(arr[j])
# Print field with appropriate padding, see below
printf "%*s%*s%s", length(arr[j]) + int(pad/2), arr[j], \
pad % 2 == 0 ? pad/2 : int(pad/2) + 1, "", \
j == NF ? "" : " "
}
# Newline at end of record
print ""
}
}
Called like this:
$ awk -f centre.awk infile
Label1 label2
Anotherlabel label2442
label152 label42242
label78765 label373737737
The printf statement uses padding with dynamic widths:
The first %*s takes care of left padding and the data itself: arr[j] gets printed and padded to a total width of length(arr[j]) + int(pad/2).
The second %*s prints the empty string, left padded to half of the total padding required. pad % 2 == 0 ? pad/2 : int(pad/2) + 1 checks if the total padding was an even number, and if not, adds an extra space.
The last %s prints j == NF ? "" : " ", i.e., two spaces, unless we're at the last field.
Some older awks don't support the %*s syntax, but the formatting string can be assembled like width = 5; "%" width "s" in that case.
Here's a Python program to do what you want. It's probably too hard to do in bash, so you'll need to use a custom program or awk script. Basic algorithm:
count number of columns
[optional] make sure each line has the same number of columns
figure out the maximum length of data for each column
print each line using the max lengths
.
#!/usr/bin/env python3
import sys
def column():
# Read file and split each line into fields (by whitespace)
with open(sys.argv[1]) as f:
lines = [line.split() for line in f]
# Check that each line has the same number of fields
num_fields = len(lines[0])
for n, line in enumerate(lines):
if len(line) != num_fields:
print('Line {} has wrong number of columns: expected {}, got {}'.format(n, num_fields, len(line)))
sys.exit(1)
# Calculate the maximum length of each field
max_column_widths = [0] * num_fields
for line in lines:
line_widths = (len(field) for field in line)
max_column_widths = [max(z) for z in zip(max_column_widths, line_widths)]
# Now print them centered using the max_column_widths
spacing = 4
format_spec = (' ' * spacing).join('{:^' + str(n) + '}' for n in max_column_widths)
for line in lines:
print(format_spec.format(*line))
if __name__ == '__main__':
column()

How to divide any sized image to equal sized block using only for loop in matlab?

How to divide image to equal sized blocks using simple matlab for loop only?
As a beginner I have tried but its showing error.I have done same thing using mat2cell and using simple calculation.
This is my code:
[rows, columns, numberOfColorChannels] = size(p);
r4 = int32(rows/4);
c4 = int32(columns/4);
% Extract images.
image1 = p(1:r4, 1:c4);
image2 = p(1:r4, c4+1:2*c4);
image3 = p(1:r4, 2*c4+1:3*c4);
image4 = p(1:r4, 3*c4+1:4*c4);
I need to do it with a for loop only.
First things first if you separate x and y into 4 equally sized sections you will get 16 smaller images. You need to understand this first part.
[rows, columns, numberOfColorChannels] = size(p);
r4 = int32(rows/4);
c4 = int32(columns/4);
output = zeros(16,r4,c4,numberOfColorChannels);
cnt = 1;
for i=1:4,
for j=1:4,
output(cnt,:,:,:) = p((i-1)*r4+1:i*r4, (j-1)*c4+1:j*c4);
cnt = cnt + 1;
end
end
The code basically does what you've already done but in two dimensions.

Annoying bug in MATLAB

I'm doing a project and i needed to write two functions. The first is mk_key and its job is to convert a 24 RGB image into a vector consisting of consecutive bytes followed by a trailer consisting of 4 bytes that hold the dimensions of the image for reconstructing (so the vector size is [1 m*n*3+4]) then the elements of that vector are shuffled randomly according to a seed value (acting like a password) and finally the vector is saved to a file using hexa bytes; the other function is use_key and it is used to reconstruct the image from the key file, this is done by re-arranging the vector elements back into their positions and then using the data at the trailer to reconstruct the full RGB image. The only problem i have is that the reconstructed image has most of it's pixels lost and it only shows about 1/6 of the image and only at the red plane, the bytes at the other planes appear gray.
mk_key:
function mk_key(img, dest_file, seed)
s=size(img);
m=s(1);
n=s(2);
rg = zeros([1 m*n 3],'uint8');
for i=drange(1:m)
for j=drange(1:n)
rg(1,n*i+j-n,:)=img(i,j,:); %convert rectangular image matrix into row image
end
end
rgf = zeros([1 (m*n*3)+4],'uint8');
for x=drange(1:3)
rgf(1,(m*n*(x-1))+1:m*n*x)=rg(1,1:m*n,x);
end
mm=uint16(m);
nn=uint16(n);
rgf(1,(m*n*3)+1)=uint8(bitand(mm,hex2dec('00ff')));
rgf(1,(m*n*3)+2)=uint8(bitshift(bitand(mm,hex2dec('ff00')),-8));
rgf(1,(m*n*3)+3)=uint8(bitand(nn,hex2dec('00ff')));
rgf(1,(m*n*3)+4)=uint8(bitshift(bitand(nn,hex2dec('ff00')),-8));
rng(seed);
idxs = randperm(((m*n*3)+4)); % generate a random sequence representing byte locations
sg = zeros([1 ((m*n*3)+4)],'uint8');
for i=drange(1:((m*n*3)+4))
sg(1,i)=rgf(1,idxs(i));
end
f = fopen(dest_file, 'w');
for i=drange(1:((m*n*3)+4))
fprintf(f, '%x %x', sg(1,i));
end
fclose('all');
end
use_key:
function [img]=use_key(source_file, seed)
key_file=fopen(source_file);
key=fscanf(key_file,'%x %x');
key=key'; %Transpose column vector into row vector
key=uint8(key);
s=size(key);
rng(seed);
idxs = randperm(s(2)); % generate a random sequence representing byte locations
mgf = zeros([1 s(2)],'uint8');
for i=drange(1:s(2))
mgf(1,idxs(i))=key(1,i);
end
m=uint16(mgf(1,s(2)-3))+uint16(mgf(1,s(2)-2))*(16^2);
n=uint16(mgf(1,s(2)-1))+uint16(mgf(1,s(2)))*(16^2);
img = zeros([m n 3],'uint8');
for x=drange(1:3)
for i=drange(1:m)
for j=drange(1:n)
img(i,j,x)=mgf(1,(n*i+j-n)+(m*n)*(x-1));%convert row matrix into rectangular image matrix
end
end
end
fclose('all');
end
Whatever the bug is, it's somewhere in those horrible nested loops. Rather than attempt to fix what looks like a direct port of C code, I started cleaning up all the confusing and needlessly overcomplicated bits so I could make sense of it; by the time I'd finished, there wasn't much left:
function mk_key(img, dest_file, seed)
s = uint16(size(img));
s = typecast(s(1:2), 'uint8');
rg = reshape(img, 1, []);
rgf = [rg s];
rng(seed);
idxs = randperm(numel(rgf));
sg = rgf(idxs);
f = fopen(dest_file, 'w');
fprintf(f, '%x ', sg);
fclose(f);
end
and correspondingly:
function [img] = use_key(source_file, seed)
key_file = fopen(source_file);
key = fscanf(key_file,'%x');
fclose(key_file);
rng(seed);
idxs = randperm(numel(key));
mgf = zeros(1, numel(key), 'uint8');
mgf(idxs) = key;
s = typecast(mgf(end-3:end), 'uint16');
img = reshape(mgf(1:end-4), s(1), s(2), 3);
end
Whilst the ordering from reshape is different compared to your loops, that makes no practical difference given that the vector then gets shuffled - it's more robust, works as expected and is considerably quicker.

Convert Excel Column Number to Column Name in Matlab

I am using Excel 2007 which supports Columns upto 16,384 Columns. I would like to obtain the Column name corresponding Column Number.
Currently, I am using the following code. However this code supports upto 256 Columns. Any idea how to obtain Column Name if the column number is greater than 256.
function loc = xlcolumn(column)
if isnumeric(column)
if column>256
error('Excel is limited to 256 columns! Enter an integer number <256');
end
letters = {'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'};
count = 0;
if column-26<=0
loc = char(letters(column));
else
while column-26>0
count = count + 1;
column = column - 26;
end
loc = [char(letters(count)) char(letters(column))];
end
else
letters = ['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'];
if size(column,2)==1
loc =findstr(column,letters);
elseif size(column,2)==2
loc1 =findstr(column(1),letters);
loc2 =findstr(column(2),letters);
loc = (26 + 26*loc1)-(26-loc2);
end
end
Thanks
As a diversion, here is an all function handle example, with (almost) no file-based functions required. This is based on the dec2base function, since Excel column names are (almost) base 26 numbers, with the frustrating difference that there are no "0" characters.
Note: this is probably a terrible idea overall, but it works. Better solutions are probably found elsewhere in the file exchange.
First, the one file based function that I couldn't get around, to perform arbitrary depth function composition.
function result = compose( fnHandles )
%COMPOSE Compose a set of functions
% COMPOSE({fnHandles}) returns a function handle consisting of the
% composition of the cell array of input function handles.
%
% For example, if F, G, and H are function handles with one input and
% one output, then:
% FNCOMPOSED = COMPOSE({F,G,H});
% y = FNCOMPOSED(x);
% is equivalent to
% y = F(G(H(x)));
if isempty(fnHandles)
result = #(x)x;
elseif length(fnHandles)==1
result = fnHandles{1};
else
fnOuter = fnHandles{1};
fnRemainder = compose(fnHandles(2:end));
result = #(x)fnOuter(fnRemainder(x));
end
Then, the bizarre, contrived path to convert base26 values into the correct string
%Functions leading to "getNumeric", which creates a numeric, base26 array
remapUpper = #(rawBase)(rawBase + (rawBase>='A')*(-55)); %Map the letters 'A-P' to [10:26]
reMapLower = #(rawBase)(rawBase + (rawBase<'A')*(-48)); %Map characters '0123456789' to [0:9]
getRawBase = #(x)dec2base(x, 26);
getNumeric = #(x)remapUpper(reMapLower(getRawBase(x)));
%Functions leading to "correctNumeric"
% This replaces zeros with 26, and reduces the high values entry by 1.
% Similar to "borrowing" as we learned in longhand subtraction
borrowDownFrom = #(x, fromIndex) [x(1:(fromIndex-1)) (x(fromIndex)-1) (x(fromIndex+1)+26) (x((fromIndex+2):end))];
borrowToIfNeeded = #(x, toIndex) (x(toIndex)<=0)*borrowDownFrom(x,toIndex-1) + (x(toIndex)>0)*(x); %Ugly numeric switch
getAllConditionalBorrowFunctions = #(numeric)arrayfun(#(index)#(numeric)borrowToIfNeeded(numeric, index),(2:length(numeric)),'uniformoutput',false);
getComposedBorrowFunction = #(x)compose(getAllConditionalBorrowFunctions(x));
correctNumeric = #(x)feval(getComposedBorrowFunction(x),x);
%Function to replace numerics with letters, and remove leading '#' (leading
%zeros)
numeric2alpha = #(x)regexprep(char(x+'A'-1),'^#','');
%Compose complete function
num2ExcelName = #(x)arrayfun(#(x)numeric2alpha(correctNumeric(getNumeric(x))), x, 'uniformoutput',false)';
Now test using some stressing transitions:
>> num2ExcelName([1:5 23:28 700:704 727:729 1024:1026 1351:1355 16382:16384])
ans =
'A'
'B'
'C'
'D'
'E'
'W'
'X'
'Y'
'Z'
'AA'
'AB'
'ZX'
'ZY'
'ZZ'
'AAA'
'AAB'
'AAY'
'AAZ'
'ABA'
'AMJ'
'AMK'
'AML'
'AYY'
'AYZ'
'AZA'
'AZB'
'AZC'
'XFB'
'XFC'
'XFD'
This function I wrote works for any number of columns (until Excel runs out of columns). It just requires a column number input (e.g. 16368 will return a string 'XEN').
If the application of this concept is different than my function, it's important to note that a column of x number of A's begins every 26^(x-1) + 26^(x-2) + ... + 26^2 + 26 + 1. (e.g. 'AAA' begins on 26^2 + 26 + 1 = 703)
function [col_str] = let_loc(num_loc)
test = 2;
old = 0;
x = 0;
while test >= 1
old = 26^x + old;
test = num_loc/old;
x = x + 1;
end
num_letters = x - 1;
str_array = zeros(1,num_letters);
for i = 1:num_letters
loc = floor(num_loc/(26^(num_letters-i)));
num_loc = num_loc - (loc*26^(num_letters-i));
str_array(i) = char(65 + (loc - 1));
end
col_str = strcat(str_array(1:length(str_array)));
end
Hope this saves someone some time!

How do I read N random lines out of a file without storing the file in memory?

I'm familiar with the algorithm for reading a single random line from a file without reading the whole file into memory. I wonder if this technique can be extended to N random lines?
The use case is for a password generator which concatenates N random words pulled out of a dictionary file, one word per line (like /usr/share/dict/words). You might come up with angela.ham.lewis.pathos. Right now it reads the whole dictionary file into an array and picks N random elements from that array. I would like to eliminate the array, or any other in-memory storage of the file, and only read the file once.
(No, this isn't a practical optimization exercise. I'm interested in the algorithm.)
Update:
Thank you all for your answers.
Answers fell into three categories: modifications of the full read algorithm, random seek, or index the lines and seek to them randomly.
The random seek is much faster, and constant with respect to file size, but distributes based on file size not on number of words. It also allows duplicates (that can be avoided but it makes the algorithm O(inf)). Here's my reimplementation of my password generator using that algorithm. I realize that by reading forward from the seek point, rather than backwards, it has an off-by-one error should the seek fall in the last line. Correcting is left as an exercise for the editor.
#!/usr/bin/perl -lw
my $Words = "/usr/share/dict/words";
my $Max_Length = 8;
my $Num_Words = 4;
my $size = -s $Words;
my #words;
open my $fh, "<", $Words or die $!;
for(1..$Num_Words) {
seek $fh, int rand $size, 0 or die $!;
<$fh>;
my $word = <$fh>;
chomp $word;
redo if length $word > $Max_Length;
push #words, $word;
}
print join ".", #words;
And then there's Guffa's answer, which was what I was looking for; an extension of the original algorithm. Slower, it has to read the whole file, but distributes by word, allows filtering without changing the efficiency of the algorithm and (I think) has no duplicates.
#!/usr/bin/perl -lw
my $Words = "/usr/share/dict/words";
my $Max_Length = 8;
my $Num_Words = 4;
my #words;
open my $fh, "<", $Words or die $!;
my $count = 0;
while(my $line = <$fh>) {
chomp $line;
$count++;
if( $count <= $Num_Words ) {
$words[$count-1] = $line;
}
elsif( rand($count) <= $Num_Words ) {
$words[rand($Num_Words)] = $line;
}
}
print join ".", #words;
Finally, the index and seek algorithm has the advantage of distributing by word rather than file size. The disadvantage is it reads the whole file and memory usage scales linearly with the number of words in the file. Might as well use Guffa's algorithm.
The algorithm is not implemented in a very good and clear way in that example... Some pseudo code that better explains it would be:
cnt = 0
while not end of file {
read line
cnt = cnt + 1
if random(1 to cnt) = 1 {
result = line
}
}
As you see, the idea is that you read each line in the file and calculate the probability that the line should be the one chosen. After reading the first line the probability is 100%, after reading the second line the probability is 50%, and so on.
This can be expanded to picking N items by keeping an array with the size N instead of a single variable, and calculate the probability for a line to replace one of the current ones in the array:
var result[1..N]
cnt = 0
while not end of file {
read line
cnt = cnt + 1
if cnt <= N {
result[cnt] = line
} else if random(1 to cnt) <= N {
result[random(1 to N)] = line
}
}
Edit:
Here's the code implemented in C#:
public static List<string> GetRandomLines(string path, int count) {
List<string> result = new List<string>();
Random rnd = new Random();
int cnt = 0;
string line;
using (StreamReader reader = new StreamReader(path)) {
while ((line = reader.ReadLine()) != null) {
cnt++;
int pos = rnd.Next(cnt);
if (cnt <= count) {
result.Insert(pos, line);
} else {
if (pos < count) {
result[pos] = line;
}
}
}
}
return result;
}
I made a test by running the method 100000 times, picking 5 lines out of 20, and counted the occurances of the lines. This is the result:
25105
24966
24808
24966
25279
24824
25068
24901
25145
24895
25087
25272
24971
24775
25024
25180
25027
25000
24900
24807
As you see, the distribution is as good as you could ever want. :)
(I moved the creation of the Random object out of the method when running the test, to avoid seeding problems as the seed is taken from the system clock.)
Note:
You might want to scramble the order in the resulting array if you want them to be randomly ordered. As the first N lines are placed in order in the array, they are not randomly placed if they remain at the end. For exmaple if N is three or larger and the third line is picked, it will always be at the third position in the array.
Edit 2:
I changed the code to use a List<string> instead of a string[]. That makes it easy to insert the first N items in a random order. I updated the test data from a new test run, so that you can see that the distribution is still good.
Now my Perl is not what used to be, but trusting the implicit claim on your reference (that the distribution of line numbers thus selected is uniform), it seems this should work:
srand;
(rand($.) < 1 && ($line1 = $_)) || (rand($.) <1 && ($line2 = $_)) while <>;
Just like the original algorithm, this is one-pass and constant memory.
Edit
I just realized you need N, and not 2. You can repeat the OR-ed expression N times if you know N in advance.
Quite the first time I see some Perl code ... it is incredible unreadable ... ;) But that should not matter. Why don't you just repeat the cryptic line N times?
If I would have to write this, I would just seek a random position in the file, read to the end of the line (the next newline), and then read one line up to the next newline. Add some error handling if you just seeked into the last line, repeat all this N times and you are done. I guess
srand;
rand($.) < 1 && ($line = $_) while <>;
is the Perl way to do such a single step. You could also read backwards from the initial position up to the priviouse newline or the begining of the file and then read a line forward again. But this doesn't really matter.
UPDATE
I have to admit that seeking somewhere into the file will not generate a perfect uniform distribution because of the different line lengths. If this fluctuation matters depends on the usage scenario, of course.
If you need a perfect uniform distribution, you need to read the whole file at least once to get the number of lines. In this case the algorithm given by Guffa is probably the cleverest solution because it requires reading the file exactly once.
If you don't need to do it within the scope of Perl, shuf is a really nice command-line utility for this. To do what you're looking to do:
$ shuf -n N file > newfile
Quick and dirty bash
function randomLine {
numlines=`wc -l $1| awk {'print $1'}`
t=`date +%s`
t=`expr $t + $RANDOM`
a=`expr $t % $numlines + 1`
RETURN=`head -n $a $1|tail -n 1`
return 0
}
randomLine test.sh
echo $RETURN
Pick a random point in the file, look backwards for previous EOL, search forward for current EOL, and return the line.
FILE * file = fopen("words.txt");
int fs = filesize("words.txt");
int ptr = rand(fs); // 0 to fs-1
int start = min(ptr - MAX_LINE_LENGTH, 0);
int end = min(ptr + MAX_LINE_LENGTH, fs - 1);
int bufsize = end - start;
fseek(file, start);
char *buf = malloc(bufsize);
read(file, buf, bufsize);
char *startp = buf + ptr - start;
char *finp = buf + ptr - start + 1;
while (startp > buf && *startp != '\n') {
startp--;
}
while (finp < buf + bufsize && *finp != '\n') {
finp++;
}
*finp = '\0';
startp++;
return startp;
Lots of one off errors and crap in there, bad memory management, and other horrors. If this actually compiles, you get a nickel. (Please send self addressed stamped envelope and $5 handling to receive free nickle.)
But you should get the idea.
Longer lines statistically have a higher chance of being selected than shorter lines. But the run time of this is effectively constant regardless of file size. If you have a lot of words of mostly similar length, the statisticians won't be happy (they never are anyway), but in practice it will be close enough.
I'd say:
Read the file and search for the amount of \n. That's the number of lines - let's call that L
Store their positions in a small array in memory
Get two random lines lower than L, fetch their offsets and you're done.
You'd use just a small array and read the whole file once + 2 lines afterwards.
You could do a 2 pass algorithm. First get the positions of each newline, pushing those positions into a vector. Then pick random items in that vector, call this i.
Read from the file at position v[i] to v[i+1] to get your line.
During the first pass you read the file with a small buffer as to not read it all into RAM at once.

Resources