cant find error in printf function - gcc

following is code of function
void printf(char *ch,void *num,...)
{
int i;
va_list ptr; //to store variable length argument list
va_start(ptr,num); // initialise ptr
for(i=0;ch[i]!='\0';i++)
{
if(ch[i]=='%') // check for % sign in print statement
{ i++;
if( ch[i]=='d')
{
int *no = (int *)va_arg(ptr,int * );
int value=*no; // just used for nothing
printno(value); //print int number
}
if( ch[i]=='u')
{
unsigned long *no =(unsigned long *) va_arg(ptr,unsigned long *);
unsigned long value=*no;
printuno(value); //print unsigned long
}
}
else // if not % sign then its regular character so print it
{
printchar(ch[i]);
}
}
}
this my code for printf() to print integer value and uint values
It is working fine for string portion in arguments but for %d %u it shows the same
values for all variables. This value is 405067 - even though the values of the variables are different.
Please tell me how to fix this.

Why are you interpreting the argument as a pointer? I'm surprised you aren't crashing. You should just be using
int num = va_arg(ptr,int);
printno(num);
and
unsigned int num = va_arg(ptr,unsigned int);
printuno(value);
(note, unsigned int, not unsigned long, because that would actually be %lu)
Also, get rid of the num parameter. It's wrong. Your va_list should be initialized as
`va_start(ptr, ch);`
va_start() takes the last argument before the varargs, not the first argument.

As noted in a comment, the C99 prototype for printf() is:
int printf(const char * restrict format, ...);
Therefore, if you're calling your function printf(), you should probably follow its design. I'm going to ignore flags, field width, precision and length modifiers, assuming that the conversion specifiers are simply two characters each, such as %d or %%.
int printf(const char * restrict format, ...)
{
va_list args;
va_start(args, format);
char c;
int len = 0;
while ((c = *format++) != '\0')
{
if (c != '%')
{
putchar(c);
len++;
}
else if ((c = *format++) == '%')
{
putchar(c);
len++;
}
else if (c == 'd')
{
int value = va_arg(args, int);
len += printno(value);
}
else if (c == 'u')
{
unsigned value = va_arg(args, unsigned);
len += printuno(value);
}
else
{
/* Print unrecognized formats verbatim */
putchar('%');
putchar(c);
len += 2;
}
}
return len;
}
Dealing with the full set of format specifiers (especially if you add the POSIX n$ notation as well as flags, field width, precision and length modifiers) is much harder, but this should get you moving in the correct direction. Note that I assume the printno() and printuno() functions both report how many characters were written for the conversion specifier. The function returns the total number of characters written. Note, too, that production code would need to allow for the called functions to fail, and would therefore probably not use the len += printno(value); notation, but would capture the return from printno() into a separate variable that could be tested for an error before adding it to the total length output.

Related

How to return multiple values (vector and one int value) through function

I am finding connected components of a graph.
Condition : Those components should not be printed in same function, but they should be printed in calling function ( i.e. int main() )
I have gone through the link but got some error.
Returning multiple values from a C++ function
tuple <vector<int>&, int > connected( vector<int>& store, ...)
{
int component_size = 0;
// code
return make_tuple ( store, component_size);
}
int main()
{
// code
for( int i = 0 ; i < V; i++ )
{
if( !visited[i])
{
tie( ans, compo_size ) = connected(edges,
visited, myQ, store, V, i);
for(int i = 0 ; i < compo_size; i++ )
{
cout<< ans[i] <<" ";
}
}
}
}
There are few errors :
error: could not convert 'std::make_tuple(_Elements&& ...) [with _Elements = {std::vector >&, int&}](component_size)' from 'std::tuple >, int>' to 'std::tuple >&, int>'
return make_tuple ( store, component_size);
^
error: invalid initialization of reference of type 'std::vector&' from expression of type 'std::vector'
tie( ans, compo_size ) = connected(edges, visited, myQ, store, V, i);
How to return multiple values (vector and one int value) through function
A function can have at most one return value.
Returning more objects can be emulated by either
modifying one or more objects that are global or are referenced by arguments through indirection or by
returning an object of class type that has multiple sub objects.
You've attempted the latter approach through the use of tuple class template. The reason it doesn't work is explained in the documentation:
template< class... Types >
tuple<VTypes...> make_tuple( Types&&... args );
For each Ti in Types..., the corresponding type Vi in VTypes... is std::decay<Ti>::type unless application of std::decay results in std::reference_wrapper<X> for some type X, in which case the deduced type is X&.
As such, your invocation of make_tuple is deduced to return tuple <vector<int>, int > which is wrong because the function is supposed to return tuple <vector<int>&, int > instead. This can be fixed using std::ref so that the correct type is deduced:
std::make_tuple(std::ref(store), component_size);
As eerorika mentioned, you could use std::ref() as follow:
std::tuple <std::vector<int>&, int > connected( std::vector<int>& store, ...)
{
int component_size = 0;
// code
return std::make_tuple ( std::ref(store), component_size);
}
However, there is really no point in returning a reference to the input vector since it is already a non-const reference on input. So changing the vector in place is going to be enough. On return you get a modified version. However, that's probably not what you are looking to do (i.e. you probably wanted to make a copy of store and return the copy with the other arrays appended...)
That also means you're going to have yet another copy when you create the tuple:
std::tuple <std::vector<int>, int > connected( std::vector<int>& store, ...)
{
int component_size = 0;
std::vector<int> result;
// or maybe a straight copy, depends on your needs in "code"
//std::vector<int> result(store);
// code
return std::make_tuple ( result, component_size);
}
As mentioned by others, having a result in the list of arguments is probably your best bet:
int connected( std::vector<int> & result, std::vector<int> const & store, ...)
{
int component_size = 0;
// code
return component_size;
}
Also, wouldn't component_size == result.size() be true? If so, you should not return anything because it's going to be more confusing.
That simplifies the function to this point:
void connected( std::vector<int> & result, std::vector<int> const & store, ...)
{
// code
}

Is this union compatible with strict-aliasing rules?

Is it OK to use both parts of a union if you know the parts don't overlap? Like in this example, is it OK to use both buf[31] as well as ps?
struct PtrSize {
const char *data;
size_t size;
};
class SmallStringOrNot {
union {
PtrSize ps;
char buf[32];
} pb;
public:
bool IsSmallString() const {
return pb.buf[31] != 0;
}
SmallStringOrNot(const char *str) {
size_t len = strlen(str);
if (len && len < 31) {
memcpy(pb.buf, str, len);
pb.buf[31] = len;
} else {
pb.ps.data = str;
pb.ps.size = len;
pb.buf[31] = 0; // is this OK, accessing buf right after ps?
}
}
PtrSize AsPtrSize() const {
if (IsSmallString()) {
return PtrSize{pb.buf, pb.buf[31]};
} else {
return pb.ps;
}
}
};
Unfortunately the code is not OK: you are at least not in "undefined behaviour"-zone, since in C++ it is always legal to access a union through a char member, but you have no guarantee that by modifying buf[31] you are not altering ps.data or ps.size. In a 128-bit machine you would almost surely be doing it.
On more normal architectures, your code should be fine but for a 100% guarantee you should refer to the compiler documentation, since size_t could in principle be bigger than a void*. For example, even on a 64-bit machine you could theoretically have a 192-bit ps.size member (which summed with the 64 bit of the ps.data pointer would make the PtrSize completely overlap the buffer.

All of the option to replace an unknown number of characters

I am trying to find an algorithm that for an unknown number of characters in a string, produces all of the options for replacing some characters with stars.
For example, for the string "abc", the output should be:
*bc
a*c
ab*
**c
*b*
a**
***
It is simple enough with a known number of stars, just run through all of the options with for loops, but I'm having difficulties with an all of the options.
Every star combination corresponds to binary number, so you can use simple cycle
for i = 1 to 2^n-1
where n is string length
and set stars to the positions of 1-bits of binary representations of i
for example: i=5=101b => * b *
This is basically a binary increment problem.
You can create a vector of integer variables to represent a binary array isStar and for each iteration you "add one" to the vector.
bool AddOne (int* isStar, int size) {
isStar[size - 1] += 1
for (i = size - 1; i >= 0; i++) {
if (isStar[i] > 1) {
if (i = 0) { return true; }
isStar[i] = 0;
isStar[i - 1] += 1;
}
}
return false;
}
That way you still have the original string while replacing the characters
This is a simple binary counting problem, where * corresponds to a 1 and the original letter to a 0. So you could do it with a counter, applying a bit mask to the string, but it's just as easy to do the "counting" in place.
Here's a simple implementation in C++:
(Edit: The original question seems to imply that at least one character must be replaced with a star, so the count should start at 1 instead of 0. Or, in the following, the post-test do should be replaced with a pre-test for.)
#include <iostream>
#include <string>
// A cleverer implementation would implement C++'s iterator protocol.
// But that would cloud the simple logic of the algorithm.
class StarReplacer {
public:
StarReplacer(const std::string& s): original_(s), current_(s) {}
const std::string& current() const { return current_; }
// returns true unless we're at the last possibility (all stars),
// in which case it returns false but still resets current to the
// original configuration.
bool advance() {
for (int i = current_.size()-1; i >= 0; --i) {
if (current_[i] == '*') current_[i] = original_[i];
else {
current_[i] = '*';
return true;
}
}
return false;
}
private:
std::string original_;
std::string current_;
};
int main(int argc, const char** argv) {
for (int a = 1; a < argc; ++a) {
StarReplacer r(argv[a]);
do {
std::cout << r.current() << std::endl;
} while (r.advance());
std::cout << std::endl;
}
return 0;
}

Converting lower/upper case letters without ctype.h

I just saw that this could technically work, the only mistake I couldn´t resolve was the last ASCII character that gets printed everytime I test it out, I also tested this out without using the name variable, I mean just making a substraction of 32 to any lower case letter in ASCII should give me their upper case one and it does, but I´m curious on why I´m getting an additional char, wich from what I see in screen is apparently Û.
#include <stdio.h>
main()
{
char name[22];
int i;
fputs("Type your name ",stdout);
fgets(name,22,stdin);
for (i = 0; name[i] != '\0'; i = i + 1)
printf("%c",(name[i])-32); /*This will convert lower case to upper */
/* using as reference the ASCII table*/
fflush(stdin);
getchar();
}
Perhaps there is a line break character at the end of the string.
You can check the chararacter code, so that you only convert characters that actually are lower case letters:
for (i = 0; name[i] != '\0'; i = i + 1) {
char c = name[i];
if (c => 97 && c <= 122) {
c -= 32;
}
printf("%c", c);
}
void read_chararray(char in_array[], int* Length)
{
int Indx = 0, Indx2 = 0, Indx3 = 0; // int declarations for indexs of some loops
char cinput = { 0 }, word[255] = { 0 }, word2[255] = { 0 }; // declaration of cinput and first char array before punctiation removed
for (Indx = 0; (cinput = getchar()) != '\n'; Indx++) { // Loop for getting characters from user stop at <enter>
word[Indx] = cinput; // Placing char into array while changing to lowercase
}
Indx2 = Indx; // Set Indx2 to Indx for loop operation
for (Indx = 0; Indx < Indx2; Indx++) { // Loop to check and replace upper characters with lower
cinput = word[Indx];
if (cinput >= 65 && cinput <= 90) { // If cinput is within the ASCII range 65 and 90, this indicates upper characters
cinput += 32; // Add 32 to cinput to shift to the lower character range within the ASCII table
in_array[Indx] = cinput; // Input new value into array pointer
}
else if (cinput >= 97 && cinput <= 122) // scans if character are lower ASCII, places them in array irraticating punctuation and whitespce
in_array[Indx] = cinput; // Input remaining lower case into array pointer
}
*Length = Indx; // final size of array set to Length variable for future use
}
#include<stdio.h>
void upper(char);
void main()
{
char ch;
printf("\nEnter the character in lower case");
scanf("%c", &ch);
upper(ch);
}
void upper( char c)
{
printf("\nUpper Case: %c", c-32);
}

How do I reverse a UTF-8 string in place?

Recently, someone asked about an algorithm for reversing a string in place in C. Most of the proposed solutions had troubles when dealing with non single-byte strings. So, I was wondering what could be a good algorithm for dealing specifically with utf-8 strings.
I came up with some code, which I'm posting as an answer, but I'd be glad to see other people's ideas or suggestions. I preferred to use actual code, so I've chosen C#, as it seems to be one of the most popular language in this site, but I don't mind if your code is in another language, as long as it could be reasonably understood by anyone who is familiar with an imperative language. And, as this is intended to see how such an algorithm could be implemented at a low-level (by low-level I just mean dealing with bytes), the idea is to avoid using libraries for the core code.
Notes:
I'm interested in the algorithm itself, its performance and how could it be optimized (I mean algorithm-level optimization, not replacing i++ with ++i and such; I'm not really interested in actual benchmarks either).
I don't mean to actually use it in production code or "reinventing the wheel". This is just out of curiosity and as an exercise.
I'm using C# byte arrays so I'm assuming you can get the length of the string without running though the string until you find a NUL.
That is, I'm not accounting for the complexity of finding the length of the string. But if you're using C, for instance, you could factor that out by using strlen() before calling the core code.
Edit:
As Mike F points out, my code (and other people's code posted here) is not dealing with composite characters. Some info about those here. I'm not familiar with the concept, but if that means that there are "combining characters", i.e., characters / code points that are only valid in combination with other "base" characters / code points, a look-up table of such characters could be used to preserve the order of the "global" character ("base" + "combining" characters) when reversing.
I'd make one pass reversing the bytes, then a second pass that reverses the bytes in any multibyte characters (which are easily detected in UTF8) back to their correct order.
You can definitely handle this in line in a single pass, but I wouldn't bother unless the routine became a bottleneck.
This code assumes that the input UTF-8 string is valid and well formed (i.e. at most 4 bytes per multibyte character):
#include "string.h"
void utf8rev(char *str)
{
/* this assumes that str is valid UTF-8 */
char *scanl, *scanr, *scanr2, c;
/* first reverse the string */
for (scanl= str, scanr= str + strlen(str); scanl < scanr;)
c= *scanl, *scanl++= *--scanr, *scanr= c;
/* then scan all bytes and reverse each multibyte character */
for (scanl= scanr= str; c= *scanr++;) {
if ( (c & 0x80) == 0) // ASCII char
scanl= scanr;
else if ( (c & 0xc0) == 0xc0 ) { // start of multibyte
scanr2= scanr;
switch (scanr - scanl) {
case 4: c= *scanl, *scanl++= *--scanr, *scanr= c; // fallthrough
case 3: // fallthrough
case 2: c= *scanl, *scanl++= *--scanr, *scanr= c;
}
scanr= scanl= scanr2;
}
}
}
// quick and dirty main for testing purposes
#include "stdio.h"
int main(int argc, char* argv[])
{
char buffer[256];
buffer[sizeof(buffer)-1]= '\0';
while (--argc > 0) {
strncpy(buffer, argv[argc], sizeof(buffer)-1); // don't overwrite final null
printf("%s → ", buffer);
utf8rev(buffer);
printf("%s\n", buffer);
}
return 0;
}
If you compile this program (example name: so199260.c) and run it on a UTF-8 environment (a Linux installation in this case):
$ so199260 γεια και χαρά français АДЖИ a♠♡♢♣b
a♠♡♢♣b → b♣♢♡♠a
АДЖИ → ИЖДА
français → siaçnarf
χαρά → άραχ
και → ιακ
γεια → αιεγ
If the code is too cryptic, I will happily clarify.
Agree that your approach is the only sane way to do it in-place.
Personally I don't like revalidating UTF8 inside every function that deals with it, and generally only do what's needed to avoid crashes; it adds up to a lot less code. Dunno much C# so here it is in C:
(edited to eliminate strlen)
void reverse( char *start, char *end )
{
while( start < end )
{
char c = *start;
*start++ = *end;
*end-- = c;
}
}
char *reverse_char( char *start )
{
char *end = start;
while( (end[1] & 0xC0) == 0x80 ) end++;
reverse( start, end );
return( end+1 );
}
void reverse_string( char *string )
{
char *end = string;
while( *end ) end = reverse_char( end );
reverse( string, end-1 );
}
My initial approach could by summarized this way:
1) Reverse bytes naively
2) Run the string backwards and fix the utf8 sequences as you go.
Illegal sequences are dealt with in the second step and in the first step, we check if the string is in "sync" (that is, if it starts with a legal leading byte).
EDIT: improved validation for leading byte in Reverse()
class UTF8Utils {
public static void Reverse(byte[] str) {
int len = str.Length;
int i = 0;
int j = len - 1;
// first, check if the string is "synced", i.e., it starts
// with a valid leading character. Will check for illegal
// sequences thru the whole string later.
byte leadChar = str[0];
// if it starts with 10xx xxx, it's a trailing char...
// if it starts with 1111 10xx or 1111 110x
// it's out of the 4 bytes range.
// EDIT: added validation for 7 bytes seq and 0xff
if( (leadChar & 0xc0) == 0x80 ||
(leadChar & 0xfc) == 0xf8 ||
(leadChar & 0xfe) == 0xfc ||
(leadChar & 0xff) == 0xfe ||
leadChar == 0xff) {
throw new Exception("Illegal UTF-8 sequence");
}
// reverse bytes in-place naïvely
while(i < j) {
byte tmp = str[i];
str[i] = str[j];
str[j] = tmp;
i++;
j--;
}
// now, run the string again to fix the multibyte sequences
UTF8Utils.ReverseMbSequences(str);
}
private static void ReverseMbSequences(byte[] str) {
int i = str.Length - 1;
byte leadChar = 0;
int nBytes = 0;
// loop backwards thru the reversed buffer
while(i >= 0) {
// since the first byte in the unreversed buffer is assumed to be
// the leading char of that byte, it seems safe to assume that the
// last byte is now the leading char. (Given that the string is
// not out of sync -- we checked that out already)
leadChar = str[i];
// check how many bytes this sequence takes and validate against
// illegal sequences
if(leadChar < 0x80) {
nBytes = 1;
} else if((leadChar & 0xe0) == 0xc0) {
if((str[i-1] & 0xc0) != 0x80) {
throw new Exception("Illegal UTF-8 sequence");
}
nBytes = 2;
} else if ((leadChar & 0xf0) == 0xe0) {
if((str[i-1] & 0xc0) != 0x80 ||
(str[i-2] & 0xc0) != 0x80 ) {
throw new Exception("Illegal UTF-8 sequence");
}
nBytes = 3;
} else if ((leadChar & 0xf8) == 0xf0) {
if((str[i-1] & 0xc0) != 0x80 ||
(str[i-2] & 0xc0) != 0x80 ||
(str[i-3] & 0xc0) != 0x80 ) {
throw new Exception("Illegal UTF-8 sequence");
}
nBytes = 4;
} else {
throw new Exception("Illegal UTF-8 sequence");
}
// now, reverse the current sequence and then continue
// whith the next one
int back = i;
int front = back - nBytes + 1;
while(front < back) {
byte tmp = str[front];
str[front] = str[back];
str[back] = tmp;
front++;
back--;
}
i -= nBytes;
}
}
}
The best solution:
Convert to a wide char string
Reverse the new string
Never, never, never, never treat single bytes as characters.

Resources