lldb memory read with count from variable - debugging

Is it possible to use a variable as the count in a "memory read" lldb command?
A minimal example: With a breakpoint at the return statement of the following C program
#include <stdio.h>
#include <string.h>
int main(int argc, const char * argv[]) {
char *str = "Hello";
size_t len = strlen(str);
return 0; // <-- Breakpoint here
}
I can dump the contents of the string variable with
(lldb) memory read --count 5 str
0x100000fae: 48 65 6c 6c 6f Hello
but not with
(lldb) memory read --count len str
error: invalid uint64_t string value: 'len'
How can I use the value of the len variable as the count of the "memory read" command?

lldb's command line doesn't have much syntax, but one useful bit that it does have is that if you surround an argument or option value in backticks, the string inside the backticks gets passed to the expression parser, and the result of the expression evaluation gets substituted for the backtick value before being passed to the command. So you want to do:
(lldb) memory read --count `len` str

Related

Why is fscanf read garbage?

#include <stdio.h>
#include <windows.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>
#define PATH "F:\\c\\projects\\Banking Management System\\data\\"
#define F_ACCT "accounts.txt"
#define FILENAME(file) PATH file
#define F_ACCT_FPRINTF "%05d%-8s%-30s%d%d%d%-20s%-20s%-20s%c%-15.2lf\n"
#define F_ACCT_FSCANF "%05d%8s%30[^\n]%d%d%d%20[^\n]%20[^\n]%20[^\n]%c%lf\n"
typedef struct Date
{
int dd;
int mm;
int ccyy;
} Date;
typedef struct Account
{
int id;
char acct_no[8];
char name[30];
Date birthday;
char telephone_no[20];
char mobile_no[20];
char tfn[20];
char acct_type; // 'S' - Saving | 'C' - Current | Fixed - 'F' | Recurring - 'R'
double acct_bal;
} Account;
int main(int argc, char *argv[])
{
Account *ac_t=malloc(sizeof(Account));
if (ac_t==NULL)
{
free(ac_t);
perror("Fatal error: ");
exit(EXIT_FAILURE);
}
FILE *fp=fopen(FILENAME(F_ACCT),"a+"); // Save option selected by the user
if (!fp) // NULL=0=true
{
free(ac_t);
perror("ERROR:");
exit(EXIT_FAILURE);
}
(fscanf(fp,F_ACCT_FSCANF,\
&ac_t->id,\
ac_t->acct_no,\
ac_t->name,\
&ac_t->birthday.dd,\
&ac_t->birthday.mm,\
&ac_t->birthday.ccyy,\
ac_t->telephone_no,\
ac_t->mobile_no,\
ac_t->tfn,\
&ac_t->acct_type,\
&ac_t->acct_bal));
printf("\ntmp=%d", tmp);
printf("\n[%d]",ac_t->id);
printf("\n[%s]",ac_t->acct_no);
printf("\n[%s]",ac_t->name);
printf("\n[%d]",ac_t->birthday.dd);
printf("\n[%d]",ac_t->birthday.mm);
printf("\n[%d]",ac_t->birthday.ccyy);
printf("\n[%s]",ac_t->telephone_no);
printf("\n[%s]",ac_t->mobile_no);
printf("\n[%s]",ac_t->tfn);
printf("\n[%c]",ac_t->acct_type);
printf("\n[%lf]",ac_t->acct_bal);
system("pause");
free(pw_t);
return 0;
}
=========================================================================================
Input file (accounts.txt)
=========================
000011000 Anil Dhar 27111960(02) 8883 2827 0408 942 407 111222333 S 100.21
Note: The record was created successfully using frpintf() as per F_ACCT_FPRINTF.
**Problem**
=======
fscanf is reading garbage values like this:
ac_t->id 1
t_acct_no
name Anil Dhar
birthday.dd 27
birthday.mm 11
birthday.ccyy 1960
telephone_no (02) 8883 2827 0408 942 407 111222333 Sogram Files\Intel\ip¬tαK4
mobile_no 0408 942 407 111222333 Sogram Files\Intel\ip¬tαK4
tfn 111222333 Sogram Files\Intel\ip¬tαK4
t_acct_type
acct_bal 74895632819821970000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.000000
All my string variables like name, telephone_no, mobile_no, tfn could contain spaces.
The record is not delimited with anything. My fscanf() is not populating the fields properly wherever I am reading the string variables.
What could have gone wrong????
$20[^\n] reads everything up until end of line. Read up until a space, for example. Note that all specifiers except %[] and %c automatically eat and ignore leading whitespaces (tabs, newlines and spces). Note that all whitespaces automatically eat and ignore zero or more whitespaces. See scanf documentation. I would also sprinkle spaces anyway to make scanf format more readable. Do not add trailing \n. Check return value of scanf, do not leave it unchecked - scanning may fail. You need to at max read one character less than your buffers, or you have to increase your buffers by one - place for terminating zero byte.
"%d %7[^ ] %29[^ ] %d %d %d %19[^ ] %19[^ ] %19[^ ] %c %lf"
%s reads whitespace delimited strings, while %[^\n] reads up to the next newline. It appears that you want a specific number of characters, so you want %c. So you should have something more like:
#define F_ACCT_FSCANF "%5d%8c%30c%2d%2d%4d%20c%20c%20c%c%lf"
I'm not sure if this is exactly correct, because your printing format is somewhat ambiguous (using %d rather than %02d means you don't know how many digits you'll get). It also will go off the rails badly if your input file had its spacing modified in any way, so you might want to use fgets+sscanf rather than fscanf, as that will at least allow you to resynchronize after any corrupted line.
One thing to be careful of -- %8c will read exactly 8 characters into the buffer provided as an argument with NO terminating NUL -- if you want NUL terminated strings, you'll need to arrange for that manually.

Detecting uninitialized variables in gcc

The following (broken) example code scanf's an input string to an integer, but allow an empty string through to as 0:
#include <stdio.h>
#include <string.h>
int parse(const char *p)
{
int value; // whoops! forgot to 0 here.
if (*p && sscanf(p, "%d", &value) != 1)
return -1;
return value;
}
int main(int argc, char **argv)
{
const char *p = argc > 1 ? argv[1] : "";
int value = parse(p);
printf("value = %d\n", value);
}
but it compiles clean with -Wall.
https://godbolt.org/z/sPbx99Ms5
but fails with obvious problems:
$ ./test 123
value = 123
$ ./test xxx
value = -1
$ ./test ""
value = 32765
I realise it's quite hard for it to work out that the scanf might not fill in a value as it
can't see the code, but is there a flag or a scanner I could run on a (large) body of code to try and find if there are places where uninitialized variables are being passed by pointer?
... even something that would just find all uninitialized variables would be useful.

Unexpected value appears on stack when attempting buffer overflow

I am trying to learn more about cyber security, in this case about buffer overflows. I have a simple code that I want to change flow of:
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
void win()
{
printf("code flow successfully changed\n");
}
int main(int argc, char **argv)
{
volatile int (*fp)();
char buffer[64];
fp = 0;
gets(buffer);
if(fp) {
printf("calling function pointer, jumping to 0x%08x\n", fp);
fp();
}
}
By using some tools I have determined that function pointer (fp) gets it value updated after 72 characters have entered the buffer. The function win() is located at value 0xe5894855 so after 72 characters I need to provide that value to buffer for it to jump to the desired function.
However I am facing this issue:
By putting Python3's print("A"*18*4 + "UH" + "\x89" + "\xe5") into input of given C code, I should be getting desired value 0xe5894855 in section marked with red. But instead, I am getting highlighted malformed hex from somewhere. (89 is getting extra C2 and incorrect e5 value is overflowing to next part of stack) (value in those parts of stack are zero initially, but changed into that once overflow is attempted).
Why is this happening? Am I putting hex values into C program incorrectly?
Edit: Still have not figured out why passing hex through python did not work, but I found a different method, by using Perl: perl -e 'print "A"x4x18 . "\x55\x48\x89\xe5"', which did work, and address I needed to jump to was also incorrect (which I also fixed)

minimal typing command line calculator - tcsh vs bash

I like to have a command-line calculator handy. The requirements are:
Support all the basic arithmetic operators: +, -, /, *, ^ for exponentiation, plus parentheses for grouping.
Require minimal typing, I don't want to have to call a program interact with it then asking it to exit.
Ideally only one character and a space in addition to the expression itself should be entered into the command line.
It should know how to ignore commas and dollar (or other currency symbols)
in numbers to allow me to copy/paste from the web without worrying
about having to clean every number before pasting it into the calculator
Be white-space tolerant, presence or lack of spaces shouldn't cause errors
No need for quoting anything in the expression to protect it from the shell - again for the benefit of minimal typing
Since tcsh supports alias positional arguments, and since alias expansion precedes all other expansions except history-expansion, it was straight forward to implement something close to my ideal in tcsh.
I used this:
alias C 'echo '\''\!*'\'' |tr -d '\'',\042-\047'\'' |bc -l'
Now I can do stuff like the following with minimal typing:
# the basic stuff:
tcsh> C 1+2
3
# dollar signs, multiplication, exponentiation:
tcsh> C $8 * 1.07^10
15.73721085831652257992
# parentheses, mixed spacing, zero power:
tcsh> C ( 2+5 ) / 8 * 2^0
.87500000000000000000
# commas in numbers, no problem here either:
tcsh> C 1,250.21 * 1.5
1875.315
As you can see there's no need to quote anything to make all these work.
Now comes the problem. Trying to do the same in bash, where parameter aliases aren't supported forces me to implement the calculator as a shell function and pass the parameters using "$#"
function C () { echo "$#" | tr -d ', \042-\047' | bc -l; }
This breaks in various ways e.g:
# works:
bash$ C 1+2
3
# works:
bash$ C 1*2
2
# Spaces around '*' lead to file expansion with everything falling apart:
bash$ C 1 * 2
(standard_in) 1: syntax error
(standard_in) 1: illegal character: P
(standard_in) 1: illegal character: S
(standard_in) 1: syntax error
...
# Non-leading parentheses seem to work:
bash$ C 2*(2+1)
6
# but leading-parentheses don't:
bash$ C (2+1)*2
bash: syntax error near unexpected token `2+1'
Of course, adding quotes around the expression solves these issues, but is against the original requirements.
I understand why things break in bash. I'm not looking for explanations. Rather, I'm looking for a solution which doesn't require manually quoting the arguments. My question to bash wizards is is there any way to make bash support the handy minimal typing calculator alias. Not requiring quoting, like tcsh does? Is this impossible? Thanks!
If you're prepared to type C Enter instead of C Space, the sky's the limit. The C command can take input in whatever form you desire, unrelated to the shell syntax.
C () {
local line
read -p "Arithmetic: " -e line
echo "$line" | tr -d \"-\', | bc -l
}
In zsh:
function C {
local line=
vared -p "Arithmetic: " line
echo $line | tr -d \"-\', | bc -l
}
In zsh, you can turn off globbing for the arguments of a specific command with the noglob modifier. It is commonly hidden in an alias. This prevents *^() from begin interpreted literally, but not quotes or $.
quickie_arithmetic () {
echo "$*" | tr -d \"-\', | bc -l
}
alias C='noglob quickie_arithmetic'
At least preventing the expansion of * is possible using 'set -f' (following someone's blog post:
alias C='set -f -B; Cf '
function Cf () { echo "$#" | tr -d ', \042-\047' | bc -l; set +f; };
Turning it off in the alias, before the calculation, and back on afterwards
$ C 2 * 3
6
I downloaded the bash sources and looked very closely. It seems the parenthesis error occurs directly during the parsing of the command line, before any command is run or alias is expanded. And without any flag to turn it off.
So it would be impossible to do it from a bash script.
This means, it is time to bring the heavy weapons. Before parsing the command line is read from stdin using readline. Therefore, if we intercept the call to readline, we can do whatever we want with the command line.
Unfortunately bash is statically linked against readline, so the call cannot be intercepted directly. But at least readline is a global symbol, so we can get the address of the function using dlsym, and with that address we can insert arbitrary instructions in readline.
Modifying readline directly is prune to errors, if readline is changed between the different bash version, so we modify the function calling readline, leading to following plan:
Locate readline with dlsym
Replace readline with our own function that uses the current stack to locate the function calling readline (yy_readline_get) on its first call and then restores the original readline
Modify yy_readline_get to call our wrapper function
Within the wrapper function: Replace the parentheses with non problematic symbols, if the input starts with "C "
Written in C for amd64, we get:
#include <string.h>
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#ifndef __USE_GNU
#define __USE_GNU
#endif
#ifndef __USE_MISC
#define __USE_MISC
#endif
#include <dlfcn.h>
#include <unistd.h>
#include <sys/mman.h>
#include <errno.h>
//-----------Assembler helpers----------
#if (defined(x86_64) || defined(__x86_64__))
//assembler instructions to read rdp, which we need to read the stack
#define MOV_EBP_OUT "mov %%rbp, %0"
//size of a call instruction
#define RELATIVE_CALL_INSTRUCTION_SIZE 5
#define IS64BIT (1)
/*
To replace a function with a new one, we use the push-ret trick, pushing the destination address on the stack and let ret jump "back" to it
This has the advantage that we can set an additional return address in the same way, if the jump goes to a function
This struct corresponds to the following assembler fragment:
68 ???? push <low_dword (address)>
C7442404 ???? mov DWORD PTR [rsp+4], <high_dword (address) )
C3 ret
*/
typedef struct __attribute__((__packed__)) LongJump {
char push; unsigned int destinationLow;
unsigned int mov_dword_ptr_rsp4; unsigned int destinationHigh;
char ret;
// char nopFiller[16];
} LongJump;
void makeLongJump(void* destination, LongJump* res) {
res->push = 0x68;
res->destinationLow = (uintptr_t)destination & 0xFFFFFFFF;
res->mov_dword_ptr_rsp4 = 0x042444C7;
res->destinationHigh = ((uintptr_t)(destination) >> 32) & 0xFFFFFFFF;
res->ret = 0xC3;
}
//Macros to save and restore the rdi register, which is used to pass an address to readline (standard amd64 calling convention)
typedef unsigned long SavedParameter;
#define SAVE_PARAMETERS SavedParameter savedParameters; __asm__("mov %%rdi, %0": "=r"(savedParameters));
#define RESTORE_PARAMETERS __asm__("mov %0, %%rdi": : "r"(savedParameters));
#else
#error only implmented for amd64...
#endif
//Simulates the effect of the POP instructions, popping from a passed "stack pointer" and returning the popped value
static void * pop(void** stack){
void* temp = *(void**)(*stack);
*stack += sizeof(void*);
return temp;
}
//Disables the write protection of an address, so we can override it
static int unprotect(void * POINTER){
const int PAGESIZE = sysconf(_SC_PAGE_SIZE);;
if (mprotect((void*)(((uintptr_t)POINTER & ~(PAGESIZE-1))), PAGESIZE, PROT_READ | PROT_WRITE | PROT_EXEC)) {
fprintf(stderr, "Failed to set permission on %p\n", POINTER);
return 1;
}
return 0;
}
//Debug stuff
static void fprintfhex(FILE* f, void * hash, int len) {
for (int i=0;i<len;i++) {
if ((uintptr_t)hash % 8 == 0 && (uintptr_t)i % 8 == 0 && i ) fprintf(f, " ");
fprintf(f, "%.2x", ((unsigned char*)(hash))[i]);
}
fprintf(f, "\n");
}
//---------------------------------------
//Address of the original readline function
static char* (*real_readline)(const char*)=0;
//The wrapper around readline we want to inject.
//It replaces () with [], if the command line starts with "C "
static char* readline_wrapper(const char* prompt){
if (!real_readline) return 0;
char* result = real_readline(prompt);
char* temp = result; while (*temp == ' ') temp++;
if (temp[0] == 'C' && temp[1] == ' ')
for (int len = strlen(temp), i=0;i<len;i++)
if (temp[i] == '(') temp[i] = '[';
else if (temp[i] == ')') temp[i] = ']';
return result;
}
//Backup of the changed readline part
static unsigned char oldreadline[2*sizeof(LongJump)] = {0x90};
//A wrapper around the readline wrapper, needed on amd64 (see below)
static LongJump* readline_wrapper_wrapper = 0;
static void readline_initwrapper(){
SAVE_PARAMETERS
if (readline_wrapper_wrapper) { fprintf(stderr, "ERROR!\n"); return; }
//restore readline
memcpy(real_readline, oldreadline, 2*sizeof(LongJump));
//find call in yy_readline_get
void * frame;
__asm__(MOV_EBP_OUT: "=r"(frame)); //current stackframe
pop(&frame); //pop current stackframe (??)
void * returnToFrame = frame;
if (pop(&frame) != real_readline) {
//now points to current return address
fprintf(stderr, "Got %p instead of %p=readline, when searching caller\n", frame, real_readline);
return;
}
void * caller = pop(&frame); //now points to the instruction following the call to readline
caller -= RELATIVE_CALL_INSTRUCTION_SIZE; //now points to the call instruction
//fprintf(stderr, "CALLER: %p\n", caller);
//caller should point to 0x00000000004229e1 <+145>: e8 4a e3 06 00 call 0x490d30 <readline>
if (*(unsigned char*)caller != 0xE8) { fprintf(stderr, "Expected CALL, got: "); fprintfhex(stderr, caller, 16); return; }
if (unprotect(caller)) return;
//We can now override caller to call an arbitrary function instead of readline.
//However, the CALL instruction accepts only a 32 parameter, so the called function has to be in the same 32-bit address space
//Solution: Allocate memory at an address close to that CALL instruction and put a long jump to our real function there
void * hint = caller;
readline_wrapper_wrapper = 0;
do {
if (readline_wrapper_wrapper) munmap(readline_wrapper_wrapper, 2*sizeof(LongJump));
readline_wrapper_wrapper = mmap(hint, 2*sizeof(LongJump), PROT_EXEC | PROT_READ | PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
if (readline_wrapper_wrapper == MAP_FAILED) { fprintf(stderr, "mmap failed: %i\n", errno); return; }
hint += 0x100000;
} while ( IS64BIT && ( (uintptr_t)readline_wrapper_wrapper >= 0xFFFFFFFF + ((uintptr_t) caller) ) ); //repeat until we get an address really close to caller
//fprintf(stderr, "X:%p\n", readline_wrapper_wrapper);
makeLongJump(readline_wrapper, readline_wrapper_wrapper); //Write the long jump in the newly allocated space
//fprintfhex(stderr, readline_wrapper_wrapper, 16);
//fprintfhex(stderr, caller, 16);
//patch caller to become call <readline_wrapper_wrapper>
//called address is relative to address of CALL instruction
*(uint32_t*)(caller+1) = (uint32_t) ((uintptr_t)readline_wrapper_wrapper - (uintptr_t)(caller + RELATIVE_CALL_INSTRUCTION_SIZE) );
//fprintfhex(stderr, caller, 16);
*(void**)(returnToFrame) = readline_wrapper_wrapper; //change stack to jump to wrapper instead real_readline (or it would not work on the first entered command)
RESTORE_PARAMETERS
}
static void _calc_init(void) __attribute__ ((constructor));
static void _calc_init(void){
if (!real_readline) {
//Find readline
real_readline = (char* (*)(const char*)) dlsym(RTLD_DEFAULT, "readline");
if (!real_readline) return;
//fprintf(stdout, "loaded %p\n", real_readline);
//fprintf(stdout, " => %x\n", * ((int*) real_readline));
if (unprotect(real_readline)) { fprintf(stderr, "Failed to unprotect readline\n"); return; }
memcpy(oldreadline, real_readline, 2*sizeof(LongJump)); //backup readline's instructions
//Replace readline with readline_initwrapper
makeLongJump(real_readline, (LongJump*)real_readline); //add a push/ret long jump from readline to readline, to have readline's address on the stack in readline_initwrapper
makeLongJump(readline_initwrapper, (LongJump*)((char*)real_readline + sizeof(LongJump) - 1)); //add a push/ret long jump from readline to readline_initwrapper, overriding the previous RET
}
}
This can be compiled to an intercepting library with:
gcc -g -std=c99 -shared -fPIC -o calc.so -ldl calc.c
and then loaded in bash with:
gdb --batch-silent -ex "attach $BASHPID" -ex 'print dlopen("calc.so", 0x101)'
Now, when the previous alias extended with parenthesis replacement is loaded:
alias C='set -f -B; Cf '
function Cf () { echo "$#" | tr -d ', \042-\047' | tr [ '(' | tr ] ')' | bc -l; set +f; };
We can write:
$ C 1 * 2
2
$ C 2*(2+1)
6
$ C (2+1)*2
6
Even better it becomes, if we switch from bc to qalculate:
alias C='set -f -B; Cf '
function Cf () { echo "$#" | tr -d ', \042-\047' | tr [ '(' | tr ] ')' | xargs qalc ; set +f; };
Then we can do:
$ C e ^ (i * pi)
e^(i * pi) = -1
$ C 3 c
3 * speed_of_light = approx. 899.37737(km / ms)

using MultiByteToWideChar

The following code prints the desired output but it prints garbage at the end of the string. There is something wrong with the last call to MultiByteToWideChar but I can't figure out what. Please help??
#include "stdafx.h"
#include<Windows.h>
#include <iostream>
using namespace std;
#include<tchar.h>
int main( int, char *[] )
{
TCHAR szPath[MAX_PATH];
if(!GetModuleFileName(NULL,szPath,MAX_PATH))
{cout<<"Unable to get module path"; exit(0);}
char ansiStr[MAX_PATH];
if(!WideCharToMultiByte(CP_ACP,WC_COMPOSITECHECK,szPath,-1,
ansiStr,MAX_PATH,NULL,NULL))
{cout<<"Unicode to ANSI failed\n";
cout<<GetLastError();exit(1);}
string s(ansiStr);
size_t pos = 0;
while(1)
{
pos = s.find('\\',pos);
if(pos == string::npos)
break;
s.insert(pos,1,'\\');
pos+=2;
}
if(!MultiByteToWideChar(CP_ACP,MB_PRECOMPOSED,s.c_str(),s.size(),szPath,MAX_PATH))
{cout<<"ANSI to Unicode failed"; exit(2);}
wprintf(L"%s",szPath);
}
MSDN has this to say about the cbMultiByte parameter:
If this parameter is -1, the function processes the entire input
string, including the terminating null character. Therefore, the
resulting Unicode string has a terminating null character, and the
length returned by the function includes this character.
If this parameter is set to a positive integer, the function processes
exactly the specified number of bytes. If the provided size does not
include a terminating null character, the resulting Unicode string is
not null-terminated, and the returned length does not include this
character.
..so if you want the output string to be 0 terminated you should include the 0 terminator in the length you pass in OR 0 terminate yourself based on the return value...

Resources