Find missing values between text files using VBScript - vbscript

EDIT 2
Just to make it clear - this question is not asking for "DA CODEZ", just an idea as to possible approaches in VBScript. It's not my weapon of choice, and if I needed to do this on a un*x box then I wouldn't be bothering anyone with this...
I have two text files. File A contains a list of keys:
00001 13756 000.816.000 BE2B
00001 13756 000.816.000 BR1B
00002 16139 000.816.000 BR1B
00001 10003 000.816.000 CH1C
00001 10003 000.816.000 CH3D
00001 13756 000.816.000 CZ1B
....
....
File B, tab separated, contains two columns the keys, and a UUID:
00003 16966 001.001.023 2300 a3af3b1d-ea04-4948-ba25-59b36ae364ae
00001 12119 001.001.023 CZ1B e6efe825-0759-48b0-89b9-05bbe5d49625
00002 16966 001.001.023 BR1B d3a1d62b-a0d5-43c3-ba49-a219de5f32a5
00001 12119 001.001.023 BR1B 5d74af27-ed4b-4f90-8229-90b6d807515b
00001 10009 001.001.024 BR1B 590409cc-496a-49eb-885c-9bbc51863363
00002 24550 001.001.024 2100 46ecea5d-f8f5-4df9-92cf-0b73f6c81adc
00001 12119 001.001.024 CZ1B e415ce6f-7394-4a66-a7f8-f76487e78086
00002 16966 001.001.024 CZ1B c591a726-4d71-4f61-adfd-63310d21d397
....
....
I need to extract, using plain VBScript, the UUIDs for those entries in File B which have no matching entry in File A. (I need to optimise for speed, if that's an important criteria.) The result should be a file of orphaned UUID codes.
If this isn't easy/possible, that's also an answer - I can do it in the db we're using but the performance is woefully inadequate. Just been blown away by how much faster VBScript was than db solution for a previous processing task.
EDIT
Someone's suggested using some sort of ADO library, after converting the file to CSV, which I'm looking into.

Maybe the fastest way to do it is just to ask the OS to do it
Dim orphan
orphan = WScript.CreateObject("WScript.Shell").Exec( _
"findstr /g:keys.txt /v uuids.txt" _
).StdOut.ReadAll()
WScript.Echo orphan
That is, use findstr to check the uuids.txt file for lines that do not match any of the lines in the keys.txt file

Related

How to replace same word from config file to different words from array

Hi there.
I am very new to Shell Script so need your help...
I have config file with below info
config name AAAAA
root root
port number 00000
Hostname hahahahah
config name AAAAA
root less
port number 00001
Hostname nonononono
config name AAAAA
root less
port number 00002
Hostname nonononono
And inside my bash file, there's arraylist with below info
${array1[0]} # Has value of value11111
${array2[1]} # Has value of value22222
${array2[1]} # Has value of value33333
I want to change config file and save as below
config name value11111
root root
port number 00000
Hostname hahahahah
config name value22222
root less
port number 00001
Hostname nonononono
config name value33333
root less
port number 00002
Hostname nonononono
I tried awk and sed but no luck..... Could you please help this?
Check out some of these answers.
I second the advice by Ed and David (and in hindsight, this whole post could have been a comment instead of an answer), that awk/sed might not be the best tool for this job, and you'd want to take a step back and re-think the process. There's a whole bunch of things that could go wrong; the array values might not be populated correctly, there's no check that enough values for all substitutions exist, and in the end, you can't roll changes back.
Nevertheless, here's a starting point, just to illustrate some sed. It certainly is not the most performant one, and only works for GNU sed, but provides the output you required
#!/bin/bash
declare -a array
array=(value11111 value22222 value33333)
for a in "${array[#]}"; do
# Use sed inline, perform substitutions directly on the file
# Starting from the first line, search for the first match of `config name AAAAA`
# and then execute the substitution in curly brackets
sed -i "0,/config name AAAAA/{s/config name AAAAA/config name $a/}" yourinputconfigfile
done
# yourinputconfigfile
config name value11111
root root
port number 00000
Hostname hahahahah
config name value22222
root less
port number 00001
Hostname nonononono
config name value33333
root less
port number 00002
Hostname nonononono

Disassamble ELF file - debugging area where specific string of binary is loaded

I would like to disassamble / debug an elf file. Is it somehow possible to track the function where a specific string in the elf file is called?
So I mean, I have a string where I know it is used to search for that string in a file. Is it somehow possible with e.g. gdb to debug exactly that position in the executable?
Or is the position of the string in the elf file, somehow visible in the objdump -d output?
In order to do that you need a disassembler - objdump just dumps the info - it might not give you enough information as some analysis is needed before you can tell where it is being used. What you need is to get the XREFs for the string you have in mind.
If you open your binary in the disassembler it will probably have the ability to show you strings that are present in the binary with the ability to jump to the place where the string is being used (it might be multiple places).
I'll showcase this using radare2.
Open the binary (I'll use ls here)
r2 -A /bin/ls
and then
iz
to display all the strings. There's a lot of them so here's an extract
000 0x00004af1 0x100004af1 7 8 (4.__TEXT.__cstring) ascii COLUMNS
001 0x00004af9 0x100004af9 39 40 (4.__TEXT.__cstring) ascii 1#ABCFGHLOPRSTUWabcdefghiklmnopqrstuvwx
002 0x00004b21 0x100004b21 6 7 (4.__TEXT.__cstring) ascii bin/ls
003 0x00004b28 0x100004b28 8 9 (4.__TEXT.__cstring) ascii Unix2003
004 0x00004b31 0x100004b31 8 9 (4.__TEXT.__cstring) ascii CLICOLOR
005 0x00004b3a 0x100004b3a 14 15 (4.__TEXT.__cstring) ascii CLICOLOR_FORCE
006 0x00004b49 0x100004b49 4 5 (4.__TEXT.__cstring) ascii TERM
007 0x00004b60 0x100004b60 8 9 (4.__TEXT.__cstring) ascii LSCOLORS
008 0x00004b69 0x100004b69 8 9 (4.__TEXT.__cstring) ascii fts_open
009 0x00004b72 0x100004b72 28 29 (4.__TEXT.__cstring) ascii %s: directory causes a cycle
let's see where this last one is being used. If we move to the location where it's defined 0x100004b72. We can see this:
;-- str.s:_directory_causes_a_cycle:
; DATA XREF from 0x100001cbe (sub.fts_open_INODE64_b44 + 378)
And here we see where it's being referenced -> DATA XREF. We can move there (s 0x100001cbe) and there we see how it's being used.
⁝ 0x100001cbe 488d3dad2e00. lea rdi, str.s:_directory_causes_a_cycle ; 0x100004b72 ; "%s: directory causes a cycle"
⁝ 0x100001cc5 4c89ee mov rsi, r13
⁝ 0x100001cc8 e817290000 call sym.imp.warnx ;[1]
Having the location you can put a breakpoint there (r2 is also a debugger) or use it in gdb.

SoftQuad DESC or font file binary

I read this question but it doesn't helped me. I am solving a challenge where I have two files, first one was .png which gave me upper half part of an image, second file is SoftQuad DESC or font file binary I am sure that this file should somehow convert into .png file to complete the image. I googled and got hint about magic bytes but I am unable to match the bytes.
These are the first two rows of output of xxd command
00000000: aaaa a6bb 67bb bf18 dd94 15e6 252c 0a2f ....g.......%,./
00000010: fe14 d943 e8b5 6ad5 2264 1632 646e debc ...C..j."d.2dn..
These are the last two rows of output of xxd command
00001c10: 7a05 7f4c 3600 0000 0049 454e 44ae 4260 z..L6....IEND.B`
00001c20: 82
.

Change Data Capture in delimited files

There are two tab delimited files (file1, file2) with same number and structure of records but with different values for columns.
Daily we get another file (newfile) with same number and structure of records but with some changes in column values.
Compare this file (newfile) with two files (file1, file2) and update the records in them with changed records, keeping unchanged records intact.
Before applying changes:
file1
11 aaaa
22 bbbb
33 cccc
file2
11 bbbb
22 aaaa
33 cccc
newfile
11 aaaa
22 eeee
33 ffff
After applying changes:
file1
11 aaaa
22 eeee
33 ffff
file2
11 aaaa
22 eeee
33 ffff
What could be the easy and most efficient solution? Unix shell scripting? The files are huge containing millions of records, can a shell script be efficient solution in this case?
Daily we get another file (newfile) with same number and structure of records but with
some changes in column values.
This sounds to me like a perfect case for git. With git you can commit the current file as it is.
Then as you get new "versions" of the file, you can simply replace the old version with the new one, and commit again. The best part is each time you make a commit git will record the changes from file to file, giving you access to the entire history of the file.

Read Native format bcp data file

With the Unix shell script, I am doing a bcp out from a table in Server1 using NATIVE format to a file - XXXX.bcpdat, then bcp in the file to a table of same structure in Server2.
The bcp command we have is
bcp "$dbname".."$tablename" out XXXX.bcpdat -n
bcp "$dbname".."$tablename" in XXXX.bcpdat -n -b10000
This bcp_out & bcp in works as expected from/into tables.
But i want to da an urgent change here -
I want to get the total number of rows (a row may have 120 or 30 or 40 records)in the bcp data file (XXXX.bcpdat)
But with the file in Native format i couldn differentiate each row & how its being separated. If i pass head -10 XXXX.bcpdat or tail -10 XXXX.bcpdat it prints everything in the file. "wc -l" or "awk" or "cut" is not helping me to get the count of rows from the file. There is no differentiation where a row ends like how it is in character load of bcp. It would really be great if someone help me at the earliest, how i can get the total number of rows (not records) that is in the bcpdat file. Thanks a loot in advance.

Resources