File handling in Batch programming? - windows

I have a text file that has a number in every new line and all are in ascending order.
Contents are like :
1
13
25
37
49
97
109
121
I want to extract only those numbers who have difference greater than 12, with the previous number. I wish to use batch program for this....
How can I do that ?

I would have liked to see you make an attempt but anyway I had a go and this is the closest I could get
c:\temp>type test.txt
1 line 1
10 line 1a
13 line 2
25 line 3
22 line 3a
37 line 4
49 line 5
97 line 6
109 line 7
121 line 8
c:\temp>test.bat
25 line 3
37 line 4
49 line 5
97 line 6
109 line 7
121 line 8
c:\temp>
using this code in test.bat:
#echo off
SETLOCAL ENABLEDELAYEDEXPANSION
set /a cur="0"
for /f "tokens=1,* delims= " %%a in ('type test.txt') do (
set line=%%a %%b
set /a num="%%a"
set /a dif="!num!-!cur!"
if !dif! geq 12 #echo !line!
set /a cur="%%a"
)

Related

Windows Batch Sort Function doesn't produce output as expected

I have been trying to sort the text file using Windows batch Sort function. But the results are not as expected. The input file is something like this:
name2.txt
77
76
75
74
73
72
78
69
68
67
66
65
64
63
71
62
9
8
7
and the output that I get is as below:
sorted.txt
9
8
78
77
76
75
74
73
72
71
70
7
69
68
67
66
65
64
63
The code snippet is:
setlocal EnableDelayedExpansion
set "names="
for /L %%i in (1,1,9) do set "names=!names! C:\offsite_tlog\%%i*.tlg"
dir /B /A-D /O-D %names% > name1.txt
for /F "tokens=1 delims==." %%a in (name1.txt) do echo %%a >> name2.txt
powershell.exe -command " & {Get-Content "C:\offsite\name2.txt" | Sort-Object -Descending > sorted.txt}"
The normal Windows batch sort is also not working. So, kindly assist me with sorting
The expected output should be
7
8
9
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
You were close, but when you use Get-Content, it's treating the file as a string, not a number.
[Array]$Output = #()
Get-Content -LiteralPath "C:\offsite\name2.txt" |
ForEach-Object { $Output += [Int]($_.Trim()) }
$Output | Sort-Object | Out-File -Path .\sorted.txt
This wrapped PowerShell script will leftpad all numbers with zeroes to 10 places and sort on that virtual key.
powershell -NoP -C "gc 'C:\offsite\name2.txt'|Sort -desc {[Regex]::Replace($_,'\d+',{$args[0].Value.PadLeft(10,'0')})}|sc sorted.txt"
The simplest PowerShell solution is:
Get-Content 'C:\offsite\name2.txt' | Sort-Object { [int] $_ } > sorted.txt
Script block { [int] $_ } converts each input line ($_) to an integer for the purpose of sorting.
Note that > creates a "Unicode" (UTF-16LE) file; use Out-File or Set-Content with -Encoding to change the character encoding.
Invoked from cmd.exe:
powershell -c "Get-Content 'C:\offsite\name2.txt' | Sort-Object { [int] $_ } >sorted.txt"

Editing text file in command line and make a new file

I have a big file look like the example:
chr1:16872433-16872504 54 112622
chr1:16872433-16872504 55 112110
chr1:16872433-16872504 56 110996
chr1:16872433-16872504 57 110306
chr1:16861773-16861845 20 38808
chr1:16861773-16861845 21 39768
chr1:16861773-16861845 22 40344
chr1:16861773-16861845 23 40637
chr1:16861773-16861845 24 41311
chr2:7990338-7990408 8 0
chr2:7990338-7990408 9 0
chr2:7990338-7990408 10 0
chr2:7990338-7990408 11 0
chr2:7990338-7990408 12 0
I want to extract every part starting with "chr1:16872433-16872504" and make a new .txt file.
how can I do that in bash? I tried grep command but I do not know how to make it conditional.
grep -E 'chr1:16872433-16872504' your.txt > new.txt
gives you the following output
chr1:16872433-16872504 54 112622
chr1:16872433-16872504 55 112110
chr1:16872433-16872504 56 110996
chr1:16872433-16872504 57 110306
as per your requirement ["chr1:16872433-16872504"]

Grab tokens only from lines containing a specific substring at unknown position within the first token

I have a log file that contains lines similar to:
1 3 2 4 5 6 4 3 2 4 6 6 53 54 5 5 7 4 35 52 234 234 423 26 6 2465 3
asdfj:C:kkl 4 5 6 5 4 3 2 3 4 5 6 7 6 5 45 6
1 3 2 4 5 6 4 3 2 4 6 6 53 54 5 5 7 4 35 52 234 234 423 26 6 2465 3
1 3 2 4 5 6 4 3 2 4 6 6 xdfj:C:asdfj 53 54 5 5 7 4 35 52 234 234 423 26 6 2465 3
jdfj:C:asdfj 4 5 6 5 4 3 2 3 4 5 6 7 6 5 789 6
asfgfj:C:asdfj 4 5 6 5 4 3 2 3 4 5 6 7 6 5 23 6
I need to grab the 1st and 16th tokens from all lines that BEGIN with strings containing the substring ":C:" (lines 2, 5, 6 in the example) and return these to an output file.
I'm using "FINDSTR" to grab these tokens, but I only know how to grab from all lines. How can I filter to grab from only lines beginning with the string/substring I want?
*Note: The substring ":C:" varies in it's position within the string, or else I would just try to match this ":C:" if it's position was constant.
Current commands I'm using:
#echo off
setlocal enabledelayedexpansion
for /F "tokens=1,16" %%a in (print.log) do (
echo %%a %%b >> value.txt)
This batch:
#Echo off
SetLocal EnableDelayedExpansion
for /F "tokens=1,16" %%a in (
' findstr /R /C:"^[^ ]*:C:" print.log'
) do echo %%a %%b >> value.txt
returns this output in value.txt:
> type value.txt
asdfj:C:kkl 45
jdfj:C:asdfj 789
asfgfj:C:asdfj 23

Remove rows that have a specific numeric value in a field

I have a very bulky file about 1M lines like this:
4001 168991 11191 74554 60123 37667 125750 28474
8 145 25 101 83 51 124 43
2985 136287 4424 62832 50788 26847 89132 19184
3 129 14 101 88 61 83 32 1 14 10 12 7 13 4
6136 158525 14054 100072 134506 78254 146543 41638
1 40 4 14 19 10 35 4
2981 112734 7708 54280 50701 33795 75774 19046
7762 339477 26805 148550 155464 119060 254938 59592
1 22 2 12 10 6 17 2
6 136 16 118 184 85 112 56 1 28 1 5 18 25 40 2
1 26 2 19 28 6 18 3
4071 122584 14031 69911 75930 52394 89733 30088
1 9 1 3 4 3 11 2 14 314 32 206 253 105 284 66
I want to remove rows that have a value less than 100 in the second column.
How to do this with sed?
I would use awk to do this. Example:
awk ' $2 >= 100 ' file.txt
this will only display every row from file.txt that has a column $2 greater than 100.
Use the following approach:
sed '/^\w+\s+([0-9]{1,2}|[0][0-9]+)\b/d' -E /tmp/test.txt
(replace /tmp/test.txt with your current file path)
([0-9]{1,2}|[0][0-9]+) - will match either digits from 0 to 99 OR a digits with leading zero (ex. 012, 00982)
d - delete the pattern space;
-E(--regexp-extended) - Use extended regular expressions rather than basic regular expressions
To remove matched lines in place use -i option:
sed -i -E '/^\w+\s+([0-9]{1,2}|[0][0-9]+)\b/d' /tmp/test.txt

select a column from a text file using windows batch

I have a text file which has below data
162 y 1 0 518 home47 1
163 y 1 0 520 home41 1
164 y 1 0 522 home43 1
165 y 1 0 524 home45 1
166 y 1 0 526 home46 1
169 y 1 0 531 home50 1
170 y 1 0 533 home52 1
171 y 1 0 535 home54 1
172 y 1 0 537 home56 1
173 y 1 0 539 home58 1
I would like to copy 6th column data from below (home47 to home58) into another text file using windows batch file. How can I perform that
I have tried with below command which is mentioned in another questions, but not working for me
CMD /f:off
FOR /f "tokens=6 delims= " %B in (TabFile.txt) do #echo %B >> 2ColFile.txt
CMD /f:on
#echo off
break>2ColFile.txt
for /f "tokens=6 delims= " %%c in (TabFile.txt) do (
echo %%c
)>>2ColFile.txt
EDIT Have on mind that delimiters are delims=<tab><space> and the could be changed by stackoverflow formater.

Resources