Move columns in CSV with batch or powershell - windows

I'm using MediaInfo CLI version in Win 7 x64 to automatically make a CSV via template when a video file has finished encoding in StaxRip.
However, its CLI version is critical about how to apply the output template (long story short, its variables are in sections (general, video, audio, text) and you can only use one section in one block, you can't go back to a previous section further down the template), so one variable that I want elsewhere has to end up in the wrong spot for the automation to even work.
Like this:
UTC 2015-05-21 18:04:06,Episode01.mp4,211 MiB,22mn 7s,29.970 fps,1 210 Kbps,High 10#L3,120 Kbps,AAC,Japanese
UTC 2015-05-21 19:16:18,Episode02.mp4,211 MiB,22mn 6s,29.970 fps,1 212 Kbps,High 10#L3,118 Kbps,AAC,Japanese
UTC 2015-05-21 20:24:57,Episode03.mp4,211 MiB,22mn 6s,29.970 fps,1 212 Kbps,High 10#L3,119 Kbps,AAC,Japanese
What I'm looking for is the timestamp portion (first column) to become the LAST column instead:
Episode01.mp4,211 MiB,22mn 7s,29.970 fps,1 210 Kbps,High 10#L3,120 Kbps,AAC,Japanese,UTC 2015-05-21 18:04:06
I would very much love to find a solution to this in a .bat or Powershell script if possible since these are already used in the aforementioned process, but am open to small single-purpose applications. The crucial part is being able to be run from CMD or from a master .bat file.
Thank you for your time.

I tried this one out and is working.
[string] $SourceFileFullPath = "C:\Projects\INT\CSV_ColumnSwap.csv"
[Array] $SourceFileContent = Get-Content $SourceFileFullPath
[int] $ArrayLength = $SourceFileContent.length
for ($i=0; $i -lt $ArrayLength; $i++) {
$splitter1 = ","
$LineData = $SourceFileContent[$i] -split $splitter1
$DateTimeV, $Linedata = $LineData
$LineData += $DateTimeV
$LineData -join "," >> Result.csv
}
I am not particularly sure about the performance aspects. YMMV.
Cheers

Related

win cmd: remove whitespace > not enough storage to process this command

I have a long (several million lines) data sheet in plain txt. Looks like this:
cellnumber x-coordinate y-coordinate z-coordinate temperature
1 -6.383637190E-01 2.408539131E-02 -5.244855285E-01 3.081549136E+02
2 -6.390314698E-01 2.286404185E-02 -5.245100260E-01 3.081547595E+02
3 -6.381718516E-01 2.373264730E-02 -5.236577392E-01 3.081547591E+02
4 -6.360489130E-01 2.259869128E-02 -5.245736241E-01 3.081547591E+02
5 -6.369081736E-01 2.253472991E-02 -5.236831307E-01 3.081547591E+02
6 -6.382256746E-01 2.215057984E-02 -5.237988830E-01 3.081547591E+02
7 -6.381900311E-01 2.126700431E-02 -5.245448947E-01 3.081547591E+02
8 -6.373924613E-01 2.117809094E-02 -5.238834023E-01 3.081547591E+02
I currently only have win command line and need to get rid off the whitespaces ath the beginning (their length is not constant as the cellnumber increases) so that I get
cellnumber x-coordinate y-coordinate z-coordinate temperature
1 -6.383637190E-01 2.408539131E-02 -5.244855285E-01 3.081549136E+02
2 -6.390314698E-01 2.286404185E-02 -5.245100260E-01 3.081547595E+02
3 -6.381718516E-01 2.373264730E-02 -5.236577392E-01 3.081547591E+02
4 -6.360489130E-01 2.259869128E-02 -5.245736241E-01 3.081547591E+02
5 -6.369081736E-01 2.253472991E-02 -5.236831307E-01 3.081547591E+02
6 -6.382256746E-01 2.215057984E-02 -5.237988830E-01 3.081547591E+02
7 -6.381900311E-01 2.126700431E-02 -5.245448947E-01 3.081547591E+02
8 -6.373924613E-01 2.117809094E-02 -5.238834023E-01 3.081547591E+02
May I ask for a solution? I dont have a clue, am not really experienced with this. Thx!
I guess TrimStart may be my friend.
EDIT: I have put together this:
#ECHO OFF
set "victim=testJana.txt"
SETLOCAL
FOR /F "tokens=*" %%A IN (%victim%) DO (
IF NOT "%%A"=="_" ECHO %%A>>%victim%_edited.txt
)
ENDLOCAL
pause
it works fine for smaller files but Im getting the message
not enough storage to process this command
Any idea how to deal with this?
I would suggest using powershell:
First, Second and Third edit: To be executed in the directory where data.txt file is placed and in the powershell.exe shell:
(good point to add -ReadCount by #lit in other post)
Get-Content -ReadCount 500 -Path .\path_to_your_source\data.txt | % {$_ -replace " +", " "} | Set-Content -Path .\path_to_our_output\data_no_additional_spaces.txt
Why -ReadCount makes sense? Here it takes 500 lines per run via pipes.
Here is the info from Microsoft pages)
-ReadCount
Specifies how many lines of content are sent through the pipeline at a
time. The default value is 1. A value of 0 (zero) sends all of the
content at one time.
This parameter does not change the content displayed, but it does
affect the time it takes to display the content. As the value of
ReadCount increases, the time it takes to return the first line
increases, but the total time for the operation decreases. This can
make a perceptible difference in very large items.
Reads data, replaces all the spaces and then saves data into data_new.txt
This answer was meant for the powershell.exe shell not the cmd.exe where you normally run your *.bat files. In powershell you have scripts called *.ps1.
If you store this the above command in a trim_space.ps1 and then launch it as (you need to have the script in the same directory as the data being transformed):
powershell.exe -ExecutionPolicy Bypass &'C:\path_to_script\trim_space.ps1'. You will see it executed.
Forth edit
To address your:
it works fine for smaller files but Im getting the message not enough
storage to process this command
Any idea how to deal with this?
You have to process it by chunks which you are not doing in your batch file right now. You get just to the point where you exhaust all the thread memory and it naturally fails. You need to have approach which allows you to limit the chunk of lines which are processed at once like -Readcount. In batch file I imagine it would be possible to call one batch file from other which would process only limited part of the file.
Using PowerShell, you can limit how much data is processed at a time in the pipeline.
Get-Content -Path .\testJana.txt -ReadCount 1000 |
ForEach-Object { $_ -replace '^ +', '' } |
Out-File -FilePath .\testJana_edited.txt -Encoding ASCII
If you want to run this from a cmd.exe shell, put the PowerShell code above into a file named sourcelimit.ps1 and use the following in a .bat script file.
powershell -NoProfile -File .\sourcelimit.ps1

power shell Inserting text in the middle of large files(90MB)

As part of our project we are downloading huge chunk of eml files from secure sftp location,after downloading we need to add a subtag in each of the downloaded file which is around 90 MB ,i tried to add the sub tag using powershell script that i have seen in other site and pasted below,it works fine for small files of 10 kb to 200kb but when i try to use the same script for huge files the scripts got struck, can anyone please help to get through it.
(Get-Content F:\EmlProcessor\UnZipped\example.eml) |
Foreach-Object {
$_ # send the current line to output
if ($_ -match "x-globalrelay-MsgType: ICECHAT")
{
#Add Lines after the selected pattern
" X-Autonomy SubTag=GMAIL"
}
} | Set-Content F:\EmlProcessor\EmlProcessor\example2.txt
SAMPLE EML FILE
Date: Tue, 3 Oct 2017 07:44:32 +0000 (UTC)
From: XYZ
To: ABC
Message-ID: <1373565887.28221.1507075364517.JavaMail.tomcat#HKLVATAPP075>
Subject: Symphony: 2 users, 4 messages, duration 00:00
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="----=_Part_28220_1999480254.1507075364517"
x-globalrelay-MsgType: GMAIL
x-symphony-StreamType: GMAIL
x-symphony-StreamID: RqN3HnR/ajgZvWOstxzLuH///qKcERyOdA==
x-symphony-ContentStartDateUTC: 1507016636610
x-symphony-ContentStopDateUTC: 1507016672387
x-symphony-FileGeneratedDateUTC: 1507075364516
------=_Part_28220_1999480254.1507075364517
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE html><html><body><p><font color=3D"grey">Message ID: Un/pfFrGvvVy=
T6quhMBKjX///qEezwdFdA=3D=3D</font><br>2017-10-03T07:43:56.610Z 0
----
------
-----
</HTML>
As shown in the above sample input file i must add a text "X-Autonomy SubTab" above or below "x-globalrelay-MsgType".
I tried to add subtag to sample file which is of 90 MB ,as said it got struck,though my requirement is to add to nearly 2K files by looping through each file ,i have tried it for one file with the above code but was unsuccessful,I am very new to batch & windows powershell scripting, any quick help is appreciated.
Are you sure it is stuck or just takes longer? Your code has to iterate through thousands of lines to find a match.
I did not have large text file to test with so converted a large csv (60 MB) to txt and this was working for me pretty fast (10-15 sec).
Note: Since you are new and you realize the power of PowerShell, I am going to be really generous. Most people would expect you to put in some effort yourself but I have faith that you will at least try to understand what the script is doing. Because if you use the scripts you get here directly on your environment without testing, you could end up doing some serious damage. So, at least for the sake of testing, you would understand what each line does. I have edited the code to use functions for scalability. I could use multi-threading to speed up the process but since this is a heavy CPU oriented operation, I do not think it would do much good.
#Coz functions are the best
Function Insert-SubTag ($Path)
{
$FileName = $Path | Split-Path -Leaf
$File = Get-Content -Path $Path
$Line = $File | Select-String -Pattern "x-globalrelay-MsgType"
$LineNumber = $Line.LineNumber
#Since Linenumber starts from 1 but array count starts from 0
$File[$LineNumber - 1] = "$Line
X-Autonomy SubTag=GMAIL"
$SavePath = "F:\EmlProcessor\UnZipped2\$FileName" #You can also pass the save folder as a parameter to this function like $path
$File | Set-Content -Path $SavePath
}
#If you have the list of Files in a text file use this
$FileList = Get-content C:\FileList.txt
#If you have a folder, and want to iterate through each file, use this
$FileList = (Get-ChildItem -Path "F:\EmlProcessor\UnZipped").FullName
Foreach ($FilePath in $FileList)
{
Insert-SubTag -Path $FilePath
}
Assuming that x-globalrelay-MsgType only appears once in the text file.
Do not forget to consider selecting this as the answer if it works for you.

How to resume reading a file?

I'm trying to find the best and most efficient way to resume reading a file from a given point.
The given file is being written frequently (this is a log file).
This file is rotated on a daily basis.
In the log file I'm looking for a pattern 'slow transaction'. End of such lines have a number into parentheses. I want to have the sum of the numbers.
Example of log line:
Jun 24 2015 10:00:00 slow transaction (5)
Jun 24 2015 10:00:06 slow transaction (1)
This is easy part that I could do with awk command to get total of 6 with above example.
Now my challenge is that I want to get the values from this file on a regular basis. I've an external system that polls a custom OID using SNMP. When hitting this OID the Linux host runs a couple of basic commands.
I want this SNMP polling event to get the number of events since the last polling only. I don't want to have the total every time, just the total of the newly added lines.
Just to mention that only bash can be used, or basic commands such as awk sed tail etc. No perl or advanced programming language.
I hope my description will be clear enough. Apologizes if this is duplicate. I did some researches before posting but did not find something that precisely correspond to my need.
Thank you for any assistance
In addition to the methods in the comment link, you can also simply use dd and stat to read the logfile size, save it and sleep 300 then check the logfile size again. If the filesize has changed, then skip over the old information with dd and read the new information only.
Note: you can add a test to handle the case where the logfile is deleted and then restarted with 0 size (e.g. if $((newsize < size)) then read all.
Here is a short example with 5 minute intervals:
#!/bin/bash
lfn=${1:-/path/to/logfile}
size=$(stat -c "%s" "$lfn") ## save original log size
while :; do
newsize=$(stat -c "%s" "$lfn") ## get new log size
if ((size != newsize)); then ## if change, use new info
## use dd to skip over existing text to new text
newtext=$(dd if="$lfn" bs="$size" skip=1 2>/dev/null)
## process newtext however you need
printf "\nnewtext:\n\n%s\n" "$newtext"
size=$((newsize)); ## update size to newsize
fi
sleep 300
done

Extract hostnames from Perfmon blg with Powershell

I'm writing a script which will automate the extraction of data from .blg Perfmon logs.
I've worked out the primary Import-Counter commands I will need to use to get the data out, but am trying to parametrise this so that I can do it for each machine in the log file (without having to open the log up in Perfmon, which can take 15 minutes or sometimes more, and is the reason I'm writing this script), and find out what each hostname is.
The script I have does the job, but it still takes a minute to return the data I want, and I wondered if there was a simpler way to do this, as I'm not too familiar with Powershell?
Here's what I have:
$counters = Import-Counter -Path $log_path$logfile -ListSet * | Select-Object paths -ExpandProperty paths
$svrs = #()
# for each line in the list of counters, extract the name of the server and add it to the array
foreach ($line in $counters) {
$svrs += $line.split("\")[2]
}
# remove duplicates and sort the list of servers
$sorted_svrs = $svrs | sort -unique
foreach ($svr in $sorted_svrs) {
Write-Host $svr
}
I'm just printing the names for the moment, but they'll go into an array in the proper script, and then I'll run my Import-Counter block with each of these hosts parametrised in.
Just wondered if there was a better way of doing this?
$sorted_svrs=Import-Counter "$log_path$logfile" -Counter "\\*\physicaldisk(_total)\% disk time" | %{$_.countersamples.path.split("\")[2]} | sort -Unique

Powershell or other Windows method to copy datestamped html file to network share

Im new to powershell - so serious noob.
But I wanted to see if anyone could help in doing the following.
We have a folder on a server that has reports written to it every night.
The reports are named in the following format:
DiskSpaceReport_26102012.html
and location of C:\Powershell\WebReport\
I would like a PS script to copy these 1 of these files from the folder using a daterange of -8 Days from the date the script runs - the script would be run as part of a windows scheduled task or through SQL Agent job.
So at present there are 8 files in the folder dating from Friday 26 Oct to Friday 19th Oct.
I would like the process to run today and copy the file -8 days back from todays date.
So copy the file named DiskSpaceReport_19102012.html
And this process should repeat weekly on friday and copy the last file from 8 days ago.
The copy is to a network share
\\Server01\Powershell\Webreports_Archive
And as I mentioned in title I dont mind if this is easier to do via robocopy in a batch file for example.
Would prefer it via PS though.
The following will do what you want:
$pastdays = -8
$pastdate = [datetime]::Now.AddDays($pastdays)
$filename = "DiskSpaceReport_" + $pastdate.Day + $pastdate.Month + $pastdate.Year+".html"
Copy-Item -Path "C:\Powershell\WebReport\$($filename)" "\\Server01\Powershell\Webreports_Archive"
regards
Jon

Resources