Merging of csv breaks diacritical characters - windows

I'm trying to merge some csv files. I do it on Windows with cmd, like type *.csv >> or with a batch file, containing
echo. > all.csv
for %%a in (*.csv) DO copy /b alle.csv+%%a all.csv
On one computer (win7x64) is merging no problem. But on another one (same win7x64) all diacritical characters (german: äüöß) are broken - instead of them there are only ´,,´.
The source files, which should be merged, have healthy diacritical characters - i open them with Notepad++ and Excel, as ANSI or Unicode - everything is OK.
How can i adjust the file merging to save diacritical signs?

I believe there are several issues contributing to the unexpected results:
You try to create an empty file by echo. > all.csv, but this actually results in a file containing a SPACE, followed by a line-break (CR + LF), ANSI-encoded. So you may have files that are differently encoded, which can cause troubles.
To truly create an empty file, use rem/ > all.csv, break > all.csv, type nul > all.csv or copy /Y nul all.csv.
When combining files with copy, it can be problematic when the destination file is also one of the source files. When it is the first source file, the data of every other source files are appended; when it is not the first of the source files, an overwrite prompt may appear (unless you specify /Y) and data may be lost. Since you have given *.csv as the source file, we do actually not know which source file is enumerated first, so it could or may not be all.csv. So to avoid such trouble, you better delete the destination file before copying rather than create an empty file, like del all.csv.
Supposing you have Unicode files, they begin with a two-byte header 0xFF + 0xFE. When combining such files using copy /B, you have multiple of these headers within the file. To overcome this, do copy /A but within a Unicode cmd instance initiated by cmd /U:
cmd /U /C del all.csv ^& copy /A *.csv all.csv

copy /b *.csv all.txt & ren all.txt all.csv
or
2>nul del all.csv & copy /b *.csv all.csv
The type command can make some changes that could interfere in the process. Better use copy /b (with or without the for), but ensure the file being generated is not present or selected to avoid it being included as source in the process.
You should also ensure all your files have the same encoding. If some of them are Unicode/UTF-? with BOM and some not, depending of what the first file is selected, you could end with bad formated data.

Related

Batch combine new elements into subtitle files

I want to take a folder with several (15 in this first case) 'subtitle.srt' files (which, as I'm sure you're aware, are just text files with the extension ".srt" instead of ".txt") and modify each file in turn so that a new subtitle is added at the start of each file.
I want the new subtitle to be:-
0
00:00:00,001 --> 00:00:02,100
"Filename"
So, for example, if the first subtitle file in the folder is called "01. What Lies Beneath.srt" and it looks like this:-
1
00:00:02,120 --> 00:00:03,560
<font color="#FFFF00">Previously on Superman & Lois...</font>
2
00:00:03,560 --> 00:00:06,880
<font color="#00FFFF">I'm stepping down from active duty.</font>
<font color="#00FF00">You're going to be hard to replace.</font>
3
Etc., etc...
then after processing, I want it to look like this:-
0
00:00:00,001 --> 00:00:02,100
01. What Lies Beneath
1
00:00:02,120 --> 00:00:03,560
<font color="#FFFF00">Previously on Superman & Lois...</font>
2
00:00:03,560 --> 00:00:06,880
<font color="#00FFFF">I'm stepping down from active duty.</font>
<font color="#00FF00">You're going to be hard to replace.</font>
3
Etc., etc...
I'm rubbish at batch coding so I tried searching out possible ways to do it but nothing I tried worked!
Below are some attempts I made using different "routines" I found; each successive attempt separated (from last to first) by the PAUSE, EXIT commands:-
for %%a in (*.txt) do type append_ns0 >> %%a.srt
pause
exit
for %%a in (*.txt) do type append_ns0 >> %%a
for %%a in (*.txt) do type "%%~na" >> %%a
for %%a in (*.txt) do type append_spc >> %%a.srt
pause
exit
for %%I in (*.txt) do copy "C:\Users\wbcam\Desktop\G classroom\AddTitle.txt"+"%%~nI"+" "+"%%I" "%%~nI.srt"
pause
exit
for %X in (C:\Users\wbcam\Desktop\G classroom\Add Titles\*.txt) do type C:\Users\wbcam\Desktop\G classroom\AddTitles.txt >> %X
pause
exit
To use the COPY command I had to first rename the files from .srt to .txt (I'd rather NOT have to do that; I'm hoping someone can show me how to work on the ,srt files without any intermediate stages) and COPY also seemed to add a hex1A character to the end of the new file but, of course, it couldn't handle the insertion of the Filename (a text string) into the new file as it would only concatenate files not strings (if I, eventually, understood it's operation correctly, Doh!).
And attempts to use the ECHO or TYPE commands just seemed to overwrite everything in the original file leaving only:-
0
00:00:00,001 --> 00:00:02,100
and bugger all else!
Can anyone help out, please?
#ECHO OFF
SETLOCAL
rem The following setting for the source directoryis a name
rem that I use for testing and deliberately includes spaces to make sure
rem that the process works using such names. These will need to be changed to suit your situation.
SET "sourcedir=u:\your files"
FOR /f "delims=" %%b IN (
'dir /b /a-d "%sourcedir%\*.srt" '
) DO (
(
ECHO 0
ECHO 00:00:00,001 --^> 00:00:02,100
ECHO "%%~nb"
ECHO.
TYPE "%sourcedir%\%%b"
)>"%sourcedir%\%%~nb.txt"
MOVE "%sourcedir%\%%~nb.txt" "%sourcedir%\%%b" >NUL
)
GOTO :EOF
Always verify against a test directory before applying to real data.
Perform a directory scan. assigning each filename that matches the mask to %%b.
Write the two required lines, a line containing the name part of the filename in quotes and an empty line to the output, then type the contents of the selected file. Note that the > character in the text needs to be escaped by a caret. The output of the echoes is gathered and redirected to a .txt file and the .txt file is then written over the original file. The 1 file(s) copied message is suppressed by the >nul.
If you prefer, you could replace the two echo lines that insert the fixed text with type somefilename where somefilename contains the fixed text required.
You could replace the move line for testing with
FC "%sourcedir%\%%~nb.txt" "%sourcedir%\%%b"
which will show the differences.
copy adds the control-z following an archaic convention that ^Z marked end-of-file for text files.
--- Addition in response to comment
for /? from the prompt generates documentation about the for command - in general, commandname /? generates documentation for any command, although some commands require -? or -h or --h or --help. Utility designers generally follow the same convention and the details depend on the platform for which the software was originally designed.
for /f ... ('command....') do ... interprets the command-output as though it was a file.
The dir command reports the filenames and subdirectorynames in a directory. By default, it produces a formatted report including filesize and last-update date, but has numerous options including /b to produces a "basic" report (names only, no sizes, headers, summary, dates ,etc.) and /a-d which suppresses directorynames. The /s option means 'and list subdirectories too'. It can also accept a filemask (or ambiguous filename) - which applies equally to directorynames. If the filename supplied contains a ? then this means any ONE character or * which means any number of any characters - any other characters are taken literally and will match regardless of case. Hence, *.srt means any number of any characters followed by .srtwhich should select all file/directorynames that end.srt. The filemask may be preceded by *directoryname\\* which means scan this directory- but *that* directoryname may **not** contain?or*`.
When for /f is supplied with a filename list in basic form, each filename in the list is assigned to the metavariable assigned (%%a in this example) in turn and the do statements are executed using the assigned value in the metavariable. Again, there are many ways of having for /f interpret the value-string that is assigned to the metavariable. Many, many examples of these on SO
In the days of CP/M, filesize was a number of 128-byte blocks. Since files were not often a multiple of 128 bytes long, convention was that a control-z marked the end-of-file. Backwards-compatibility is a big issue in computing as it's wasteful to revise existing systems to cater for a concept-revision. Hence, the control-Z convention is still recognised and observed for text files.

Bash robocopy from text file containing path to folders and files to copy with special characters

I had a bash xcopy script, but got a lot of issues. After a lot of searching and reading I guess robocopy was a better tool to do the copy.
(The script must run on windows 10 computers without installing anything else and without internet access)
I'm trying to make a bash script that copy (with robocopy) some local network folders and files to a local custom directory. The aim is to be able to access to the files off from the local network.
The path to folders and files are stored inside a txt file (each line = a path)
I want to keep the structure of folder I save locally.
For example the folder X:\Path\to some\local\network\folder\with\some & characters\ will result in C:\PathTolocalFolder\Path\to some\local\network\folder\with\some & characters\ (without the X:\ letter)
Based on many similar questions (but not all at the same time) I have done this :
#echo off
SETLOCAL EnableDelayedExpansion
cls
cd C:
chcp 28591 > nul
for /f "delims=*" %%a in ('type "X:\path with spaces & specials characters\List.txt"') do (
REM echo %%a
REM echo !%%a!
echo %%a to C:\PathTolocalFolder!%%a!
ROBOCOPY "%%~a" "C:\PathTolocalFolder!%%a!" /S /A+:RA /R:1 /W:5
)
It is partially a success, but :
As there are special characters everywhere in paths and files names, I got some issues. Specially with & characters. My double quotes doesn't solve the problem. How could I go better?
For some cases, I want to save some files but not the whole directory where they are. The full path to the file is inside the text file. But as robocopy needs to add a space between folder path and file filter I have do some manipulation. How can I detect and extract the file name when there is one to adapt the robocopy command?
I want to use an exclusion list like I was doing before with xcopy. But robocopy doesn't accept a file in input for exclusions. I tried this to extract the exclusion file:
for /f "usebackq tokens*" %%D in ("C:\path to exclusion file\exclusions.txt") do (
if NOT "!dirs!"=="" (
Set dirs=!dirs! "%%D"
else (
Set dirs ="%%D"
)
)
But doesn't really know what I am doing and how to combine with the first part.
Bonus questions I'm using the robocopy log file functionality (removed from below) is there a way to archive (by adding the date in the name for example) previous log file before creating the new one? Is it possible to remove the progress percents in the log file but to display it in the terminal instead? How to use the "/np" option for log file but not for terminal display?
It's hard to me to understand how the delayed variables are working in batch files and how the different methods to read a file or variable are working.
Any help is welcome :)
Sorry for my bad English skills
thank for having read

Unconcatenating files using jeb's tricky method

EDIT: My essential question (without the specific setting for which I need a solution, as described in my original posting):
BinFile.bin is a file concatenated from binary files and a text file. The included text file consists only of lines beginning with a specific string, e.g. ;;;===,,,
With a batch file:
findstr /v "^;;;===,,," "BinFile.bin" > output.bin
an output bin file is generated in which the text file is completely removed.
How to use findstr (or another dos command) to not only remove all lines beginning with the specified string, but also the part of the bin before first such line (i.e. the complete binary part preceeding the text file)?
>>> My original posting:
jeb invented a method to concatenate files using Windows native tools which can be unconcatenated (in a specific way) using native tools. His solution is just ingenious!
copy /a batchBin.bat + /b myBinaryFile.bin /b combined.bat
with batchBin.bat:
;;;===,,,#echo off
;;;===,,,echo line2
;;;===,,,findstr /v "^;;;===,,," "%~f0" > output.bin
;;;===,,,exit /b
"The key is the findstr command, it outputs all lines not beginning with ;;;===,,,.
And as each of them are standard batch delimiters, they can be prefix any command in a batch file in any combination."
So myBinaryFile.bin can be extracted from the combined.bat––only by means of native tools!
My question:
In jeb's example the combined file is a batch file, because the first file in the copy command is a batch file. Could jeb's tricky method be used for the following task too, where the combined file would be combined.exe, an exe file?
copy /b aBat2ExeFile.exe + /a delimiter.bat + /b myBinaryFile.bin /b combined.exe
where delimiter.bat would be something like this:
;;;===,,,REM
and aBat2ExeFile.exe would be a batch file (aBat2ExeFile.bat) converted to exe, with a tricky use of findstr like in batchBin.bat, but with the result
[...] > output.exe
In aBat2ExeFile.bat findstr should be used with the result that all lines of combined.exe before and including the line ';;;===,,,REM' would be ignored and output.exe would be equal to myBinaryFile.bin again?
In think the concept is correct. But how this could be implemented in the aBat2ExeFile.bat?
EDIT: My question can be simplified (the frame described above is not essential):
How the findstr method used by jeb could be adapted to process a binary file in such a way that not only lines starting with ';;;===,,,' but also all lines preceding the first such line are "ignored"?

Convert list of paths to list of filenames

Background
I find myself often copying file paths to the clipboard, which is somewhat cumbersome to do from Windows Explorer.
So I wrote a little .bat file to put into the %APPDATA%\Microsoft\Windows\SendTo\ folder utilising the CLIP executable to copy a list of the selected file paths to the clipboard. This file consists only of a single line:
echo|set /p= "%*" | clip.exe
Which works quite nicely, I can select one or more filenames in Explorer, right-click on them and "Send To" the .bat file, which copies them to the clipboard. Each file path is complete and separated from the others by a space character.
Question
Sometimes, I don't want to copy a list of the full file paths, but would prefer to have a list of just the filenames with their extensions. I know how to do that conversion for single file paths, using the %~nx syntax as described here or here.
I tried different combinations of these but can't seem to find a workable solution for my list of paths. The following code echos the filenames correctly:
for %%F in (%*) do echo %%~nxF
...but how do I combine them to pass through to CLIP? Do I have to do string concatenation? Maybe in a subroutine to be called, or is there a more elegant solution?
The following will put each file name on a separate line within the clipboard:
#(for %%F in (%*) do #echo %%~nxF)|clip
If you prefer, the following will put a space delimited list of file names on a single line, with quotes around each file name.
#(for %%F in (%*) do #<nul set /p =""%%~nxF" ")|clip
Couldn't you just:
echo|set /p= "%~nx*" | clip.exe

Copying all file contents into new files - Batch

What I need to do is (and I have some idea, just can't get this solidified):
Read all files in a directory (for loop)
In loop, I need to copy the current file in the loop's contents into a new file (x.txt into x2.txt) in the same directory. This can be done with a type >> command.
I'm more confused about the looping part of it. How do I do that?
Edit:
This is my current script:
FOR %%i IN (*)
Do type %%i >> %%i2
This will copy each .txt file to a backup file with extension .txt2:
for %i in (*.txt) do type "%i" > "%i2"
Note that in a batch file you need to double the %:
for %%i in (*.txt) do type "%%i" > "%%i2"
It's not clear from your question if you want all found files dumped into the same resulting text file or if you want one-to-one copies of each.
If you just want one results file, you could also use a variant of the copy command, i.e.:
copy *.txt output.dat

Resources