I have a text file which contains NUL characters in random rows. I want to find first NUL character and delete entire row from that NUL character as in below:
Input:
1 2 3 4 20170821NUL20170821NULNULNULNUL 123 NULNULNUL
1 2 3 4 20170821 20170821 6 7 10 123 10 11 13
1 2 3 4 20170821NUL20170821NULNULNULNUL 123 NULNULNUL
1 2 3 4 20170821NUL20170821NULNULNULNUL 123 NULNULNUL
Output:
1 2 3 4 20170821
1 2 3 4 20170821 20170821 6 7 10 123 10 11 13
1 2 3 4 20170821
1 2 3 4 20170821
I have the following to read text file data to a variable and loop through the data and replace NUL:
sInfile = WScript.Arguments(1)
'Create file system object
Set oFSO = CreateObject("Scripting.FileSystemObject")
Set oFS = oFSO.OpenTextFile(sInfile)
sData = oFS.ReadAll
oFS.Close
Set oFS = Nothing
MsgBox("File Read Completed")
'Remove Rest of the line from NULL
Do While InStr(sData, "\00.*") > 0
sData = Replace(sData, "\00.*", "")
Loop
'Cleanup and end
Set oFS = Nothing
WScript.Quit
The script went passed without any errors but I can't see any changes to the data.
EDIT 1:
Updated code:
Const ForReading = 1
Const ForWriting = 2
Const TriStateUseDefault = -2
If (WScript.Arguments.Count > 0) Then
sInfile = WScript.Arguments(0)
Else
WScript.Echo "No filename specified."
WScript.Quit
End If
If (WScript.Arguments.Count > 1) Then
sOutfile = WScript.Arguments(1)
Else
sOutfile = sInfile
End If
'Get the text file from cmd file
sInfile = Wscript.Arguments(1)
' Create file system object
Set oFSO = CreateObject("Scripting.FileSystemObject")
Set oFS = oFSO.OpenTextFile(sInfile)
sData = oFS.ReadAll
oFS.Close
Set oFS = Nothing
' Remove Rest of the line from NULL
Set re = New RegExp
re.Pattern = Chr(0) & ".*"
re.Global = True
sData = re.Replace(sData, "")
Set oOutfile = oFSO.OpenTextFile(sOutfile, ForWriting, True)
oOutfile.Write(sData)
oOutfile.Close
Set oOutfile = Nothing
' Cleanup and end
Set oFS = Nothing
WScript.Quit
Here is the sample input I am giving:
I would like to see the output as below:
But I got the below output:
ਊਊਊਊਊਊਊਊਊਊ
EDIT 2:
I am not aware of hex editors. Here is the sample input of HextDump:
FF FE 4A 00 42 00 43 00 09 00 31 00 32 00 33 00 34 00 38 00 36 00 37 00 38
00 09 00 38 00 37 00 09 00 30 00 09 00 30 00 09 00 31 00 32 00 33 00 09 00
32 00 30 00 31 00 37 00 09 00 31 00 32 00 33 00 34 00 09 00 31 00 33 00 34
00 32 00 30 00 09 00 32 00 30 00 31 00 37 00 30 00 38 00 30 00 39 00 09 00
35 00 31 00 30 00 33 00 09 00 09 00 09 00 09 00 33 00 34 00 31 00 34 00 38
00 38 00 09 00 32 00 09 00 32 00 30 00 31 00 37 00 09 00 38 00 09 00 31 00
09 00 37 00 09 00 2D 00 32 00 36 00 34 00 30 00 09 00 2D 00 33 00 39 00 33
00 2E 00 31 00 36 00 31 00 33 00 37 00 35 00 09 00 2D 00 33 00 33 00 32 00
2E 00 34 00 36 00 38 00 35 00 37 00 39 00 09 00 41 00 30 00 31 00 31 00 32
00 35 00 38 00 39 00 2F 00 33 00 34 00 31 00 34 00 38 00 38 00 2F 00 09 00
09 00 09 00 09 00 09 00 09 00 09 00 09 00 32 00 09 00 09 00 09 00 32 00 31
00 37 00 38 00 31 00 09 00 58 00 59 00 5A 00 09 00 58 00 59 00 5A 00 09 00
58 00 59 00 5A 00 09 00 31 00 32 00 33 00 09 00 31 00 32 00 33 00 09 00 2D
00 32 00 36 00 34 00 09 00 58 00 59 00 5A 00 09 00 31 00 09 00 31 00 09 00
31 00 32 00 33 00 09 00 09 00 09 00 32 00 31 00 37 00 38 00 32 00 31 00 0D
00 0A 00 41 00 42 00 43 00 09 00 31 00 32 00 33 00 34 00 38 00 36 00 37 00
and the HexDump of output which I got FF FE 4A 0A 0A 0A 0A 0A 0A 0A 0A 0A 0A 0A 0A 0A 0A 0A 0A 0A 0A 0A 0A 0A 0A
You are trying to specify regex pattern for Replace() function, that won't work. Generally, you don't need to use regex at all.
Here is non-regex code:
With CreateObject("Scripting.FileSystemObject").OpenTextFile(WScript.Arguments(1), 1, False, 0)
sData = ""
If Not .AtEndOfStream Then sData = .ReadAll
.Close
End With
a = Split(sData, vbCrLf)
For i = 0 To UBound(a)
q = Instr(a(i), Chr(0))
If q > 0 Then a(i) = Mid(a(i), 1, q - 1)
Next
sData = Join(a, vbCrLf)
And here is regex version:
With CreateObject("Scripting.FileSystemObject").OpenTextFile(WScript.Arguments(1), 1, False, 0)
sData = ""
If Not .AtEndOfStream Then sData = .ReadAll
.Close
End With
With CreateObject("VBScript.RegExp")
.Pattern = "^(.*?)\x00.*$"
.Global = True
.Multiline = True
sData = .Replace(sData, "$1")
End With
The Replace function doesn't do regular expression replacements, and VBScript also doesn't recognize \0 as the character NUL. For the former you need the Replace method of a regular expression object, for the latter you need the Chr function. Also, you don't need a loop, since you read the content of the file as a single string anyway.
However, your file is apparently UTF-16 LE encoded, which means that each character is represented by 2 bytes, one of which is zero for ANSI characters. If you read such files as ANSI files your replacement would remove everything after the first byte. You need to set the 4th parameter of the OpenTextFile method to -1 in order to handle the file as a UTF-16 (vulgo Unicode) file.
Change this:
Set oFS = oFSO.OpenTextFile(sInfile)
sData = oFS.ReadAll
oFS.Close
Set oFS = Nothing
...
Do While InStr(sData, "\00.*") > 0
sData = Replace(sData, "\00.*", "")
Loop
...
Set oOutfile = oFSO.OpenTextFile(sOutfile, ForWriting, True)
oOutfile.Write(sData)
oOutfile.Close
Set oOutfile = Nothing
into this:
sData = oFSO.OpenTextFile(sInfile, 1, False, -1).ReadAll
Set re = New RegExp
re.Pattern = Chr(0) & "[^\r\n]*"
re.Global = True
sData = re.Replace(sData, "")
oFSO.OpenTextFile(sOutfile, 2, True, -1).Write sData
and the problem will disappear.
The pattern [^\r\n]* (any number of characters that are neither carriage-return nor line-feed) is used to keep Windows line breaks intact. Those consist of the two characters carriage-return and line-feed (CR-LF). The regular expression meta-character . does not match line-feeds, but it does match carriage-return, so those would be removed when using the pattern .*.
For clarity: the above code will remove a NUL character and the remainder of the line from each line containing a NUL character. Lines not containing NUL characters will not be affected.
If you want the entire text after a NUL character removed (including subsequent lines) you could do it like this:
Set re = New RegExp
re.Pattern = Chr(0) & "[\s\S]*"
sData = re.Replace(sData, "")
Related
I am working with neural network to classify images.
I have some files generated by a CytoVision Platform. I would like to use the images in those files but I need to extract them somehow.
These .slide files contain several images of apparently 16kb each one.
I have developed a program in C that I am currently running on linux to extract each 16kb in files. I should build a header in order to use those images.
I don't know which format they have.
If I look at the entire file as a bitmap with FileAlyzer I can see this:
File as a bitmap
This link should allow anyone to download an example file:
https://ufile.io/2ibdq
This is what it seems to be one image header:
42 4D 31 00 00 00 00 00 40 8F 40 05 00 9E 5F 98 D7 47 60 A1 40 01 04 4D 65 74 31 00 00 00 00 00 40 8F 40 05 00 64 31 2E 29 B5 46 DC 40 01 04 4D 65 74 32 00 00 00 00 00 40 8F 40 05 00 87 7D 26 70 88 C0 C5 40 01 04 4D 65 74 33 00 00 00 00 00 40 8F 40 05 00 C8 97 53 05 BB 0D 0F 41 01 04 54 65 78 31 00 00 00 00 00 00 D0 40 05 00 00 00 00 00 00 40 5C 40 07 04 54 65 78 32 00 00 00 00 00 00 D0 40 05 00 00 00 00 00 00 00 44 40 07 04 54 65 78 33 00 00 00 00 00 00 D0 40 05 00 00 00 00 00 00 90 76 40 07 04 54 65 78 34 00 00 00 00 00 00 D0 40 05 00 00 00 00 00 00 F4 CD 40 07 0A 43 68 72 6F 6D 73 41 72 65 61 00 00 00 00 00 4C BD 40 05 00 F3 76 84 D3 82 85 74 40 07 08 42 6F 75 6E 64 61 72 79 00 00 00 00 00 88 B3 40 05 00 D9 CE F7 53 E3 AD 7E 40 07 04 41 72 65 61 00 00 00 00 00 88 B3 40 05 00 20 EF 55 2B 13 0B 85 40 07 07 4F 62 6A 65 63 74 73 00 00 00 00 00 00 69 40 05 00 00 00 00 00 00 00 18 40 03 04 43 69 72 63 00 00 00 00 00 40 8F 40 05 00 9D E5 51 0E 5C 34 65 40 03 03 42 47 52 00 00 00 00 00 40 8F 40 05 00 7D 0C CE C7 E0 AC 86 40 03 04 54 65 78 35 00 00 00 00 00 00 D0 40 05 00 00 00 00 00 00 00 53 40 07 04 41 52 41 54 00 00 00 00 00 40 8F 40 05 00 86 89 F7 23 A7 79 7E 40 07 05 43 6C 61 73 73 00 00 00 00 00 00 F0 3F 05 00 00 00 00 00 00 00 F0 BF 00 01 00 00 00 01 00 00 00
With notepad++ I can see the previous hex like this:
BM1 #? ??G`?Met1 #? d1.)??Met2 #? ?&p?bMet3 #? ?S?ATex1 ? #\#Tex2 ? D#Tex3 ? ?#Tex4 ? ??
ChromsArea L? ???t#Boundary ?# ???~#Area ?# ?bObjects i# #Circ #? ?Q\4e#BGR #? }??#Tex5 ? S#ARAT #? Ð??~#Class ?? ?? #
Hope someone can give me an idea about the format of the images and what info I can extract from the header.
I took sample_app from smppcxx library and changed the settings to:
const std::string ipaddr = "194.228.174.1";
const Smpp::Uint16 port = 9111;
const Smpp::SystemId sysid("MaxiTipSMPP");
const Smpp::Password pass(<actual_password>);
const Smpp::SystemType systype("");
const Smpp::Uint8 infver = 0x34;
const Smpp::ServiceType servtype("");
const Smpp::Address srcaddr("234567");
const Smpp::Address dstaddr("420606752839");
const std::string msgtext = "Hello smsc";
The code called is:
Socket sd;
sd.connect(ipaddr.c_str(), port);
send_bind(sd);
read_bind_resp(sd);
//send_enquire_link(sd);
//read_enquire_link_resp(sd);
send_submit_sm(sd);
read_submit_sm_resp(sd);
Smpp::Uint32 seqnum = read_deliver_sm(sd);
send_deliver_sm_resp(sd, seqnum);
//send_data_sm(sd);
//read_data_sm_resp(sd);
//seqnum = read_deliver_sm(sd);
//send_deliver_sm_resp(sd, seqnum);
send_unbind(sd);
read_unbind_resp(sd);
and the problem happens in read_submit_sm_resp(sd) (or in read_enquire_link_resp(sd) if uncommented):
Buffer buf;
buf = read_smpp_pdu(sd, buf);
std::cout << "\nRead a submit sm resp\n";
Smpp::hex_dump(&buf[0], buf.size(), std::cout);
Smpp::SubmitSmResp pdu;
std::cout << "read_submit_sm_resp buf.size() is " << buf.size() << std::endl;
pdu.decode(&buf[0]);
std::string sid = pdu.message_id();
printf("response message_id: \"%s\"\n", sid.c_str());
on line
pdu.decode(&buf[0]);
, why? The application crashes. I expected the code to work as is, but it just doesn't.
There is the output:
Sending a bind transceiver
00 00 00 2a 00 00 00 09 00 00 00 00 00 00 00 01 ...*............
4d 61 78 69 54 69 70 53 4d 50 50 00 MaxiTipSMPP.password
Read a bind response
00 00 00 15 80 00 00 09 00 00 00 00 00 00 00 01 ................
53 4d 53 43 00 SMSC.
read_bind_resp buf.size() is 21
response system_id: "SMSC"
Sending a submit sm
00 00 00 3d 00 00 00 04 00 00 00 00 00 00 00 01 ...=............
00 00 00 32 33 34 35 36 37 00 01 01 34 32 30 36 ...234567...4206
30 36 37 35 32 38 33 39 00 00 00 00 00 00 01 00 06752839........
00 00 0a 48 65 6c 6c 6f 20 73 6d 73 63 ...Hello smsc
Read a submit sm resp
00 00 00 a4 00 00 00 05 00 00 00 00 00 00 00 01 ................
00 01 01 39 39 39 30 33 30 00 01 01 34 32 30 36 ...999030...4206
30 36 37 35 32 38 33 39 00 04 00 00 00 00 00 00 06752839........
00 00 47 69 64 3a 66 62 32 37 37 66 62 34 33 66 ..Gid:fb277fb43f
63 31 34 36 66 30 39 61 39 31 37 37 32 63 37 63 c146f09a91772c7c
31 33 64 65 35 62 20 64 6f 6e 65 20 64 61 74 65 13de5b done date
3a 31 37 30 32 30 36 30 35 30 37 30 34 20 73 74 :170206050704 st
61 74 3a 55 4e 44 45 4c 49 56 00 1e 00 21 66 62 at:UNDELIV...!fb
32 37 37 66 62 34 33 66 63 31 34 36 66 30 39 61 277fb43fc146f09a
39 31 37 37 32 63 37 63 31 33 64 65 35 62 00 04 91772c7c13de5b..
27 00 01 05 '...
read_submit_sm_resp buf.size() is 164
SMPP error: Invalid command_length
I added an output and it tells that size is 164 and I see 164 bytes and in bind response, which works without problems, there is size 21 and I see 21 bytes, should I fix the decode function somehow?
Smpp::SubmitSmResp::decode(const Smpp::Uint8* buff)
{
Response::decode(buff);
Smpp::Uint32 len = Response::command_length();
Smpp::Uint32 offset = 16;
const char* err = "Bad length in submit_sm_resp";
if(len < offset)
throw Error(err);
const Smpp::Char* sptr = reinterpret_cast<const Smpp::Char*>(buff);
message_id_ = &sptr[offset];
offset += message_id_.length() + 1;
if(len < offset)
throw Error(err);
Header::decode_tlvs(buff + offset, len - offset);
}
I still think that the library should work as is, so I guess that maybe I should change some setting or something. Did anyone have the same problem? Any idea what to do? The only thing I want is to send smses, about max 100 a day...
I managed to fix it.
1) Have only 1 open connection instead of opening a new connection for every sms to send. Provider banned me, because of too many connection openings.
2) The flow isn't send->read->send->read->.., but reading of the responses should be done asynchronously and so the client must parse the responses to understand, what type of response was received.
3) The connection should be maintained with send_enquire_link.
I guess that programming this gateway for sending smses was far over the scope of 1 manday for a person like me that didn't know anything about smpp. I finished the task in about 3 days with quite a lot of work overnight. Why do I add that? Because the main problem was my approach because of the assigned time => I thought that there should be a simple solution for 1 manday task...
For an app that has been around for many years, and which has stored the classic Alias records in files, I like to recreate Alias files pointing to the same file now, without having to resolve the Alias first (because the destination may be unavailable at that moment).
Supposedly this should accomplish this:
CFDataRef aliasRecord = ... ; // contains the Alias Record data, see below for an example
CFURLRef url = ... ; // initialized with a file URL
CFDataRef bmData = CFURLCreateBookmarkDataFromAliasRecord (NULL, aliasRecord);
CFError error;
bool ok = CFURLWriteBookmarkDataToFile (bmData, url, 0, &error);
However, the write function fails, and the error says "The file couldn’t be saved."
If I instead create bookmark data using CreateBookmarkData, the write succeeds.
How do I make this work? I'd try writing an old style Alias file with the data in the resource fork if that wasn't so utterly deprecated.
Here's an example alias record I'd have in the aliasRecord object - I can resolve this using the classic Alias Manager FSResolveAlias function, so I know that it is indeed valid.
00 00 00 00 01 12 00 02 00 01 06 54 54 73 4D 42
50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 CC 31 2F 12 48 2B 00 00 01 A5
F3 9B 03 74 6D 70 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 01 AC 1C 67 D1 FE B7 D0 00 00 00 00 00 00
00 00 FF FF FF FF 00 00 09 20 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 07 70 72 69 76 61 74
65 00 00 10 00 08 00 00 CC 31 12 F2 00 00 00 11
00 08 00 00 D1 FE 9B B0 00 00 00 01 00 04 01 A5
F3 9B 00 02 00 13 54 54 73 4D 42 50 3A 70 72 69
76 61 74 65 3A 00 74 6D 70 00 00 0E 00 08 00 03
00 74 00 6D 00 70 00 0F 00 0E 00 06 00 54 00 54
00 73 00 4D 00 42 00 50 00 12 00 0B 70 72 69 76
61 74 65 2F 74 6D 70 00 00 13 00 01 2F 00 FF FF
00 00
CFURLCreateBookmarkDataFromAliasRecord() doesn't create the bookmark data with the kCFURLBookmarkCreationSuitableForBookmarkFile option required by CFURLWriteBookmarkDataToFile().
CFURLCreateBookmarkDataFromAliasRecord() was intended as a way to convert alias records stored a program's own data files to bookmarks with no I/O.
Before CFURLWriteBookmarkDataToFile(), Finder Alias files (bookmark files) were created by the Finder. Those files contained an Alias resource (containing known properties that could be obtained from the Alias resource with FSCopyAliasInfo()) and icon resources. Apple needed the bookmark data in the files written by CFURLWriteBookmarkDataToFile() to provide the same properties. The kCFURLBookmarkCreationSuitableForBookmarkFile option enforces that requirement.
If you have an AliasHandle and want to create a new-style Alias file with bookmark data, you'll need to:
(1) resolve the AliasHandle to an FSRef, create a CFURLRef from the FSRef, and then create the bookmark data using the kCFURLBookmarkCreationSuitableForBookmarkFile option,
or
(2) you'll need to resolve the bookmark data created with CFURLCreateBookmarkDataFromAliasRecord(), and then create a new bookmark data using the kCFURLBookmarkCreationSuitableForBookmarkFile option.
However, you've indicated you'd like to handle this without resolving the AliasHandle, so the only solution is to create an old-style Finder Alias file. Although I know you already know how to accomplish that, it's described at How do I create a Finder alias within an application?.
The first time a user resolves/opens that old-style Alias file with the Finder, the Finder will detect the Alias file needs to be updated (i.e., CFURLCreateByResolvingBookmarkData() will return with isStale == true) and the Finder will create a new bookmark to the Alias file's target and re-write the Alias file. CFURLCreateBookmarkDataFromFile() will continue to support old-style Alias files as long as possible for backwards compatibility.
I have just started studying X86 Assembly Language.
My doubt -
When I am using the DOS DEBUG program to look at memory location, I am getting slightly different values on examining the same memory location using two different segment:offset addresses. I.e.-
Aren't D 40[0]:17 and D 41[0]:7 supposed to give exactly same output? since both of them give same address on adding segment + offset = 400+17 = 410+7 = 417H
The results which I get - (notice they are slightly different)
-D 40:17
0040:0010 00-00 00 1E 00 1E 00 0D 1C .........
0040:0020 44 20 20 39 34 05 34 05-3A 27 39 0A 0D 1C 44 20 D 94.4.:'9...D
0040:0030 20 39 34 05 30 0B 3A 27-31 02 37 08 0D 1C 00 00 94.0.:'1.7.....
0040:0040 93 00 C3 00 00 00 00 00-00 03 50 00 00 10 00 00 ..........P.....
0040:0050 00 18 00 00 00 00 00 00-00 00 00 00 00 00 00 00 ................
0040:0060 0F 0C 00 D4 03 29 30 00-00 00 00 00 91 DA 10 00 .....)0.........
0040:0070 00 00 00 00 00 00 08 00-14 14 14 14 01 01 01 01 ................
0040:0080 1E 00 3E 00 18 10 00 60-F9 11 0B 00 50 01 00 00 ..>....`....P...
0040:0090 00 00 00 00 00 00 10 .......
-D 41:7
0041:0000 00-00 00 2C 00 2C 00 44 20 ...,.,.D
0041:0010 20 39 34 05 31 02 3A 27-37 08 0D 1C 0D 1C 44 20 94.1.:'7.....D
0041:0020 20 39 34 05 30 0B 3A 27-31 02 37 08 0D 1C 00 00 94.0.:'1.7.....
0041:0030 08 00 C3 00 00 00 00 00-00 03 50 00 00 10 00 00 ..........P.....
0041:0040 00 18 00 00 00 00 00 00-00 00 00 00 00 00 00 00 ................
0041:0050 0F 0C 00 D4 03 29 30 00-00 00 00 00 1C DB 10 00 .....)0.........
0041:0060 00 00 00 00 00 00 08 00-14 14 14 14 01 01 01 01 ................
0041:0070 1E 00 3E 00 18 10 00 60-F9 11 0B 00 50 01 00 00 ..>....`....P...
0041:0080 00 00 00 00 00 00 10 .......
You are looking at the BIOS data area, whose contents changes over time since it contains things like the state of shift/control/alt keys, the read/write positions of the keyboard buffer and the timer.
I am using git 1.7.2.3 via cygwin on Windows 7 and seeing strange artifacts appearing in some of my source files when switching branches. git status reports everything as unchanged yet they crazy characters are present. I've confirmed on GitHub that the files are as they should be in the repo.
My Copy:
⼀⼀⼀ 㰀猀甀洀洀愀爀礀㸀ഀഀ
/// Set up method.
⼀⼀⼀ 㰀⼀猀甀洀洀愀爀礀㸀ഀഀ
[SetUp]
瀀甀戀氀椀挀 漀瘀攀爀爀椀搀攀 瘀漀椀搀 匀攀琀甀瀀⠀⤀ഀഀ
{
琀栀椀猀⸀匀挀漀瀀攀 㴀 渀攀眀 吀爀愀渀猀愀挀琀椀漀渀匀挀漀瀀攀⠀⤀㬀ഀഀ
琀栀椀猀⸀琀攀猀琀䤀琀攀洀 㴀 渀攀眀 嘀椀攀眀䐀漀挀甀洀攀渀琀䠀椀猀琀漀爀礀⠀ ഀഀ
625016,
㔀㜀㤀㤀㘀Ⰰ ഀഀ
'T',
㌀㐀㠀㌀㔀㈀㤀Ⰰ ഀഀ
DateTime.Parse("2003-01-08 09:57:04.957"),
㌀Ⰰ ഀഀ
"Invoice (PG-PS) - SUPP(11/16/2008)",
∀䘀䤀一䄀一䌀䔀∀Ⰰ ഀഀ
DateTime.Parse("2008-04-11 11:15:07.770"),
䀀∀尀尀䐀伀匀䬀尀䌀䜀䐀伀䌀匀尀㌀㜀㐀㤀㐀尀㐀㘀 㐀㘀尀戀椀氀猀氀椀瀀开 㠀㘀㐀㠀⸀搀漀挀∀⤀㬀ഀഀ
}
Repo Copy:
/// <summary>
/// Set up method.
/// </summary>
[SetUp]
public override void Setup()
{
this.Scope = new TransactionScope();
this.testItem = new ViewDocumentHistory(
625016,
57996,
'T',
3483529,
DateTime.Parse("2003-01-08 09:57:04.957"),
3,
"Invoice (PG-PS) - SUPP(11/16/2008)",
"FINANCE",
DateTime.Parse("2008-04-11 11:15:07.770"),
#"\\DOSK\CGDOCS\374914\46046\bilslip_1081648.doc");
}
I'm also using a .gitattributes file to ensure line endings are correct since we are developing on Windows.
*.cs eol=crlf text
*.csproj eol=crlf text
*.sln eol=crlf text
*.xml eol=crlf text
The text is an addition by me to attempt to fix the problem as git diff was interpreting the file as binary when I modified it. Didn't have any effect.
This also occurs on fresh checkouts in 1.7.2.3 but not in 1.6.5.1 (mysysgit) as far as I can tell. The caveat is that 1.6 doesn't support .gitattributes which I need for working on Windows. This seems to be a fairly new bug and I haven't changed any configuration.
Does anyone have any idea what could be causing this?
edit:
hexdump -C ViewDocumentHistoryTests.cs | sed -n "130,212p"
000008d0 00 20 00 20 00 2f 00 2f 00 2f 00 20 00 3c 00 73 |. . ./././. .<.s|
000008e0 00 75 00 6d 00 6d 00 61 00 72 00 79 00 3e 00 0d |.u.m.m.a.r.y.>..|
000008f0 00 0d 0a 00 20 00 20 00 20 00 20 00 20 00 20 00 |.... . . . . . .|
00000900 20 00 20 00 2f 00 2f 00 2f 00 20 00 53 00 65 00 | . ./././. .S.e.|
00000910 74 00 20 00 75 00 70 00 20 00 6d 00 65 00 74 00 |t. .u.p. .m.e.t.|
00000920 68 00 6f 00 64 00 2e 00 0d 00 0d 0a 00 20 00 20 |h.o.d........ . |
00000930 00 20 00 20 00 20 00 20 00 20 00 20 00 2f 00 2f |. . . . . . ././|
00000940 00 2f 00 20 00 3c 00 2f 00 73 00 75 00 6d 00 6d |./. .<./.s.u.m.m|
00000950 00 61 00 72 00 79 00 3e 00 0d 00 0d 0a 00 20 00 |.a.r.y.>...... .|
00000960 20 00 20 00 20 00 20 00 20 00 20 00 20 00 5b 00 | . . . . . . .[.|
00000970 53 00 65 00 74 00 55 00 70 00 5d 00 0d 00 0d 0a |S.e.t.U.p.].....|
00000980 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 |. . . . . . . . |
00000990 00 70 00 75 00 62 00 6c 00 69 00 63 00 20 00 6f |.p.u.b.l.i.c. .o|
000009a0 00 76 00 65 00 72 00 72 00 69 00 64 00 65 00 20 |.v.e.r.r.i.d.e. |
000009b0 00 76 00 6f 00 69 00 64 00 20 00 53 00 65 00 74 |.v.o.i.d. .S.e.t|
000009c0 00 75 00 70 00 28 00 29 00 0d 00 0d 0a 00 20 00 |.u.p.(.)...... .|
000009d0 20 00 20 00 20 00 20 00 20 00 20 00 20 00 7b 00 | . . . . . . .{.|
000009e0 0d 00 0d 0a 00 20 00 20 00 20 00 20 00 20 00 20 |..... . . . . . |
000009f0 00 20 00 20 00 20 00 20 00 20 00 20 00 74 00 68 |. . . . . . .t.h|
00000a00 00 69 00 73 00 2e 00 53 00 63 00 6f 00 70 00 65 |.i.s...S.c.o.p.e|
00000a10 00 20 00 3d 00 20 00 6e 00 65 00 77 00 20 00 54 |. .=. .n.e.w. .T|
00000a20 00 72 00 61 00 6e 00 73 00 61 00 63 00 74 00 69 |.r.a.n.s.a.c.t.i|
00000a30 00 6f 00 6e 00 53 00 63 00 6f 00 70 00 65 00 28 |.o.n.S.c.o.p.e.(|
00000a40 00 29 00 3b 00 0d 00 0d 0a 00 0d 00 0d 0a 00 20 |.).;........... |
00000a50 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 |. . . . . . . . |
00000a60 00 20 00 20 00 20 00 74 00 68 00 69 00 73 00 2e |. . . .t.h.i.s..|
00000a70 00 74 00 65 00 73 00 74 00 49 00 74 00 65 00 6d |.t.e.s.t.I.t.e.m|
00000a80 00 20 00 3d 00 20 00 6e 00 65 00 77 00 20 00 56 |. .=. .n.e.w. .V|
00000a90 00 69 00 65 00 77 00 44 00 6f 00 63 00 75 00 6d |.i.e.w.D.o.c.u.m|
00000aa0 00 65 00 6e 00 74 00 48 00 69 00 73 00 74 00 6f |.e.n.t.H.i.s.t.o|
00000ab0 00 72 00 79 00 28 00 20 00 0d 00 0d 0a 00 20 00 |.r.y.(. ...... .|
00000ac0 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 | . . . . . . . .|
00000ad0 20 00 20 00 20 00 20 00 20 00 20 00 20 00 36 00 | . . . . . . .6.|
00000ae0 32 00 35 00 30 00 31 00 36 00 2c 00 20 00 0d 00 |2.5.0.1.6.,. ...|
00000af0 0d 0a 00 20 00 20 00 20 00 20 00 20 00 20 00 20 |... . . . . . . |
00000b00 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 |. . . . . . . . |
00000b10 00 20 00 35 00 37 00 39 00 39 00 36 00 2c 00 20 |. .5.7.9.9.6.,. |
00000b20 00 0d 00 0d 0a 00 20 00 20 00 20 00 20 00 20 00 |...... . . . . .|
00000b30 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 | . . . . . . . .|
00000b40 20 00 20 00 20 00 27 00 54 00 27 00 2c 00 20 00 | . . .'.T.'.,. .|
00000b50 0d 00 0d 0a 00 20 00 20 00 20 00 20 00 20 00 20 |..... . . . . . |
00000b60 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 |. . . . . . . . |
00000b70 00 20 00 20 00 33 00 34 00 38 00 33 00 35 00 32 |. . .3.4.8.3.5.2|
00000b80 00 39 00 2c 00 20 00 0d 00 0d 0a 00 20 00 20 00 |.9.,. ...... . .|
00000b90 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 | . . . . . . . .|
00000ba0 20 00 20 00 20 00 20 00 20 00 20 00 44 00 61 00 | . . . . . .D.a.|
00000bb0 74 00 65 00 54 00 69 00 6d 00 65 00 2e 00 50 00 |t.e.T.i.m.e...P.|
00000bc0 61 00 72 00 73 00 65 00 28 00 22 00 32 00 30 00 |a.r.s.e.(.".2.0.|
00000bd0 30 00 33 00 2d 00 30 00 31 00 2d 00 30 00 38 00 |0.3.-.0.1.-.0.8.|
00000be0 20 00 30 00 39 00 3a 00 35 00 37 00 3a 00 30 00 | .0.9.:.5.7.:.0.|
00000bf0 34 00 2e 00 39 00 35 00 37 00 22 00 29 00 2c 00 |4...9.5.7.".).,.|
00000c00 0d 00 0d 0a 00 20 00 20 00 20 00 20 00 20 00 20 |..... . . . . . |
00000c10 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 |. . . . . . . . |
00000c20 00 20 00 20 00 33 00 2c 00 20 00 0d 00 0d 0a 00 |. . .3.,. ......|
00000c30 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 | . . . . . . . .|
*
00000c50 22 00 49 00 6e 00 76 00 6f 00 69 00 63 00 65 00 |".I.n.v.o.i.c.e.|
00000c60 20 00 28 00 50 00 47 00 2d 00 50 00 53 00 29 00 | .(.P.G.-.P.S.).|
00000c70 20 00 2d 00 20 00 53 00 55 00 50 00 50 00 28 00 | .-. .S.U.P.P.(.|
00000c80 31 00 31 00 2f 00 31 00 36 00 2f 00 32 00 30 00 |1.1./.1.6./.2.0.|
00000c90 30 00 38 00 29 00 22 00 2c 00 20 00 0d 00 0d 0a |0.8.).".,. .....|
00000ca0 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 |. . . . . . . . |
*
00000cc0 00 22 00 46 00 49 00 4e 00 41 00 4e 00 43 00 45 |.".F.I.N.A.N.C.E|
00000cd0 00 22 00 2c 00 20 00 0d 00 0d 0a 00 20 00 20 00 |.".,. ...... . .|
00000ce0 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 | . . . . . . . .|
00000cf0 20 00 20 00 20 00 20 00 20 00 20 00 44 00 61 00 | . . . . . .D.a.|
00000d00 74 00 65 00 54 00 69 00 6d 00 65 00 2e 00 50 00 |t.e.T.i.m.e...P.|
00000d10 61 00 72 00 73 00 65 00 28 00 22 00 32 00 30 00 |a.r.s.e.(.".2.0.|
00000d20 30 00 38 00 2d 00 30 00 34 00 2d 00 31 00 31 00 |0.8.-.0.4.-.1.1.|
00000d30 20 00 31 00 31 00 3a 00 31 00 35 00 3a 00 30 00 | .1.1.:.1.5.:.0.|
00000d40 37 00 2e 00 37 00 37 00 30 00 22 00 29 00 2c 00 |7...7.7.0.".).,.|
00000d50 20 00 0d 00 0d 0a 00 20 00 20 00 20 00 20 00 20 | ...... . . . . |
00000d60 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 |. . . . . . . . |
00000d70 00 20 00 20 00 20 00 40 00 22 00 5c 00 5c 00 44 |. . . .#.".\.\.D|
00000d80 00 4f 00 53 00 4b 00 5c 00 43 00 47 00 44 00 4f |.O.S.K.\.C.G.D.O|
00000d90 00 43 00 53 00 5c 00 33 00 37 00 34 00 39 00 31 |.C.S.\.3.7.4.9.1|
00000da0 00 34 00 5c 00 34 00 36 00 30 00 34 00 36 00 5c |.4.\.4.6.0.4.6.\|
00000db0 00 62 00 69 00 6c 00 73 00 6c 00 69 00 70 00 5f |.b.i.l.s.l.i.p._|
00000dc0 00 31 00 30 00 38 00 31 00 36 00 34 00 38 00 2e |.1.0.8.1.6.4.8..|
00000dd0 00 64 00 6f 00 63 00 22 00 29 00 3b 00 0d 00 0d |.d.o.c.".).;....|
00000de0 0a 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 |.. . . . . . . .|
00000df0 20 00 7d 00 0d 00 0d 0a 00 0d 00 0d 0a 00 20 00 | .}........... .|
It appears this is some sort of encoding problem.
You're saving your files as UTF-16, the encoding that Windows text editors misleadingly call “Unicode”.
UTF-16 is not ASCII-compatible and so won't work properly with the diff tool used by git. What you're getting is a single byte change to the input on every newline (presumably due to conversion between LF and Windows CRLF line endings) causing the two-byte alignment of UTF-16 code units to be out by one, causing the low byte and high byte to be swapped:
original text: < s u m m a r y >
representation in UTF-16LE: 3C 00 73 00 75 00 6D 00 6D 00 61 00 72 00 79 00 3E 00
accidentally misaligned: 00 3C 00 73 00 75 00 6D 00 6D 00 61 00 72 00 79 00 3E
decoded from misaligned: 㰀 猀 甀 洀 洀 愀 爀 礀 㸀
Save your files in an ASCII-compatible encoding and you'll not have this trouble. Preferably: UTF-8-without-BOM.