Golang: "compress/flate" module can't decompress valid deflate compressed HTTP body - go

This question continues the discussion started here. I found out that the HTTP response body can't be unmarshaled into JSON object because of deflate compression of the latter. Now I wonder how can I perform decompression with Golang. I will appreciate anyone who can show the errors in my code.
Input data
I've dumped the HTTP response body into the 'test' file. Here is it:
$ cat test
x��PAN�0�
;��NtJ�FӮdU�|"oVR�C%�f�����Z.�^Hs�dW뮑�'��DH�S�SFVC����r)G,�����<���z}�x_g�+�2��sl�r/�Oy>��J3\�G�9���N���#[5M�^v/�2Ҕ��|�h��[�~7�_崛<D*���/��i
Let's make sure that this file can be decompressed and even contains valid JSON:
$ zlib-flate -uncompress < test
{"timestamp":{"tv_sec":1428488670,"tv_usec":197041},"string_timestamp":"2015-04-08 10:24:30.197041","monitor_status":"enabled","commands":{"REVERSE_LOOKUP":{"cache":{"outside":{"successes":0,"failures":0,"size":0,"time":0},"internal":{"successes":0,"failures":0,"size":0,"time":0}},"disk":{"outside":{"successes":0,"failures":0,"size":0,"time":0},"internal":{"successes":13366,"failures":0,"size":0,"time":501808}},"total":{"storage":{"successes":0,"failures":0},"proxy":{"successes":13366,"failures":0}}},"clients":{}}}
$ zlib-flate -uncompress < test | python -m json.tool
{
"commands": {
"REVERSE_LOOKUP": {
"cache": {
....
Source code
package main
import (
"bytes"
"compress/flate"
"fmt"
"io/ioutil"
)
func main() {
fname := "./test"
content, err := ioutil.ReadFile(fname)
if err != nil {
panic(err)
}
fmt.Println("File content:\n", content)
enflated, err := ioutil.ReadAll(flate.NewReader(bytes.NewReader(content)))
if err != nil {
panic(err)
}
fmt.Println("Enflated:\n", enflated)
}
Error
$ go run uncompress.go
File content:
[120 156 181 80 65 78 195 48 16 252 10 242 57 69 118 226 166 38 247 156 64 42 42 130 107 100 156 165 88 196 118 149 93 35 160 234 223 89 183 61 112 42 226 192 109 118 118 102 103 180 123 65 62 0 146 13 59 209 237 5 189 15 8 78 116 74 215 70 27 211 174 100 85 184 124 34 111 86 82 171 67 37 144 102 31 183 195 15 167 168 165 90 46 164 94 72 115 165 100 87 235 174 145 215 39 189 168 68 72 209 83 154 7 22 83 70 86 67 180 207 19 140 188 114 41 4 27 71 44 225 155 254 169 223 60 244 195 221 122 125 251 120 95 24 103 221 43 20 144 50 161 31 143 16 179 115 128 8 108 225 114 47 214 79 121 62 15 232 191 224 8 74 51 6 92 213 71 130 57 218 233 175 78 182 142 30 223 254 35 91 53 77 219 94 118 47 165 50 210 148 18 148 232 124 128 31 104 183 151 91 176 126 55 167 143 207 95 3 15 229 180 155 60 68 42 159 231 241 27 47 165 167 25]
panic: flate: corrupt input before offset 5
goroutine 1 [running]:
runtime.panic(0x4a7180, 0x5)
/usr/lib/go/src/pkg/runtime/panic.c:266 +0xb6
main.main()
/home/isaev/side-projects/elliptics-manager/uncompress.go:20 +0x2a3
exit status 2
PS Ubuntu 14.10, Go 1.2.1

Your input is not a simple deflated block, it's a zlib stream.
According to the ZLIB Compressed Data Format Specification 3.3 the first 2 bytes are:
-------------
| CMF | FLG |
-------------
The Compression Method and flags. Your input starts with [120, 156] which is 78 9C in hexa. This is the Default Compression. Also no dictionary follows, so the subsequent data is the compressed data.
Bits 0 to 3 are CM Compression Method and bits 4 to 7 are CINFO Compression Info. In this case CINFO=7 indicates a 32K window size, CM=8 denotes the "deflate" compression method. FLG bit 5 tells if a dictionary is preset, which is in this case. Details of the FLG are also in the linked RFC 1950.
So your input basically tells the rest of the data was constructed using default compression, but the go flate package does not decode this.
Change your decompression to omit the first 2 bytes like this and it will work:
enflated, err := ioutil.ReadAll(flate.NewReader(bytes.NewReader(content[2:])))
Try it on the Go Playground. But...
Use Proper ZLib decompression!
We got lucky this time because the compression level is the default and dictionary was preset. If not, you won't be able to decode it using the flate package. Since the input is a zlib stream, you should use the compress/zlib package to properly decode it and not rely on luck:
r, err := zlib.NewReader(bytes.NewReader(content))
if err != nil {
panic(err)
}
enflated, err := ioutil.ReadAll(r)
if err != nil {
panic(err)
}
fmt.Println(string(enflated))
Try the zlib variant on the Go Playground.

Related

Why does Base64 buffer sizing make it larger than the length of the underlying text?

I am trying to encode a byte array as Base64 and running into two issues. I can do this with base64.StdEncoding.EncodedLen(text) but I'm worried that's costly, so I wanted to see if I could do it just with len(text). Here is the code (the functions are named "Marshal" because I'm using them as a field converter during JSON Marshaling):
package main
import (
"crypto/rand"
"encoding/base64"
"fmt"
)
func main() {
b := make([]byte, 60)
_, _ = rand.Read(b)
// Marshal Create Dst Buffer
MarshalTextBuffer(b)
// Marshal Convert to String
MarshalTextStringWithBufferLen(b)
// Marshal Convert to String
MarshalTextStringWithDecodedLen(b)
}
func MarshalTextBuffer(text []byte) error {
ba := base64.StdEncoding.EncodeToString(text)
fmt.Println(ba)
return nil
}
func MarshalTextStringWithBufferLen(text []byte) error {
ba := make([]byte, len(text)+30) // Why does len(text) not suffice? Temporarily using '30' for now, just so it doesn't overrun.
base64.StdEncoding.Encode(ba, text)
fmt.Println(ba)
return nil
}
func MarshalTextStringWithDecodedLen(text []byte) error {
ba := make([]byte, base64.StdEncoding.EncodedLen(len(text)))
base64.StdEncoding.Encode(ba, text)
fmt.Println(ba)
return nil
}
Here's the output:
IL5CW8T9WSgwU5Hyi9JsLLkU/EcydY6pG2fgLQJsMaXgxhSh74RTagzr6b9yDeZ8CP4Azc8xqq5/+Cgk
[73 76 53 67 87 56 84 57 87 83 103 119 85 53 72 121 105 57 74 115 76 76 107 85 47 69 99 121 100 89 54 112 71 50 102 103 76 81 74 115 77 97 88 103 120 104 83 104 55 52 82 84 97 103 122 114 54 98 57 121 68 101 90 56 67 80 52 65 122 99 56 120 113 113 53 47 43 67 103 107 0 0 0 0 0 0 0 0 0 0]
[73 76 53 67 87 56 84 57 87 83 103 119 85 53 72 121 105 57 74 115 76 76 107 85 47 69 99 121 100 89 54 112 71 50 102 103 76 81 74 115 77 97 88 103 120 104 83 104 55 52 82 84 97 103 122 114 54 98 57 121 68 101 90 56 67 80 52 65 122 99 56 120 113 113 53 47 43 67 103 107]
Why does the middle one MarshalTextStringWithBufferLen require extra padding?
Is base64.StdEncoding.EncodedLen a costly function (e.g. I can solve it with the bottom function, but I worry about the cost).
Base-64 encoding stores binary data (8 bits per byte) as text (using 6 bits per byte), so every 3 bytes is encoded as 4 bytes (3x8 = 4x6). So len(text) + 30 in your code is wrong, and should be len(text)*4/3 (if len(text) is divisible by 3) but to make for readability and to avoid bugs you should be using base64.StdEncoding.EncodedLen() to get the length.
If you look at the code for base64.StdEncoding.EncodedLen you will see that it is as fast as doing the calcs yourself (esp. as it will be in-lined).

When Cloudwatch Logs data is sent into kinesis data stream, what is its encoding format

I'm trying to write a Go program, to download data from aws kinesis data stream. I read that kinesis data stream encode the data with base64, so I need first decode with base64. However, I can't figure out what encoding was used on the data as it is passed, from cloudwatch logs to kinesis data stream.
I'm trying the different decoding method but none works. My unprocessed byte array downloaded from kinesis data stream is as the following:
[31 139 8 0 0 0 0 0 0 0 53 206 65 11 130 64 16 134 225 191 178 204 89 130 178 34 246 22 97 30 178 130 12 58 68 196 166 147 14 233 174 236 140 69 68 255 61 204 58 190 204 7 243 188 160 70 102 83 224 254 217 32 104 88 108 55 251 221 54 57 175 163 52 157 199 17 4 224 30 22 125 119 169 92 155 63 140 100 101 226 10 134 0 42 87 196 222 181 13 104 232 43 21 143 166 238 147 219 11 103 158 26 33 103 151 84 9 122 6 125 60 125 119 209 29 173 116 249 2 202 251 185 80 141 44 166 110 64 15 167 227 201 48 28 79 166 225 108 20 6 127 94 7 56 36 234 199 83 63 158 86 139 18 179 27 217 66 149 104 42 41 149 187 170 28 89 200 154 238 179 90 145 69 38 86 252 165 13 224 125 122 127 0 234 141 66 79 242 0 0 0]
Can someone give me some tips how to process this piece of data?
You can use a subscription filter with Kinesis, Lambda, or Kinesis Data Firehose. Logs that are sent to a receiving service through a subscription filter are base64 encoded and compressed with the gzip format.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/SubscriptionFilters.html

Why doesn't my GraphQL query return the right result when triggered from GO?

I'm new to both Hasura/Graphql and GO. I've succeeded in writing a couple of Hasura actions which are supported by a GO server which in turn calls the Hasura server to run a query. These both work. For some reason the third one does not and I cannot see why. I don't get any errors from my GO server. It just doesn't return the same result as running the query directly in Hasura GraphiQL and I would be much appreciative of any help someone can provide.
I've defined an Action as follows -
type Query {
lookupCostCentreLocalDB (arg1: InputCostCentreAndId!): ValidCostCentreDtl
}
input InputCostCentreAndId {costCentre : String! altId : String!
}
type ValidCostCentreDtl {validCostCentre : String!
}
When I run the following query from the Hasura GraphiQL it works fine
query {GLW_GL_MAP (where: {_and: [{C3_Cost_Centre: {_eq: "8106"}},{idAlt: {_eq: 2}}]}){C3_Cost_Centre}}
and returns a cost centre
The action triggers my GO Server which builds the same query inserting the parameters and sends it to the Hasura URL
query lookingUpCCLocalDB_Test_True {
lookupCostCentreLocalDB (arg1: {costCentre: "8106" altId: "2"}) {ValidCostCentreDtl: validCostCentre}
}
but it returns a different result - in this case it does not return a cost centre. The call to Hasura is done as follows
//Prepare the HTTP Request
url := "http://localhost:8080/v1/graphql"
respRqst, err := http.NewRequest("POST", url, bytes.NewBuffer(jsonQueryStr))
if err != nil {
fmt.Println("Error from http.NewRequest - err =", err)
panic(err)
}
fmt.Println("jsonQueryStr=", jsonQueryStr)
fmt.Println("respRqst=", respRqst)
respRqst.Header.Set("Content-Type", "application/json")
//Action the HTTP Request
client := http.Client{}
resp, err := client.Do(respRqst)
if err != nil {
fmt.Println("Error from client.Do - err =", err)
panic(err)
}
fmt.Println("resp=", resp)
//Close the HTTP Request when this function returns to ensure that it is always closed
defer resp.Body.Close()
if resp.StatusCode == http.StatusOK {
bodyBytes, err = ioutil.ReadAll(resp.Body)
if err != nil {
fmt.Println("Error ioutil.ReadAll(resp.Body)")
panic(err)
}
return bodyBytes, nil
}
The result of the Println statements is (the first item is a display of the []bytes variable passed to the http request and is the query in []bytes)
jsonQueryStr= [123 34 113 117 101 114 121 34 58 32 34 113 117 101 114 121 32 123 71 76 87 95 71 76 95 77
65 80 32 40 119 104 101 114 101 58 32 123 95 97 110 100 58 32 91 123 67 51 95 67 111 115 116 95 67 101
110 116 114 101 58 32 123 95 101 113 58 32 34 56 49 48 54 34 125 125 44 123 105 100 65 108 116 58 32 123 95
101 113 58 32 50 125 125 93 125 41 123 67 51 95 67 111 115 116 95 67 101 110 116 114 101 125 125 34 125]
respRqst= &{POST http://localhost:8080/v1/graphql HTTP/1.1 1 1 map[] {{"query": "query {GLW_GL_MAP (where: {_and: [{C3_Cost_Centre: {_eq: "8106"}},{idAlt: {_eq: "2"}}]}){C3_Cost_Centre}}"}} 0xb4b6c0 116 [] false localhost:8080 map[] map[] map[] 0xc0000120e0}
resp= &{200 OK 200 HTTP/1.1 1 1 map[Content-Type:[application/json; charset=utf-8] Date:[Wed, 04 Nov 2020 06:00:34 GMT] Server:[Warp/3.3.10]] 0xc000104340 -1 [chunked] false false map[] 0xc000124100 }
The following is a breakdown of the response
//Extract the data returned into the QueryResult data structure
var queryResult QueryLocalGLWCostCentre
err = json.Unmarshal(bodyBytes, &queryResult)
if err != nil {
fmt.Println("Unmarshal queryResult failed")
panic(err)
}
fmt.Println("bodyBytes =:", bodyBytes, "RESULT Length: ", len(queryResult.Data.GLWGLMAP), "queryResult=", queryResult, "queryResult.Data.GLWGLMAP=", queryResult.Data.GLWGLMAP)
The following is what was printed by the statements above
bodyBytes =: [123 34 101 114 114 111 114 115 34 58 91 123 34 101 120 116 101 110 115 105 111 110 115 34
58 123 34 112 97 116 104 34 58 34 36 34 44 34 99 111 100 101 34 58 34 105 110 118 97 108 105 100 45 106
115 111 110 34 125 44 34 109 101 115 115 97 103 101 34 58 34 69 114 114 111 114 32 105 110 32 36 58 32
70 97 105 108 101 100 32 114 101 97 100 105 110 103 58 32 115 97 116 105 115 102 121 46 32 69 120 112
101 99 116 105 110 103 32 39 44 39 32 111 114 32 39 125 39 32 97 116 32 39 56 49 48 54 125 125 44 123
105 100 65 108 116 58 123 95 101 113 58 50 125 125 93 125 41 123 67 51 95 67 111 115 116 95 67 101 110
116 114 101 125 125 125 39 34 125 93 125] RESULT Length: 0 queryResult= {{[]}}
queryResult.Data.GLWGLMAP= []
My GO struct for the response is as follows -
type QueryLookupGLWCostCentre struct {
Data struct {
GLWGLMAP []struct {
C3CostCentre string json:"C3_Cost_Centre"
} json:"GLW_GL_MAP"
} json:"data"
}
Can anyone provide any clues on what my problem is?
Regards
I managed to resolve my problem by using the following process to build the required Query for the Hasura Server
//Set up the query template to use
fullQueryOriginalTemplate := `{"query": "query C3MS_query {GLW_GL_MAP (where: {_and: [{C3_Cost_Centre: {_eq: \":1\"}},{id: {_eq: \":2\"}}]}){C3_Cost_Centre}}"}`
//Replace the parameters with the supplied values
fullQueryInterim := strings.Replace(fullQueryOriginalTemplate, ":1", costCentre, -1)
fullQuery := strings.Replace(fullQueryInterim, ":2", strconv.Itoa(id), -1)

Create a file according to sort contend

I have a list of more than 100000 records.
per example the values from 21 to 84 are continuous, then it will be 21-84 but if it is not continuous as the case 84 87, then it need to be 84,87 separated by ,
at beginning of each line will be the value 11111.
The values from the list will be in the column range of 21 to 80 with, at last.
The length of each row need to be maximum 80.
here is the input file.
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
87
85
86
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
108
111
109
112
110
113
115
114
117
116
118
124
125
120
122
123
126
132
127
133
128
130
131
135
136
137
138
139
140
141
142
143
144
145
146
148
147
149
150
151
152
153
154
155
156
158
157
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
184
183
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
214
here in the output file desired.
111111 21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117,
111111 116,118,124-125,120,122-123,126,132,127,133,128,130-131,
111111 135-146,148,147,149-156,158,157,159-182,184,183,185-212,214,
thanks in advance
Presented without explanation: check the man pages for the commands used and come back with questions:
awk '
function printrange() { print start (start == last ? "" : "-" last) }
NR == 1 {start=last=$1; next}
$1 == last+1 {last=$1; next}
{printrange(); start=last=$1}
END {printrange()}
' file | paste -sd" " | fold -sw 60 | tr ' ' ',' | sed 's/^/111111 /'
111111 21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117,
111111 116,118,124-125,120,122-123,126,132,127,133,128,130-131,
111111 135-146,148,147,149-156,158,157,159-182,184,183,185-212,214

R compare all list elements for duplicates

I am looking at all possible paths through a graph. I have written a DFS algorithm that finds all these paths. I want to make sure that my algorithm works correctly and that no two paths are identical. My algorithm returns a list that looks as follows:
....
[[2770]]
[1] 1 2 3 52 53 54 55 56 57 58 59 60 12 11 10 9 8 78 79 80 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129
[38] 130 131 132 133 134 137 138 139 140 141 142 143 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166
[[2771]]
[1] 1 2 3 52 53 54 55 56 57 58 59 60 12 11 10 9 8 78 79 80 113 114 115 143 144 145 146 147 148 149 150 151 152 153 154 155 156
[38] 157 158 159 160 161 162 163 164 165 166
[[2772]]
[1] 1 2 3 52 53 54 55 56 57 58 59 60 12 11 10 9 8 78 79 80 113 114 115 143 150 151 152 153 154 155 156 157 158 159 160 161 162
[38] 163 164 165 166
As you can see, the list is 2772 elements long. This means there are 2,772 paths through this graph. How can I easily compare all the list elements to make sure there are no duplicates. Just to be clear, the same set of numbers but in a different ordering represents a different path and is not a duplicate!
Thank you for your help!
maybe something like
test<-list(1:2,3:4,5:7,1:10,3:4,4:3)
dups<-duplicated(test)
idups<-seq_along(test)[dups]

Resources