How do I SQL Query MinIO objects using the MinIO Client (mc)?

How do I SQL Query MinIO objects using the MinIO Client (mc)? - minio

Getting into Minio. Investigating a few commands,
If I do mc ls alias/bucket then I get expected output:
[2020-12-09 19:48:15 UTC] 10B Account-9.dta
[2020-12-09 19:48:22 UTC] 10B Account-90.dta
[2020-12-09 19:48:22 UTC] 11B Account-92.dta
So, I would expect some kind of output when I execute the following on the same connection:
mc sql --recursive --query "select * from s3object" alias/bucket
but instead, it just goes back to a prompt (No results). I suspect my "from" is wrong but I have no idea what values to use other than "s3object".
How do I properly perform SQL queries on a local MinIO instance?
MinIO Version:
VERSION
2019-08-14T20:37:41Z
MEMORY
Used: 4.4 MB | Allocated: 3.6 GB | Used-Heap: 4.4 MB | Allocated-Heap: 65 MB
PLATFORM
Host: minio-66c9cd74c9-7m6lx | OS: linux | Arch: amd64
RUNTIME
Version: go1.12.8 | CPUs: 12
MinIO Client Version:
mc version RELEASE.2020-11-25T23-04-07Z

When you specify multiple objects, mc filters objects by extension.
If you store files with a .csv extension they should be picked up. Similar for .json files and gzip/bzip2.

Related

AWS Lambda Chalice Layers Segmentation Fault

I am deploying a Python 3.7 Lambda function via Chalice. Because the code with its environment requirements, is larger than 50 MB limit, I am using the "automatic_layer" feature of Chalice to generate the layer with the requirements, which is awswrangler.
Because the generated layer is > 50 MB, I am uploading the generated managed-layer-...-python3.7.zip manually to s3 and create a Lambda layer. Then I re-deploy with chalice, removing the automatic_layer option and setting the layers to the generated ARN of the layer I manually created.
The function deployed this way worked OK for a couple of times, then started failing occasionally with "Segmentation Fault". The error rate increased shortly and now it is failing 100%.
Traceback:
> OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
> START RequestId: 3b98bd4b-6cda-4d21-8090-1a49b17c06fc Version: $LATEST
> OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
> END RequestId: 3b98bd4b-6cda-4d21-8090-1a49b17c06fc
> REPORT RequestId: 3b98bd4b-6cda-4d21-8090-1a49b17c06fc Duration: 7165.04 ms Billed Duration: 7166 ms Memory Size: 128 MB Max Memory Used: 41 MB
> RequestId: 3b98bd4b-6cda-4d21-8090-1a49b17c06fc Error: Runtime exited with error: signal: segmentation fault (core dumped)
> Runtime.ExitError
As awswrangler itself requires boto3 & botocore, and they are already in the Lambda environment, I suspected that there might be a conflict of different versions of boto. I tried the same flow by explicitly including boto3 and botocore in the requirements but I am still receiving the same segmentation fault error.
Any help is much appreciated.

You could use AWS X-Ray to get more information on the problem : https://docs.aws.amazon.com/lambda/latest/dg/python-tracing.html
Moreover you might analyze the core dump generated executing your lambda function on a bash shell:
ulimit -c unlimited
cd /tmp
ececute your python ...
You should find a file named /tmp/core..... that you should analyze with gdb after download. The command "man core" could help you.

rsyslog not escaping backslash in JSON

I have a rsyslogd instance running, producing the following JSON from syslog:
{"timegenerated":"2019-01-28T09:24:37.033990+00:00","type":"syslog","host":"REDACTED_HOSTNAME","host-ip":"REDACTED_IP","message":"<190>Jan 28 2019 10:24:35: %ASA-X-XXXXXX: Teardown TCP connection 82257709 for outside:REDACTED_IP\/REDACTED_PORT(LOCAL\ususername) to inside:REDACTED_IP\/REDACTED_PORT duration 0:01:52 bytes XXXX TCP FINs from outside (ususername)"}
This is invalid JSON, as the \ususe is interpreted as a hex representation of a unicode symbol. It should have been escaped as \\ususe.
I noticed on GitHub that there were an open issue (https://github.com/rsyslog/rsyslog/issues/1235), although it mentions another issue that resulted in a merged fix.
Here's some system info:
:~# rsyslogd -version
rsyslogd 8.24.0, compiled with:
PLATFORM: x86_64-pc-linux-gnu
PLATFORM (lsb_release -d):
FEATURE_REGEXP: Yes
GSSAPI Kerberos 5 support: Yes
FEATURE_DEBUG (debug build, slow code): No
32bit Atomic operations supported: Yes
64bit Atomic operations supported: Yes
memory allocator: system default
Runtime Instrumentation (slow code): No
uuid support: Yes
Number of Bits in RainerScript integers: 64
:~# lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 9.4 (stretch)
Release: 9.4
Codename: stretch
Template used to create the JSON document is:
template(name="json_syslog"
type="list") {
constant(value="{")
constant(value="\"timegenerated\":\"") property(name="timegenerated" dateFormat="rfc3339")
constant(value="\",\"type\":\"syslograw")
constant(value="\",\"host\":\"") property(name="fromhost")
constant(value="\",\"host-ip\":\"") property(name="fromhost-ip")
constant(value="\",\"message\":\"") property(name="rawmsg" format="jsonr")
constant(value="\"}\n")
Is there any functionality in rsyslog that would allow me to fix this, or does it seem like an upstream bug?

I notice you are using format="jsonr" in the template for the message. There is a difference if you use json instead of jsonr, which the documentation describes very briefly as avoids double escaping the value. Using a template with
constant(value="\",\n\"json\":\"") property(name="rawmsg" format="json")
constant(value="\",\n\"jsonr\":\"") property(name="rawmsg" format="jsonr")
and providing input containing
LOCAL\ususer "abc"
produces the 2 lines
"json":"LOCAL\\ususer \"abc\",
"jsonr":"LOCAL\ususer \"abc\",
in which the json format has escaped the \u into \\u (tested with rsyslog-8.27.0).
If this is not right for you, you can always manipulate the message, for example as follows, adding before your action:
set $.msg2 = replace($rawmsg, "\\u", "\\\\u");
and in your template use
constant(value="\",\"message\":\"") property(name="$.msg2" format="jsonr")
The replace function does a global replace, so you may want to restrict it, for example with
set $.msg2 = replace($rawmsg, "LOCAL\\u", "LOCAL\\\\u");

Handbrake-CLI on Synology NAS

I installed Docker on my Synology NAS (DS415+) and tried to run the handbrake-cli (via this package) over ssh.
However, something seems to be broken. I get the following error message after a simple sudo docker run -d supercoder/docker-handbrake-cli -i ~/_inProgress/input/movie.mkv -o ~/_inProgress/output/test.mp4 (I shortened the error message for readability):
- hb_init: starting libhb thread
- HandBrake 0.10.1 (2015030800) - Linux x86_64 - https://handbrake.fr
- 4 CPUs detected
- Opening /var/services/homes/xxx/_inProgress/input/movie.mkv...
- CPU: Intel(R) Atom(TM) CPU C2538 # 2.40GHz
- Intel microarchitecture Silvermont
- logical processor count: 4
- OpenCL: library not available
- hb_scan: path=/var/services/homes/xxx/_inProgress/input/movie.mkv, title_index=1
- libbluray/bdnav/index_parse.c:162: indx_parse(): error opening /var/services/homes/xxx/_inProgress/input/movie.mkv/BDMV/index.bdmv
- libbluray/bdnav/index_parse.c:162: indx_parse(): error opening /var/services/homes/xxx/_inProgress/input/movie.mkv/BDMV/BACKUP/index.bdmv
- libbluray/bluray.c:2182: nav_get_title_list(/var/services/homes/xxx/_inProgress/input/movie.mkv) failed
- bd: not a bd - trying as a stream/file instead
- libdvdnav: Using dvdnav version 5.0.1
- libdvdread: Encrypted DVD support unavailable.
- libdvdread: Can't stat /var/services/homes/xxx/_inProgress/input/movie.mkv
- No such file or directory
- libdvdread: Could not open /var/services/homes/xxx/_inProgress/input/movie.mkv
- libdvdnav: vm: failed to open/read the DVD
- dvd: not a dvd - trying as a stream/file instead
- hb_stream_open: open /var/services/homes/xxx/_inProgress/input/movie.mkv failed
- scan: unrecognized file type
- libhb: scan thread found 0 valid title(s)
- No title found.
- HandBrake has exited.
I followed this blog post originally and got the same message there.
Executing the same thing on my desktop works without any problems.
Anyone got an idea?

When running the docker instance, your input and output file do not exist in the container. You need first to mount the input and output directories of you file system as volumes (as shown in the blog post you shared)
-v ~/_inProgress/output/:/outout:rw
-v ~/_inProgress/input/:/input:ro
And then you use those paths in the options:
-i /input/<file>
-o /output/<file>
Good luck!

AWS EC2 User Data Not Decoding Correctly

I am experimenting with creating an EC2 instance to host a Perforce server. My instance is configured with the following user data:
#!/bin/bash
# Add a newline to the ec2-user prompt string
echo PS1=\"\\n\$PS1\" >> /home/ec2-user/.bashrc
# Update all packages
yum update –y
# Install Perforce packages
# The RHEL/7 part of the baseurl should be replaced with
# the latest RHEL version that both Amazon and Perforce support
rpm –import https://package.perforce.com/perforce.pubkey
cd /etc/yum.repos.d/
echo [perforce] > perforce.repo
echo name=Perforce >> perforce.repo
echo baseurl=http://package.perforce.com/yum/rhel/7/x86_64 >> perforce.repo
echo enabled=1 >> perforce.repo
echo gpgcheck=1 >> perforce.repo
yum install –y helix-p4d
# Make directories for the server, owned by new “perforce” user
cd /opt/perforce/servers/
mkdir danware
cd danware
mkdir danware-db danware-chkpts journal
chown –R perforce:perforce danware
I have tested each of the above commands, and know that they work when executed manually in this order. However, some aspect of Amazon's base64 encode/decode system seems to be getting in the way. When I go to "Actions > Instance Settings > View/Change User Data" from the EC2 Console after launching (and passing all system checks), I see the following user data. Note how almost every hyphen "-" has been replaced with some strange "a" character.
However, I'm not sure that this is the issue, because the log file at /var/log/cloud-init-output.log gives me the following output (I replaced some repetitive text with [...] to save space). Note the line that says Failed running /var/lib/cloud/instance/scripts/part-001 I have verified that this part-001 file actually does have the correctly displayed hyphen characters.
[...]
Cloud-init v. 0.7.6 running 'modules:final' at Fri, 09 Sep 2016 06:23:39 +0000. Up 86.66 seconds.
Loaded plugins: priorities, update-motd, upgrade-helper
No Match for argument: –y
No packages marked for update
RPM version 4.11.2
Copyright (C) 1998-2002 - Red Hat, Inc.
This program may be freely redistributed under the terms of the GNU GPL
Usage: rpm [-aKfgpqVcdLilsiv?] [-a|--all] [-f|--file] [-g|--group] [...]
Loaded plugins: priorities, update-motd, upgrade-helper
Resolving Dependencies
--> Running transaction check
---> [...]
Dependencies Resolved
================================================================================
Package Arch Version Repository Size
================================================================================
Installing:
helix-p4d x86_64 2016.1-1429894 perforce 24 k
Installing for dependencies:
helix-cli x86_64 2016.1-1429894 perforce 8.8 k
helix-cli-base x86_64 2016.1-1429894 perforce 1.4 M
helix-p4d-base x86_64 2016.1-1429894 perforce 3.1 k
helix-p4d-base-16.1 x86_64 2016.1-1429894 perforce 2.4 M
helix-p4dctl x86_64 2016.1-1429894 perforce 1.2 M
Transaction Summary
================================================================================
Install 1 Package (+5 Dependent packages)
Total download size: 5.0 M
Installed size: 13 M
Is this ok [y/d/N]: Exiting on user command
Your transaction was saved, rerun it with:
yum load-transaction /tmp/yum_save_tx.2016-09-09.06-23.dRP_r2.yumtx
/var/lib/cloud/instance/scripts/part-001: line 22: cd: /opt/perforce/servers/: No such file or directory
chown: invalid user: ‘–R’
Sep 09 06:23:41 cloud-init[2517]: util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/part-001 [1]
Sep 09 06:23:41 cloud-init[2517]: cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts)
Sep 09 06:23:41 cloud-init[2517]: util.py[WARNING]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_scripts_user.pyc'>) failed
Cloud-init v. 0.7.6 finished at Fri, 09 Sep 2016 06:23:41 +0000. Datasource DataSourceEc2. Up 88.53 seconds
Even more annoying, I assumed that the early No Match for argument: –y line from the log file was referring to the yum update -y line from my user data. Sure enough, just running the example user data script from the EC2 documentation page, which also uses yum update -y, gives me this same error/warning! Amazon's own example script doesn't work!? So can anyone answer why A) AWS is not displaying the user data code correctly, and B) why my user data is yielding the errors shown above? The help is much appreciated!

For lines such as
yum update –y
The character you are using is a "EN DASH U+2013"
The usual character for a hyphen is "HYPHEN-MINUS U+002D"
Fix your user data source to use "hyphen minus" and have another go
I checked the character codes by cut n pasting into this online site http://www.fileformat.info/info/unicode/char/search.htm?q=-&preview=entity
Don't know if you can see the difference but this is your hyphen
yum update –y
and this is a "hyphen minus"
yum update -y

Apache on Win2003 cannot find the path specified

A fresh installation of Apache 2.2 on Win2003.
Configuration validates with the apache tool yet when I attempt to access the site the browser displays an internal error.
Apache log shows:
[Mon Jul 16 13:36:38 2012] [error] [client 10.162.9.158] (OS 3)The
system cannot find the path specified. : couldn't spawn child
process: D:/Heatmap/Webapp/public/dispatch.cg
The file system shows:
D:\Heatmap\Webapp\public>dir dispatch* Volume in drive D is DATA
Volume Serial Number is C482-3950
Directory of D:\Heatmap\Webapp\public
05/02/2012 10:56 AM 445 dispatch.cgi 05/02/2012 10:56
AM 520 dispatch.fcgi
2 File(s) 965 bytes
0 Dir(s) 5,625,618,432 bytes free
Since I noramlly run Apache on Linux servers I'm stymied as to what the root cause is here. The system cannot find a path that is present.
Cluestick please.

The bit of the message couldn't spawn child process caught my attention.
Research showed the shebang line is actually used by Apache - unless Perl at the WinOS level which associates the file extension with the interpreter - and I needed to correct it in my .cgi.
Specifying the full path to Perl in the cgi's shebang line corrected the problem.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How do I SQL Query MinIO objects using the MinIO Client (mc)? - minio

When you specify multiple objects, mc filters objects by extension. If you store files with a .csv extension they should be picked up. Similar for .json files and gzip/bzip2.

Related

AWS Lambda Chalice Layers Segmentation Fault

rsyslog not escaping backslash in JSON

Handbrake-CLI on Synology NAS

AWS EC2 User Data Not Decoding Correctly

Apache on Win2003 cannot find the path specified

Categories

Resources