How to run db migration script in a kubernetes pod from bash? - bash

I would like to run database migration scripts in Ubuntu pod automatically.
How I am doing this manually:
$ kubectl run -i --tty ubuntu --image=ubuntu:focal -- bash
$ apt install -y postgresql-client
$ psql "hostaddr=addr port=5432 user=username password=pass dbname=dbname"
COPY persons(first_name, last_name, dob, email)
FROM './persons.csv'
DELIMITER ','
CSV HEADER;
$ exit
I would like to create a bash script for this purposes to run locally. Could you please advise how to script it? First command connects to a remote bash session, and I am not able to execute other commands. Definitely doing something wrong.
Thank you.

Use here documents.
#!/bin/bash
kubectl run -i --tty ubuntu --image=ubuntu:focal -- bash <<EOF
apt install -y postgresql-client
psql "hostaddr=addr port=5432 user=username password=pass dbname=dbname" <<EOF2
COPY persons(first_name, last_name, dob, email)
FROM './persons.csv'
DELIMITER ','
CSV HEADER;
EOF2
EOF

Let's assume we have a command that is supposed to be run to execute some SQL query on a postgresql server in a Kubernetes cluster:
export pgcmd="PGPASSWORD=pass1234 psql -U username -d mydatabase -h addr -p port -c \"COPY persons(first_name, last_name, dob, email) FROM './persons.csv' DELIMITER ',' CSV HEADER;\" "
or by using URL syntax
export pgcmd="psql postgresql://username:pass#addr:5432/mydatabase -c \"COPY persons(first_name, last_name, dob, email) FROM './persons.csv' DELIMITER ',' CSV HEADER;\" "
Actually, it's more convenient to use official postgres docker image instead of installing postgresql client on Ubuntu image:
(if I use the same image as used to spin up the posgresql server, I can save some time on pulling image from the repository)
kubectl run -it --rm pgclient --image=postgres -- $pgcmd
Alternatively you can run the command using posgresql pod itself
kubectl exec -it posgresql-server-pod-name -- $pgcmd
or proxy connection to the postgresql server and execute the command there
kubectl port-forward posgresql-server-pod-name 8888:5432 &
#or we can use parent object to connect
#kubectl port-forward deployment/posgresql-server-deploy-name 8888:5432 &
# save ID of the background process
proxyid=$!
# run postgres command locally
$pgcmd
# switch off port forwarding and cleanup environment variables
unset PGPASSWORD
kill $proxyid && unset proxyid

Related

Passing shell variable to command executed via kubectl exec

I have a repetitive task that I do while testing which entails connecting to a cassandra pod and running a couple of CQL queries.
Here's the "manual" approach:
On cluster controller node, I exec a shell on the pod using kubectl:
kubectl exec pod/my-app-cassandra-pod-name -it --namespace myns -- /bin/bash
Once in the pod I execute cqlsh:
cqlsh $(hostname -i) -u myuser
and then enter password interactively
I execute my cql queries interactively
Now, I'd like to have a bash script to automate this. My intent is to run cqlsh directly, via kubectl exec.
The problem I have is that apparently I cannot use a shell variable within the "command" section of kubectl exec. And I will need shell variables to store a) the pod's IP, b) an id which is the input to my first query, and c) intermediate query results (the two latter ones are not added to script yet).
Here's what I have so far, using a dummy CQL query for now:
#!/bin/bash
CASS_IP=$(kubectl exec pod/my-app-cassandra-pod-name -it --namespace myns -- /usr/bin/hostname -i)
echo $CASS_IP # This prints out the IP address just fine, say 192.168.79.208
# The below does not work, errors provided below
kubectl exec pod/my-app-cassandra-pod-name -it --namespace myns -- /opt/cassandra/bin/cqlsh $CASS_IP -u myuser -p 'mypass' -e 'SELECT now() FROM system.local;'
# The below works just fine and returns the CQL query output
kubectl exec pod/my-app-cassandra-pod-name -it --namespace myns -- /opt/cassandra/bin/cqlsh 192.168.79.208 -u myuser -p 'mypass' -e 'SELECT now() FROM system.local;'
The output from the above is as follows, where IP is echoed, first exec'd cqlsh breaks, and second succeeds:
192.168.79.208
Warning: Timezone defined and 'pytz' module for timezone conversion not installed. Timestamps will be displayed in UTC timezone.
Traceback (most recent call last):
File "/opt/cassandra/bin/cqlsh.py", line 2357, in <module>
main(*read_options(sys.argv[1:], os.environ))
File "/opt/cassandra/bin/cqlsh.py", line 2326, in main
encoding=options.encoding)
File "/opt/cassandra/bin/cqlsh.py", line 463, in __init__
load_balancing_policy=WhiteListRoundRobinPolicy([self.hostname]),
File "/opt/cassandra/bin/../lib/cassandra-driver-internal-only-3.25.0.zip/cassandra-driver-3.25.0/cassandra/policies.py", line 425, in __init__
File "/opt/cassandra/bin/../lib/cassandra-driver-internal-only-3.25.0.zip/cassandra-driver-3.25.0/cassandra/policies.py", line 426, in <listcomp>
File "/usr/lib64/python3.6/socket.py", line 745, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known
command terminated with exit code 1
Warning: Timezone defined and 'pytz' module for timezone conversion not installed. Timestamps will be displayed in UTC timezone.
system.now()
--------------------------------------
e78e75c0-0d3e-11ed-8825-1de1a1b1c128
(1 rows)
Any ideas how to get around this? I've been researching this for quite a while now, but I'm stuck...
This is a very, very FAQ: the kubectl exec is, as its name says, using exec(3) versus system(3) -- which in your case wouldn't work anyway because the $ in your kubectl exec would be interpreted by your shell not the pod's shell
but thankfully the solution is the same to both problems: create your own system(3) by wrapping the command in a sh -c invocation (or bash -c if you have bash-isms and bash is available inside the pod):
kubectl exec pod/my-app-cassandra-pod-name -it --namespace myns -- sh -c '/opt/cassandra/bin/cqlsh $(hostname -i) -u myuser -p "mypass" -e "SELECT now() FROM system.local;"'
as always, be cognizant of the "outer" versus "inner" quoting, especially if your "mypass" or the -e statement contains shell meta-characters

Unable to issue long sql query to postgres pod in kubernetes via bash script

I am trying to execute a query on postgres pod in k8s via bash script but cannot get results when i select a large number of columns. Here is my query:
kubectl exec -it postgres-pod-dcd-wvd -- bash -c "psql -U postgres -c \"Select json_build_object('f_name',json_agg(f_name),'l_name',json_agg(l_name),'email',json_agg(email),'date_joined',json_agg(date_joined),'dep_name',json_agg(dep_name),'address',json_agg(address),'zip_code',json_agg(zip_code),'city',json_agg(city), 'country',json_agg(country)) from accounts WHERE last_name='ABC';\""
When i reduce the number of columns to be selected in the query, i get the results but if I use all the column names, the query just hangs indefinitely. What could be wrong here?
Update:
I tried using the query as :
kubectl exec -it postgres-pod-dcd-wvd -- bash -c "psql -U postgres -c \"Select last_name,first_name,...(other column names).. row_to_json(accounts) from register_account WHERE last_name='ABC';\""
But this also hangs.
When i try from inside the pod, It works but i need to execute it via bash script
Means it is almost certainly the results pagination; when you run exec -t it sets up a TTY in the Pod, just like you were connected interactively, so it is likely waiting for you to press space or "n" for the next page
You can disable the pagination with env PAGER=cat psql -c "select ..." or use the --pset pager=off as in psql --pset pager=off -c "Select ..."
Also, there's no need to run bash -c unless your .bashrc is setting some variables or otherwise performing work in the Pod. Using exec -- psql should work just fine, all other things being equal. You will need to use the env command if you want to go with the PAGER=cat approach, because $ ENV=var some_command is shell syntax, and thus cannot be fed directly into exec
As the resulting columns are having a lot of json processing, I think the time taken to execute these two queries are different.
Maybe you can login into the pod and execute the query and see.
kubectl exec -it postgres-pod-dcd-wvd -- bash
Now you are inside the pod. Then we can execute the query.
# psql -U postgres -c \"Select json_build_object('f_name',json_agg(f_name),'l_name',json_agg(l_name),'email',json_agg(email),'date_joined',json_agg(date_joined),'dep_name',json_agg(dep_name),'address',json_agg(address),'zip_code',json_agg(zip_code),'city',json_agg(city), 'country',json_agg(country)) from accounts WHERE last_name='ABC';\"
# psql -U postgres -c \"Select last_name,first_name,...(other column names).. row_to_json(accounts) from register_account WHERE last_name='ABC';\"
Now you we will be able to see whether one query is taking longer time to execute.
Also, kubectl exec pod command can be executed with a request timeout value (--request-timeout=5m) to see if there is a slowness.

pg_dump in Docker container

I am trying to run pg_dump in a Docker container via kubectl and save the output to my local machine.
Here's what I have so far:
kubectl exec -it MY_POD_NAME -- pg_dump -h DB_HOST -U USER_NAME SCHEMA_NAME > backup.sql
However this just hangs currently. I am fairly certain it's due to the -- ignoring the >
kubectl exec -it MY_POD_NAME -- pg_dump -h DB_HOST -U USER_NAME SCHEMA_NAME outputs to the console as expected.
Use kubectl port-forward POD_NAME 6000:5342 to forward your pod port (assumed to be exposed on 5432) onto localhost:6000.
And then run pg_dump directly with hostname as localhost and port as 6000
$ pg_dump -h DB_HOST -U USER_NAME SCHEMA_NAME > backup.sql
Managed to solve myself - not the most elegant solution but it works.
First I open a shell on a pod in the cluster which has network access to the RDS instance:
kubectl exec -it psql-xxx-xxx sh
Once connected to the shell, run pg_dump to backup the database:
pg_dump -h db.internal.dns -U user schema_name > backup.sql
Once the backup completes, exit the container and copy the file from the pod to my local:
kubectl cp psql-xxx-xxx:/backup.sql ./backup.sql
Will continue searching for a streamlined way to do this.
If the database is indeed running as a docker image in remote kubernetes instance, I was successful with
kubectl exec -it POD_NAME -- pg_dump -h localhost -U DB_USER DB_NAME > backup.sql
Where localhost points to localhost in remote instance. Running directly from local terminal, this saves database dump to my local machine, no matter the database itself is running in the cloud.

Automatically enter only running docker container

In the cloud, I have multiple instances, each running a container with a different random name, e.g.:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5dc97950d924 aws_beanstalk/my-app:latest "/bin/sh -c 'python 3 hours ago Up 3 hours 80/tcp, 5000/tcp, 8080/tcp jolly_galileo
To enter them, I type:
sudo docker exec -it jolly_galileo /bin/bash
Is there a command or can you write a bash script to automatically execute the exec to enter the correct container?
"the correct container"?
To determine what is the "correct" container, your bash script would still need either the id or the name of that container.
For example, I have a function in my .bashrc:
deb() { docker exec -u git -it $1 bash; }
That way, I would type:
deb jolly_galileo
(it uses the account git, but you don't have to)
Here's my final solution. It edits the instance's .bashrc if it hasn't been edited yet, prints out docker ps, defines the dock function, and enters the container. A user can then type "exit" if they want to access the raw instances, and "exit" again to quit ssh.
commands:
bashrc:
command: if ! grep -Fxq "sudo docker ps" /home/ec2-user/.bashrc; then echo -e "dock() { sudo docker exec -it $(sudo docker ps -lq) bash; } \nsudo docker ps\ndock" >> /home/ec2-user/.bashrc; fi
As VonC indicated, usually you have to make some shell scripting of your own if you find yourself doing something repetitive. I made a tool myself here which works if you have Bash 4+.
Install
wget -qO- https://raw.githubusercontent.com/Pithikos/dockerint/master/docker_autoenter >> ~/.bashrc
Then you can enter a container by simply typing the first letters of the container.
$> docker ps
CONTAINER ID IMAGE ..
807b1e7eab7e ubuntu ..
18e953015fa9 ubuntu ..
19bd96389d54 ubuntu ..
$> 18
root#18e953015fa9:/#
This works by taking advantage of the function command_not_found_handle introduced in Bash 4. If a command is not found, the script will try and see if what you typed is a container and if it is, it will run docker exec <container> bash.

Bash / Docker exec: file redirection from inside a container

I can't figure out how to read content of a file from a Docker container. I want to execute content of a SQL file into my PGSQL container. I tried:
docker exec -it app_pgsql psql --host=127.0.0.1 --username=foo foo < /usr/src/app/migrations/*.sql
My application is mounted in /usr/src/app. But I got an error:
bash: /usr/src/app/migrations/*.sql: No such file or directory
It seems that Bash interprets this path as an host path, not a guest one. Indeed, executing the command in two times works perfectly:
docker exec -it app_pgsql
psql --host=127.0.0.1 --username=foo foo < /usr/src/app/migrations/*.sql
I think that's more a Bash issue than a Docker one, but I'm still stuck! :)
Try and use a shell to execute that command
sh -c 'psql --host=127.0.0.1 --username=foo foo < /usr/src/app/migrations/*.sql'
The full command would be:
docker exec -it app_pgsql sh -c 'psql --host=127.0.0.1 --username=foo foo < /usr/src/app/migrations/*.sql'
try with sh -c "your long command"
Also working when piping backup to the mysql command:
cat backup.sql | docker exec -i CONTAINER /usr/bin/mysql -u root --password=root DATABASE
You can use the database client in order to connect to you container and redirect the database file, then you can perform the restore.
Here is an example with MySQL: a container running MySQL, using the host network stack. Since that the container is using the host network stack (if you don't have any restriction on your MySQL or whatever database), you can connect via localhost and performing the commands transparently
mysql -h 127.0.0.1 -u user -pyour_passwd database_name < db_backup.sql
You can do the same with PostgresSQL (Restore a postgres backup file using the command line?):
pg_restore --host 127.0.0.1 --port 5432 --username "postgres" --dbname "mydatabase" --no-password --clean "/home/dinesh/db/mydb.backup"
Seems like that "docker exec" does not support input redirection.. I will verify this and maybe open an issue for Docker Community at GitHub, if it is applicable.

Resources