I am playing around with Spark using the supplied spark-ec2:
./spark-ec2 \
--key-pair=pems \
--identity-file=/path/pems.pem \
--region=eu-west-1 \
-s 8 \
--instance-type c3.xlarge \
launch my-spark-cluster
After install I ssh into the master node after it has fully installed and then I start pyspark.
$ /root/spark/bin/pyspark --executor-memory 2G
I specify (at least I think) that each executor (machine) gets 2GB of memory. When I browse the console at <masternode>:4040 then I see that something didn't go right.
When I enter other preferences I get a similar result.
$ /root/spark/bin/pyspark --executor-memory 1G
The confusing part for me is that I specified c3.xlarge machines and these have ~7.5 Gb of memory so this shouldn't be an issue with memory shortage. Anyone have an idea?
Memory shown here is memory allocated for caching
It is defined by spark.storage.memoryFraction config and its default value is .6
Related
Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 6 months ago.
Improve this question
I used the QEMU(qemu-system-aarch64 -M raspi3) for emulating the Raspberry pi3 with the kernel from the working image. Everything was working but there was no networking.
qemu-system-aarch64 \
-kernel ./bootpart/kernel8.img \
-initrd ./bootpart/initrd.img-4.14.0-3-arm64 \
-dtb ./debian_bootpart/bcm2837-rpi-3-b.dtb \
-M raspi3 -m 1024 \
-nographic \
-serial mon:stdio \
-append "rw earlycon=pl011,0x3f201000 console=ttyAMA0 loglevel=8 root=/dev/mmcblk0p3 fsck.repair=yes net.ifnames=0 rootwait memtest=1" \
-drive file=./genpi64lite.img,format=raw,if=sd,id=hd-root \
-no-reboot
I tried to add this option
-device virtio-blk-device,drive=hd-root \
-netdev user,id=net0,hostfwd=tcp::5555-:22 \
-device virtio-net-device,netdev=net0 \
But there would be an error
qemu-system-aarch64: -device virtio-blk-device,drive=hd-root: No 'virtio-bus' bus found for device 'virtio-blk-device'
I have referenced some forum, and used the "virt" machine instead of raspi3 in order of emulating virtio-network
qemu-system-aarch64 \
-kernel ./bootpart/kernel8.img \
-initrd ./bootpart/initrd.img-4.14.0-3-arm64 \
-m 2048 \
-M virt \
-cpu cortex-a53 \
-smp 8 \
-nographic \
-serial mon:stdio \
-append "rw root=/dev/vda3 console=ttyAMA0 loglevel=8 rootwait fsck.repair=yes memtest=1" \
-drive file=./genpi64lite.img,format=raw,if=sd,id=hd-root \
-device virtio-blk-device,drive=hd-root \
-netdev user,id=net0,net=192.168.1.1/24,dhcpstart=192.168.1.234 \
-device virtio-net-device,netdev=net0 \
-no-reboot
There is nothing printed and the terminal was suspended. It means the kernel does not work with virt machine.
I decided to build for my own custom kernel. Could anyone give me advice for options to build the kernel that works with both the QEMU and the virtio?
Thanks in advance!
The latest versions of QEMU (5.1.0 and 5.0.1) have USB emulation for the raspi3 machine (qemu-system-aarch64 -M raspi3).
You can emulate networking and access to SSH if you use: -device usb-net,netdev=net0 -netdev user,id=net0,hostfwd=tcp::5555-:22 in QEMU
I tested this configuration, and I got this:
The USB network device in QEMU raspi3
Ethernet interface in QEMU raspi3
Here the full command and options that I used:
qemu-system-aarch64 -m 1024 -M raspi3 -kernel kernel8.img -dtb bcm2710-rpi-3-b-plus.dtb -sd 2020-08-20-raspios-buster-armhf.img -append "console=ttyAMA0 root=/dev/mmcblk0p2 rw rootwait rootfstype=ext4" -nographic -device usb-net,netdev=net0 -netdev user,id=net0,hostfwd=tcp::5555-:22
The QEMU version used was 5.1.0.
I have had the same problems as user #peterbabic, in that while I could see the gadget with lsusb, I could not see any net device.
So I tried manually inserting the appropriate module g_ether -- and it said that it could not find the driver.
It was then that I realized that the kernelv8.img file I had downloaded and the Raspbian OS image that I was booting were different versions, so the kernel could not find its modules because it looked for them in the wrong directory.
On the other hand, the Raspbian OS image had the correct kernel in its first partition (I could see it in /boot). The only problem was getting it out and use it to replace the wrong kernelv8.img (I could not find the correct one online -- and anyway, the kernel of the Raspbian image is by definition more correct).
So, I copied the Raspbian OS image on my Linux box, and mounted it with loop:
# fdisk raspbian.img
- command "p" lists partitions and tells me that P#1 starts at sector 2048
- command "q" exits without changes
# losetup -o $[ 2048 * 512 ] /dev/loop9 raspbian.img # because sectors are 512 bytes
# mkdir /mnt/raspi
# mount /dev/loop9 /mnt/raspi
- now "ls -la /mnt/raspi" shows the content of image partition 1, with kernels
# cp /mnt/raspi/kernel8.img .
# umount /mnt/raspi
# losetup -d /dev/loop9 # destroy loop device
# rmdir /mnt/raspi # remove temporary mount point
# rm raspbian.img
- I no longer need the raspbian.img copy so I delete it.
- now current directory holds "kernel8.img". I can just copy it back.
To be sure, I also modified /boot/cmdline.txt on the Raspberry image (before rebooting with the new kernel) so that it now added the dwc and g_ether modules:
On boot, the gadget is now automatically recognized:
Your raspi3 command line has no networking because on a raspi3 the networking is via USB, and QEMU doesn't have a model of the USB controller for that board yet. Adding virtio-related options won't work, because the raspi3 has no PCI and so there's no way to plug in a pci virtio device.
Your command line option with virt looks basically right (at least enough so to boot; you probably want "if=none" rather than "if=sd" and I'm not sure if the network options are quite right, but if those parts are wrong they will result in errors from the guest kernel later rather than total lack of output). So your problem is likely that the kernel config is missing some important items.
You can boot a stock Debian kernel on the virt board (instructions here: https://translatedcode.wordpress.com/2017/07/24/installing-debian-on-qemus-64-bit-arm-virt-board/) so one approach you could take to finding the error in your kernel config is to compare your config with the one the Debian kernel has. The upstream kernel source 'defconfig' also should work. I find that starting with a configm that works and cutting it down is faster than building one up from nothing by trying to find all the obscure options that need to be present.
I've updated the steps needed to get this working with April 4th Raspios
# wget https://downloads.raspberrypi.org/raspios_lite_arm64/images/raspios_lite_arm64-2022-04-07/2022-04-04-raspios-bullseye-arm64-lite.img.xz
# unxz 2022-04-04-raspios-bullseye-arm64-lite.img.xz
# mkdir boot
# mount -o loop,offset=4194304 2022-04-04-raspios-bullseye-arm64-lite.img boot
# cp boot/bcm2710-rpi-3-b-plus.dtb kernel8.img .
# echo 'pi:$6$6jHfJHU59JxxUfOS$k9natRNnu0AaeS/S9/IeVgSkwkYAjwJfGuYfnwsUoBxlNocOn.5yIdLRdSeHRiw8EWbbfwNSgx9/vUhu0NqF50' > boot/userconf
# umount boot
# qemu-img convert -f raw -O qcow2 2022-04-04-raspios-bullseye-arm64-lite.img 2022-04-04-raspios-bullseye-arm64-lite.qcow2
# qemu-img resize 2022-04-04-raspios-bullseye-arm64-lite.qcow2 4g
Then run Qemu 7.1.0 this way:
# qemu-system-aarch64 -m 1024 -M raspi3b -kernel kernel8.img \
-dtb bcm2710-rpi-3-b-plus.dtb -sd 2022-04-04-raspios-bullseye-arm64-lite.qcow2 \
-append "console=ttyAMA0 root=/dev/mmcblk0p2 rw rootwait rootfstype=ext4" \
-nographic -device usb-net,netdev=net0 -netdev user,id=net0,hostfwd=tcp::5555-:22
Edit your /boot/cmdline.txt file to add modules-load=dwc2,g_ether to /boot/cmdline.txt after rootwait.
I'm using Google Dataproc to initialize a Jupyter cluster.
At first I used the "dataproc-initialization-actions" available in github, and it works like a charm.
This is the create cluster Call available in the documentation:
gcloud dataproc clusters create my-dataproc-cluster \
--metadata "JUPYTER_PORT=8124" \
--initialization-actions \
gs://dataproc-initialization-actions/jupyter/jupyter.sh \
--bucket my-dataproc-bucket \
--num-workers 2 \
--properties spark:spark.executorEnv.PYTHONHASHSEED=0,spark:spark.yarn.am.memory=1024m \
--worker-machine-type=n1-standard-4 \
--master-machine-type=n1-standard-4
But I want to customize it, so I got the initialization file and saved it o my Google Storage (that is under the same project where I'm trying to create the cluster). So, I changed the call to point to my script instead, like this:
gcloud dataproc clusters create my-dataproc-cluster \
--metadata "JUPYTER_PORT=8124" \
--initialization-actions \
gs://myjupyterbucketname/jupyter.sh \
--bucket my-dataproc-bucket \
--num-workers 2 \
--properties spark:spark.executorEnv.PYTHONHASHSEED=0,spark:spark.yarn.am.memory=1024m \
--worker-machine-type=n1-standard-4 \
--master-machine-type=n1-standard-4
But running this I got the following error:
Waiting on operation [projects/myprojectname/regions/global/operations/cf20
466c-ccb1-4c0c-aae6-fac0b99c9a35].
Waiting for cluster creation operation...done.
ERROR: (gcloud.dataproc.clusters.create) Operation [projects/myprojectname/
regions/global/operations/cf20466c-ccb1-4c0c-aae6-fac0b99c9a35] failed: Multiple
Errors:
- Google Cloud Dataproc Agent reports failure. If logs are available, they can
be found in 'gs://myjupyterbucketname/google-cloud-dataproc-metainfo/231e5160-75f3-
487c-9cc3-06a5918b77f5/my-dataproc-cluster-m'.
- Google Cloud Dataproc Agent reports failure. If logs are available, they can
be found in 'gs://myjupyterbucketname/google-cloud-dataproc-metainfo/231e5160-75f3-
487c-9cc3-06a5918b77f5/my-dataproc-cluster-w-1'..
Well the files where there, so I think it may not be some access permission problem. The file named "dataproc-initialization-script-0_output" has the following content:
/usr/bin/env: bash: No such file or directory
Any ideas?
Well, found my answer here
Turns out the script had windows line endings instead of unix line endings.
Made an online convertion using dos2unix and now it runs fine.
With help from #tix I could check that the file was reacheable using a SSH connection to the cluster (Successful "gsutil cat gs://myjupyterbucketname/jupyter.sh")
AND, the initialization file was correctly saved locally in the directory "/etc/google-dataproc/startup-scripts/dataproc-initialization-script-0"
using ulimit command, i set core file size.
ulimit -c unlimited
and I compiled c source code using gcc - g option.
then a.out generated.
after command
./a.out
there is runtime error .
(core dumped)
but core file was not generated.(ex. core.294340)
how to generated core file?
First make sure the container will write the cores to an existing location in the container filesystem. The core generation settings are set in the host, not in the container. Example:
echo '/cores/core.%e.%p' | sudo tee /proc/sys/kernel/core_pattern
will generate cores in the folder /cores.
In your Dockerfile, create that folder:
RUN mkdir /cores
You need to specify the core size limit; the ulimit shell command would not work, cause it only affects at the current shell. You need to use the docker run option --ulimit with a soft and hard limit. After building the Docker image, run the container with something like:
docker run --ulimit core=-1 --mount source=coredumps_volume,target=/cores ab3ca583c907 ./a.out
where coredumps_volume is a volume you already created where the core will persist after the container is terminated. E.g.
docker volume create coredumps_volume
If you want to generate a core dump of an existing process, say using gcore, you need to start the container with --cap-add=SYS_PTRACE to allow a debugger running as root inside the container to attach to the process. (For core dumps on signals, see the other answer)
I keep forgetting how to do it exactly and keep stumbling upon this question which provides marginal help.
All in all it is very simple:
Run the container with extra params --ulimit core=-1 --privileged to allow coredumps:
docker run -it --rm \
--name something \
--ulimit core=-1 --privileged \
--security-opt seccomp=unconfined \
--entrypoint '/bin/bash' \
$IMAGE
Then, once in container, set coredump location and start your failing script:
sysctl -w kernel.core_pattern=/tmp/core-%e.%p.%h.%t
myfailingscript.a
Enjoy your stacktrace
cd /tmp; gdb -c `ls -t /tmp | grep core | tail -1`
Well, let's resurrect an ancient thread.
If you're running Docker on Linux, then all of this is controlled by /proc/sys/kernel/core_pattern on your raw metal. That is, if you cat that file on bare metal and inside the container, they'll be the same. Note also the file is tricky to update. You have to use the tee method from some of the other posts.
echo core | sudo tee /proc/sys/kernel/core_pattern
If you change it in bare metal, it gets changed in your container. So that also means that behavior is going to be based on where you're running your containers.
My containers don't run apport, but my bare metal did, so I wasn't getting cores. I did the above (I had already solved the ulimit -c thing), and suddenly I get core files in the current directory.
The key to this is understanding that it's your environment, your bare metal, that controls the contents of that file.
How to cache a particular directory in RHEL / CentOS ? Suppose I have a directory which contains 10 GB data and I've 48 GB of RAM. How to cache all these data inside the directory(only this specific directory) to my memory for a specific amount of time or indefinitely ?
There's a standard memory device on each Linux system /dev/shm.
When you run the mount command you will see:
tmpfs on /dev/shm type tmpfs (rw)
Generally it's about half the size of the system's memory so if you have 48GB of RAM, its size will be about 24GB. ( you can check this by running df -h )
You can use /dev/shm as if it was a normal hard drive, for example, you can copy a file to it:
cp -r "YOUR DIRECTORY" /dev/shm/
that will do the tirck
By using bootstrap i was moving some source files to master node. While creating the jobflow through elastic-mapreduce-client, I will pass a pig script, that will launch embedded python from the source files that present in master node.
following commands i have used to create the jobflow,
./elastic-mapreduce --create --alive --name "AutoTest" \
--instance-group master --instance-type m1.small \
--instance-count 1 --bid-price 0.20 \
--instance-group core --instance-type m1.small \
--instance-count 2 --bid-price 0.20 \
--log-uri s3n://test/logs \
--bootstrap-action "s3://test/bootstrap-actions/download.sh" \
--pig-script \
--args s3://test/rollups.pig
rollups.pig contains the following code that launches the embedded pig file,
sh pig automate.py
If i run the rollups.pig in local machine, it will fire the automate.py successfully. but when i try to run this by using amazon elastic map reduce, it is not working ?