Mesos's resource information when executing a docker image - mesos

I'm working on Mesos code, and become very confused about the resource needed for executing a docker image.
In, src/cli/execute.cpp: CommandScheduler::offers(), it pulls out the resource from the task, and uses this resource information to check whether to accept or decline the offer.
However in CommandScheduler, I don't see anywhere the task's resource is updated.
And in the main() function, where a CommandScheduler object is create, I only see a docker-image-string used to create the task-info, still no explicit compute resource usage information.
I need this resource information (code level) explicitly. Could anyone help me to understand this point?
I'm working on Mesos 1.2 right now.
Thanks

I got it. By default, the resource allocated is cpus:1;mem:128.
It's done by flag default value for resources
add(&Flags::resources,
"resources",
"Resources for the command.",
"cpus:1;mem:128");

Related

middy-ssm not picking up changes to the lambda's execution role

We're using middy-ssm to fetch & cache SSM parameter values during lambda initialization. We ran into a situation where the execution role of the lambda did not have access to perform SSM::GetParameters on the path that it attempted to fetch. We updated a policy on the role to allow access, but it appeared like the lambda function never picked up the changes to permissions, but instead kept failing due to missing permissions until the end of the lifecycle (closer to 1 hour as requests kept on coming to it).
I then did a test where I fetched parameters using both the aws-lambda SDK directly and middy-ssm. Initially the lambda role didn't have permissions and both methods failed. We updated the policy and after a couple of minutes, the code that used the SDK was able to retrieve the parameter, but the middy middleware kept failing.
I tried to interpret the implementation of middy-ssm to figure out if the error result is somehow cached or what is going on there, but couldn't really pinpoint the issue. Any insight and/or suggestions how to overcome this are welcome! Thanks!
So as pointed out by Will in the comments, this turned out to be a bug.

GCP - creating a VM instance and extracting logs

I jave a JAVA application in which I am using GCP to create VM instances from images.
In this application, I would like to allow the user to view the vm creation logs in order to be updated on the status of the creation, and to be able to see failure points in detail.
I am sure such logs exist in GCP, but have been unable to find specific APIOs which let me see a specific action, for example creation of instance "X".
Thanks for the help
When you create a VM, the answer that you have is a JobID (because the creation take time and the Compute Engine API answer immediately). To know the status of the VM start (and creation) you have to poll regularly this JobID.
In the logs, you can also filter with this JobID to select and view only the logs that you want on the Compute API side (create/start errors).
If you want to see the logs of the VM, filter the logs not with the JobID but with the name of the VM, and its zone.
In Java, you have client libraries that help you to achieve this

SELinux init_daemon_domain(avahi_t,avahi_exec_t) vs. files_types(avahi_t)

I am running into a problem with labeling. In order to lock down access to a file /etc/avahi/avahi-daemon.conf I decided to label it as a part of the avahi_t domain.
I am working on an embedded system. When I boot up the system from a version update, the file system is relabeled with the .autorelabel flag set.
Unfortunately the file /etc/avahi/avahi-daemon.conf remains in the unlabeled_t type. Due to the label being wrong, it is unable to read the file and avahi fails to initialize properly with an avc read denied on an unlabeled_t file. I want to have the label correctly set and not modify policy to read an unlabeled file. I also want it to be protected so the configuration can not be modified.
I have properly labeled it in the .fc file with the following:
/etc/avahi/avahi-daemon.conf -- gen_context(system_u:object_r:avahi_t,s0)
When I try a restorecon on the file system it attempts to relabel the file but is blocked by SELinux with a relabelto avc violation. Similarly changing it with chcon -t fails to change it. I do not wish to open relabelto up on an embedded system as it can then be relabeled and take down the avahi initialization. If I take out the SD card, and relabel the file on a different system. And place it back into the target system it is properly labeled. And avahi operates correctly. So I am certain that the labeling is causing the problem.
In looking in the reference policy an init_daemon_domain(avahi_t,avahi_exec_t) is being performed.
In looking at the documentation for init_daemon_domain() it states the following:
"The types will be made usable as a domain and file, making calls to domain_type() and files_type() redundant."
This is unusual in that if I add files_type(avahi_t) to the .te file, it properly labels after version update.
I am really wanting to know more information about this, and unfortunately my searches on the internet have been less than fruitful in this regard.
Is the documentation for SELinux wrong? Am I missing something about init_daemon_domain() in that it only works with processes and not files?
Or is the files_type(avahi_t) truly needed?
I know this comes off as a trivial issue since there is a path to where it is working. However I am hoping to get an explanation as to why files_type(avahi_t) is necessary?
Thanks

How to plug in a process of identifying sensitive information somewhere in ETL pipeline?

Hope you are doing well !
We have already developed ETL pipeline using apache NiFi. Which gets trigger only when client uploads source data file from portal.After that, the data present inside source file goes through various layers,gets transformed and stored back to warehouse(i.e. hive).
Goal : To identify sensitive information and mask it so that end user won't see actual data.
Identify Sensitive data & masking strategy : We will make use of open source tool to achieve this goal as follow.
Data steward studio : This tool allow me to identify sensitive information and tag it properly.
Apache Atlas : Once data steward user has confirmed the tag then that tag will be pushed into Apache atlas.
Apache ranger : At the final, we can define tag based-masking policy using Apache ranger which will allow or deny to specific user. 
For more details on above solution , please visit link.
https://www.youtube.com/watch?v=RzEfLwJaLsc
Problem : In order to feed the data to DSS tool, it should be loaded first in hive table. That is fine. But we cannot stop the existing ETL flow in-between and then start identification process of sensitive information. The above solution must require some manual process which i want to get rid of and make it automated.that is, it should be plugged in somewhere within NiFi pipeline.But so far, as per my understanding DSS do not allow us to do something like that.
Manual Process :
Create Asset collection
Accept/Reject suggested tags within DSS.
If we cannot plug identification process in pipeline, then client sensitive data will be exposed to everyone and visible to everyone in team. I want something where we can de-identify sensitive data before it actually get loaded into HDFS or hive tables.
Please write your response to me on the same problem, if anyone has already worked into this particular area.
  
I did not test it, but here are my thoughts on this challenge.
Set up the system such that data is NOT visible to everyone(or anyone) by default
Load the data into hive
Let the profilers run and accept its suggestions
Open up the data to those who should have access (except for the things found by the profiler)
There are still some implementation details to work out (e.g. How to automate step 3/4 and whether you can just solve this with tags or whether the data needs to sit in a staging area first). But I hope this steers you in a good direction.
One idea might be to use EncryptContent of nifi (https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.5.0/org.apache.nifi.processors.standard.EncryptContent/). Then the values loaded into Hive will be encrypted in the first place and would not be visible to the stewards. Once the tagging has been done - then in the subsequent part of the pipeline (where I'm assuming you're using nifi as well) - you can decrypt back content as required.

EC2 init.d script - what's the best practice

I'm creating an init.d script that will run a couple of tasks when the instance starts up.
it will create a new volume with our code repository and mount it if it doesn't exist already.
it will tag the instance
The tasks above being complete will be crucial for our site (i.e. without the code repository mounted the site won't work). How can I make sure that the server doesn't end up being publicly visible? Should I start my init.d script with de-registering the instance from the ELB (I'm not even sure if it will be registered at that point), and then register it again when all the tasks finished successfully?
What is the best practice?
Thanks!
You should have a health check on your ELB. So your server shouldn't get in unless it reports as happy. And it shouldn't report happy if the boot script errors out.
(Also, you should look into using cloud-init. That way you can change the boot script without making a new AMI.)
I suggest you use CloudFormation instead. You can bring up a full stack of your system by representing it in a JSON format template.
For example, you can create an autoscale group that has an instances with unique tags and the instances have another volume attached (which presumably has your code)
Here's a sample JSON template attaching an EBS volume to an instance:
https://s3.amazonaws.com/cloudformation-templates-us-east-1/EC2WithEBSSample.template
And here many other JSON templates that you can use for your guidance and deploy your specific Stack and Application.
http://aws.amazon.com/cloudformation/aws-cloudformation-templates/
Of course you can accomplish the same using init.d script or using the rc.local file in your instance but I believe CloudFormation is a cleaner solution from the outside (not inside your instance)
You can also write your own script that brings up your stack from the outside by why reinvent the wheel.
Hope this helps.

Resources