ContainerLaunchContext.setResource() missing of hadoop yarn - hadoop

http://hadoop.apache.org/docs/r2.1.0-beta/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
I am try to make the example work well from the above link.but I can't compile the code below
Resource capability = Records.newRecord(Resource.class);
capability.setMemory(512);
amContainer.setResource(capability);
// Set the container launch content into the
// ApplicationSubmissionContext
appContext.setAMContainerSpec(amContainer);
amContainer is ContainerLaunchContext and my hadoop version is 2.1.0-beta.
I did some investigation. I found there's no method "setResource" in ContainerLaunchContext
I have 3 question about this
1) the method has been removed or something?
2) if the method has been removed, how can I do now?
3) is there any doc about yarn, because I found the doc in website is very easy, I hope I can get a manual or something. for example,
capability.setMemory(512);
I don't know it's 512k or 512M according comments in code.

This is actually proper solution to the question. Previous answer might cause incorrect execution !!!
#Dyin I couldn't fit it in the comment ;) Validated for 2.2.0 and 2.3.0
Driver setting up resources for AppMaster:
ApplicationSubmissionContext appContext = app.getApplicationSubmissionContext();
ApplicationId appId = appContext.getApplicationId();
appContext.setApplicationName(this.appName);
// Set up the container launch context for the application master
ContainerLaunchContext amContainer = Records.newRecord(ContainerLaunchContext.class);
Resource capability = Records.newRecord(Resource.class);
capability.setMemory(amMemory);
appContext.setResource(capability);
appContext.setAMContainerSpec(amContainer);
Priority pri = Records.newRecord(Priority.class);
pri.setPriority(amPriority);
appContext.setPriority(pri);
appContext.setQueue(amQueue);
// Submit the application to the applications manager
yarnClient.submitApplication(appContext); // this.yarnClient = YarnClient.createYarnClient();
In ApplicationMaster this is how you should specify resources for containers (workers).
private AMRMClient.ContainerRequest setupContainerAskForRM() {
// setup requirements for hosts
// using * as any host will do for the distributed shell app
// set the priority for the request
Priority pri = Records.newRecord(Priority.class);
pri.setPriority(requestPriority);
// Set up resource type requirements
// For now, only memory is supported so we set memory requirements
Resource capability = Records.newRecord(Resource.class);
capability.setMemory(containerMemory);
AMRMClient.ContainerRequest request = new AMRMClient.ContainerRequest(capability, null, null,
pri);
return request;
}
Some run() or main() method in your AppMaster
AMRMClientAsync.CallbackHandler allocListener = new RMCallbackHandler();
resourceManager = AMRMClientAsync.createAMRMClientAsync(1000, allocListener);
resourceManager.init(conf);
resourceManager.start();
for (int i = 0; i < numTotalContainers; ++i) {
AMRMClient.ContainerRequest containerAsk = setupContainerAskForRM();
resourceManager.addContainerRequest(containerAsk); //
}
Launching containers
You can use the original answer solution (java cmd), but it's just a cherry on top. It should work anyway.

You can set memory available to ApplicationMaster via commend. As such:
// Set the necessary command to execute the application master
Vector<CharSequence> vargs = new Vector<CharSequence>(30);
...
vargs.add("-Xmx" + amMemory + "m"); // notice "m" indicating megabytes, you can use also -Xms combined with -Xmx
... // transform vargs to String commands
amContainer.setCommands(commands);
This should solve your problem. As for the 3 questions. Yarn is rapidly evolving software. My advice forget documentation, get source code and read it. This will answer a lot of your questions.

Related

Elastic-Cloud Not Receiving Data from Serilog Sink

I set up an Elastic Cloud to offload my local elasticsearch config (as one does), but for reasons unknown to me, I can't get it to show any logs in Elastic Cloud, despite it working fine locally.
The code I got: (modified for privacy reasons)
//var uri = new Uri("http://localhost:9200"); // old one
var uri = new Uri("https://my-server.kb.eastus2.azure.elastic-cloud.com:9243");
var sinkOptions = new ElasticsearchSinkOptions(uri)
{
AutoRegisterTemplate = true,
ModifyConnectionSettings = x => x.BasicAuthentication("elastic", "the password I was given"),
IndexFormat = $"test-logs-{env.EnvironmentName?.ToLower().Replace('.', '-')}-{DateTime.Now:yyyy-MM}",
};
Log.Logger = new LoggerConfiguration()
.ReadFrom.Configuration(config)
.Enrich.FromLogContext()
.Enrich.WithMachineName()
.WriteTo.Console()
.WriteTo.Elasticsearch(sinkOptions)
.Enrich.WithProperty("Environment", env.EnvironmentName)
.CreateLogger();
There are two possible reasons I can think of that might be the cause of this not working:
The credentials are wrong
The Uri is wrong
Every solution I've been given so far has provided the data in this fashion, and nowhere does it say what the URI I'm supposed to use looks like.
I get no errors.
I get no warnings.
I get no logs.
What am I doing wrong here?
The issue was using the incorrect uri. I wrote
my-server.kb.eastus2.azure.elastic-cloud.com:9243 rather than
my-server.es.eastus2.azure.elastic-cloud.com:9243.
Note the very tiny difference that is kb vs es in the url

Why does RBD snap id start from 4?

I'm a newbee Ceph developer, and recenlty reading code of snapshots. From
pg_pool_t::add_unmanaged_snap, it's obvious that the first RBD snapshot id should
start from 2, but in reality, it starts from 4, I wonder whether there are some organisms
in RBD snap, which increments snap_seq, could anyone help me?
Thanks in advance!
Below is
the code of pg_pool_t::add_unmanaged_snap.
void pg_pool_t::add_unmanaged_snap(uint64_t& snapid)
{
ceph_assert(!is_pool_snaps_mode());
if (snap_seq == 0) {
// kludge for pre-mimic tracking of pool vs selfmanaged snaps. after
// mimic this field is not decoded but our flag is set; pre-mimic, we
// have a non-empty removed_snaps to signifiy a non-pool-snaps pool.
removed_snaps.insert(snapid_t(1));
snap_seq = 1;
}
flags |= FLAG_SELFMANAGED_SNAPS;
snapid = snap_seq = snap_seq + 1;
}
Following screenshot is the process of rbd snapshot creation on a totally new rbd pool. Obviously, snapshot id here starts from 4
rbd snapshot creation on a totally new rbd pool

SWUpdate multiple bootenv sections

I use SWUpdate to update different Hardware-Revisions of the same device with a double-copy strategy.
The bootloader environmnent of all those looks very similar. However, I have to set the mmc-partition to boot from depending on the active copy and the boot_file depending on the hardware-revision.
To keep the sw-description-file as comprehensive as possible and to make it easy to maintain I would like to set a "basic" boot-environment for all devices in a first step and in a second step overwrite some variables depending on hardware-revision and active copy:
software =
{
version = "1.1";
hardware-compatibility = ["0.1","1.0"];
device1=
{
copy-1:
{
images:
(
{
filename = "rootfs.ext3.gz";
device = "/dev/mmcblk0p3";
compressed = true;
},
{
filename = "u-boot-env-base"; #basic boot environment
type = "uboot";
}
);
bootenv: # device-specific boot variables
(
{
name = "boot_file"
value = "uImage1"
},
{
name = "mmcpart";
value = "3";
}
);
}
}
}
While parsing both bootloader environments are reported but only one is applied or both are, but in the wrong order, because when checking via fw_printenv the "u-boot-env-base" is unaltered.
I am using
SWUpdate v2018.11.0
U-Boot 2018.09.
I feel that I had this working in an older setup (SWUpdate 2016).
I have addressed the mailing list with this question. Stefano Babic, SWUpdate developer and maintainer, answered my question I am just trying to summarize it here.
What I have described is desired behaviour. It is not foreseen to set bootloader variables twice during an update. The u-boot variables defined in a file have priority over u-boot name-value-pairs in the bootenv section because the file is processed in the very end of the update. The solution in my case is to set the variables only in the bootenv section.

Boost event logger

http://boost-log.sourceforge.net/libs/log/doc/html/log/detailed/sink_backends.html
In this page there is sample code to initialize boost windows event backends,
but when I run it it gives memory error at first line.
void init_logging()
{
// Create an event log sink
boost::shared_ptr< sink_t > sink(new sink_t());
sink->set_formatter
(
expr::format("%1%: [%2%] - %3%")
% expr::attr< unsigned int >("LineID")
% expr::attr< boost::posix_time::ptime >("TimeStamp")
% expr::smessage
);
// We'll have to map our custom levels to the event log event types
sinks::event_log::custom_event_type_mapping< severity_level > mapping("Severity");
mapping[normal] = sinks::event_log::info;
mapping[warning] = sinks::event_log::warning;
mapping[error] = sinks::event_log::error;
sink->locked_backend()->set_event_type_mapper(mapping);
// Add the sink to the core
logging::core::get()->add_sink(sink);
}
Here it fails to create sink_t object.
boost::shared_ptr< sink_t > sink(new sink_t());
Any idea what is the problem and how can I solve this?
Also If you know any other source that I can learn using boost event logging please write.
No answer yet...
But I have found a solution in a blog by Timo Geusch.
http://www.lonecpluspluscoder.com/2011/01/boost-log-preventing-the-unhandled-exception-in-windows-7-when-attempting-to-log-to-the-event-log/
The reason for this problem was that the registry key in HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\EventLog\ that the application needs access to both has to be present (if you’re not administrator, you don’t have the privileges to create it) and the user who runs the application also needs to be able to both read and write to it.

Monitering Hadoop multi node cluster by Ganglia

I want to monitor Hadoop (Hadoop version-0.20.2) multi node cluster using ganglia. My Hadoop is working properly.I have installed Ganglia after reading following blogs---
http://hakunamapdata.com/ganglia-configuration-for-a-small-hadoop-cluster-and-some-troubleshooting/
http://hokamblogs.blogspot.in/2013/06/ganglia-overview-and-installation-on.html
I have also studied Monitoring with Ganglia.pdf(APPENDIX B
Ganglia and Hadoop/HBase ). ​
I have modified only the following lines in **Hadoop-metrics.properties**(same on all Hadoop Nodes)==>
// Configuration of the "dfs" context for ganglia
dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext
dfs.period=10
dfs.servers=192.168.1.182:8649
// Configuration of the "mapred" context for ganglia
mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext
mapred.period=10
mapred.servers=192.168.1.182:8649:8649
// Configuration of the "jvm" context for ganglia
jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext
jvm.period=10
jvm.servers=192.168.1.182:8649
**gmetad.conf** (Only on Hadoop master Node )
data_source "Hadoop-slaves" 5 192.168.1.182:8649
RRAs "RRA:AVERAGE:0.5:1:302400" //Because i want to analyse one week data.
**gmond.conf** (on all the Hadoop Slave nodes and Hadoop Master)
globals {
daemonize = yes
setuid = yes
user = ganglia
debug_level = 0
max_udp_msg_len = 1472
mute = no
deaf = no
allow_extra_data = yes
host_dmax = 0 /*secs */
cleanup_threshold = 300 /*secs */
gexec = no
send_metadata_interval = 0
}
cluster {
name = "Hadoop-slaves"
owner = "Sandeep Priyank"
latlong = "unspecified"
url = "unspecified"
}
/* The host section describes attributes of the host, like the location */
host {
location = "CASL"
}
/* Feel free to specify as many udp_send_channels as you like. Gmond
used to only support having a single channel */
udp_send_channel {
host = 192.168.1.182
port = 8649
ttl = 1
}
/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
port = 8649
}
/* You can specify as many tcp_accept_channels as you like to share
an xml description of the state of the cluster */
tcp_accept_channel {
port = 8649
}
Now Ganglia is only giving system metrics(mem , disk etc.) for all the nodes. But it is not showing the Hadoop metrics( like jvm, mapred metrics
etc. ) on the web interface. how can i fix this problem ?
I do work Hadoop with Ganglia, and yes, I see on Ganglia a lot of metrics of Hadoop (Containers, map task, vmem). In fact, Hadoop specific report to Ganglio more of hundred metrics.
The hokamblogs Post was enough for this.
I edit hadoop-metrics2.properties on the master node and the content is:
namenode.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
namenode.sink.ganglia.period=10
namenode.sink.ganglia.servers=gmetad_hostname_or_ip:8649
resourcemanager.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
resourcemanager.sink.ganglia.period=10
resourcemanager.sink.ganglia.servers=gmetad_hostname_or_ip:8649
and I also edit the same files on the slaves:
datanode.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
datanode.sink.ganglia.period=10
datanode.sink.ganglia.servers=gmetad_hostname_or_ip:8649
nodemanager.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
nodemanager.sink.ganglia.period=10
nodemanager.sink.ganglia.servers=gmetad_hostname_or_ip:8649
Your remember restart Hadoop and Ganglia after change the files.
I hope this help you.
Thanks to everyone, If you are using older version of Hadoop then put following files( from new version of Hadoop) ==>
GangliaContext31.java
GangliaContext.java
In path ==> hadoop/src/core/org/apache/hadoop/metrics/ganglia
From the new version of Hadoop.
Compile your Hadoop using ant ( and set proper proxy while compiling).
If it gives error like function definition is missing then put that function definition( from new version) in proper java file and then compile Hadoop again. It will work.

Resources