zookeeper node gets deleted after system reboot - windows

I'm trying to create a new node in zookeeper via groovy, but after the system reboot the node disappear like it was never created.
This code is a part of Jenkins pipeline which in a post build groovy script against some windows VM. then the pipeline then executes a restart on this machine and the node vanishes.
I've this occurs on several other windows VMs
Here's my code:
#Grab('org.apache.zookeeper:zookeeper:3.4.6')
import org.apache.zookeeper.*
import static org.apache.zookeeper.ZooKeeper.States.*
import org.apache.zookeeper.ZooDefs.Ids;
final int TIMEOUT_MSEC = 5000
final int RETRY_MSEC = 100
def num_retries enter code here= 0
PATH_TO_NODE = "/some/path/to/node"
noOpWatcher = { event -> } as Watcher
zk = new ZooKeeper('myIP', TIMEOUT_MSEC, noOpWatcher)
def addNode() {
if (zk.exists(PATH_TO_NODE, true) == null) {
/*create node*/
println("creating nodes")
zk.create(PATH_TO_NODE, new byte[0], Ids.OPEN_ACL_UNSAFE,
CreateMode.PERSISTENT)
zk.create(PATH_TO_NODE + "myNode", "someData".getBytes(), Ids.OPEN_ACL_UNSAFE,
CreateMode.PERSISTENT)
} else {
println("node already exists")
}
}
while (zk.state != CONNECTED && num_retries < (TIMEOUT_MSEC / RETRY_MSEC)) {
Thread.sleep(RETRY_MSEC)
num_retries++
}
if (zk.state != CONNECTED) {
println("could not connect to zookeeper")
System.exit(1)
} else {
addNode()
}
zk.close()

Related

AWS Lambda logging through Serilog UDP sink and logstash silently fails

We have a .NET Core 2.1 AWS Lambda that I'm trying to hook into our existing logging system.
I'm trying to log through Serilog using a UDP sink to our logstash instance for ingestion into our ElasticSearch logging database that is hosted on a private VPC. Running locally through a console logs fine, both to the console itself and through UDP into Elastic. However, when it runs as a lambda, it only logs to the console (i.e CloudWatch), and doesn't output anything indicating that anything is wrong. Possibly because UDP is stateless?
NuGet packages and versions:
Serilog 2.7.1
Serilog.Sinks.Udp 5.0.1
Here is the logging code we're using:
public static void Configure(string udpHost, int udpPort, string environment)
{
var udpFormatter = new JsonFormatter(renderMessage: true);
var loggerConfig = new LoggerConfiguration()
.Enrich.FromLogContext()
.MinimumLevel.Information()
.Enrich.WithProperty("applicationName", Assembly.GetExecutingAssembly().GetName().Name)
.Enrich.WithProperty("applicationVersion", Assembly.GetExecutingAssembly().GetName().Version.ToString())
.Enrich.WithProperty("tags", environment);
loggerConfig
.WriteTo.Console(outputTemplate: "[{Level:u}]: {Message}{N---ewLine}{Exception}")
.WriteTo.Udp(udpHost, udpPort, udpFormatter);
var logger = loggerConfig.CreateLogger();
Serilog.Log.Logger = logger;
Serilog.Debugging.SelfLog.Enable(Console.Error);
}
// this is output in the console from the lambda, but doesn't appear in the Database from the lambda
// when run locally, appears in both
Serilog.Log.Logger.Information("Hello from Serilog!");
...
// at end of lambda
Serilog.Log.CloseAndFlush();
And here is our UDP input on logstash:
udp {
port => 5000
tags => [ 'systest', 'serilog-nested' ]
codec => json
}
Does anyone know how I might go about resolving this? Or even just seeing what specifically is wrong so that I can start to find a solution.
Things tried so far include:
Pinging logstash from the lambda - impossible, lambda doesn't have ICMP
Various things to try and get the UDP sink to output errors, as seen above, various attempts at that. Even putting in a completely fake address yields no error though
Adding the lambda to a VPC where I know logging is possible from
Sleeping around at the end of the lambda. SO that the logs have time to go through before the lambda exits
Checking the logstash logs to see if anything looks odd. It doesn't really. And the fact that local runs get through fine makes me think it's not that.
Using UDP directly. It doesn't seem to reach the server. I'm not sure if that's connectivity issues or just UDP itself from a lambda.
Lots of cursing and swearing
In line with my comment above you can create a log subscription and stream to ES like so, I'm aware that this is NodeJS so it's not quite the right answer but you might be able to figure it out from here:
/* eslint-disable */
// Eslint disabled as this is adapted AWS code.
const zlib = require('zlib')
const { Client } = require('#elastic/elasticsearch')
const elasticsearch = new Client({ ES_CLUSTER_DETAILS })
/**
* This is an example function to stream CloudWatch logs to ElasticSearch.
* #param event
* #param context
* #param callback
*/
export default (event, context, callback) => {
context.callbackWaitsForEmptyEventLoop = true
const payload = new Buffer(event.awslogs.data, 'base64')
zlib.gunzip(payload, (err, result) => {
if (err) {
return callback(null, err)
}
const logObject = JSON.parse(result.toString('utf8'))
const elasticsearchBulkData = transform(logObject)
const params = { body: [] }
params.body.push(elasticsearchBulkData)
esClient.bulk(params, (err, resp) => {
if (err) {
callback(null, 'success')
return
}
})
callback(null, 'success')
})
}
function transform(payload) {
if (payload.messageType === 'CONTROL_MESSAGE') {
return null
}
let bulkRequestBody = ''
payload.logEvents.forEach((logEvent) => {
const timestamp = new Date(1 * logEvent.timestamp)
// index name format: cwl-YYYY.MM.DD
const indexName = [
`cwl-${process.env.NODE_ENV}-${timestamp.getUTCFullYear()}`, // year
(`0${timestamp.getUTCMonth() + 1}`).slice(-2), // month
(`0${timestamp.getUTCDate()}`).slice(-2), // day
].join('.')
const source = buildSource(logEvent.message, logEvent.extractedFields)
source['#id'] = logEvent.id
source['#timestamp'] = new Date(1 * logEvent.timestamp).toISOString()
source['#message'] = logEvent.message
source['#owner'] = payload.owner
source['#log_group'] = payload.logGroup
source['#log_stream'] = payload.logStream
const action = { index: {} }
action.index._index = indexName
action.index._type = 'lambdaLogs'
action.index._id = logEvent.id
bulkRequestBody += `${[
JSON.stringify(action),
JSON.stringify(source),
].join('\n')}\n`
})
return bulkRequestBody
}
function buildSource(message, extractedFields) {
if (extractedFields) {
const source = {}
for (const key in extractedFields) {
if (extractedFields.hasOwnProperty(key) && extractedFields[key]) {
const value = extractedFields[key]
if (isNumeric(value)) {
source[key] = 1 * value
continue
}
const jsonSubString = extractJson(value)
if (jsonSubString !== null) {
source[`$${key}`] = JSON.parse(jsonSubString)
}
source[key] = value
}
}
return source
}
const jsonSubString = extractJson(message)
if (jsonSubString !== null) {
return JSON.parse(jsonSubString)
}
return {}
}
function extractJson(message) {
const jsonStart = message.indexOf('{')
if (jsonStart < 0) return null
const jsonSubString = message.substring(jsonStart)
return isValidJson(jsonSubString) ? jsonSubString : null
}
function isValidJson(message) {
try {
JSON.parse(message)
} catch (e) { return false }
return true
}
function isNumeric(n) {
return !isNaN(parseFloat(n)) && isFinite(n)
}
One of my colleagues helped me get most of the way there, and then I managed to figure out the last bit.
I updated Serilog.Sinks.Udp to 6.0.0
I updated the UDP setup code to use the AddressFamily.InterNetwork specifier, which I don't believe was available in 5.0.1.
I removed enriching our log messages with "tags", since I believe it being present on the UDP endpoint somehow caused some kind of clash and I've seen it stop logging without a trace before.
And voila!
Here's the new logging setup code:
loggerConfig
.WriteTo.Udp(udpHost, udpPort, AddressFamily.InterNetwork, udpFormatter)
.WriteTo.Console(outputTemplate: "[{Level:u}]: {Message}{NewLine}{Exception}");

Xamarin.Forms Sip: 'Internal server error 500' When Placing Outgoing Calls

I am using Linphone SDK in Xamarin.forms project for the sip calling. I am able to make the connection using following code:
var authInfo = Factory.Instance.CreateAuthInfo(username.Text,
null, password.Text, null, null,domain.Text);
LinphoneCore.AddAuthInfo(authInfo);
String proxyAddress ="sip:"+username.Text+"#192.168.1.180:5160";
var identity = Factory.Instance.CreateAddress(proxyAddress);
var proxyConfig = LinphoneCore.CreateProxyConfig();
identity.Username = username.Text;
identity.Domain = domain.Text;
identity.Transport = TransportType.Udp;
proxyConfig.Edit();
proxyConfig.IdentityAddress = identity;
proxyConfig.ServerAddr = domain.Text + ":5160;transport=udp";
proxyConfig.Route = domain.Text;
proxyConfig.RegisterEnabled = true;
proxyConfig.Done();
LinphoneCore.AddProxyConfig(proxyConfig);
LinphoneCore.DefaultProxyConfig = proxyConfig;
LinphoneCore.RefreshRegisters();
After Successful connection, I am using the code for placing the code.
if (LinphoneCore.CallsNb == 0)
{
string phoneCall = "sip:"+address.Text+ "#192.168.1.180:5160";
var addr = LinphoneCore.InterpretUrl(phoneCall);
LinphoneCore.InviteAddress(addr);
}
else
{
Call call = LinphoneCore.CurrentCall;
if (call.State == CallState.IncomingReceived)
{
LinphoneCore.AcceptCall(call);
}
else
{
LinphoneCore.TerminateAllCalls();
}
}
And the listener that is listening to call state changed event is as:
private void OnCall(Core lc, Call lcall, CallState state, stringmessage)
{
call_status.Text = "Call state changed: " + state;
if (lc.CallsNb > 0)
{
if (state == CallState.IncomingReceived)
{
call.Text = "Answer Call (" + lcall.RemoteAddressAsString + ")";
}
else
{
call.Text = "Terminate Call";
}
if (lcall.CurrentParams.VideoEnabled) {
video.Text = "Stop Video";
} else {
video.Text = "Start Video";
}
}
else
{
call.Text = "Start Call";
call_stats.Text = "";
}
}
The call status is giving 'Internal Server Error'. I am able to receive the calls using Linphone or X-lite Soft Phone in my code, But I am not able to place the calls. I don't know whether this issue is related to server or it is related to my code. Please suggest.
Internal Server Error (HTTP Status code 500) means that an unexpected error occurred on the server. So I would suspect the problem is rather there than with your app's code.
500 - A generic error message, given when an unexpected condition was encountered and no more specific message is suitable.
It could be that your request doesn't satisfy the expectations of the endpoint you are calling, but even then, the server should then respond with a more meaningful error, than crashing with 500.

Jenkins declarative pipeline job - how to distribute parallel steps across slaves?

I am running a Declarative Pipeline where one of the steps runs a (very long) integration test. I'm trying to split my test into several smaller ones and run them in parallel over several nodes. I have 8 of these smaller tests and I have 8 nodes (under a label), so I'd like to have each test run on a separate node. Unfortunately, two tests — when run on the same node — interfere with each other, and so both fail.
I need to be able to first get the list of available nodes, and then run the smaller tests in parallel, one of each node; if there are not enough nodes, one of the smaller tests need to wait until the node is finished.
However, what happens is that when asking for a node by label, two of the smaller tests usually get the same node, and so both fail. Nodes are configured to run up to 3 executors, otherwise the whole system halts, so I can't change that.
My current configuration for the smaller test is:
stage('Integration Tests') {
when {
expression {params.TESTS_INTEGRATION}
}
parallel {
stage('Test1') {
agent {node {label 'my_builder'}}
steps {
script {
def shell_script = getShellScript("Test1")
sh "${shell_script}"
}
}
}
I am able to get the list of available slaves from a label like this:
pipeline {
stages {
// ... other stages here ...
stage('NodeList'){
steps {
script {
def nodes = getNodeNames('my_builder')
free_nodes = []
for (def element = 0; element < nodes.size(); element++) {
usenode = nodes[element]
try {
// Give it 5 seconds to run the nodetest function
timeout(time: 5, unit: 'SECONDS') {
node(usenode) {
nodetest()
free_nodes += usenode
}
}
} catch(err) {
}
}
println free_nodes
}
}
}
Where
def getNodeNames (String label) {
def lgroup = Jenkins.instance.getLabel(label)
def nodes = lgroup.getNodes()
def result = []
if (nodes.size() > 0) {
for (def element = 0; element < nodes.size(); element++) {
result += nodes[element].getNodeName()
}
}
return result
}
def nodetest() {
sh('echo alive on \$(hostname)')
}
How can I get the node name programmatically out of the free_nodes array and direct the stage to use that?
I've figured it out, so for the people from the future:
It turns out you can run a Scripted Pipeline inside a Declarative Pipeline, like this:
pipeline {
stage('SomeStage') {
steps {
script {
// ... your scripted pipeline here
}
}
}
The script can do anything, and that includes... running a pipeline!
Here is the script:
script {
def builders = [:]
def nodes = getNodeNames('my_label')
// let's find the free nodes
String[] free_nodes = []
for (def element = 0; element < nodes.size(); element++) {
usenode = nodes[element]
try {
// Give it 5 seconds to run the nodetest function
timeout(time: 5, unit: 'SECONDS') {
node(usenode) {
nodetest()
free_nodes += usenode
}
}
} catch(err) {
// do nothing
}
}
println free_nodes
def tests = params.TESTS_LIST.split(',')
for(int i = 0; i < tests.length; i++) {
// select the test to run
def the_test = tests[i]
// select on which node to run it
def the_node = free_nodes[i % free_nodes.length]
// here comes the scripted pipeline: prepare steps
builders[the_test] = {
// run on the selected node
node(the_node) {
// lock the resource with the name of the node so two tests can't run there at the same time
lock(the_node) {
// name the stage
stage(the_test) {
println "Running on ${NODE_NAME}"
def shell_script = getShellScript("${the_test}")
sh "${shell_script}"
}
}
}
}
}
// run the steps in parallel
parallel builders
}

How to solve SchemaValidationFailedException: Child is not present in schema

I'm trying to use Databroker of MD-SAL to save a list of data, after modifying the yang file and InstanceIdentifier many times but always facing similar validation issue, for example
java.util.concurrent.ExecutionException: TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaValidationFailedException: Child /(urn:opendaylight:params:xml:ns:yang:testDataBroker?revision=2015-01-05)service-datas is not present in schema tree.]]}
at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.wrapInExecutionExc
My goal is to use rpc save-device-info to get data from rest. Then use databroker api to save data in the memory and finally test if the data could be succesfully replicated into other cluster nodes.
Yang file:
module testDataBroker {
yang-version 1;
namespace "urn:opendaylight:params:xml:ns:yang:testDataBroker";
prefix "testDataBroker";
revision "2015-01-05" {
description "Initial revision of testDataBroker model";
}
container service-datas {
list service-data {
key "service-id";
uses service-id;
uses device-info;
}
}
grouping device-info {
container device-info {
leaf device-name {
type string;
config false;
}
leaf device-description {
type string;
config false;
}
}
}
grouping service-id {
leaf service-id {
type string;
mandatory true;
}
}
rpc save-device-info {
input {
uses service-id;
uses device-info;
}
output {
uses device-info;
}
}
rpc get-device-info {
output {
uses device-info;
}
}
}
Java Code
#Override public Future<RpcResult<SaveDeviceInfoOutput>> saveDeviceInfo(SaveDeviceInfoInput input) {
String name = input.getDeviceInfo().getDeviceName();
String description = input.getDeviceInfo().getDeviceDescription();
String serviceId = input.getServiceId();
WriteTransaction writeTransaction = dataBroker.newWriteOnlyTransaction();
DeviceInfo deviceInfo = new DeviceInfoBuilder().setDeviceDescription(description).setDeviceName(name).build();
ServiceData serviceData = new ServiceDataBuilder().setServiceId(serviceId).setDeviceInfo(deviceInfo).build();
InstanceIdentifier<ServiceData> instanceIdentifier =
InstanceIdentifier.builder(ServiceDatas.class).child(ServiceData.class, serviceData.getKey()).build();
writeTransaction.put(LogicalDatastoreType.CONFIGURATION, instanceIdentifier, serviceData, true);
boolean isFailed = false;
try {
writeTransaction.submit().get();
log.info("Create containers succeeded!");
} catch (InterruptedException | ExecutionException e) {
log.error("Create containers failed: ", e);
isFailed = true;
}
return isFailed ?
RpcResultBuilder.success(new SaveDeviceInfoOutputBuilder())
.withError(RpcError.ErrorType.RPC, "Create container failed").buildFuture() :
RpcResultBuilder.success(new SaveDeviceInfoOutputBuilder().setDeviceInfo(input.getDeviceInfo()))
.buildFuture();
}
Really need your help. Thanks.
Update:
With the same version of md-sal bundles, I installed feature odl-toaster on only one ODL instead of cluster nodes. It seems like rpc from odl-toaster is working properly on single node.
I didn't realize that rpc is also clustered. Sometimes the rpc request hit on other nodes which didn't deploy the same bundles. Now the problem has been solved after the bundle is distrubted on each node.

Spark on Windows - What exactly is winutils and why do we need it?

I'm curious! To my knowledge, HDFS needs datanode processes to run, and this is why it's only working on servers. Spark can run locally though, but needs winutils.exe which is a component of Hadoop. But what exactly does it do? How is it, that I cannot run Hadoop on Windows, but I can run Spark, which is built on Hadoop?
I know of at least one usage, it is for running shell commands on Windows OS. You can find it in org.apache.hadoop.util.Shell, other modules depends on this class and uses it's methods, for example getGetPermissionCommand() method:
static final String WINUTILS_EXE = "winutils.exe";
...
static {
IOException ioe = null;
String path = null;
File file = null;
// invariant: either there's a valid file and path,
// or there is a cached IO exception.
if (WINDOWS) {
try {
file = getQualifiedBin(WINUTILS_EXE);
path = file.getCanonicalPath();
ioe = null;
} catch (IOException e) {
LOG.warn("Did not find {}: {}", WINUTILS_EXE, e);
// stack trace comes at debug level
LOG.debug("Failed to find " + WINUTILS_EXE, e);
file = null;
path = null;
ioe = e;
}
} else {
// on a non-windows system, the invariant is kept
// by adding an explicit exception.
ioe = new FileNotFoundException(E_NOT_A_WINDOWS_SYSTEM);
}
WINUTILS_PATH = path;
WINUTILS_FILE = file;
WINUTILS = path;
WINUTILS_FAILURE = ioe;
}
...
public static String getWinUtilsPath() {
if (WINUTILS_FAILURE == null) {
return WINUTILS_PATH;
} else {
throw new RuntimeException(WINUTILS_FAILURE.toString(),
WINUTILS_FAILURE);
}
}
...
public static String[] getGetPermissionCommand() {
return (WINDOWS) ? new String[] { getWinUtilsPath(), "ls", "-F" }
: new String[] { "/bin/ls", "-ld" };
}
Though Max's answer covers the actual place where it's being referred. Let me give a brief background on why it needs it on Windows -
From Hadoop's Confluence Page itself -
Hadoop requires native libraries on Windows to work properly -that
includes accessing the file:// filesystem, where Hadoop uses some
Windows APIs to implement posix-like file access permissions.
This is implemented in HADOOP.DLL and WINUTILS.EXE.
In particular, %HADOOP_HOME%\BIN\WINUTILS.EXE must be locatable
And , I think you should be able to run both Spark and Hadoop on Windows.

Resources