job information not found in JobContext - hadoop

I am running a Java program on a remote computer and trying to read the split data using RecordReader object but instead getting:
Exception in thread "main" java.io.IOException: job information not found in JobContext. HCatInputFormat.setInput() not called?
I already have called the following:
_hcatInputFmt = HCatInputFormat.setInput(_myJob, db,tbl);
and then creating the RecordReader object as:
_hcatInputFmt.createRecordReader(hSplit, taskContext)
On debugging it fails while searching for the value of the key: HCAT_KEY_JOB_INFO in job configuration object, while trying to create a RecordReader object.
How do I set this value? Any pointers will be helpful.
Thanks.

We have to use getConfiguration() method to get the configuration from the job object. The configuration object used in creating the job object won't do it.

I had the same problem, you shuold use:
HCatInputFormat.setInput(job, dbName, inputTableName);
HCatSchema inputschema = HCatBaseInputFormat.getTableSchema(job.getConfiguration());
not
HCatInputFormat.setInput(job, dbName, inputTableName);
HCatSchema inputschema = HCatBaseInputFormat.getTableSchema(getConf());
Because, when you use Job.getInstance(conf), it will copy the conf, you can't use the original conf. Here is the code:
/**
* A new configuration with the same settings cloned from another.
*
* #param other the configuration from which to clone settings.
*/
#SuppressWarnings("unchecked")
public Configuration(Configuration other) {
this.resources = (ArrayList<Resource>) other.resources.clone();
synchronized(other) {
if (other.properties != null) {
this.properties = (Properties)other.properties.clone();
}
if (other.overlay!=null) {
this.overlay = (Properties)other.overlay.clone();
}
this.updatingResource = new ConcurrentHashMap<String, String[]>(
other.updatingResource);
this.finalParameters = Collections.newSetFromMap(
new ConcurrentHashMap<String, Boolean>());
this.finalParameters.addAll(other.finalParameters);
}
synchronized(Configuration.class) {
REGISTRY.put(this, null);
}
this.classLoader = other.classLoader;
this.loadDefaults = other.loadDefaults;
setQuietMode(other.getQuietMode());
}

Related

"A registration already exists for URI" when using HttpSelfHostServer

We are having an issue unit test failing because previous tests haven't closed session of HttpSelfHostServer.
So the second time we try to open a connection to a sever we get this message:
System.InvalidOperationException : A registration already exists for URI 'http://localhost:1337/'.
This test forces the issue (as an example):
[TestFixture]
public class DuplicatHostIssue
{
public HttpSelfHostServer _server;
[Test]
public void please_work()
{
var config = new HttpSelfHostConfiguration("http://localhost:1337/");
_server = new HttpSelfHostServer(config);
_server.OpenAsync().Wait();
config = new HttpSelfHostConfiguration("http://localhost:1337/");
_server = new HttpSelfHostServer(config);
_server.OpenAsync().Wait();
}
}
So newing up a new instance of the server dosent seem to kill the previous session. Any idea how to force the desposal of the previous session?
Full exception if it helps?
System.AggregateException : One or more errors occurred. ----> System.InvalidOperationException : A registration already exists for URI 'http://localhost:1337/'.
at
System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken) at System.Threading.Tasks.Task.Wait()
at ANW.API.Tests.Acceptance.DuplicatHostIssue.please_work() in DuplicatHostIssue.cs: line 32
--InvalidOperationException
at System.Runtime.AsyncResult.End(IAsyncResult result)
at System.ServiceModel.Channels.CommunicationObject.EndOpen(IAsyncResult result)
at System.Web.Http.SelfHost.HttpSelfHostServer.OpenListenerComplete(IAsyncResult result)
You might want to write a Dispose method like below and call it appropriately to avoid this issue
private static void HttpSelfHostServerDispose()
{
if (server != null)
{
_server.CloseAsync().Wait();
_server.Dispose();
_server = null;
}
}
This will clear the URI register.

ycsb load run elasticsearch

I'm trying to run the benchmark software yscb on ElasticSearch
The problem I'm having is that after the load, the data seems to get removed during cleanup.
I'm struggling to understand what is supposed to happen?
If I comment out the cleanup, it still fails because it cannot find the index during the "run" phase.
Can someone please explain what is supposed to happen in YSCB?
I mean I think it would have
1. load phase: load say 1,000,000 records
2. run phase: query the records loaded during the "load phase"
Thanks,
Okay I have discovered by running Couchbase in YCSB that the data shouldn't be removed.
Looking at cleanup() for ElasticSearchClient I see no reason why the files would be deleted (?)
#Override
public void cleanup() throws DBException {
if (!node.isClosed()) {
client.close();
node.stop();
node.close();
}
}
The init is as follows: any reason this would not persist on the filesystem?
public void init() throws DBException {
// initialize OrientDB driver
Properties props = getProperties();
this.indexKey = props.getProperty("es.index.key", DEFAULT_INDEX_KEY);
String clusterName = props.getProperty("cluster.name", DEFAULT_CLUSTER_NAME);
Boolean newdb = Boolean.parseBoolean(props.getProperty("elasticsearch.newdb", "false"));
Builder settings = settingsBuilder()
.put("node.local", "true")
.put("path.data", System.getProperty("java.io.tmpdir") + "/esdata")
.put("discovery.zen.ping.multicast.enabled", "false")
.put("index.mapping._id.indexed", "true")
.put("index.gateway.type", "none")
.put("gateway.type", "none")
.put("index.number_of_shards", "1")
.put("index.number_of_replicas", "0");
//if properties file contains elasticsearch user defined properties
//add it to the settings file (will overwrite the defaults).
settings.put(props);
System.out.println("ElasticSearch starting node = " + settings.get("cluster.name"));
System.out.println("ElasticSearch node data path = " + settings.get("path.data"));
node = nodeBuilder().clusterName(clusterName).settings(settings).node();
node.start();
client = node.client();
if (newdb) {
client.admin().indices().prepareDelete(indexKey).execute().actionGet();
client.admin().indices().prepareCreate(indexKey).execute().actionGet();
} else {
boolean exists = client.admin().indices().exists(Requests.indicesExistsRequest(indexKey)).actionGet().isExists();
if (!exists) {
client.admin().indices().prepareCreate(indexKey).execute().actionGet();
}
}
}
Thanks,
Okay what I am finding is as follows
(any help from ElasticSearch-ers much appreciated!!!!
because I'm obviously doing something wrong )
Even when the load shuts down leaving the data behind, the "run" still cannot find the data on startup
ElasticSearch node data path = C:\Users\Pl_2\AppData\Local\Temp\/esdata
org.elasticsearch.action.NoShardAvailableActionException: [es.ycsb][0] No shard available for [[es.ycsb][usertable][user4283669858964623926]: routing [null]]
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.perform(TransportShardSingleOperationAction.java:140)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:125)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:72)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:47)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:61)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:83)
The github README has been updated.
It looks like you need to specify using:
-p path.home=<path to folder to persist data>

How to create SubCommunities using the Social Business Toolkit Java API?

In the SDK Javadoc, the Community class does not have a "setParentCommunity" method but the CommunityList class does have a getSubCommunities method so there must be a programmatic way to set a parent Community's Uuid on new Community creation. The REST API mentions a "rel="http://www.ibm.com/xmlns/prod/sn/parentcommunity" element". While looking for clues I check an existing Subcommunity's XmlDataHandler's nodes and found a link element. I tried getting the XmlDataHandler for a newly-created Community and adding a link node with href, rel and type nodes similar to those in the existing Community but when trying to update or re-save the Community I got a bad request error. Actually even when I tried calling dataHandler.setData(n) where n was set as Node n=dataHandler.getData(); without any changes, then calling updateCommunity or save I got the same error, so it appears that manipulating the dataHandler XML is not valid.
What is the recommended way to specify a parent Community when creating a new Community so that it is created as a SubCommunity ?
The correct way to create a sub-community programatically is to modify the POST request body for community creation - here is the link to the Connections 45 infocenter - http://www-10.lotus.com/ldd/appdevwiki.nsf/xpDocViewer.xsp?lookupName=IBM+Connections+4.5+API+Documentation#action=openDocument&res_title=Creating_subcommunities_programmatically_ic45&content=pdcontent
We do not have support in the SBT SDK to do this using CommunityService APIs. We need to use low level Java APIs using Endpoint and ClientService classes to directly call the REST APIs with the appropriate request body.
I'd go ahead and extend the class CommunityService
then go ahead and add CommunityService
https://github.com/OpenNTF/SocialSDK/blob/master/src/eclipse/plugins/com.ibm.sbt.core/src/com/ibm/sbt/services/client/connections/communities/CommunityService.java
Line 605
public String createCommunity(Community community) throws CommunityServiceException {
if (null == community){
throw new CommunityServiceException(null, Messages.NullCommunityObjectException);
}
try {
Object communityPayload;
try {
communityPayload = community.constructCreateRequestBody();
} catch (TransformerException e) {
throw new CommunityServiceException(e, Messages.CreateCommunityPayloadException);
}
String communityPostUrl = resolveCommunityUrl(CommunityEntity.COMMUNITIES.getCommunityEntityType(),CommunityType.MY.getCommunityType());
Response requestData = createData(communityPostUrl, null, communityPayload,ClientService.FORMAT_CONNECTIONS_OUTPUT);
community.clearFieldsMap();
return extractCommunityIdFromHeaders(requestData);
} catch (ClientServicesException e) {
throw new CommunityServiceException(e, Messages.CreateCommunityException);
} catch (IOException e) {
throw new CommunityServiceException(e, Messages.CreateCommunityException);
}
}
You'll want to change your communityPostUrl to match...
https://greenhouse.lotus.com/communities/service/atom/community/subcommunities?communityUuid=2fba29fd-adfa-4d28-98cc-05cab12a7c43
and where the Uuid here is the parent uuid.
I followed #PaulBastide 's recommendation and created a SubCommunityService class, currently only containing a method for creation. It wraps the CommunityService rather than subclassing it, since I found that preferrable. Here's the code in case you want to reuse it:
public class SubCommunityService {
private final CommunityService communityService;
public SubCommunityService(CommunityService communityService) {
this.communityService = communityService;
}
public Community createCommunity(Community community, String superCommunityId) throws ClientServicesException {
Object constructCreateRequestBody = community.constructCreateRequestBody();
ClientService clientService = communityService.getEndpoint().getClientService();
String entityType = CommunityEntity.COMMUNITY.getCommunityEntityType();
Map<String, String> params = new HashMap<>();
params.put("communityUuid", superCommunityId);
String postUrl = communityService.resolveCommunityUrl(entityType,
CommunityType.SUBCOMMUNITIES.getCommunityType(), params);
String newCommunityUrl = (String) clientService.post(postUrl, null, constructCreateRequestBody,
ClientService.FORMAT_CONNECTIONS_OUTPUT);
String communityId = newCommunityUrl.substring(newCommunityUrl.indexOf("communityUuid=")
+ "communityUuid=".length());
community.setCommunityUuid(communityId);
return community;
}
}

Is there a way to create multiple instances of CacheManager in Microsoft Enterprise Library, programatically without depending on configuration file

We are trying to migrate to use Microsoft Enterprise Library - Caching block. However, cache manager initialization seems to be pretty tied to the config file entries and our application creates inmemory "containers" on the fly. Is there anyway by which an instance of cache manager can be instantiated on the fly using pre-configured set of values (inmemory only).
Enterprise Library 5 has a fluent configuration which makes it easy to programmatically configure the blocks. For example:
var builder = new ConfigurationSourceBuilder();
builder.ConfigureCaching()
.ForCacheManagerNamed("MyCache")
.WithOptions
.UseAsDefaultCache()
.StoreInIsolatedStorage("MyStore")
.EncryptUsing.SymmetricEncryptionProviderNamed("MySymmetric");
var configSource = new DictionaryConfigurationSource();
builder.UpdateConfigurationWithReplace(configSource);
EnterpriseLibraryContainer.Current
= EnterpriseLibraryContainer.CreateDefaultContainer(configSource);
Unfortunately, it looks like you need to configure the entire block at once so you wouldn't be able to add CacheManagers on the fly. (When I call ConfigureCaching() twice on the same builder an exception is thrown.) You can create a new ConfigurationSource but then you lose your previous configuration. Perhaps there is a way to retrieve the existing configuration, modify it (e.g. add a new CacheManager) and then replace it? I haven't been able to find a way.
Another approach is to use the Caching classes directly.
The following example uses the Caching classes to instantiate two CacheManager instances and stores them in a static Dictionary. No configuration required since it's not using the container. I'm not sure it's a great idea -- it feels a bit wrong to me. It's pretty rudimentary but hopefully helps.
public static Dictionary<string, CacheManager> caches = new Dictionary<string, CacheManager>();
static void Main(string[] args)
{
IBackingStore backingStore = new NullBackingStore();
ICachingInstrumentationProvider instrProv = new CachingInstrumentationProvider("myInstance", false, false,
new NoPrefixNameFormatter());
Cache cache = new Cache(backingStore, instrProv);
BackgroundScheduler bgScheduler = new BackgroundScheduler(new ExpirationTask(null, instrProv), new ScavengerTask(0,
int.MaxValue, new NullCacheOperation(), instrProv), instrProv);
CacheManager cacheManager = new CacheManager(cache, bgScheduler, new ExpirationPollTimer(int.MaxValue));
cacheManager.Add("test1", "value1");
caches.Add("cache1", cacheManager);
cacheManager = new CacheManager(new Cache(backingStore, instrProv), bgScheduler, new ExpirationPollTimer(int.MaxValue));
cacheManager.Add("test2", "value2");
caches.Add("cache2", cacheManager);
Console.WriteLine(caches["cache1"].GetData("test1"));
Console.WriteLine(caches["cache2"].GetData("test2"));
}
public class NullCacheOperation : ICacheOperations
{
public int Count { get { return 0; } }
public Hashtable CurrentCacheState { get { return new System.Collections.Hashtable(); } }
public void RemoveItemFromCache(string key, CacheItemRemovedReason removalReason) {}
}
If expiration and scavenging policies are the same perhaps it might be better to create one CacheManager and then use some intelligent key names to represent the different "containers". E.g. the key name could be in the format "{container name}:{item key}" (assuming that a colon will not appear in a container or key name).
You can using UnityContainer:
IUnityContainer unityContainer = new UnityContainer();
IContainerConfigurator configurator = new UnityContainerConfigurator(unityContainer);
configurator.ConfigureCache("MyCache1");
IContainerConfigurator configurator2 = new UnityContainerConfigurator(unityContainer);
configurator2.ConfigureCache("MyCache2");
// here you can access both MyCache1 and MyCache2:
var cache1 = unityContainer.Resolve<ICacheManager>("MyCache1");
var cache2 = unityContainer.Resolve<ICacheManager>("MyCache2");
And this is an extension class for IContainerConfigurator:
public static void ConfigureCache(this IContainerConfigurator configurator, string configKey)
{
ConfigurationSourceBuilder builder = new ConfigurationSourceBuilder();
DictionaryConfigurationSource configSource = new DictionaryConfigurationSource();
// simple inmemory cache configuration
builder.ConfigureCaching().ForCacheManagerNamed(configKey).WithOptions.StoreInMemory();
builder.UpdateConfigurationWithReplace(configSource);
EnterpriseLibraryContainer.ConfigureContainer(configurator, configSource);
}
Using this you should manage an static IUnityContainer object and can add new cache, as well as reconfigure existing caching setting anywhere you want.

Framework for generating BPEL in runtime?

I need to generate BPEL XML code in runtime. The only way I can do it now is to create XML document with "bare hands" using DOM API. But there must be a framework that could ease such work incorporating some kind of object model.
I guess it should look something like this:
BPELProcessFactory.CreateProcess().addSequence
Do you know any?
The Eclipse BPEL designer project provides an EMF model for BPEL 2.0. The generated code can be used to programmatically create BPEL code with a convenient API.
In case anyone stumbles upon this.
Yes this can be done using the BPEL Model.
Here is a sample piece of code which generates a quite trivial BPEL file:
public Process createBPEL()
{
Process process = null;
BPELFactory factory = BPELFactory.eINSTANCE;
try
{
ResourceSet rSet = new ResourceSetImpl();
rSet.getResourceFactoryRegistry().getExtensionToFactoryMap()
.put("bpel", new BPELResourceFactoryImpl());
File file = new File("myfile.bpel");
file.createNewFile();
String filePath = file.getAbsolutePath();
System.out.println(filePath);
AdapterRegistry.INSTANCE.registerAdapterFactory( BPELPackage.eINSTANCE, BasicBPELAdapterFactory.INSTANCE );
Resource resource = rSet.createResource(URI.createFileURI(filePath));
process = factory.createProcess();
process.setName("FirstBPEL");
Sequence seq = factory.createSequence();
seq.setName("MainSequence");
Receive recieve = factory.createReceive();
PortType portType = new PortTypeProxy(URI.createURI("http://baseuri"), new QName("qname"));
Operation operation = new OperationProxy(URI.createURI("http://localhost"), portType , "operation_name");
recieve.setOperation(operation);
Invoke invoke = factory.createInvoke();
invoke.setOperation(operation);
While whiles = factory.createWhile();
If if_st = factory.createIf();
List<Activity> activs = new ArrayList<Activity>();
activs.add(recieve);
activs.add(invoke);
activs.add(if_st);
activs.add(whiles);
seq.getActivities().addAll(activs);
process.setActivity(seq);
resource.getContents().add(process);
Map<String,String> map = new HashMap<String, String>();
map.put("bpel", "http://docs.oasis-open.org/wsbpel/2.0/process/executable");
map.put("xsd", "http://www.w3.org/2001/XMLSchema");
resource.save(map);
}
catch(Exception e)
{
e.printStackTrace();
}
return process;
}
The dependencies require that you add the following jars to the project's build path from the plugins folder in eclipse installation directory:
org.eclipse.bpel.model_*.jar
org.eclipse.wst.wsdl_*.jar
org.eclipse.emf.common_*.jar
org.eclipse.emf.ecore_*.jar
org.eclipse.emf.ecore.xmi_*.jar
javax.wsdl_*.jar
org.apache.xerces_*.jar
org.eclipse.bpel.common.model_*.jar
org.eclipse.xsd_*.jar
org.eclipse.core.resources_*.jar
org.eclipse.osgi_*.jar
org.eclipse.core.runtime_*.jar
org.eclipse.equinox.common_*.jar
org.eclipse.core.jobs_*.jar
org.eclipse.core.runtime.compatibility_*.jar

Resources