Can we read built-in counters in Hadoop for individual tasks and in a periodic manner (say every 500 ms or 1 sec) and record in a file. If we can do that then how to do that?
How to get the individual task pids?


Spark Job gets stuck at 99.7%

I'm trying to perform a simple join operation using Talend & Spark. The input data set is a few million records and the look up data set is around 100 records.(we might need to join with million records look up data too).
When trying to just read the input data and generate a flat file with the following memory settings, the job works fine and takes less amount of time to run. But, when trying to perform a join operation as explained above, the job gets stuck at 99.7%.
ExecutorMemory = 20g
Cores Per Executor = 4
Yarn resources allocation = Fixed
Num of executors = 100
spark.yarn.executor.memoryOverhead=6000 (On some preliminary research I found that this has to be 10% of the executor memory, but that didn't help too.)
After a while(30-40 minutes) the job prints a log saying - "Lost executor xx on". This is probably because it's put on wait for too long and the executor gets killed.
I'm trying to check if anyone has run into this issue where a Spark job gets stuck at 99.7% for a simple operation. Also, what are the recommended tuning properties to use in such a scenario.

How to see all Hadoop counter when running pig

I run my pig via the command line, and I want to see all Hadoop counters after the run is finish.
I have written UDF that write to Hadoop counter base on this blog, but I want to test it - when the pig start I can see logs from the the constructor, but later I see no log
Currently all I see is simple static - see below
Total records written : 3487
Total bytes written : 38078
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 101
Total records proactively spilled: 12464701
Pig job is actually a MapReduce job so you could see the status of the job and its complete list of counters from JobTracker page (if using MR1) or Application Master page (if using YARN).
A single pig script may create multiple jobs depending on the complexity. You can query all the counters for each job from the command line by running
mapred job -status <job-id>
If you know the actual counter you are interested you can retrieve individual counters with
mapred job -counter <job-id> <group-name> <counter-name>
Of course, you need to know the job-id(s) - those should be available in the original pig output following the line 'Job DAG:'

Limit number of processed record by hadoop mapreduce

i have a hugh file (hive table with over 20 billions of records)
and i need to run a mapreduce that will process the first 10k number of records.
is there an effective way to limit the number of processed record by hadoop mapreduce?
You can use LIMIT with task specification. However if you have to do it again and again then a better automated solution is to use OOZIE (work flow editor for hadoop) that can create partitions in hive for your data.
You may use LIMIT:
But it returns 10k random records. As MapReduce processes data blocks independently you can't say which record is first and which is last.
Here is a trick to get what you want in case you know the order of records:
SET mapred.reduce.tasks = 1
Still you'll have to process all 20 billions of records.

Is there an Alternative for HBaseStorage in PIG

I am using HBaseStorage with -caching option in pig script as follows
HBaseStorage('countDetails:ansCount countDetails:divCount countDetails:unansCount countDetails:engCount countDetails:ineffCount countDetails:totalCount', '-caching 1000');
I can see this was reflecting in my job.xml
but I can see there is no time difference in it I am processing 10 million records and store data around 160mb in to HBase.
When I store the result in hdfs its taking 3 mins to process the same job takes 30mins to store into HBase.
I even tried by setting
SET hbase.client.scanner.caching 1000;
Please let me know how can I reduce the time.
Is there any alternative for HBaseStorage?
the above blog says that I have to set hbase.client.scanner.caching in bootstrap scrip
I don't know how to do that
will it be enough If I set it in Hbase-conf.
Please help me out of this
hbase.client.scanner.caching points to number of rows that will be fetched when calling next on a scanner if it is not served from (local, client) memory.
Higher caching values will enable faster scanners but will eat up more memory and some calls of next may take longer and longer time when the cache is empty. Do not set this value such that the time between invocations is greater than the scanner timeout;
i.e. This property is 1 min by default. Clients must
report in within this period else they are considered dead.
In my experience HBase doesn't perform very well with Pig. It you don't have requirement for random look-up then use only HDFS otherwie HBase MR job would be better option. Also, In Hadoop MR job, you can connect to Hbase(This option gave me the best performance).

Reduce job pending in HFileOutputFormat

I am using
Hbase:0.92.1-cdh4.1.2, and
I have a mapreduce program that will load data from HDFS to HBase using HFileOutputFormat in cluster mode.
In that mapreduce program i'm using HFileOutputFormat.configureIncrementalLoad() to bulk load a 800000 record
data set which is of 7.3GB size and it is running fine, but it's not running for 900000 record data set which is of 8.3GB.
In the case of 8.3GB data my mapreduce program have 133 maps and one reducer,all maps completed successfully.My reducer status is always in Pending for a long time. There is nothing wrong with the cluster since other jobs are running fine and this job also running fine upto 7.3GB of data.
What could i be doing wrong?
How do I fix this issue?
I ran into the same problem. Looking at the DataTracker logs, I noticed there was not enough free space for the single reducer to run on any of my nodes:
2013-09-15 16:55:19,385 WARN org.apache.hadoop.mapred.JobInProgress: No room for reduce task. Node has 503,777,017,856 bytes free; but we expect reduce input to take 978136413988
This 503gb refers to the free space available on one of the hard drives on the particular slave (""), thus the reducer apparently needs to copy all the data to a single drive.
The reason this happens is your table only has one region when it is brand new. As data is inserted into that region, it'll eventually split on its own.
A solution to this is to pre-create your regions when creating your table. The Bulk Loading Chapter in the HBase book discusses this, and presents two options for doing this. This can also be done via the HBase shell (see create's SPLITS argument I think). The challenge though is defining your splits such that the regions get an even distribution of keys. I've yet to solve this problem perfectly, but here's what I'm doing currently:
HTableDescriptor desc = new HTableDescriptor();
desc.addFamily(new HColumnDescriptor("my_col_fam"));
admin.createTable(desc, Bytes.toBytes(0), Bytes.toBytes(2147483647), 100);
An alternative solution would be to not use configureIncrementalLoad, and instead: 1) just generate your HFile's via MapReduce w/ no reducers; 2) use completebulkload feature in hbase.jar to import your records to HBase. Of course, I think this runs into the same problem with regions, so you'll want to create the regions ahead of time too (I think).
Your job is running with single reduces, means 7GB data getting processed on single task.
The main reason of this is HFileOutputFormat starts reducer that sorts and merges data to be loaded in HBase table.
here, Num of Reducer = num of regions in HBase table
Increase the number of regions and you will achieve parallelism in reducers. :)
You can get more details here:
