HBase bulkload fails to load Hfiles - hadoop

I'm trying to load Hfiles to an HBase docker container after an upgrade to Hbase 2.2.3.
My load Hfiles function looks as follows:
public void loadHfiles(String hfilesPath) throws IOException{
Path hfiles = new Path(hfilesPath);
Configuration conf = DataContext.getConfig();
BulkLoadHFiles loader = BulkLoadHFiles.create(conf);
loader.bulkLoad(HbaseDataIndex.dataContext.getTableName(), hfiles);
}
The Hfiles load fails with the following error:
[org.apache.hadoop.hbase.tool.LoadIncrementalHFiles] [WARN ] [main]
Skipping non-directory
file:/hfiles/14-02-2023-19-01-43/rHfiles/R/f0994365fa064243b81db927c18600ac
[org.apache.hadoop.hbase.tool.LoadIncrementalHFiles] [WARN ] [main]
Bulk load operation did not find any files to load in directory
/hfiles/14-02-2023-19-01-43/rHfiles/R. Does it contain files in
subdirectories that correspond to column family names?
My HBase table contains the following schema:
COLUMN FAMILIES DESCRIPTION
{NAME => 'BL', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false',
NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE',
CACHE_DATA_ON_WRITE => 'false' , DATA_BLOCK_ENCODING => 'NONE', TTL =>
'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER
=> 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMO RY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false',
COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
{NAME => 'R', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false',
NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE',
CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL =>
'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER
=> 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMOR Y => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false',
COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
I checked and the Hfiles exist on the Hbase container under /hfiles/14-02-2023-19-01-43/rHfiles/R/f0994365fa064243b81db927c18600ac with all the permission r/w/x.
In my previous HBase version 2.1.0 my load Hfiles function looked as follows:
Connection conn = ConnectionFactory.createConnection(DataContext.getConfig());
Admin admin = conn.getAdmin();
Table table = conn.getTable(DataContextataContext.getTableName());
RegionLocator regionLocator = conn.getRegionLocator(HDataContextataContext.getTableName());
LoadIncrementalHFiles load = new LoadIncrementalHFiles(HbaseDataIndex.dataContext.getConfig());
load.doBulkLoad(new Path(hfilesPath), admin, table, regionLocator);
It worked fine. In the new version, I saw that LoadIncrementalHFiles is deprecated, therefore I used the implementation above with BulkLoadHFiles.
What am I doing wrong here?
Must Hfiles be on HDFS/S3 in the new BulkLoadHFiles implementation?

Related

how to speed up magento2 with redis?

I am using magento2 , but its page load time max than 4s.
have alread config useing varnish .
after config use redis , there are still huge sql query at catalog page ? why ? how to speed up this page ?
redis config is : app/etc/env.php
'cache' => array(
'frontend' => array(
'default' => array(
'backend' => 'Cm_Cache_Backend_Redis',
'backend_options' => array(
'server' => '127.0.0.1',
'database' => '0',
'port' => '6379',
),
),
'page_cache' => array(
'backend' => 'Cm_Cache_Backend_Redis',
'backend_options' => array(
'server' => '127.0.0.1',
'port' => '6379',
'database' => '1',
'compress_data' => '0',
),
),
),
),
=======================
catalog page query is :
enter image description here
To enable Magento2 profiler following below steps:
Set environment variable: MAGE_PROFILER = html. Refer to this link.
Enable/Set Magento2 developer mode
php bin/magento deploy:mode:set developer
Clear the cache. OR You can disable the cache for time being.
Open the website in browser incognito(private) window. Browse to the slow webpage & check at the bottom, you will see the profiler stack trace showing calls and time taken.

Creating a Phoenix table view with existing Hbase tables of common names

I've cloned a table and I'm trying to create a Phoenix view for it according to https://phoenix.apache.org/faq.html#How_I_map_Phoenix_table_to_an_existing_HBase_table.
Suppose I have two HBase tables below.
hbase(main):008:0> describe 'USERINFO'
Table USERINFO is ENABLED
USERINFO, {TABLE_ATTRIBUTES => {coprocessor$1 => '|org.apache.phoenix.coprocessor.ScanRegionObserver|805306366|', coprocessor$2 => '|org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver|805306366|', coprocessor$3 => '
|org.apache.phoenix.coprocessor.GroupedAggregateRegionObserver|805306366|', coprocessor$4 => '|org.apache.phoenix.coprocessor.ServerCachingEndpointImpl|805306366|', coprocessor$5 => '|org.apache.phoenix.hbase.index.Indexer|80530
6366|index.builder=org.apache.phoenix.index.PhoenixIndexBuilder,org.apache.hadoop.hbase.index.codec.class=org.apache.phoenix.index.PhoenixIndexCodec', coprocessor$6 => '|org.apache.hadoop.hbase.regionserver.LocalIndexSplitter|80
5306366|'}
COLUMN FAMILIES DESCRIPTION
{NAME => '0', DATA_BLOCK_ENCODING => 'FAST_DIFF', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'SNAPPY', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '6553
6', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
hbase(main):006:0> describe 'USERPREFERENCE'
Table USERPREFERENCE is ENABLED
USERPREFERENCE, {TABLE_ATTRIBUTES => {coprocessor$1 => '|org.apache.phoenix.coprocessor.ScanRegionObserver|805306366|', coprocessor$2 => '|org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver|805306366|', coprocessor$
3 => '|org.apache.phoenix.coprocessor.GroupedAggregateRegionObserver|805306366|', coprocessor$4 => '|org.apache.phoenix.coprocessor.ServerCachingEndpointImpl|805306366|', coprocessor$5 => '|org.apache.phoenix.hbase.index.Indexer
|805306366|index.builder=org.apache.phoenix.index.PhoenixIndexBuilder,org.apache.hadoop.hbase.index.codec.class=org.apache.phoenix.index.PhoenixIndexCodec', coprocessor$6 => '|org.apache.hadoop.hbase.regionserver.LocalIndexSplit
ter|805306366|'}
COLUMN FAMILIES DESCRIPTION
{NAME => '0', DATA_BLOCK_ENCODING => 'FAST_DIFF', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', COMPRESSION => 'SNAPPY', VERSIONS => '1', TTL => 'FOREVER', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '6553
6', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
I would run the follow command to create a table view in Phoenix?
CREATE VIEW USERINFO ( pk VARCHAR PRIMARY KEY, "0".team VARCHAR, "0".firstname, "0".lastname )
CREATE VIEW USERPREFERENCE ( pk VARCHAR PRIMARY KEY, "0".firstname VARCHAR, "0".lastname )
This seems incorrect. How do I create a table view according to this situation?
So I believe what you are missing is that you need to specify what kind of a variable each column would be. So, technically you need to mention the type of column your variable firstname will be and so on.
I tried and tested this code. First I fired up my hbase-shell(Don't worry about the DCDR, its just my namespace, our DBA's have this pretty much locked down):
create 'DCDR:USERINFOV2', {NAME => '0', VERSIONS => 5}
create 'DCDR:USERPREFERENCEV2', {NAME => '0', VERSIONS => 5}
Then I fire up my phoenix shell:
CREATE VIEW "DCDR:USERINFOV2" ( pk VARCHAR PRIMARY KEY, "0".team VARCHAR, "0".firstname VARCHAR, "0".lastname VARCHAR);
CREATE VIEW "DCDR:USERPREFERENCEV2" ( pk VARCHAR PRIMARY KEY, "0".firstname VARCHAR, "0".lastname VARCHAR);
This gives me the views. Hopefully this works for you. Let me know if not.

How to configure realUrl 2.1.x to prevent unwanted enties in tx_realurl_urldata

We have a multi langauge setup for our homepage
en.html is mapped to index.php?L=0
and
de.html is mapped to index.php?L=1
Rootpage ID = 76
5 Years with realUrl 1.x.x worked fine .. Until upgrading to 2.1.x i got some strange effekts. f.e.:
Today it happened that the following entry was stored into tx_realurl_urldata :
original_url= L=1%27A%3D0&id=76
speaking_url = de.html
request_variables = {"id":"76","L":"1'A=0"}
and the de.html shows the content of the englisch Version. Deleting that row, fixes the problem. But i am shure, this entry will re-appear.
What should be don in Real Url Conf to store only allowed languages into request Variables . ??
My (custommade) RealUrl Conf File looks like this:
$TYPO3_CONF_VARS['EXTCONF']['realurl'] = array(
'_DEFAULT' => array(
'init' => array(
'enableCHashCache' => 1,
'appendMissingSlash' => 'ifNotFile',
'enableUrlDecodeCache' => 1,
'enableUrlEncodeCache' => 1,
'emptyUrlReturnValue' => '/',
'postVarSet_failureMode' => '',
),
'cache' => array ( 'banUrlsRegExp' => '/ContactLeadId=|gclid=|type=|(?:^|\?|&)q=/' )
'redirects' => array(),
'preVars' => array(
array(
'GETvar' => 'L',
'valueMap' => array(
// alle sprachen die doch nicht live gehen deaktivieren
//'en' => 0, //international (needs no url part because its the default language)
'de' => 1, //germany
'it' => 2, //italy
'cz' => 3, //czechrepublic
'fr' => 4, //france
'ch_de' => 6, //switzerland - german
'at' => 7, //austria
'es' => 18, //spain
'ch_fr' => 19, //switzerland - french
),
'noMatch' => 'bypass',
),
),
'pagePath' => array(
'type' => 'user',
'userFunc' => 'EXT:realurl/class.tx_realurl_advanced.php:&tx_realurl_advanced->main',
'spaceCharacter' => '-',
'languageGetVar' => 'L',
'expireDays' => 7,
'rootpage_id' => 76,
),
after that it just follows fixedPostVars, filenames , postVarSets and some definitions for different domains. But i think those settings are not important for the "L=" problem.

symfony3 : query_builder search by boolean field

I have a form search and i'd like to select data by boolean field. The problem is that if the select choice has false value (0) the query returns all data but if selected choice has true value (1) the query is correct.
In the formTye:
->add('publier', ChoiceType::class, array(
'required' => false,
'label' => 'Publier',
'choices' => array('oui' => '1', 'non' => '0'),
'multiple' => false,
'expanded' => false,
'attr' => array('class'=> 'form-control')
));
and in the query_builder
if (!empty($publier)) {
$qb->andWhere('a.publier = :publier')
->setParameter('publier', $publier );
}
if I remove this test : if (!empty($publier)) { and I select a false choice the returned data is correct but I can't remove this test.
I have changed
if (!empty($publier))
by
if (null !== $publier )
and it works fine now
I'm not sure I understand your question clearly, but if it's a boolean you want, you should try this:
->add('publier', ChoiceType::class, array(
'required' => false,
'label' => 'Publier',
'choices' => array(
'oui' => true,
'non' => false
),
'multiple' => false,
'expanded' => false,
'attr' => array('class'=> 'form-control')
));
Not sure if it will work. The way you have it with '1' and '0', those are strings (not integers).

Using and Configuring Zend Session and Zend Cache Memcached - Zend Framework 2.3

Actually, I'm using "standard" sessions manager config:
http://framework.zend.com/manual/current/en/modules/zend.session.manager.html
I want to use cache and save my session's data into server's cache (memcached) for improves performances and scalability.
I set php.ini like this (localhost memcached):
session.save_handler=memcached
session.save_path= "tcp://127.0.0.1"
and it show this error:
Warning: session_start(): Cannot find save handler 'memcached' - session startup failed in C:\Program Files (x86)\xampp\htdocs\Zend-application\vendor\zendframework\zendframework\library\Zend\Session\SessionManager.php on line 98
So, I don't understand how to configure my config/autoload/global.php and module/application/module.php. it's my first time that I want to implement memcached and caching in general. thanks, so much!
I tried to modify module/application/module.php like this:
---add session and cache ---
use Zend\Session\Config\SessionConfig;
use Zend\Session\Container;
use Zend\Cache\StorageFactory;
use Zend\Session\SaveHandler\Cache;
use Zend\Session\SessionManager;
use Zend\Session\Validator\HttpUserAgent;
use Zend\Session\Validator\RemoteAddr;
--- end session and cache ---
public function onBootstrap($e)
{
$eventManager = $e->getApplication()->getEventManager();
$moduleRouteListener = new ModuleRouteListener();
$moduleRouteListener->attach($eventManager);
$this->initSession(array(
'remember_me_seconds' => 180,
'use_cookies' => true,
'cookie_httponly' => true,
'validators' => array(
'Zend\Session\Validator\RemoteAddr',
'Zend\Session\Validator\HttpUserAgent',
'phpSaveHandler' => 'memcached',
'savePath' => 'tcp://127.0.0.1',
)
));
}
public function initSession($config)
{
$sessionConfig = new SessionConfig();
$sessionConfig->setOptions($config);
$sessionManager = new SessionManager($sessionConfig);
$sessionManager->getValidatorChain()
->attach(
'session.validate',
array(new HttpUserAgent(), 'isValid')
)
->attach(
'session.validate',
array(new RemoteAddr(), 'isValid')
);
$cache = StorageFactory::factory(array(
'adapter' => array(
'name' => 'memcached',
'options' => array(
'server' => '127.0.0.1',
),
)
));
$saveHandler = new Cache($cache);
$sessionManager->setSaveHandler($saveHandler);
$sessionManager->start();
Container::setDefaultManager($sessionManager);
}
but it shows this error:
Warning: ini_set() expects parameter 2 to be string, array given in C:\Program Files (x86)\xampp\htdocs\Zend-application\vendor\zendframework\zendframework\library\Zend\Session\Config\SessionConfig.php on line 88
Fatal error: Call to undefined method Zend\Stdlib\CallbackHandler::attach() in C:\Program Files (x86)\xampp\htdocs\Zend-application\module\Application\Module.php on line 68
this is my config/autoload/global.php
return array(
'db' => array(
'driver' => 'Pdo_Mysql',
'charset' => 'utf-8',
'dsn' => 'mysql:dbname=mydb;host=localhost',
'driver_options' => array(
PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES \'UTF8\''
),
),
'service_manager' => array(
'factories' => array(
'Zend\Db\Adapter\Adapter' => 'Zend\Db\Adapter\AdapterServiceFactory',
),
),
'session' => array(
'config' => array(
'class' => 'Zend\Session\Config\SessionConfig',
'options' => array(
'name' => 'zend-application',
),
),
'storage' => 'Zend\Session\Storage\SessionArrayStorage',
'validators' => array(
'Zend\Session\Validator\RemoteAddr',
'Zend\Session\Validator\HttpUserAgent',
),
),
);
Hoping it'll help someone, I resolved my issue. I'm working in Win7 enviroment and memcached doesn't work on it! I changed :
session.save_handler=memcached
session.save_path= "tcp://127.0.0.1"
to
session.save_handler=memcache
session.save_path= "tcp://127.0.0.1:11211"
I restored the "standard" session manager config and memcache works correctly. When I'll transfer the entire site to apache server, I'll change php.ini for using memcached.
http://framework.zend.com/manual/current/en/modules/zend.session.manager.html

Resources