Set Default Block Size for Specific Path - alluxio

how can I set default block size for a specific path? I don't know which property to be set in './alluxio fsadmiin pathConf ' command.

You can set the default block size for Alluxio files with alluxio.user.block.size.bytes.default and the default size is 64MB.
You can read more about this property and other related properties here: https://docs.alluxio.io/os/user/stable/en/reference/Properties-List.html#alluxio.user.block.size.bytes.default

Related

dynamic filename and setting of variable

I am dynamically genearting filename in Informatica with filename option in target structure and I am also setting the value of this dynamic filename to a maping variable - SETVARIABLE($$m_FILENAME,FILENAME) But what I see is the filename is generated with different name and varaible is set with different name.
target file - E1_ONBOARDING_0705016055915.txt
varaible - E1_ONBOARDING_0705016054509.txt
I do not understand why few seconds of difference in the timestamp.
When i debug it, it shows same value.
Please help
Can you please try 'E1_ONBOARDING_'||TO_CHAR(SESSSTARTTIME,'MMDDYYYHHMISS')||'.txt' ; Replacing SYSDATE with SESSSTARTTIME.
SESSSTARTTIME will take the value of time when the session starts and remains same for the session whereas SYSDATE varies within the Session.

Setting variables to the same value as other variables in cmd

I would like to set the value of an old variable to a new one, like shown down here:
set %NEWLPATH%=%OLDLPATH%
So the variable %NEWLPATH% needs to get the same value as %OLDLPATH%. The code shown above does not seem to work. Can someone help?
Don't use % on the left hand side of an assignment
set NEWLPATH=%OLDLPATH%
The % is only needed when retrieving the value of a variable.

setting up VFP IDE environment for debugging

I am trying to set up the environment for VFP for an app I have tried SET DEFAULT & SET PATH TO I have also tried to use Environment Manager to all the directories of the prodject but when I run the program I have to use the locate dialog to find the files that the programe needs, main programme sets the environment I think, the code looks like this
CLOSE DATABASES ALL
CLOSE TABLE ALL
SET SYSMENU OFF
SET STATUS OFF
SET STATUS BAR OFF
_VFP.autoyield = .F.
IF FILE("c:\pb1\photobooth\photographer.exe")
SET DEFAULT TO c:\pb1\photobooth
ELSE
ON ERROR DO FORM FORMS\errorfrm WITH ERROR( ), MESSAGE( ), MESSAGE(1), PROGRAM( ), LINENO( )
ENDIF
SET PATH TO ..\CommandBars\Redistr,..\wwclient\,..\sfquery,..\classes,..\wwclient\classes, c:\sdt\sdt\source,c:\sdt\sdt\,..\xfrx,..\xfrx\xfrxlib
SET CLASSLIB TO (HOME()+"ffc\_reportlistener")
SET PROCEDURE TO PROGS\procfile ADDITIVE
SET PROCEDURE TO ..\xfrx\utilityreportlistener.prg ADDITIVE
SET PROCEDURE TO wwUtils ADDITIVE
SET PROCEDURE TO wwEval ADDITIVE
SET PROCEDURE TO CodeBlockClass ADDITIVE <-----
SET CLASSLIB TO wwIPStuff ADDITIVE
SET CLASSLIB TO wwXML ADDITIVE
SET PROCEDURE TO wwHTTP ADDITIVE
SET PROCEDURE TO WWPOP3 ADDITIVE
SET STATUS BAR ON
SET DATE BRITISH
SET DELETED ON
SET SAFETY OFF
SET MULTILOCKS ON
ON KEY LABEL SHIFT+F1 gl_diag=!gl_diag
I am looking for a way to run the programme with out errors so that I can find out why the app is not parsing all data to an XML file Tamar has provided a goog guide to debugging I just need to run the programme to the point the XML get generated. the errors start at the point indicated by the arrow
If the main program is setting the environment, you will likely be overwriting some of the settings by not using the ADDITIVE keyword. In your example, it looks like this is the case for SET PATH and SET CLASSLIB.
Example One - without ADDITIVE
*--- Main program
SET PATH TO "C:\VFP9"
*--- Debug setup
SET PATH TO "D:\Debug"
?set('Path')
Output: D:\Debug
Example Two - with ADDITIVE
*--- Main program
SET PATH TO "C:\VFP9"
*--- Debug setup
SET PATH TO "D:\Debug" ADDITIVE
?set('Path')
Output: D:\Debug;C:\VFP9

Pig: Force one mapper per input line/row

I have a Pig Streaming job where the number of mappers should equal the number of rows/lines in the input file. I know that setting
set mapred.min.split.size 16
set mapred.max.split.size 16
set pig.noSplitCombination true
will ensure that each block is 16 bytes. But how do I ensure that each map job has exactly one line as input? The lines are variable length, so using a constant number for mapred.min.split.size and mapred.max.split.size is not the best solution.
Here is the code I intend to use:
input = load 'hdfs://cluster/tmp/input';
DEFINE CMD `/usr/bin/python script.py`;
OP = stream input through CMD;
dump OP;
SOLVED! Thanks to zsxwing
And, in case anyone else runs into this weird nonsense, know this:
To ensure that Pig creates one mapper for each input file you must set
set pig.splitCombination false
and not
set pig.noSplitCombination true
Why this is the case, I have no idea!
Following your clue, I browsed the Pig source codes to find out the answer.
Set pig.noSplitCombination in the Pig script does't work. In the Pig script, you need to use pig.splitCombination. Then Pig will set the pig.noSplitCombination in JobConf according to the value of pig.splitCombination.
If you want to set pig.noSplitCombination directly, you need to use the command line. For example,
pig -Dpig.noSplitCombination=true -f foo.pig
The difference between these two ways is: if you use set instruction in the Pig script, it is stored in Pig properties. If you use -D, it is stored in Hadoop Configuration.
If you use set pig.noSplitCombination true, then (pig.noSplitCombination, true) is stored in Pig properties. But when Pig wants to init a JobConf, it fetches the value using pig.splitCombination from Pig properties. So your setting has not effect. Here is the source codes. The correct way is set pig.splitCombination false as you mentioned.
If you use -Dpig.noSplitCombination=true, (pig.noSplitCombination, true) is stored in Hadoop Configuration. Since JobConf is copied from Configuration, the value of -D is directly passed to JobConf.
At last, PigInputFormat reads pig.noSplitCombination from JobConf to decide if using the combination. Here is the source codes.

Uiimport does not save variable to base workspace

I tried using uiimport to load a file to the base workspace.....It worked first time....but after trying again after a while...I wasnt seeing the variable in the base work space.
I used the default variable name which is given by 'uiimport".
This was the command I used:
uiimport(filename)
And two variables where created by default..."data" and "textdata"(which is the header)....but now when i run it is no longer saved in the base workspace
I do not want to assign a variable to the uiimport like so...
K = uiimport(filename)
assignin(base,'green',K)
I do not want to do that because
My dataset has a text header and the data itself, and doing this would assign both "textdata" and "data" to "green" variable
How would I be able to get the dimensions of ONLY the "data" in green and how would I pass only "data"(which is in the green variable in the workspace.."rmbr"...the green variable holds both "data" and "textdata") to another function.
I was able to do all this when the uiimport automatically saved the variables in the base workspace....but somehow now it doesn't.
I would appreciate any help or tips on this matter
One thing to note about UIIMPORT is that it will save variables to the workspace from which it is called. If you call it from the command window, the variables will be saved to the base workspace. However, if you call it from within a function, the variables will be saved in the workspace of the function. This may explain why you are not seeing the variables appear in the base workspace.
One solution would be to do the following, using the function ASSIGNIN:
K = uiimport(filename); %# Load your data into a structure K
assignin('base','green',K.data); %# Get the "data" field from K and assign
%# it to variable "green" in the base
%# workspace
Use
K = uiimport(filename);
green=[K.data];
to get only numerical data in your green variable.
uiimport returns file data as a structure containing the fields data, textdata, and colheaders. To return only the data field, assign another variable as K.data, or simply reassign K=K.data if you don't want the rest of the information contained by the file.

Resources