JavaFX: ObsevableMap keySet as an ObservableSet - java-8

I want to transform an ObservableMap's keySet to a read only ObservableSet. I don't want to copy the value, any modification to the ObservableMap must affect the Observable keySet. If i bind another set to the observable key set content, its content is automatically updated.
This is what i would like to write.
ObservableMap<String, Object> map = FXCollections.observableHashMap();
ObservableSet<String> keySet = FXCollections.observableKeySet(map);
Set<String> boundSet = new HashSet<String>();
Bindings.bindContent(boundSet, keySet);
map.put("v", new Object());
assert boundSet.contains("v");
Is there this functionality in the JavaFX SDK ?

The feature you request does not need a special ObservableSet. It’s already part of the Map interface contract:
ObservableMap<String, Object> map = FXCollections.observableHashMap();
Set<String> keySet = map.keySet();
map.put("v", new Object());
assert keySet.contains("v");
A Map’s keyset always reflects the changes made to the backing map.
http://docs.oracle.com/javase/8/docs/api/java/util/Map.html#keySet--
Returns a Set view of the keys contained in this map. The set is backed by the map, so changes to the map are reflected in the set, and vice-versa.

As far as I know, there's no built-in way to do it.
Here's a utility method that I made to handle this:
/**
* Builds and returns an observable keyset of the specified map. Note that the resulting
* keyset is a new set that is guaranteed to have the same items as the map's true
* keyset, but does not make any guarantees about iteration order or implementation
* details of the keyset (for example, if the Map is a SortedMap, the keyset
* will not necessarily maintain keys in sorted order).
* #param <K> Map's key type
* #param <V> Map's value type
* #param map the ObservableMap to which the set should be bound
* #return a new observable set reflecting the map's keys
*/
public static <K, V> ObservableSet<K> getObservableKeySet(ObservableMap<K, V> map) {
ObservableSet<K> set = FXCollections.observableSet(new HashSet<>());
map.addListener(
(MapChangeListener<K, V>)(
change -> {
if(change.wasAdded() && !change.wasRemoved()) {
set.add(change.getKey());
}
if(change.wasRemoved() && !change.wasAdded()) {
set.remove(change.getKey());
}
//Note that if change was added and removed, that means that
//the key was unchanged and a value was just replaced. That
//shouldn't affect the keyset so we do nothing
})
);
return set;
}
Update:
I decided that I didn't like how the iteration order of the resulting set wouldn't match the original map's iteration order, so instead I made a new class that acts as an observable wrapper around a map's keyset. This is more like Map's built-in keySet() method where it returns a view of the actual set. This just adds in the listeners that make it observable:
/**
* Observable view of an ObservableMap's keyset
*/
public class ObservableKeySet<K, V> implements ObservableSet<K> {
/**
* The map's actual keyset object that gets wrapped
*/
private Set<K> wrappedSet;
/**
* Invalidation listeners to be notified when the set changes. Note that we end
* up calling these more than we should since invalidation listeners should only
* be called if the observable value is observed between changes and we're going
* to call them on every change. However, the ObservableSet returned from
* FxCollectionUtilities.observableSet actually also does that too, so I feel
* like we can get away with it.
*/
private Collection<InvalidationListener> invalidationListeners = new ArrayList<>();
/**
* Change listeners to be notified when the set changes
*/
private Collection<SetChangeListener<? super K>> changeListeners = new ArrayList<>();
/**
* Creates an Observable Set view of the specified map's keyset
* #param map ObservableMap
*/
public ObservableKeySet(ObservableMap<K, V> map) {
this.wrappedSet = map.keySet();
map.addListener((MapChangeListener<K,V>)this::onMapChange);
}
/**
* Code to be executed on any match change. It will determine if there is a resulting
* set change and trigger listeners as appropriate
* #param change
*/
private void onMapChange(MapChangeListener.Change<? extends K, ? extends V> change) {
SetChangeListener.Change<K> setChange = null;
//Note that if the map change says that there was an add and removal, then
//that means a value was getting replaced, which doesn't result in a keySet
//change
if(change.wasAdded() && ! change.wasRemoved()) {
setChange = new BasicSetChange(true, change.getKey());
}
else if(change.wasRemoved() && ! change.wasAdded()) {
setChange = new BasicSetChange(false, change.getKey());
}
if(setChange != null) {
invalidationListeners.forEach(listener -> listener.invalidated(this));
final SetChangeListener.Change<K> finalChange = setChange;
changeListeners.forEach(listener -> listener.onChanged(finalChange));
}
}
#Override
public void addListener(InvalidationListener listener) {
invalidationListeners.add(listener);
}
#Override
public void removeListener(InvalidationListener listener) {
invalidationListeners.remove(listener);
}
#Override
public void addListener(SetChangeListener<? super K> listener) {
changeListeners.add(listener);
}
#Override
public void removeListener(SetChangeListener<? super K> listener) {
changeListeners.remove(listener);
}
//Simple wrapper method that either just pass through to the wrapped set or
//throw an unsupported operation exception
#Override public int size() {return wrappedSet.size();}
#Override public boolean isEmpty() {return wrappedSet.isEmpty();}
#Override public boolean contains(Object o) {return wrappedSet.contains(o);}
#Override public Iterator<K> iterator() {return wrappedSet.iterator();}
#Override public Object[] toArray() {return wrappedSet.toArray();}
#Override public <T> T[] toArray(T[] a) {return wrappedSet.toArray(a);}
#Override public boolean containsAll(Collection<?> c) {return wrappedSet.containsAll(c);}
#Override public boolean add(K e) {throw new UnsupportedOperationException();}
#Override public boolean remove(Object o) {throw new UnsupportedOperationException();}
#Override public boolean addAll(Collection<? extends K> c) {throw new UnsupportedOperationException();}
#Override public boolean retainAll(Collection<?> c) {throw new UnsupportedOperationException();}
#Override public boolean removeAll(Collection<?> c) {throw new UnsupportedOperationException();}
#Override public void clear() {throw new UnsupportedOperationException();}
/**
* Simple implementation of {#link SetChangeListener.Change}
*/
private class BasicSetChange extends SetChangeListener.Change<K> {
/** If true, it is an add change, otherwise it is a remove change*/
private final boolean isAdd;
/** Value that was added or removed */
private final K value;
/**
* #param isAdd {#link #isAdd}
* #param value {#link #value}
*/
public BasicSetChange(boolean isAdd, K value) {
super(ObservableKeySet.this);
this.isAdd = isAdd;
this.value = value;
}
#Override
public boolean wasAdded() {
return isAdd;
}
#Override
public boolean wasRemoved() {
return !isAdd;
}
#Override
public K getElementAdded() {
return isAdd ? value : null;
}
#Override
public K getElementRemoved() {
return isAdd ? null : value;
}
}
}

Related

Edit next row on tab

When i put all code in a SSCCE, it works as expected i.e first and third cells are editable. When tab on last column, takes to next row.
import java.text.NumberFormat;
import java.text.ParseException;
import java.text.ParsePosition;
import java.util.ArrayList;
import java.util.List;
import javafx.application.Application;
import static javafx.application.Application.launch;
import javafx.application.Platform;
import javafx.beans.property.ListProperty;
import javafx.beans.property.SimpleListProperty;
import javafx.beans.value.ChangeListener;
import javafx.beans.value.ObservableValue;
import javafx.collections.FXCollections;
import javafx.collections.ObservableList;
import javafx.event.ActionEvent;
import javafx.event.EventHandler;
import javafx.geometry.Insets;
import javafx.scene.Group;
import javafx.scene.Scene;
import javafx.scene.control.Button;
import javafx.scene.control.ContentDisplay;
import javafx.scene.control.TableCell;
import javafx.scene.control.TableColumn;
import javafx.scene.control.TableColumn.CellEditEvent;
import javafx.scene.control.TablePosition;
import javafx.scene.control.TableView;
import javafx.scene.control.TextField;
import javafx.scene.control.cell.PropertyValueFactory;
import javafx.scene.input.KeyCode;
import javafx.scene.input.KeyEvent;
import javafx.scene.layout.VBox;
import javafx.stage.Stage;
import javafx.util.Callback;
/*
* To change this license header, choose License Headers in Project Properties.
* To change this template file, choose Tools | Templates
* and open the template in the editor.
*/
/**
*
* #author Yunus
*/
public class CollectionForm extends Application{
private TableView table = new TableView();
private ObservableList<Collection> collectionList = FXCollections.<Collection>observableArrayList();
ListProperty<Collection> collectionListProperty = new SimpleListProperty<>();
/**
* #param args the command line arguments
*/
public static void main(String[] args) {
launch(args);
}
#Override
public void start(Stage stage) {
// single cell selection mode
table.getSelectionModel().setCellSelectionEnabled(true);
//Create a custom cell factory so that cells can support editing.
Callback<TableColumn, TableCell> editableFactory = new Callback<TableColumn, TableCell>() {
#Override
public TableCell call(TableColumn p) {
return new EditableTableCell();
}
};
//A custom cell factory that creates cells that only accept numerical input.
Callback<TableColumn, TableCell> numericFactory = new Callback<TableColumn, TableCell>() {
#Override
public TableCell call(TableColumn p) {
return new NumericEditableTableCell();
}
};
Button b = createSaveCollectionBtn();
//Create columns
TableColumn colMNO = createMNOColumn(editableFactory);
TableColumn colName = createNameColumn(editableFactory);
TableColumn colQty = createQuantityColumn(numericFactory);
table.getColumns().addAll(colMNO, colName, colQty);
//Make the table editable
table.setEditable(true);
collectionListProperty.set(collectionList);
table.itemsProperty().bindBidirectional(collectionListProperty);
collectionList.add(new Collection());
collectionList.add(new Collection());
Scene scene = new Scene(new Group());
stage.setTitle("Table View Sample");
final VBox vbox = new VBox();
vbox.setSpacing(5);
vbox.getChildren().addAll(b, table);
vbox.setPadding(new Insets(10, 0, 0, 10));
((Group) scene.getRoot()).getChildren().addAll(vbox);
stage.setScene(scene);
stage.show();
}
private void handleCollection(ActionEvent event){
for (Collection collection : collectionList) {
System.out.println("MNO: "+collection.getMno()+" Quantity: "+collection.getQuantity());
}
}
private Button createSaveCollectionBtn(){
Button btn = new Button("Save Collection");
btn.setId("btnSaveCollection");
btn.setOnAction(this::handleCollection);
return btn;
}
private TableColumn createQuantityColumn(Callback<TableColumn, TableCell> editableFactory) {
TableColumn colQty = new TableColumn("Quantity");
colQty.setMinWidth(25);
colQty.setId("colQty");
colQty.setCellValueFactory(new PropertyValueFactory("quantity"));
colQty.setCellFactory(editableFactory);
colQty.setOnEditCommit(new EventHandler<CellEditEvent<Collection, Long>>() {
#Override
public void handle(CellEditEvent<Collection, Long> t) {
((Collection) t.getTableView().getItems().get(t.getTablePosition().getRow())).setQuantity(t.getNewValue());
}
});
return colQty;
}
private TableColumn createMNOColumn(Callback<TableColumn, TableCell> editableFactory) {
TableColumn colMno = new TableColumn("M/NO");
colMno.setMinWidth(25);
colMno.setId("colMNO");
colMno.setCellValueFactory(new PropertyValueFactory("mno"));
colMno.setCellFactory(editableFactory);
colMno.setOnEditCommit(new EventHandler<CellEditEvent<Collection, String>>() {
#Override
public void handle(CellEditEvent<Collection, String> t) {
((Collection) t.getTableView().getItems().get(t.getTablePosition().getRow())).setMno(t.getNewValue());
}
});
return colMno;
}
private TableColumn createNameColumn(Callback<TableColumn, TableCell> editableFactory) {
TableColumn colName = new TableColumn("Name");
colName.setEditable(false);
colName.setMinWidth(100);
colName.setId("colName");
colName.setCellValueFactory(new PropertyValueFactory<Collection, String>("name"));
colName.setCellFactory(editableFactory);
//Modifying the firstName property
colName.setOnEditCommit(new EventHandler<CellEditEvent<Collection, String>>() {
#Override
public void handle(CellEditEvent<Collection, String> t) {
((Collection) t.getTableView().getItems().get(t.getTablePosition().getRow())).setName(t.getNewValue());
}
});
return colName;
}
/**
*
* #author Graham Smith
*/
public class EditableTableCell<S extends Object, T extends String> extends AbstractEditableTableCell<S, T> {
public EditableTableCell() {
}
#Override
protected String getString() {
return getItem() == null ? "" : getItem().toString();
}
#Override
protected void commitHelper( boolean losingFocus ) {
commitEdit(((T) textField.getText()));
}
}
/**
*
* #author Graham Smith
*/
public class NumericEditableTableCell<S extends Object, T extends Number> extends AbstractEditableTableCell<S, T> {
private final NumberFormat format;
private boolean emptyZero;
private boolean completeParse;
/**
* Creates a new {#code NumericEditableTableCell} which treats empty strings as zero,
* will parse integers only and will fail if is can't parse the whole string.
*/
public NumericEditableTableCell() {
this( NumberFormat.getInstance(), true, true, true );
}
/**
* The integerOnly and completeParse settings have a complex relationship and care needs
* to be take to get the correct result.
* <ul>
* <li>If you want to accept only integers and you want to parse the whole string then
* set both integerOnly and completeParse to true. Strings such as 1.5 will be rejected
* as invalid. A string such as 1000 will be accepted as the number 1000.</li>
* <li>If you only want integers but don't care about parsing the whole string set
* integerOnly to true and completeParse to false. This will parse a string such as
* 1.5 and provide the number 1. The downside of this combination is that it will accept
* the string 1x and return the number 1 also.</li>
* <li>If you want to accept decimals and want to parse the whole string set integerOnly
* to false and completeParse to true. This will accept a string like 1.5 and return
* the number 1.5. A string such as 1.5x will be rejected.</li>
* <li>If you want to accept decimals and don't care about parsing the whole string set
* both integerOnly and completeParse to false. This will accept a string like 1.5x and
* return the number 1.5. A string like x1.5 will be rejected because ti doesn't start
* with a number. The downside of this combination is that a string like 1.5x3 will
* provide the number 1.5.</li>
* </ul>
*
* #param format the {#code NumberFormat} to use to format this cell.
* #param emptyZero if true an empty cell will be treated as zero.
* #param integerOnly if true only the integer part of the string is parsed.
* #param completeParse if true an exception will be thrown if the whole string given can't be parsed.
*/
public NumericEditableTableCell( NumberFormat format, boolean emptyZero, boolean integerOnly, boolean completeParse ) {
this.format = format;
this.emptyZero = emptyZero;
this.completeParse = completeParse;
format.setParseIntegerOnly(integerOnly);
}
#Override
protected String getString() {
return getItem() == null ? "" : format.format(getItem());
}
/**
* Parses the value of the text field and if matches the set format
* commits the edit otherwise it returns the cell to it's previous value.
*/
#Override
protected void commitHelper( boolean losingFocus ) {
if( textField == null ) {
return;
}
try {
String input = textField.getText();
if (input == null || input.length() == 0) {
if(emptyZero) {
setText( format.format(0) );
commitEdit( (T)new Integer( 0 ));
}
return;
}
int startIndex = 0;
ParsePosition position = new ParsePosition(startIndex);
Number parsedNumber = format.parse(input, position);
if (completeParse && position.getIndex() != input.length()) {
throw new ParseException("Failed to parse complete string: " + input, position.getIndex());
}
if (position.getIndex() == startIndex ) {
throw new ParseException("Failed to parse a number from the string: " + input, position.getIndex());
}
commitEdit( (T)parsedNumber );
} catch (ParseException ex) {
//Most of the time we don't mind if there is a parse exception as it
//indicates duff user data but in the case where we are losing focus
//it means the user has clicked away with bad data in the cell. In that
//situation we want to just cancel the editing and show them the old
//value.
if( losingFocus ) {
cancelEdit();
}
}
}
}
/**
* Provides the basis for an editable table cell using a text field. Sub-classes can provide formatters for display and a
* commitHelper to control when editing is committed.
*
* #author Graham Smith
*/
public abstract class AbstractEditableTableCell<S, T> extends TableCell<S, T> {
protected TextField textField;
public AbstractEditableTableCell() {
}
/**
* Any action attempting to commit an edit should call this method rather than commit the edit directly itself. This
* method will perform any validation and conversion required on the value. For text values that normally means this
* method just commits the edit but for numeric values, for example, it may first parse the given input. <p> The only
* situation that needs to be treated specially is when the field is losing focus. If you user hits enter to commit the
* cell with bad data we can happily cancel the commit and force them to enter a real value. If they click away from the
* cell though we want to give them their old value back.
*
* #param losingFocus true if the reason for the call was because the field is losing focus.
*/
protected abstract void commitHelper(boolean losingFocus);
/**
* Provides the string representation of the value of this cell when the cell is not being edited.
*/
protected abstract String getString();
#Override
public void startEdit() {
super.startEdit();
if (textField == null) {
createTextField();
}
setGraphic(textField);
setContentDisplay(ContentDisplay.GRAPHIC_ONLY);
Platform.runLater(new Runnable() {
#Override
public void run() {
textField.selectAll();
textField.requestFocus();
}
});
}
#Override
public void cancelEdit() {
super.cancelEdit();
setText(getString());
setContentDisplay(ContentDisplay.TEXT_ONLY);
//Once the edit has been cancelled we no longer need the text field
//so we mark it for cleanup here. Note though that you have to handle
//this situation in the focus listener which gets fired at the end
//of the editing.
textField = null;
}
#Override
public void updateItem(T item, boolean empty) {
super.updateItem(item, empty);
if (empty) {
setText(null);
setGraphic(null);
} else {
if (isEditing()) {
if (textField != null) {
textField.setText(getString());
}
setGraphic(textField);
setContentDisplay(ContentDisplay.GRAPHIC_ONLY);
} else {
setText(getString());
setContentDisplay(ContentDisplay.TEXT_ONLY);
}
}
}
private void createTextField() {
textField = new TextField(getString());
textField.setMinWidth(this.getWidth() - this.getGraphicTextGap() * 2);
textField.setOnKeyPressed(new EventHandler<KeyEvent>() {
#Override
public void handle(KeyEvent t) {
if (t.getCode() == KeyCode.ENTER) {
commitHelper(false);
} else if (t.getCode() == KeyCode.ESCAPE) {
cancelEdit();
} else if (t.getCode() == KeyCode.TAB) {
commitHelper(false);
TableColumn nextColumn = getNextColumn(!t.isShiftDown());
TablePosition focusedCellPosition = getTableView().getFocusModel().getFocusedCell();
if (nextColumn != null) {
//if( focusedCellPosition.getColumn() ){}focusedCellPosition.getTableColumn()
System.out.println("Column: "+focusedCellPosition.getColumn());
System.out.println("nextColumn.getId();: "+nextColumn.getId());
if( nextColumn.getId().equals("colMNO") ){
collectionList.add(new Collection());
getTableView().edit((getTableRow().getIndex())+1,getTableView().getColumns().get(0) );
getTableView().layout();
} else {
getTableView().edit(getTableRow().getIndex(), nextColumn);
}
}else{
getTableView().edit((getTableRow().getIndex())+1,getTableView().getColumns().get(0) );
}
}
}
});
textField.focusedProperty().addListener(new ChangeListener<Boolean>() {
#Override
public void changed(ObservableValue<? extends Boolean> observable, Boolean oldValue, Boolean newValue) {
//This focus listener fires at the end of cell editing when focus is lost
//and when enter is pressed (because that causes the text field to lose focus).
//The problem is that if enter is pressed then cancelEdit is called before this
//listener runs and therefore the text field has been cleaned up. If the
//text field is null we don't commit the edit. This has the useful side effect
//of stopping the double commit.
if (!newValue && textField != null) {
commitHelper(true);
}
}
});
}
/**
*
* #param forward true gets the column to the right, false the column to the left of the current column
* #return
*/
private TableColumn<S, ?> getNextColumn(boolean forward) {
List<TableColumn<S, ?>> columns = new ArrayList<>();
for (TableColumn<S, ?> column : getTableView().getColumns()) {
columns.addAll(getLeaves(column));
}
//There is no other column that supports editing.
if (columns.size() < 2) {
return null;
}
int currentIndex = columns.indexOf(getTableColumn());
int nextIndex = currentIndex;
if (forward) {
nextIndex++;
if (nextIndex > columns.size() - 1) {
nextIndex = 0;
}
} else {
nextIndex--;
if (nextIndex < 0) {
nextIndex = columns.size() - 1;
}
}
return columns.get(nextIndex);
}
private List<TableColumn<S, ?>> getLeaves(TableColumn<S, ?> root) {
List<TableColumn<S, ?>> columns = new ArrayList<>();
if (root.getColumns().isEmpty()) {
//We only want the leaves that are editable.
if (root.isEditable()) {
columns.add(root);
}
return columns;
} else {
for (TableColumn<S, ?> column : root.getColumns()) {
columns.addAll(getLeaves(column));
}
return columns;
}
}
}
public class Collection {
private int id;
private String mno;
private String name;
private float quantity;
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
public String getMno() {
return mno;
}
public void setMno(String mno) {
this.mno = mno;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public float getQuantity() {
return quantity;
}
public void setQuantity(float quantity) {
this.quantity = quantity;
}
}
}
The problem is when i take the same code to a controller and add this table programmatically, does not work as before: it jumps next line and go to third.
Before asking the TableView to edit the cell it's important to make sure that it has focus, that the cell in question is in view, and that the view layout is up to date. This is probably because of the way TableView uses virtual cells.
Add these three lines before any call to TableView#edit:
getTableView().requestFocus();
getTableView().scrollTo(rowToEdit);
getTableView().layout();
// getTableView().edit goes here.
This solved this problem for me.

Custom object as value for Mapper output

I have object have constructed as following:
Class ObjExample {
String s;
Object[] objArray; // element in this array can be primitive type or array of primitive type.
}
I know that to using it as output type for mapper or reducer, we have to implement WritableComparable for it.
But I really get confused how to write readFields(), write(), compareTo() for this kind of class?
You can wrap field s in Text and objArray in ArrayWritable. Each element of the objArray would be an array (also ArrayWritable) of primitives. Here is possible implementation:
public static final class ObjExample implements WritableComparable<ObjExample> {
public final Text s = new Text(); // wrapped String
public final ArrayOfArrays objArray = new ArrayOfArrays();
#Override
public int compareTo(ObjExample o) {
// your logic here, example:
return s.compareTo(o.s);
}
#Override
public void write(DataOutput dataOutput) throws IOException {
s.write(dataOutput);
objArray.write(dataOutput);
}
#Override
public void readFields(DataInput dataInput) throws IOException {
s.readFields(dataInput);
objArray.readFields(dataInput);
}
// set size of the objArray
public void setSize(int n) {
objArray.set(new IntArray[n]);
}
// set i-th element of the objArray to an array of elements
public void setElement(int i, IntWritable... elements) {
IntArray subArr = new IntArray();
subArr.set(elements);
objArray.get()[i] = subArr;
}
}
You will need two more classes to make it work:
// array of primitives
public static final class IntArray extends ArrayWritable {
public IntArray() {
// you can specify any other primitive wrapper (DoubleWritable, Text, ...)
super(IntWritable.class);
}
}
// array of arrays
public static final class ArrayOfArrays extends ArrayWritable {
public ArrayOfArrays() {
super(IntArray.class);
}
}
Example of construction of the object:
ObjExample o = new ObjExample();
o.s.set("hello");
o.setSize(2);
o.setElement(0, new IntWritable(0)); // single primitive
o.setElement(1, new IntWritable(1), new IntWritable(2)); // array of primitives

Part of key changes when iterating through values when using composite key - Hadoop

I have implemented Secondary sort on Hadoop and I don't really understand the behavior of the framework.
I have created a composite key which contains original key and part of value, that is used for sorting.
To achieve this I have implemented my own partitioner
public class CustomPartitioner extends Partitioner<CoupleAsKey, LongWritable>{
#Override
public int getPartition(CoupleAsKey couple, LongWritable value, int numPartitions) {
return Long.hashCode(couple.getKey1()) % numPartitions;
}
My own group comparator
public class GroupComparator extends WritableComparator {
protected GroupComparator()
{
super(CoupleAsKey.class, true);
}
#Override
public int compare(WritableComparable w1, WritableComparable w2) {
CoupleAsKey c1 = (CoupleAsKey)w1;
CoupleAsKey c2 = (CoupleAsKey)w2;
return Long.compare(c1.getKey1(), c2.getKey1());
}
}
And defined the couple in the following way
public class CoupleAsKey implements WritableComparable<CoupleAsKey>{
private long key1;
private long key2;
public CoupleAsKey() {
}
public CoupleAsKey(long key1, long key2) {
this.key1 = key1;
this.key2 = key2;
}
public long getKey1() {
return key1;
}
public void setKey1(long key1) {
this.key1 = key1;
}
public long getKey2() {
return key2;
}
public void setKey2(long key2) {
this.key2 = key2;
}
#Override
public void write(DataOutput output) throws IOException {
output.writeLong(key1);
output.writeLong(key2);
}
#Override
public void readFields(DataInput input) throws IOException {
key1 = input.readLong();
key2 = input.readLong();
}
#Override
public int compareTo(CoupleAsKey o2) {
int cmp = Long.compare(key1, o2.getKey1());
if(cmp != 0)
return cmp;
return Long.compare(key2, o2.getKey2());
}
#Override
public String toString() {
return key1 + "," + key2 + ",";
}
}
And here is the driver
Configuration conf = new Configuration();
Job job = new Job(conf);
job.setJarByClass(SSDriver.class);
job.setMapperClass(SSMapper.class);
job.setReducerClass(SSReducer.class);
job.setMapOutputKeyClass(CoupleAsKey.class);
job.setMapOutputValueClass(LongWritable.class);
job.setPartitionerClass(CustomPartitioner.class);
job.setGroupingComparatorClass(GroupComparator.class);
FileInputFormat.addInputPath(job, new Path("/home/marko/WORK/Whirlpool/input.csv"));
FileOutputFormat.setOutputPath(job, new Path("/home/marko/WORK/Whirlpool/output"));
job.waitForCompletion(true);
Now, this works, but what is really strange is that while iterating in reducer for a key, second part of the key (the value part) changes in each iteration. Why and how?
#Override
protected void reduce(CoupleAsKey key, Iterable<LongWritable> values, Context context) throws IOException, InterruptedException {
for (LongWritable value : values) {
//key.key2 changes during iterations, why?
context.write(key, value);
}
}
Definition says that "if you want all your relevant rows within a partition of data sent to a single reducer you must implement a grouping comparator". This only ensures that those set of keys will be sent to a single reduce call, and not that the key will change from composite (or whatever) to something that only contains that part of key on which grouping was done.
However, when you iterate over values, the corresponding keys will also change. We normally do not observe this happening, as by default the values are grouped on the same (non-composite) key, and thus, even when the value changes, the (value of-) key remains the same.
You can try printing the object reference of the key, and you shall notice that with every iteration, the object reference of the key is also changing (like this:)
IntWritable#1235ft
IntWritable#6635gh
IntWritable#9804as
Alternatively, you can also try applying a group-comparator on an IntWritable in a following way (you will have to write your own logic to do so):
Group1:
1 a
1 b
2 c
Group2:
3 c
3 d
4 a
and you shall see that with every iteration of value, your key is also changing.

Serializing a long string in Hadoop

I have a class which implements WritableComparable class in Hadoop. This class has two string variables, one short and one very long. I use writeChars to write these variables and readLine to read them but it seems like I get some sort of error. What is the best way to serialize such a long String in Hadoop?
I think you can use byteswritable to make it more efficient. Check the below custom key which has BytesWritable type as callId.
public class CustomMRKey implements WritableComparable<CustomMRKey> {
private BytesWritable callId;
private IntWritable mapperType;
/**
* #default constructor
*/
public CustomMRKey() {
set(new BytesWritable(), new IntWritable());
}
/**
* Constructor
*
* #param callId
* #param mapperType
*/
public CustomMRKey(BytesWritable callId, IntWritable mapperType) {
set(callId, mapperType);
}
/**
* sets the call id and mapper type
*
* #param callId
* #param mapperType
*/
public void set(BytesWritable callId, IntWritable mapperType) {
this.callId = callId;
this.mapperType = mapperType;
}
/**
* This method returns the callId
*
* #return callId
*/
public BytesWritable getCallId() {
return callId;
}
/**
* This method sets the callId given a callId
*
* #param callId
*/
public void setCallId(BytesWritable callId) {
this.callId = callId;
}
/**
* This method returns the mapper type
*
*
* #return
*/
public IntWritable getMapperType() {
return mapperType;
}
/**
* This method is set to store the mapper type
*
* #param mapperType
*/
public void setMapperType(IntWritable mapperType) {
this.mapperType = mapperType;
}
#Override
public void readFields(DataInput in) throws IOException {
callId.readFields(in);
mapperType.readFields(in);
}
#Override
public void write(DataOutput out) throws IOException {
callId.write(out);
mapperType.write(out);
}
#Override
public boolean equals(Object obj) {
if (obj instanceof CustomMRCdrKey) {
CustomMRCdrKey key = (CustomMRCdrKey) obj;
return callId.equals(key.callId)
&& mapperType.equals(key.mapperType);
}
return false;
}
#Override
public int compareTo(CustomMRCdrKey key) {
int cmp = callId.compareTo(key.getCallId());
if (cmp != 0) {
return cmp;
}
return mapperType.compareTo(key.getMapperType());
}
}
To use in say mapper code say you can generate the key of BytesWritable form using something as following :-
You can call as :
CustomMRKey customKey=new CustomMRKey(new BytesWritable(),new IntWritable());
customKey.setCallId(makeKey(value, this.resultKey));
customKey.setMapperType(this.mapTypeIndicator);
Then makeKey method is something like below :-
public BytesWritable makeKey(Text value, BytesWritable key) throws IOException {
try {
ByteArrayOutputStream byteKey = new ByteArrayOutputStream(Constants.MR_DEFAULT_KEY_SIZE);
for (String field : keyFields) {
byte[] bytes = value.getString(field).getBytes();
byteKey.write(bytes,0,bytes.length);
}
if(key==null){
return new BytesWritable(byteKey.toByteArray());
}else{
key.set(byteKey.toByteArray(), 0, byteKey.size());
return key;
}
} catch (Exception ex) {
throw new IOException("Could not generate key", ex);
}
}
Hope this may help.

Custom WritableCompare displays object reference as output

I am new to Hadoop and Java, and I feel there is something obvious I am just missing. I am using Hadoop 1.0.3 if that means anything.
My goal for using hadoop is to take a bunch of files and parse them one file at a time (as opposed to line by line). Each file will produce multiple key-values, but context to the other lines is important. The key and value are multi-value/composite, so I have implemented WritableCompare for the key and Writable for the value. Because the processing of each file take a bit of CPU, I want to save the output of the mapper, then run multiple reducers later on.
For the composite keys, I followed [http://stackoverflow.com/questions/12427090/hadoop-composite-key][1]
The problem is, the output is just Java object references as opposed to the composite key and value. Example:
LinkKeyWritable#bd2f9730 LinkValueWritable#8752408c
I am not sure if the problem is related to not reducing the data at all or
Here is my main class:
public static void main(String[] args) throws Exception {
JobConf conf = new JobConf(Parser.class);
conf.setJobName("raw_parser");
conf.setOutputKeyClass(LinkKeyWritable.class);
conf.setOutputValueClass(LinkValueWritable.class);
conf.setMapperClass(RawMap.class);
conf.setNumMapTasks(0);
conf.setInputFormat(PerFileInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
PerFileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
JobClient.runJob(conf);
}
And my Mapper class:
public class RawMap extends MapReduceBase implements
Mapper {
public void map(NullWritable key, Text value,
OutputCollector<LinkKeyWritable, LinkValueWritable> output,
Reporter reporter) throws IOException {
String json = value.toString();
SerpyReader reader = new SerpyReader(json);
GoogleParser parser = new GoogleParser(reader);
for (String page : reader.getPages()) {
String content = reader.readPageContent(page);
parser.addPage(content);
}
for (Link link : parser.getLinks()) {
LinkKeyWritable linkKey = new LinkKeyWritable(link);
LinkValueWritable linkValue = new LinkValueWritable(link);
output.collect(linkKey, linkValue);
}
}
}
Link is basically a struct of various information that get's split between LinkKeyWritable and LinkValueWritable
LinkKeyWritable:
public class LinkKeyWritable implements WritableComparable<LinkKeyWritable>{
protected Link link;
public LinkKeyWritable() {
super();
link = new Link();
}
public LinkKeyWritable(Link link) {
super();
this.link = link;
}
#Override
public void readFields(DataInput in) throws IOException {
link.batchDay = in.readLong();
link.source = in.readUTF();
link.domain = in.readUTF();
link.path = in.readUTF();
}
#Override
public void write(DataOutput out) throws IOException {
out.writeLong(link.batchDay);
out.writeUTF(link.source);
out.writeUTF(link.domain);
out.writeUTF(link.path);
}
#Override
public int compareTo(LinkKeyWritable o) {
return ComparisonChain.start().
compare(link.batchDay, o.link.batchDay).
compare(link.domain, o.link.domain).
compare(link.path, o.link.path).
result();
}
#Override
public int hashCode() {
return Objects.hashCode(link.batchDay, link.source, link.domain, link.path);
}
#Override
public boolean equals(final Object obj){
if(obj instanceof LinkKeyWritable) {
final LinkKeyWritable o = (LinkKeyWritable)obj;
return Objects.equal(link.batchDay, o.link.batchDay)
&& Objects.equal(link.source, o.link.source)
&& Objects.equal(link.domain, o.link.domain)
&& Objects.equal(link.path, o.link.path);
}
return false;
}
}
LinkValueWritable:
public class LinkValueWritable implements Writable{
protected Link link;
public LinkValueWritable() {
link = new Link();
}
public LinkValueWritable(Link link) {
this.link = new Link();
this.link.type = link.type;
this.link.description = link.description;
}
#Override
public void readFields(DataInput in) throws IOException {
link.type = in.readUTF();
link.description = in.readUTF();
}
#Override
public void write(DataOutput out) throws IOException {
out.writeUTF(link.type);
out.writeUTF(link.description);
}
#Override
public int hashCode() {
return Objects.hashCode(link.type, link.description);
}
#Override
public boolean equals(final Object obj){
if(obj instanceof LinkKeyWritable) {
final LinkKeyWritable o = (LinkKeyWritable)obj;
return Objects.equal(link.type, o.link.type)
&& Objects.equal(link.description, o.link.description);
}
return false;
}
}
I think the answer is in the implementation of the TextOutputFormat. Specifically, the LineRecordWriter's writeObject method:
/**
* Write the object to the byte stream, handling Text as a special
* case.
* #param o the object to print
* #throws IOException if the write throws, we pass it on
*/
private void writeObject(Object o) throws IOException {
if (o instanceof Text) {
Text to = (Text) o;
out.write(to.getBytes(), 0, to.getLength());
} else {
out.write(o.toString().getBytes(utf8));
}
}
As you can see, if your key or value is not a Text object, it calls the toString method on it and writes that out. Since you've left toString unimplemented in your key and value, it's using the Object class's implementation, which is writing out the reference.
I'd say that you should try writing an appropriate toString function or using a different OutputFormat.
It looks like you have a list of objects just like you wanted. You need to implement toString() on your writable if you want a human-readable version printed out instead of an ugly java reference.

Resources