public DoublyLinkedList() {
header = new Node<>(null,null,trailer);
trailer = new Node<>(null,header,null);
}
public DoublyLinkedList() {
header = new Node<>(null, null, null); // create header
trailer = new Node<>(null, header, null); // trailer is preceded by header
header.setNext(trailer); // header is followed by trailer
}
Above are the two different constructors of DoublyLinkedList. The first one is my idea and the second is what is written in the textbook. I wonder if my code works out.
Your code won't work unfortunately because at the time the header is created, the trailer is not yet available.
Related
My restful service accepts a file argument, the contents of which is an Apache SS (XSSF) Workbook. The method iterates through the cells of the Workbook and takes various actions, based on the values of the cells.
I've conceived a Junit test where I instantiate and populate an Apache SS Workbook, add it to a 'file'object and then pass the 'file' object to the method under test. The primary problem that I'm having is how to instantiate the 'file' object using the Workbook object.
Following is a sample of the method under test (omitting details unrelated to the Workbook), followed by a pseudo-code example of what I'm trying to do:
/* Method under test */
public Object workbookProcessing(HttpServletRequest request) {
List<Part> workbookParts = request.getParts().collect(Collectors.toList());
for (Part workbookPart : workbookParts) {
InputStream workbookContent = workbookPart.getInputStream();
Workbook workbook = WorkbookFactory.create(workbookContent);
// ...
}
/* Junit test pseudo-code */
public void testWorkbookProcessing() {
RestfulService rs = new RestfulService();
Workbook wb = WorkbookFactory.create(true) // Create XSSF workbook
CreationHelper createHelper = wb.getCreationHelper();
Sheet sheet = wb.createSheet("Sheet 1");
Row row = sheet.createRow(0);
row.createCell(0).setCellValue(createHelper.createRichTextString("Row 1 Cell 1"));
row.createCell(1).setCellValue(12345678);
// ...
HttpServletRequest request = new HttpServletRequest(); // Create HttpServletRequest
// Create InputStream, using above Workbook <- Help!
// Create Part object from InputStream <- Help!
request.addPart(inputStream); // Add Part object to request
ResponseEntity re = rs.workbookProcessing(request);
assertEquals(200, re.getStatusCodeValue());
}
Alternatively, if there is a way to Mock the workbook and it's cell values, I'm happy to do that as well.
After quite a bit of research, trial and error and combining techniques, here is how I was able to write a working Junit test to submit a multi-part HTTP request which contained an Apache SS workbook object.
// Working Junit test code:
#Test
public void testWorkbookProcessing() {
RestfulService rs = new RestfulService();
Workbook wb = WorkbookFactory.create(true) // Create XSSF workbook
CreationHelper createHelper = wb.getCreationHelper();
Sheet sheet = wb.createSheet("Sheet 1");
Row row = sheet.createRow(0); // First row of sheet
row.createCell(0).setCellValue(createHelper.createRichTextString("Row 1 Cell 1"));
row.createCell(1).setCellValue(12345678);
// ...
// This is the code to create the multi-part content
ByteArrayOutputStream baos = New ByteArrayOutputStream();
wb.write(baos); // Write workbook to output stream
byte[] bytes = baos.toByteArray(); // Convert output stream to byte array
MockPart part = new MockPart("file", "test_filename.xlsx", bytes); // Add byte array to mocked Part object
MockMultipartHttpServletRequest request = new MockMultipartHttpServletRequest();
request.setMethod(RequestMethod.POST.name());
request.addHeader("Content-Type", "multipart/form-data");
request.addParameter("id", "12345678"); // optional
request.addPart(part); // Add Part to request
ResponseEntity re = rs.workbookProcessing(request);
assertEquals(202, re.getStatusCodeValue());
}
Refinements/comments/suggestions appreciated.
I need to use PushStreamContent because of the source of my data (effectively have to concatenate blobs), but I also have to support requests for multiple byte ranges (arbitrary ranges not aligned to the stored blobs). What is not clear to me is if I can use PushStreamContent to generate a multipart/byteranges response, if each range needs to be separated in the response, and if so, how to do it, and how it relates to the chunked transfer encoding which PushStreamContent invokes.
You can do it using MultipartContent like this:
public class MyRangeController : ApiController
{
[HttpGet]
public HttpResponseMessage Get()
{
// Create a multi-part content object for the response; note that per RFC spec, subtype must be "byteranges"
// Note that the content type of the over-all response will be "multipart/byteranges"
// We choose to use a GUID string for the separator; it could be anything suitable.
var multipartContent = new MultipartContent("byteranges", Guid.NewGuid().ToString("D"));
// Create the response object and set its content
var response = new HttpResponseMessage(HttpStatusCode.PartialContent) { Content = multipartContent };
foreach (var rangeItemHeaderValue in Request.Headers.Range.Ranges)
{
// Create PushStreamContent object for our current byte range...
var pushStreamContent = new PushStreamContent((stream1, content, arg3) =>
{
// Write to stream1
stream1.Close();
});
// We need to add certain headers to each part of the response
pushStreamContent.Headers.ContentRange = new ContentRangeHeaderValue(rangeItemHeaderValue.From.Value, rangeItemHeaderValue.To.Value, /* total size of the resource */);
pushStreamContent.Headers.ContentType = new MediaTypeHeaderValue(/* Set a content type for each part of the response */);
// Add the part to the multi-part content response
multipartContent.Add(pushStreamContent);
}
return response;
}
}
I'm experimenting with Stanford NLP's TokensRegex and try to find dimensions (e.g. 100x120) in a text. So my plan is to first retokenize the input to further split these tokens (using the example provided in retokenize.rules.txt) and then to search for the new pattern.
After doing the retokenization, however, only null-values are left that replace the original string:
The top level annotation
[Text=100x120 Tokens=[null-1, null-2, null-3] Sentences=[100x120]]
The retokenization seems to work fine (3 tokens in result), but the values are lost. What can I do to maintain the original values in the tokens list?
My retokenize.rules.txt file is (as in the demo):
tokens = { type: "CLASS", value:"edu.stanford.nlp.ling.CoreAnnotations$TokensAnnotation" }
options.matchedExpressionsAnnotationKey = tokens;
options.extractWithTokens = TRUE;
options.flatten = TRUE;
ENV.defaults["ruleType"] = "tokens"
ENV.defaultStringPatternFlags = 2
ENV.defaultResultAnnotationKey = tokens
{ pattern: ( /\d+(x|X)\d+/ ), result: Split($0[0], /x|X/, TRUE) }
The main method:
public static void main(String[] args) throws IOException {
//...
text = "100x120";
Properties properties = new Properties();
properties.setProperty("tokenize.language", "de");
properties.setProperty("annotators", tokenize,retokenize,ssplit,pos,lemma,ner");
properties.setProperty("customAnnotatorClass.retokenize", "edu.stanford.nlp.pipeline.TokensRegexAnnotator");
properties.setProperty("retokenize.rules", "retokenize.rules.txt");
StanfordCoreNLP stanfordPipeline = new StanfordCoreNLP(properties);
runPipeline(pipelineWithRetokenize, text);
}
And the pipeline:
public static void runPipeline(StanfordCoreNLP pipeline, String text) {
Annotation annotation = new Annotation(text);
pipeline.annotate(annotation);
out.println();
out.println("The top level annotation");
out.println(annotation.toShorterString());
//...
}
Thanks for letting us know. The CoreAnnotations.ValueAnnotation is not being populated and we'll update TokenRegex to populate the field.
Regardless, you should be able to use TokenRegex to retokenize as you have planned. Most of the pipeline does not depending on the ValueAnnotation and uses the CoreAnnotations.TextAnnotation instead. You can use the CoreAnnotations.TextAnnotation to get the text for the new tokens (each token is a CoreLabel so you can access it using token.word() as well).
See TokensRegexRetokenizeDemo for example code on how to get the different annotations out.
I need to parse a big csv file (2gb). The values have to be validated, the rows containing "bad" fields must be dropped and a new file containing only valid rows ought to be output.
I've selected uniVocity parser library to do that. Please help me to understand whether this library is well-suited for the task and what approach should be used.
Given the file size, what is the best way to organize read->validate->write in uniVocity ? Read in all rows at once or use iterator style ? Where parsed and validated rows should be stored before they are written to file ?
Is there a way in Univocity to access row's values by index ? Something like row.getValue(3) ?
I'm the author of this library, let me try to help you out:
First, do not try to read all rows at once as you will fill your memory with LOTS of data.
You can get the row values by index.
The faster approach to read/validate/write would be by using a RowProcessor that has a CsvWriter and decides when to write or skip a row. I think the following code will help you a bit:
Define the output:
private CsvWriter createCsvWriter(File output, String encoding){
CsvWriterSettings settings = new CsvWriterSettings();
//configure the writer ...
try {
return new CsvWriter(new OutputStreamWriter(new FileOutputStream(output), encoding), settings);
} catch (IOException e) {
throw new IllegalArgumentException("Error writing to " + output.getAbsolutePath(), e);
}
}
Redirect the input
//this creates a row processor for our parser. It validates each row and sends them to the csv writer.
private RowProcessor createRowProcessor(File output, String encoding){
final CsvWriter writer = createCsvWriter(output, encoding);
return new AbstractRowProcessor() {
#Override
public void rowProcessed(String[] row, ParsingContext context) {
if (shouldWriteRow(row)) {
writer.writeRow(row);
} else {
//skip row
}
}
private boolean shouldWriteRow(String[] row) {
//your validation here
return true;
}
#Override
public void processEnded(ParsingContext context) {
writer.close();
}
};
}
Configure the parser:
public void readAndWrite(File input, File output, String encoding) {
CsvParserSettings settings = new CsvParserSettings();
//configure the parser here
//tells the parser to send each row to them custom processor, which will validate and redirect all rows to the CsvWriter
settings.setRowProcessor(createRowProcessor(output, encoding));
CsvParser parser = new CsvParser(settings);
try {
parser.parse(new InputStreamReader(new FileInputStream(input), encoding));
} catch (IOException e) {
throw new IllegalStateException("Unable to open input file " + input.getAbsolutePath(), e);
}
}
For better performance you can also wrap the row processor in a ConcurrentRowProcessor.
settings.setRowProcessor(new ConcurrentRowProcessor(createRowProcessor(output, encoding)));
With this, the writing of rows will be performed in a separate thread.
Can someone help me convert the following from Java to C# (Xamarin)?
I tried a couple of different ways, but I cannot get it to work.
The code is:
HttpPost post = new HttpPost(url);
// Break out all extra HTTP header lines and add it to the HttpPost object
for (String line : contentType.replace("\r", "\n").split("\n")) {
if (line.length() > 0 && line.contains(":")) {
String[] parts = line.split(":", 2);
if (parts.length == 2) {
post.addHeader(parts[0].trim(), parts[1].trim());
}
}
}
// Create a byte array entity for the POST data, the content
// type here is only used for the postEntity object
ByteArrayEntity postEntity = new ByteArrayEntity(challenge);
postEntity.setContentType("application/octet-stream");
post.setEntity(postEntity);
// Create a HttpClient and execute the HttpPost filled out above
HttpClient client = new DefaultHttpClient();
HttpResponse httpResponse = client.execute(post);
// Get the response entity out of the response
HttpEntity entity = httpResponse.getEntity();
If you are stuck with
post.SetEntity(postEntity);
then it converts to:
ByteArrayEntity postEntity = new ByteArrayEntity(challenge);
postEntity.SetContentType("application/octet-stream");
post.Entity = postEntity;
When converting to Java from C# you mostly have to change the property names to start with upperCase and then if you get stuck on certain objects I would look check out the Xamarin API Docs, HttpPost class linked here.