How to add metadata to the document using marklogic mapreduce connector api - hadoop

I wanted to write the document to marklogic database using marklogic mapreduce api, lets say here is the example. I wanted to add metadata to the document which i am writing it back to the marklogic database in the reducer -
context.write(outputURI, result);
If adding metadata to the document with mapreduce api of marklogic is possible please let me know.

For Metadata, I am assuming you are talking about the document properties fragment. For background on document properties, please see here: https://docs.marklogic.com/guide/app-dev/properties#id_19516
For use in MarkLogic mapreduce, please see here (the output classes):
https://docs.marklogic.com/guide/mapreduce/output#id_76625
I believe you need to extend/modify your example to also write content to the properties fragment using the PropertyOutputFormat class.
One of the sample applications in the same documentation is an example of saving content in the properties fragment. If, however, you would like to fast-track yourself by looking at some source code: see some examples - including writing to a document property fragment, see here: https://gist.github.com/evanlenz/2484318 - specifically LinkCountInProperty.java

Used property mapreduce.marklogic.output.content.collection with the configuration xml. Adding this property added inserted data to that collection.

Related

Python Sphinx Validate and update schema

I'm using Sphinx to maintain docs on a project, I am generating a jsonschema document from a tool where all properties of objects are listed.
Those objects properties are documented in rst files, I need to:
I've managed to read the rst files in the doctree-resolved event, and match with the json property, but I'm not sure if this is the best approach since I need
a) check all properties are documented, this is almost donde, I can mark on the json properties found and then check the json at the end.
b) Copy the description retrieved from the doctree object in the json (adding a property to the json) the format I need is markup, so I need to figure out how to convert a doctree node(set) to markup. Also the url links should be working at this stage. Also if markdown is not possible converting the fragment to html then to markdown might be easier
I don't know if I'm in the right path or should I write a builder instead?
Thanks

Is there any way to make mathematical operations for some values in files with apache nifi?

I am getting some numerical data with API from URL and I am looking for a way to make some mathematical operations in apache nifi before putting data to file directory. Thanks already now.
By the way, I am using InvokeHTTP processor to get data and to put file in somewhere I am using PutFile processor. I searched some related websites but I could not find out a working way.
Try using QueryRecord processor and Define Record Reader/Writer controller services to read/write the flowfile.
Add new property to the QueryRecord processor by using Apache calcite SQL query with your mathematical operations on flowfile.
Results of the SQL query will be added to the outgoing flowfile in your desired format.
Ultimately the answer depends on whether the data you're working with is in the content of the FlowFile or in the attributes. If the data is small enough and it's only a couple operations, the suggested approach would be to work with the data as attributes and use NiFi's expression language to do the transformations.
There is a section of mathematical operations[1] in the Apache documentation[2]. The operations range from simple operand like plus/minus to exposing the java.lang.Math static methods.
[1] https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#numbers
[2] https://nifi.apache.org/docs.html
You can try ExecuteStreamCommand if you want to intake the whole file and then run operations. Alternatively, you can fiddle around with the variables on the flowfile - depending on how large your operation is.
For example if you have some initial variables you can include them in the name of your file and then extract them, run the operations within the variables of the flowfile, then add to the bottom of the original file

JMeter: Script to compare response kept in an external parameterised file

I have following requirement
1. Keep responses in an external xml file.
2. Hit the API and compare the response with external response (Kept in xml file. )
3. Also while comparison, I have to ignore dynamic components like , etc.
4. Also I have to ignore sequence of parameters.
Can you please if any such utility/program to do so in JMeter
Thanks in advance
Regards
Vishal Pachpute
I believe it makes more sense to use XML Schema Assertion. This way you will validate your XML response syntax and structure, elements and attributes, number and order of attributes, data types, etc. but this assertion won't care in the slightest about the content.
You can ask the .xsd schema from the developers, most likely they have it, if not the majority of IDEs can do this, there are even online services.
References:
XML Schema Tutorial
How to Use JMeter Assertions in Three Easy Steps

How to update the "Replacement Value" in ReplaceText Processor using Rest API?

I need to know how to update the values in nifi processors using Rest API.
https://nifi.apache.org/docs/nifi-docs/rest-api/index.html
For example: I have used below processor structure
GetFile>SplitText>ExtractText>ReplaceText>ConvertJSONToSQL>PUTSQL.
I have passed following inputs for above processors.,
FileLocation(GetFile).
validation(ExtractText).
ReplacementValue(ReplaceText).
DBCP ConnectionPool,username and pwd for SQL.
I just need to use nifi rest api client to write above inputs into processors.
For example : If I give Processor name and input file in Rest API Client then it will write into processor.
Please stop me if anything i'm doing wrong.
Help Appreciated and Tell me any other ways is possible?
Mahen,
You can issue a PUT request to /processors/{id} and provide the new value of the "Replacement Value" property. You'll need to provide JSON body in the request to do this, and you can see the structure by expanding the endpoint noted above on the documentation link you provided, then clicking ProcessorEntity > ProcessorDTO > ProcessorConfigDTO to see the pop-up dialogs with the element listing and examples. You can also quickly get the current values of the processor by issuing a GET request to /processors/{id}.

Why Elastic Search favorite JSON?

I'm new beginner of Elastic Search. One feature I found is that elastic search documents is particularly expressed in JSON. I google a while but I can not found any reason about that.
Can someone help to explain why JSON not XML or other format?
It is because json document has key, value structure and it helps elasticsearch to index on basis of keys. Suppose if there is an XML, then a lot of effort will be required to just parse the data whereas in json , according to key value elastic search can directly index the required data.
Basically there are mainly 2 standard ways to transport data between a server and client, XML and JSON. Old services use XML as well as JSON as a way to transfer data as most of the old consumers of the services are stick to XML parsers, but recent services use JSON as a standard mainly because of simplicity that comes with JSON. JSON parsers are easy to build and use. At the same time XML parsers needs to be customized as per fields. Although there are some great libraries for parsing a XML response like SAX parser in JAVA, its still not that straight forward. Also JSON can be directly used in javascript. I hope I have answered your question.

Resources