I would like to understand if it is possible to load a GML file in GeoServer and serve it over WFS.
Thank you in advance
It is not possible to load a GML file into GeoServer. GML is a transport format not a storage format as while it compresses easily it has no indexing and as a text format is very hard to search efficiently.
You can use ogr2ogr to convert your GML into an number of formats that GeoServer can serve as WFS layers. The most efficient would be to store the data in a well indexed PostGIS database.
Related
I want to store image in hive table and then retrieve the image to display it on the dashboard. Can i do it without using any Java coding? I have successfully created hive table and loaded the image file in it in a column with a binary datatype but the image file in HDFS is like this
�����JFIF���������Exif��MM�*�����������>�������F(��������i�������N�����������������������z���`����UNICODE��C�R�E�A�T�O�R�:� �g�d�-�j�p�e�g� �v�1�.�0� �(�u�s�i�n�g� �I�J�G� �J�P�E�G� �v�6�2�)�,� �q�u�a�l�i�t�y� �=� �9�0�
���C�..........
Can any one kindly help me how to retrieve the image from Hive table.
Interesting. You say that you have already stored it in a Table using binary Data Type. Can you share the details of how you did it? The question that is open to answer is how to you visualize it in a dashboard?
Hive exposes a Hive Server connection using which you will be able to query the hive table from your webpage. Essentially it is a Select * statement and then what you do with the data is up to you.
I think the problem is you are loading data using load data inpathbut your hive expects the data being provided in Binary format. This is issue of hive not understanding the data. Viewing JPEG file from dashboard hive is assuming that .jpeg file is binary format and decoding the data. I would recommend using some method (java or other) to convert the .jpeg file to binary format before using your load data inpath command.
This is not a question of a code, I need to extract some BLOB data from an Oracle database using python script. My question is what are the steps in dealing with BLOB data and how to read as images, videos and text? Since I have no access to the database itself, is it possible to know the type of BLOBs stored if it is pictures, videos or texts? Do I need encoding or decoding in order to tranfer these BLOBs into .jpg, .avi or .txt files ? These are very basic questions but I am new to programming so need some help to find a starting point :)
If you have a pure BLOB in the database, as opposed to, say, an ORDImage that happens to be stored in a BLOB under the covers, the BLOB itself has no idea what sort of binary data it contains. Normally, when the table was designed, a column would be added that would store the data type and/or the file name.
I try to...figure that case in Hadoop.
What is best file format Avro or SequenceFile, in case storing images in HDFS and process them after, with Python?
SequenceFile are key-value oriented, so I think that Avro files will work better?
I use SequenceFile to store images in HDFS and it works well. Both Avro and SequenceFile are binary file formats, hence they can store images efficiently. As a keys in SequenceFile I usually use the original image file names.
SequenceFile's are used in many image processing products, such as OpenIMAJ. You can use existing tools for working with images in SequenceFile's, for example OpenIMAJ SequenceFileTool.
In addition, you can take a look at HipiImageBundle. This is a special format provided by HIPI (Hadoop Image Processing Interface). In my experience, HipiImageBundle has better performance, than the SequenceFile. But in can be used only by HIPI.
If you don't have large number of files (less than 1M), you can try to store them without packaging in one big file and use CombineFileInputFormat to speedup processing.
I never use Avro to store images and I don't know about any project that use it.
I have huge amount of json files, >100TB size in total, each json file is 10GB bzipped, and each line contain a json object, and they are stored on s3
If I want to transform the json into csv (also stored on s3) so I can import them into redshift directly, is writing custom code using hadoop the only choice?
Would it be possible to do adhoc query on the json file without transform the data into other format (since I don't want to convert them into other format first every time I need to do query as the source is growing)
The quickest and easiest way would be to launch an EMR cluster loaded with Hive to do the heavy lifting for this. By using the JsonSerde, you can easily transform the data into csv format. This would only require you to do a insert the data into a CSV formatted table from the JSON formatted table.
A good tutorial for handling the JsonSerde can be found here:
http://aws.amazon.com/articles/2855
Also a good library used for CSV format is:
https://github.com/ogrodnek/csv-serde
The EMR cluster can be short-lived and only necessary for that one job, which can also span across low cost spot instances.
Once you have the CSV format, the Redshift COPY documentation should suffice.
http://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html
How to Store/Upload Satellite Imagery ( *.TIFF , *.GeoTiff *.Jpeg format ) into HDFS?
How to break the stored Satellite Imagery into Tiles?
How to Store that Tiles into HIVE meta store?
How to perform simple querying that stored data using PIG or HBase ?
How to perform simple Image Processing of the stored Satellite Imagery using MapReduce Program ?
Hadoop provides for SequenceFiles as an alternative to handle small files. In order to handle images please check this link Processing images and also a cloudera post.
Edit:
HIPI : is a library for Hadoop's MapReduce framework that provides an API for performing image processing tasks.
What I would do is to treat the image as a matrix.
I would generate a flat file with tuples following this format:
(x coord, y coord, value)
This way you can apply many image manipulations (rotate, substract 2 images, indentify connected components, do some border detection...).
About the technology, I would start using flat files in the HDFS and playing with Pig.
Here is an example of matrix multiplication using this format:
http://importantfish.com/one-step-matrix-multiplication-with-hadoop/