Upload large folder of json files to elasticsearch - elasticsearch

I have a folder that consists of 20,000 json files.
I want to be import this into an index in elastic search. I know of the bulk API, but I'm not sure how to loop over my folder to index each json.
I've seen things on how to do --databinary with the #__.json but this seems to only work for files, not an entire folder. Maybe I am just not understanding correctly.
The jsons are of type geojson and I am on a windows machine. Any help is appreciated, thanks!

Related

In Elasticsearch Data folder What is the scenario which folder nodes/1 gets created?

We are facing an issue w.r.t folder which elasticsearch uses for data. I can see I was getting data from data/elasticsearch/nodes/0 folder before, now it changed to data/elasticsearch/nodes/1
Old data is still there in data/elasticsearch/nodes/0 we were able to verify that by copying the data from node/0 to node/1
Please help me on the reason why it is behaving like this and how to reproduce this issue. so we can avoid the same in future.

Elasticsearch index a file automatically

I am new to elasticsearch,this question might look weird but is it possible to index a file automatically (i.e given a file path, elasticsearch should index the contents of it automatically).I have got some open source tool like elasticdump and tried using it for the purpose,but I prefer some plugins of elasticsearch which can support almost all elasticsearch versions.. Can anyone suggest me?

Run Elastic Search on pdf and ppts

I am new to elastic search. I have read its tutorials. But need guidance on my problem:
I have a collection of pdf documents and power point files on my system. I need to build a system using elastic search where I can retrieve these files on the basis of keywords present in this file. Can someone please guide as to how can I proceed here and index my documents.Do I need to parse my pdf and convert it to JSON format using Tika or FSCrawler and then provide it to elastic search.
Thankyou.
You should setup FSCrawler, that'll do the parsing and make the files content searchable.

files indexing automatically by elasticsearch

I am a newbie in elasticsearch, please forgive me if my question sounds weird :D
I want to index files in some directories with elasticsearch automatically (for example: if i add a file in certain directory then elasticsearch can index that file immediately), but i don't know how to configure elasticsearch in order to solve that problem.
Can anyone suggest me?
Thank in advance
I dont think you can have elasticsearch watch a directory (I wouldn't think that is a good thing to do in most cases.)
Instead, have a client wrapper that implements a FileWatcher. Push changes to ElasticSearch via this client.
You could use PathHierarchyTokenizer to preserve the file system hierarchy in your index, allowing you to drill down your Directory structure.

GAE/J file store

I am working on a GAE/J based project. One of the requirements is to let the users upload files (doc,ppt,pdf,xls etc).
What options do I have for storing files besides http://code.google.com/p/gae-filestore/
Is it possible to make these files "searchable"?
Blobstore service. It stores files up to 2GB, and its API was intended for user uploaded files. See: http://code.google.com/appengine/docs/java/blobstore/overview.html It isn't indexed, but I believe that some people have been able use Map/Reduce with it: http://ikaisays.com/2010/08/11/using-the-app-engine-mapper-for-bulk-data-import/
Datastore. You can store up to 1 MB as a "blob property".

Resources