Database Crawling in GSA - google-search-appliance

Database Crawling in GSA - google-search-appliance

I could see there are two ways to index a database records in GSA.
Content Sources > Databases
Using DB connector
As per my understanding, Content Sources > Databases does not support automatic recrawl. We have to manually sync after any changes occured in DB records. Is that correct?
Also, Would using DB connectors help in automatic recrawl?
I would like to check DB in every 15 minutes for the changes and update the index accordingly. Please suggest the viable apporach to achieve this.
Thanks in advance.

You are correct that Content Sources > Databases does not support any sort of automated recrawl.
Using either the 3.x Connector or 4.x Adaptor for Databases supports automatic recrawls. If you are looking to index the rows of databases only and not using it to feed a list of URLs to index then I would go with the 4.x Database Adaptor as it is new.

The Content Sources > Databases approach is good for data that doesn't change often where a manual sync is acceptable. That said though, it's easy enough to write a simple client that logs in to the admin console and hits the 'Sync' link periodically.
However, if you want frequent updates like every 15m I'd definitely go with the 4.x plexi-based adaptor, not because it's newer but because it's better. Older versions of the 3.x connector were a bit flaky (although the most recent versions are much better).
What flavour DB are you looking to index?

Related

Data model migration on Kafka connectors

I'm using Debezium and JDBCSinkConnector to copy data from multiple databases into another DB. I would like to have the ability to upgrade some of the models from time to time and let's say not all DBs together, but only to upgrade the sink DB and some time later also the source DBs. And let's say I have a version column in the tables or environment variable to reconfigure the connectors. I've considered creating a combination of SMTs for upgrading from each version and run them depending on the source version, using some predicates. But I'm not sure this is a good practice or going to work at all. Haven't been able to find another solution for this.
What is the best way to implement the required migrations (such as added/removed columns or values manipulations, etc) "on the fly" via Kafka Connectors?

Using Saiku-ui without saiku-server/mondrian?

Is it possible to use the saiku-ui component with a different jolap provider than mondrian, or with a different server backend than the saiku-server component?
I have been looking but I have not found an architecture description of how these pieces fit together and what interfaces they use to communicate. Can anyone point me towards an understanding of what the saiku-ui wants to speak with and what the saiku-server is providing?
The reason for my interest is that I have a set of data spread across hundreds of csv files that I would like to query with a pivot and charting tool. It looks like the standard way to use this with saiku would be to have an ETL process to load in to an RDBMS. However, this would not be a simple process because the files and content and the way the files relate to each other vary, so the ETL would have to do a lot of inspection of the data sources to figure it out.
Given this it seems to me that I would have three options in how to use saiku:
1) write a complex ETL to load in to a rdbms, and then use a standard jdbc driver to provide the data to modrian. A side function of the ETL would be to analyze the inputs and write the mondrian schema file describing the cubes.
2) write a jdbc driver to access the data natively. This driver would parse sql and provide access to the underlying tables. Essentially this would be a custom r/o dbms written on top of the csv files. The jdbc connection would be used by mondrian to access the data. A side function of this custom dbms would be to produce the mondrian schema file.
3) write a tool that provides a jolap interface to the native data (accepts discovery and mdx queries). This would bypass mondrian entirely and interface with the ui.
I may be a bit naive here but I consider each of the three options to be feasible. Option #1 is my least preferred because of the likelihood of the data in the rdbms becoming out of sync with the cvs files. Option #3 is most preferred because the data are simple, so not much aggregating required and I suspect that mdx will be easier to parse than sql.
So, if i could produce my own jolap data source would it be possible to hook the saiku-ui tools up to it? Where would I look to find out the interface configuration details?

Many years ago, #ronaldbouman created xmondrian - the set of tools with the olap server, and web ui tools for xmla browsing and visualusation. But that project was not updating, and has no source code.
I just updated olap server and libraries to the latest versions.
You may get it here and build:.
https://github.com/Muritiku/xmondrian-build.
You may use web package as the example. The mondrian server works with the saiku-ui.

IMHO,
I would not be as confident as your are, because it took Julian Hyde more than a decade to build Mondrian (MDX->SQL) and Calcite (SQL), fulfilling your last two proposals.
You might simply consider using Calcite, or even better Dremio. Dremio has a JDBC interface, and can query directories of CSV files in SQL. I tested Saiku over Dremio successfully (with a schema based on two separate RDBMS). Just be careful to setup tables' schema accordingly in the Mondrian v4 schema.
Best regards,
Fabrice Etanchaud
Dremio

Ruby: Mongoid Criteria Selector to SQL Query

As part of my bachelor's thesis I'm building a Microservice using Postgres which would in part replace an existing part of an application using MongoDB. Now to change as little as possible at the moment on the client side I was wondering if there was an easy way to translate a Mongoid::Criteria to an SQL query (assuming all fields are named the same, of course), without having to write a complete parser myself. Are there any gems out there that might support this?
Any input is highly appreciated.

Maybe you're looking for this : https://github.com/stripe/mosql.
I don't dig it but it seems to work for what you need :
"MoSQL imports the contents of your MongoDB database cluster into a PostgreSQL instance, using an oplog tailer to keep the SQL mirror live up-to-date. This lets you run production services against a MongoDB database, and then run offline analytics or reporting using the full power of SQL."

Is it possible to migrate projects on SONAR from the in-memory DB to an Oracle DB?

I am currently setting up SONAR with the in-memory db for an evaluation. Should we wish to use the tool, I would like to then migrate the analysis results onto an Oracle db to use going forward. Is this possible?

No tool is provided to do such a migration, and I advise you not to try to do so.
However, be aware that you will have the possibility to replay the history of your analysis: you can check out old versions of your code and launch an analysis on each one using the "sonar.projectDate" paramater to change the date.

Is there a way for the Oracle Data Integrator to extract data from MongoDB

I'm trying to move snapshots of data from our MongoDB into our Oracle BI data store.
From the BI team I've been asked to make the data available for ODI, but I haven't been able to find an example of that being done.
Is it possible and what do I need to implement it?
If there is a more generic way of getting MongoDB data into Oracle then I'm happy to propose that as well.
Versions
MongoDB: 2.0.1
ODI: 11.1.1.5
Oracle: 11.2g
Edit:
This is something that will be queried once a day, maybe twice but at this stage the BI report granularity is daily

In ODI, under the Topology tab and Physical Architecture sub-tab, you can see all technologies that are supported out of the box. MongoDB is not one of them. There are also no Knowledge Modules available for importing/exporting from/to MongoDB.
ODI supports implementing your own technologies and your own Knowledge Modules.
This manual will get you started with developing your won Knowledge module, and in one of the other manuals i'm sure you can find an explanation on how to implement your own technologies. (Ctrl-F for "Data integrator")
If you're lucky, you might find someone else who has already implemented it. Your best places to look would be The Oracle Technology Network Forum, or a forum related to MongoDB.
Instead of creating a direct link, you could also take an easier workaround. Export the data from the MongoDB to a format that ODI supports, and MongoDB can extract to. CSV or XML maybe? Then load the data trough ODI into the oracle database. I think... that will be the best option, unless you have to do this frequently...

Look at the blog post below for an option;
https://blogs.oracle.com/dataintegration/entry/odi_mongodb_and_a_java
Cheers
David

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio