I'm working on an EDM migration project and I would like to know if anyone has information on the documents limitation in Liferay 6.2 CE.
We need to import millions of documents to a Liferay instance.
I would like to know if someone has done that with a community edition ? And if someone knows if I need a cluster of Liferay's to keep an acceptable response time for the end users.
Thanks for your advices !
Julien
I found an answer in the Liferay documentation. https://dev.liferay.com/fr/discover/deployment/-/knowledge_base/7-0/document-repository-configuration
The advanced file system store overcomes this limitation by programmatically creating a structure that can expand to millions of files, by alphabetically nesting the files in folders. This not only allows for more files to be stored, but also improves performance as there are fewer files stored per folder.
So it seems that with this advanced file system store, Liferay can handle millions of documents and bypass OS limitations.
Hope this helps other.
Julien
Related
I was having some trouble bulk-loading records to go faster than what cursor.executemany would allow. I hoped the bulk operations documented with regular MonetDB here might work, so I tried an export as a test. e.g. cursor.execute("COPY SELECT * FROM foo INTO '/file/path.csv'"). This doesn't raise an error unless the file already exists, but the resulting file is always 0 bytes. I tried the same with file STDOUT and it prints nothing.
Are these COPY commands meant to work on the embedded version?
Note: This is my first use of anything related to MonetDB. As a fan of SQLite and a not-super-impressed user of Amazon Redshift, this seemed like a neat project. Not sure if MonetDB/e is the same as MonetDBLite - the former seems more active lately?
Exporting data through a COPY INTO command should be possible in MonetDB/e, yes.
However, this feature is not working currently. I was able to reproduce your problem, i.e. the COPY INTO creates the file where the data should be exported to, but doesn't write the data. This does not happen with regular MonetDB.
Our team is notified of this issue, and we're looking into it. Thanks for the heads up!
PS: Regarding your doubt about MonetDB/e vs MonetDBLite: our team no long develops and maintains MonetDBLite. Both are embedded databases that use MonetDB as the core engine, but MonetDBLite is deprecated. After having learnt some do's and don'ts with MonetDBLite, our team is developing our next generation of embedded databases.
So for your embedded database needs, you should follow what's coming out of our MonetDB/e projects.
I've created a test for it at: https://github.com/MonetDBSolutions/monetdbe-examples/blob/CI/C/copy_into.c
Also filed a bug report over on GitHub: https://github.com/MonetDB/MonetDB/issues/7058
We're currently looking into this issue.
Currently I am working with a team that works on "search engines", especially with HP Idol,
The main idea of my work is to find a new search engine which is open source so that I started to work with Elasticsearch, but I still have some problems that I could not find solutions;
I need to index the documents into Elasticsearch from the servers of,
Sharepoint
Documentum
Alfresco
so from my searches on the web, I found out,
Talend ( can not use because, the team does not want to pay )
Apache Manifoldcf (open source but lots of problem with it)
Have seen those problems, I continue to find out new solutions.
Can you please tell me if I have some possibilities to put all files from sources into HDFS and then index them all on Elasticsearch with Apache Spark ?
I will appreciate also all your new techniques that I have never thought.
Thanks in advance
I'm developing a ADF Fusion Web Application in JDeveloper 12. After the creation of the project I took a look at the file system and a bunch of directories were created.
Can anyone tell me what the .adf folder is good for? I can't find anything about it in the Oracle Docs. I'm developing with git and I'd like to know if I have to version this directory, too.
Thanks in advance!
Inside the above mentioned folder can be found two files: adf-config.xml and connections.xml. For an overview of their usage you can take a look at these links:
Oracle ADF XML File Appendix and Web Center. In both of them it states that there are stored application-level setting during design time, which can be used later during the deployment process (it seems quite important though :) ). So, even if you delete that folder it should be recreated if you make any changes and redeploy the application, BUT, if it is there it means it should be there (typical Oracle politics ;) ). So even if you are really in need of their settings (such as modify connection details to point to production server instances) it should be versioned as well.
I'm using svn, and it does version it automatically.
Hope this helps.
ADF creates several files and folders that are needed by the project.
It creates them when the project uses a functionality that needs those files.
In .adf/META-INF/ you can find adf-config.xml & connections.xml which are Application Level Settings.
But for example src/META-INF/jazn-data.xml doesn't exist until you enable Security on your application. This file is also needed and should be on SVN/Git.
ADF also creates some temporary files and folders that shouldn't be on Git/SVN.
Like: .data/.
Depending on what technologies you use from the ADF stack (ADF BC, ADF Model, ADF Controller, ADF Faces), you should understand what files and folders are created.
If you have searched for .adf/ in the official Documentation you would have found your answers.
ADF by default creates .adf and .data file, .adf file you can say is for holding various info related to your workspace in IDE i.e the connections that its having with Database, META-INF info used for customization purpose.
& .data support your MDS functionality.
we can always delete it but our Jdev will create it automatically, whenever we rebuild our application.
I am maintaining a Joomla 2.5 based magazine website with 3-4 new, long articles every day.
Smart Search was enabled by default and now I've got a few "finder" tables full of indexed phrases and therms.
I wonder if there are any disadvantages if I'd:
Disable the Smart Search plugin
Remove these 'finder' tables completely
Aha, we're using a Search field, which works fine, but I'm not sure what's going to happen if I disable the plugin and remove these tables. Will it then search for phrases in content Joomla tables or simply break w/o missing 'finder' tables
Has anyone tried this before?
I really doubt about having Smart Search enabled by default. Smart Search is one of the coolest (and sometimes unknown) features in J2.5. But it's not for all websites. You just need to test it!
Here is what you can do:
You can certainly disable the Smart Search plugin, IF you are not using Smart Search (or you think you don't need it). The old search will still be there to use.
I would strongly advice against deleting tables (just empty their content if you really want; can be done from the Smart Search Management GUI); you may / will have problems later when you update Joomla.
Most important, don't do this kind of stuff on your live server. Nowdays it's almost as easy as you can get to run Joomla! on a XAMPP or WAMP intalation where you can test everything.
Smart search is a much more powerful search than the old search since it indexes your content and returns it in a sensible order and lets you do filtering and maps.
If you really want to disable it you would disable the finder content plugin. Then you can clear the table in the smart search manager (I think it says purge index or something like that). I wouldn't uninstall because you may want to go back.
I was thinking to delete or empty the FINDER table. But the best solution for big websites is to
Disable the CONTENT SEARCH plugin and
Use the classic search module.
My user experience remained the same.
I am using the embedded version of RavenDb and have put the physical database in the App_Data folder, based on this article http://msdn.microsoft.com/en-us/magazine/hh547101.aspx. My first question is, what portions of the db need to be committed to the SCM repo?
The second question is, My workflow is such that I'll also use web publishing directly from my laptop, are there any concerns using this methodology?
Thank you,
Stephen
There's no need to put your database under source-control since your documents have no particular schema. They will be created on the fly when serialized into json. So as long as you check in your C# classes, you're all fine.
First, are you aware that RavenDB uses the AGPL license? This license requires that you publish your project as open source if you are not paying for a commercial license.
They do offer free licensing in some cases but you must contact them and get a license. Check their licensing page for more details.
Second, You proably shouldn't check your database into your SCM. Databases change frequently, and SCM is designed for files that not constantly changing. You might want to check in your database schema as it changes... but not the database itself.
Regarding your second question, i'm not sure what concerns you're talking about. Can you be more clear about what your concerns are?