Document Based App with Core Data vs plain Core Data app? - cocoa

I am trying to understand the key differences between these two types of Core Data application templates.
My understanding is that with a document based core data app you get access to NSDocument instances and a lot of document based behaviors for free (save dialogs, undo, etc).
Assuming I want to create an application that is more "Project" based and not necessarily focused on creating individual savable documents should I still use Core Data with Documents?
To be more concrete, I am trying to build a simple CMS application using core data and that outputs html pages in a structured way. The idea of the app would be focused on "sites" that are really projects and not single documents. The projects contain a consolidated model for various posts, pages, sidebar content, and whatever content might need to go into a website. But the app doesn't save individual pages as documents in the traditional sense. I want a unified model of the all the project data and provide export functionality where the entire application model would become expressed as a set of html documents in a specified project folder.
This is both a learning exercise and something I want to try and build for myself.
Any tips on specific documentation to read? In particular information about "Project" based cocoa apps, and useful samples and tutorials.
It is conceivable that the CMS data model could be stored in a single Core Data document but that doesn't necessarily seem right from an architectural point of view.

A project can be a document and a document doesn't have to be a single file. Read the NSDocument related documentation and decide if it offers any functionality you might be interested in.

I look at the difference as being that a Document based application lets the user have multiple sets of information stored separately.
The best example, from a functionality perspective, is iTunes. Apple doesn't let you have multiple libraries, it's all or nothing, one "database" for the entire application.
A simple document based application would be something like TextEdit.
I don't think what you are proposing is too different from a single document based application, though - you just need to remember that the web pages you produce are an OUTPUT, not a part of the project. The same way you don't think of printed output from TextEdit being part of the document. Or object/executable files in XCode being part of the project.


Apache ACE - Simple folder based deployment / provisioning?

I have many bundles (let's say hundreds) and it is quite difficult to specify relation between bundles+features-distrubutions in UI. Image, at first I define all relations between bundles, features and distribution. Than I want to update some bundles... it is almost impossible to find them in current implementation of UI. They are not groupped and one list of all bundles without any search bar is really hard to work with.
Is there any support for a file based solution. For example Apache ACE would watch a certain folder containing distribnution's bundles. When ever I make a change there, it would be propagate it to all targets.
There is currently no file based solution that matches what you describe, however, I think there are still a couple of solutions that might help you:
There are two types of associations between artifacts and features in ACE: static and dynamic ones. The latter can be of help to you, as they always automatically bind to the highest version of a bundle. So, once you've made all your associations, you can simply upload a set of newer bundles and the associations will adapt.
There is also a REST API you can use to programmatically talk to ACE. You can use that to further automate your process.
That said, you have a valid point that it is difficult to keep an overview when there are a lot of artifacts in the first column. I would advise you to watch, or even contribute to the following issues that were all created to improve this situation:

File type misery - Cocoa

So we recently shipped a document based application with an unfortunate oversight: the UTI for our main document type was left blank. We had a name for it, but the identifier was straight up empty.
Everything still worked great, but then we went to add another file type to the mix. The new file type is simply xml (conforms to public.xml). We set that up and dropped it into the document. This is when we caught our oversight on the first document type's UTI.
Now, if we so much as touch this document type, BOOM. The application can't read any files it has created of that type. We really want to clean this up, so what's the best way to do so?
My question is essentially:
How do you migrate your main document type in a document based application?
First, it's very difficult to debug this type of problem on the machine you're using to cut builds. The dynamic UTI system gets confused as to which app owns which files. To solve this issue, there is a command you can run in terminal to clear out the file associations on your system.
Next, we tackled the actual document types of our application. Ultimately, we want to support just two document types, our custom type and the xml type. However, we had to keep that empty, dynamically generated UTI that was shipped. In "Document types", we have three: the two we actually want to support and the legacy one we no longer want. For the first two, our application is an "Editor". For the legacy one, we changed it to "Reader".
Another thing that really helped our system out is using exported an imported UTIs. We told the system our application imports the XML type, and exports the two others.
We've done some pretty significant testing, including deployment, and this configuration works like a charm.

Sharing Models, Views, and Controllers Between ExtJS 4 Applications

Right now, I’m working on a legacy web application that is made up of multiple screens, each one performing a separate function. I’m in the process of converting several of the screens to EXTJS 4 using the MVC approach. In order to isolate the impact of my changes and because we don’t have time to convert the entire app at once, I’ve converted two of the screens into two separate EXTJS 4 apps. Each screen now has its own folder in which I’ve set up an app using the appropriate file structure and app.js file.
My question is this: as I continue developing, I may want to use models from one app (screen) in another app. How do you share models, views and controllers between applications? What’s the best approach?
FYI, I’m using autoloading to pull everything in.
I would not use autoload in production, because it generates to many HTTP requests to get all files, which slows down the page load speed. This is well documented at Google's Page Speed and Yahoo's Best Practices for Speeding Up Your Web Site.
The best practice is to preprocess the resources upon deployment of the application and generate a single javascript file with everything in it that is sent in a single (GZIP) compressed response. There are several tools for this job, but it depends heavily on your toolchain. You can for example have a look a the SO question Best JavaScript compressor to get recommendations for various compressors (I use Jammit).
When you have a flexible configurable JavaScript compressor in your toolchain, you can set up a shared folder where you have your common files, like model, stores and some libs. These are now included in the builds for the different projects.
In case you have a good reason to serve single javascript files, you can either use a good version controll system like git and make use of submodules. Which this approach you'll have a separate repository for common files. This gives you the downside of slower page speed and a little overhead with updating the submodules.
As last solution, you can use a symbolic link on the file system to link the common folder to the different other projects.
Here's what Saki said to me on the Sencha Forums:
The multiple applications on one page, or sub-applications of Ext MVC
are not supported yet, however, developers are working on this
functionality, AFAIK. Such implementation would most likely solve also
the problem of re-using models, views and controllers among (sub)
applications, I hope.
More specifically regarding linking multiple applications:
I would just soft-link files of MVC components is this case. There's
no logical or functional connection among them now, only I wanna reuse
already written file, right?

Libraries/Tools for Website Parsing

I would like to start working with parsing large numbers of raw HTML pages into semantic data structures.
Just interested in the community opinion on various available tools for such a task, particularly various useful libraries in any language.
So far, planning on using Hadoop to manage a lot of the processing, but curious about alternatives.
First you need to download your page source and then create a DOM tree.
if you are coding in C# you can user the following tools to create your DOM tree.
the first one is easy to use but second one is much faster and memory friendly and I suggest you to use the second one if you want to create a robust application
then you can extract usefull content from web page using:
and many other articles you can find to extract content from web page by Googling (extract main content from web page)
Hope it helps

Visual Studio Solution Shared Data Source for Report Projects?

Simple question... I have a VS 2005 solution that encompasses several reporting services projects. Currently, each project has it's own shared data source making changing the database target very tedious.
Is there a way to share the data source across the entire solution (i.e. all the projects in the solution will use the data source defined in one place?).
I thought I could create a project that just held one data source item and then make all of the other projects dependent upon that one, however, the shared date source in the new project does not appear in the other projects for me to select.
Help! I have looked around the web for info, but not much available. There must be a simple solution to this.
I am sorry I somehow overlooked your question when I posted the same.
Nonetheless, a technique I am using is described in an answer to it. It feels a little shady and underhanded but seems to be working so far:
Make a new report project to hold your shared data source. I called mine Data Source.
Copy your shared data source (let's pretend it's called My Shared Data Source) to that new project.
If necessary, copy My Shared Data Source to each actual report project and link things up the way you want. But probably you're already set up like this.
Close Visual Studio to make sure all changes are saved in the filesystem and to make sure it doesn't end up clobbering some of our next, "backstage" edits.
In plain old Windows Explorer (or whatever), delete the My Shared Data Source.rds file from every project folder except Data Source's.
Using a text editor or XML-file editor, edit each project's .rptproj file to change the text of the Project.DataSources.ProjectItem.FullPath element from My Shared Data Source.rds to ..\Data Source\My Shared Data Source.rds.
Now each project still has its own reference to a data source, but all those references happen point to the same underlying physical file, and thus they all share one data source specification.
According to this post by Paul Turley, it appears as if this is not possible. You'll have to copy the data source into each project. The good news is that if you deploy them to the same location, only one data source should exist on the server.
This may not be what you're thinking, but when I'm writing an app consisting of several distinct applicaitons accessing the same data I usually take one of two approaches.
write all of my data access logic into a Class Library project and reference it from the other projects.
Write my data access logic into a Web Service library and add a web reference.
I usually go for option 2 if the data I am accessing is likely to be used in future development, such as accessing company-wide customer lists, etc.
