Magento customer/order transferring - magento

Most of you coders most likely already have the habit of working on different platforms (development-staging-production). In the company I work with they also have these different platforms with a Magento enterprise edition (v1.9.0.0) instance deployed.
About 2 months ago our team took a database backup of the production version to start working from (rather large development project about the content (product images, descriptions, ...) and automatic product loading.
Currently all the modifications have been deployed on the staging platform, containing order information of orders that have been placed two months ago at latest.
After buying a (badly coded and full of bugs) extension for exporting and importing orders (including order information, quotes, shipping info and customer info) which does not work properly I had decided to just copy all the following tables from the production site:
All tables starting with customer_
All tables starting with s_
All tables starting with sales_
I imported them on my development platform (just to try it out and it works! :O
All order, shipping, credit-memo and customer information is maintained and seem to be fully working and correct.
Here comes the actual question:
Will there be a chance of possible conflict with something order/customer-related in the future by doing this? As far as I know orders only carry relations to customers and customer addresses and not to actually products (at least I think they are linked by SKU and not by product entity_id like most things in magento)
This is proven by the fact that if you remove all products from your magento instance, all the order and customer information is maintained and fully working.
Edit: This actually worked ;)

This is probably a bit late but I have come across this situation several times recently and this is an excellent question and it's important to be aware of it to avoid future problems with the customer/sales relationship.
At the time of writing the most current Magento version is 1.7.0.2 and yes, the habit of working with a production and development site is likely, so the transferring of new sales and customer during the development is an important step to take and shouldn't require extensions if one has even little experience with the DB, so here it goes:
It is correct that the transferring of the customer_ and sales_ tables will transfer all the data correctly and safely.
Thereafter, one MUST only update the eav_entity_store table's increment_last_id column for each row.
Doing the last step above avoids that the new orders, invoices, shipping or creditmemo's ignore the new ID's and assures that the new orders start from where the transferred orders left off.
It may be a bit confusing, but its a very easy step. This Article explains it more in detail in case.

There's this script:
https://github.com/iateadonut/magento_copy_customer
It grabs a customer and all his orders through the single command:
$mg->copy_customer(1234);
if 1234 is the customer_entity.entity_id - You can take a look in the source code to see how the table restraints were queried to make sure all rows were grabbed.

You are approaching this the wrong way around. If you have EE , then it bundles content staging procedures and you should use that for your content changing.
And yes it most certainly can cause issues, as all your relations with other order related content like sent invoices, all objects attributes might just get new entity_id 's and this will eventually end in a mess along your road somewhere.
If you add attribute sets and attributes to a large installation it's always recommended to implement those as extension setup routines so you can move your codebase and all changes are automatically populated to whatever database you might connect in the future.

Related

How to handle multiple customers with different SQL databases

Summary
I have a project with multiple existing MSSQL databases, I already created an Azure Analysis Service where I deployed my first Tabular Cube. I already tested to access the Analysis Service, worked perfectly.
Finally I have to duplicate the above described for ~90 databases (90 different customers).
I'm unsure how to organize this project and I'm not sure about the possibilities I have.
What I did
I already browsed the Internet to find some information, but I just found a single source where somebody asked a similar question, the first reply is what I was already thinking about, as I described below.
The last reply I don't really understand, what does he mean with one solution, is there another hierarchy above the project?
Question
A possibility would be to import each database as a source in the same project, but I think this means I have to import each table from this source, means finally 5*90 = 450 tables, I think this gets quickly outta control?
Also I thought about duplicating the whole Visual Studio Project folder for ~90 times for each customer, but at the moment I fail to find all references to change the name, but I think this wouldn't be to hard.
Is there an easier way to achieve my goal? Especially regarding maintainability.
Solution
I will make a completely new Database with all the needed tables. Inside those tables I copy the databases from all customers with a new column customerId. The data I'll transfer with a cyclic job, periodicity to define. Updates in already existing row in the customer database I handle with a trigger.
For this the best approach would be to create a staging database and import the data from the other databases, so your Tabular Model can read the data from it.
Doing 90+ databases is going to be a massive admin overhead and getting the cube to lad them effectively is going to be problematic. Move the data using SSIS/Data factory as you'll be able to better orchestrate the data movement, and incremental loads that way. That way if you need to add/remove/update data sources it is not done in the Cube, its all done at the database/data factory level.
Just use one database for all the customers and differentiate each customer with a customer_id column.

Multi tenancy with tenant sharing data

I'm currently in the process of making a webapp that sell subscriptions as a multi tenant app. The tech i'm using is rails.
However, it will not just be isolated tenants using the current app.
Each tenant create products and publish them on their personnal instance of the app. Each tenant has it's own user base.
The problematic specification is that a tenant may share its product to others tenants, so they can resell it.
Explanation :
FruitShop sells apple oranges and tomatoes.
VegetableShop sells radish and pepper bell.
Fruitshop share tomatoes to other shops.
VegetableShop decide to get tomatoes from the available list of shared
items and add it to its inventory.
Now a customer browsing vegetableshop will see radish, pepper bell and
Tomatoes.
As you can guess, a select products where tenant_id='vegetableshop_ID' will not work.
I was thinking of doing a many to many relation with some kind of tenant_to_product table that would have tenant_id, product_id, price_id and even publish begin-end dates. And products would be a "half tenanted table" where the tenant ID is replaced by tenant_creator_id to know who is the original owner.
To me it seems cumbersome, adding it would mean complex query, even for shop selling only their own produts. Getting the sold products would be complicated :
select tenant_to_products.*
where tenant_to_products.tenant_ID='current tenant'
AND (tenant_to_products.product match publication constraints)
for each tenant_to_product do
# it will trigger a lot of DB call
Display tenant_to_product.product with tenant_to_product.price
Un-sharing a product would also mean a complex update modifying all tenant_to_products referencing the original product.
I'm not sure it would be a good idea to implement this constraint like this, what do you suggest me to do? Am I planning to do something stupid or is it a not so bad idea?
You are going to need a more complicated subscription to product mechanism, as you have already worked out. It sounds like you are on the right track.
Abstract the information as much as possible. For example, don't call the table 'tenant_to_product', instead call it 'tenant_relationships', and have the product Id as a column in this table.
Then, when the tenant wants to have services, you can simply add a column to this table 'service Id' without having to add a whole extra table.
For performance, you can have a read-only database server with tenant relationships that is updated on a slight delay. Azure or similar cloud services would make this easy to spin up. However, that probably isn't needed unless you're in the order of 1 million+ users.
I would suggest you consider:
Active/Inactive (Vegetable shop may prefer to temporarily stop selling Tomatoes, as they are quite faulty at the moment, until the grower stops including bugs with them)
Server-side services for notification, such as 'productRemoved' service. These services will batch-up changes, providing faster feedback to the user.
Don't delete information, instead set columns 'delete_date' and 'delete_user_id' or similar.
Full auditing history of changes to products, tenants, relationships, etc. This table will grow quite large, so avoid reading from it and ensure updates are asynchronous so that the caller isn't blocked waiting for the table to update. But it will probably be very useful from a business perspective.
EDIT:
This related question may be useful if you haven't already seen it: How to create a multi-tenant database with shared table structures?
Multi-tenancy does seem the obvious answer as you are providing multiple clients your system.
However as an alternative, perhaps consider a reseller 'service layer', this would enable a degree of isolation whilst still offering integration. Taking inspiration to how reseller accounts work with 3rd parties like Amazon.
I realise this is very abstract reasoning but perhaps considering the integration at a higher tier than the data layer could be of benefit.
From experience strictly enforcing multi-tenancy at a data layer level we have found that tenancy sometimes has to be manipulated at a business layer (like your reseller ideas) to such a degree that the tenancy becomes a tricky concept. So considering alternatives early on could help.

Handling passive deletion updates (ie. archiving instead of deleting)

We are developing an application based on DDD principles. We have encountered a couple of problems so far that we can't answer nor can we find the answers on the Internet.
Our application is intended to be a cloud application for multiple companies.
One of the demands is that there are no physical deletions from the database. We make only passive deletion by setting Active property of entities to false. That takes care of Select, Insert and Delete operations, but we don't know how to handle update operations.
Update means changing values of properties, but also means that past values are deleted and there are many reasons that we don't want that. One of the primary reason is for Accounting purposes.
If we make all update statements as "Archive old values" and then "Create new values" we would have a great number of duplicate values. For eg., Company has Branches, and Company is the Aggregate Root for Branches. If I change Companies phone number, that would mean I have to archive old company and all of its branches and create completely new company with branches just for one property. This may be a good idea at first, but over time there will be many values which can clog up the database. Phone is maybe an irrelevant property, but changing the Address (if street name has changed, but company is still in the same physical location) is a far more serious problem.
Currently we are using ASP.NET MVC with EF CF for repository, but one of the demands is that we are able to easily switch, or add, another technology like WPF or WCF. Currently we are using Automapper to map DTO's to Domain entities and vice versa and DTO's are primary source for views, ie. we have no view models. Application is layered according to DDD principle, and mapping occurs in Service Layer.
Another demand is that we musn't create a initial entity in database and then fill the values, but an entire aggregate should be stored as a whole.
Any comments or suggestions are appreciated.
We also welcome any changes in demands (as this is an internal project, and not for a customer) and architecture, but only if it's absolutely neccessary.
Thank you.
Have you ever come across event sourcing? Sounds like it could be of use if you're interested in tracking the complete history of aggregates.
To be honest I would create another table that would be a change log inserting the old record and deleted records etc etc into it before updating the live data. Yes you are creating a lot of records but you are abstracting this data from live records and keeping this data as lean as possible.
Also when it comes to clean up and backup you have your live date and your changed / delete data and you can routinely back up and trim your old changed / delete and reduced its size depending on how long you have agreed to keep changed / delete data live with the supplier or business you are working with.
I think this would be the best way to go as your core functionality will be working on a leaner dataset and I'm assuming your users wont be wanting to check revision and deletions of records all the time? So by separating the data you are accessing it when it is needed instead of all the time because everything is intermingled.

Magento - Migrate products by copying all database tables with catalog_ prefix

In order to migrate only the products and categories, I manually copied all the database tables with the catalog_ prefix from one db to another and it seems to have worked rather well.. so far.
But does anyone know if there is anything potentially bad in doing this?
It might be bad if you have custom eav attributes. Also, even core eav attributes ids can mismatch on different magento instances (if you installed different magento versions).
Time will tell. The tables in Magento are pretty much all relational - so if you've missed something with a foreign key dependency - you're bound to run into issues.
What about your custom attributes, attribute sets, historic orders that relate to a certain entity ID etc.
You would be better off exporting and re-importing your catalogue for a "cleaner" approach, although, it will take some time if you have a large catalogue (100k+).
Have a look at Unirgy Rapidlow - it supports features your looking for and we recommend it to a lot of clients as a drop-in replacement for Dataflow.
Thanks for the answers, guys.
In case anyone is thinking of trying this, some issues did creep in. When creating new products through the admin, I suddenly found I couldn't get them to show up in the front-end.
Also, (this may or may not have been related) I noticed the image upload buttons seemed to have vanished in the Add Product screen.
In the end the paranoia got too much and I was attributing every glitch to the potentially ropey db migration. I scrapped it and took a totally different approach.

Strategy for updating data in databases (Oracle)

We have a product using Oracle, with about 5000 objects in the database (tables and packages). The product was divided into two parts, the first is the hard part: client, packages and database schema, the second is composed basically by soft data representing processes (Workflow) that can be configured to run on our product.
Well, the basic processes (workflow) are delivered as part of the product, our customers can change these processes and adapt them to their needs, the problem arises when trying to upgrade to a newer version of the product, then trying to update the database records data, there are problems for records deleted or modified by our customers.
Is there a strategy to handle this problem?
It is common for a software product to be comprised of not just client and schema objects, but data as well; typically it seems to be called "static data", i.e. it is data that should only be modified by the software developer, and is usually not modifiable by end users.
If the end users bypass your security controls and modify/delete the static data, then you need to either:
write code that detects, and compensates for, any modifications the end user may have done; e.g. wipe the tables and repopulate with "known good" data;
get samples of modifications from your customers so you can hand-code customised update scripts for them, without affecting their customisations; or
don't allow modifications of static data (i.e. if they customise the product by changing data they shouldn't, you say "sorry, you modified the product, we don't support you".
From your description, however, it looks like your product is designed to allow customers to customise it by changing data in these tables; in which case, your code just needs to be able to adapt to whatever changes they may have made. That needs to be a fundamental consideration in the design of the upgrade. The strategy is to enumerate all the types of changes that users may have made (or are likely to have made), and cater for them. The only viable alternative is #1 above, which removes all customisations.

Resources