There are several techniques that Premier International utilizes on every implementation to eliminate data migration risk. One technique critical to the success of Oracle, SAP, Workday, and other complex implementations is a central data repository. We have been leveraging this technique and optimizing our software’s data migration repository as one the key tenants of our data migration process for many years. The following 8 benefits highlight the importance of the repository and are routinely realized on our implementations.

1. One Location = Improved Analysis

One of the biggest benefits of having a central data repository is that it places the entire data landscape in one location. With the data in a single place, analysis reports that include information about each individual source as well as across multiple sources can be more easily implemented. These analysis reports provide a 360-degree view of the data landscape, identify conditions that cause issues such as duplicates, missing/invalid values, and other integrity problems. Without a data repository, cross system analysis is difficult, leaving many issues uncovered resulting in bad\duplicate\orphan data end in the target system.

2. No Impact to Production

The ability to create and run analysis reports across the entire landscape adds incredible amount of value into the project. Without the data repository, analysis would need to be performed on test or production environments. If the analysis is done in test environments, data will be out of date and out of sync. If there is a data cleansing effort, monitoring the status of data quality issues becomes difficult. If the analysis is on production, impact to the ongoing processing is a concern and activities will be required to take place within limited windows. By being able to extract data as needed, the repository sidesteps both issues as extraction activities have minimal impact to production databases.

3. Predict Conversion Results

Predicting conversion results without actually populating the new application is critical for knowing the data is go-live ready before cut-over. A centralized data repository facilitates this effort. By having a repository, it is possible to import configuration data, table structures, and meta data from the target system and run mini conversion tests that ensure that there aren’t missing cross reference values, setups, and other conversion errors. This predictability becomes more important in Cloud implementations where control of the environments is limited and the ability to restore backups/back-out data is difficult/impossible. Since these mini-test cycles are a frequent occurrence, issues are identified early and pulled forward in the project. When it is an official test cycle or go-live, the project team has a high confidence in the data, as it has already been run through the ringer.

4. Clean Independent of Production

Data quality is the number one complaint on all projects and to meet the data cleansing requirements, additional data sources and cleansing spreadsheets need to be incorporated into the process. Often these data sources are filled out by the business to override data that needs to be cleansed.  A data repository provides a method to validate and incorporate these additional cleansing sources into the data conversion\cleansing processes.

5. Consistently Transform and Enrich Across Multiple Data Sources

The legacy and target applications were developed by different companies, with different business requirements, in different eras of technology. A byproduct of this is that the new system will need data that doesn’t currently exist to function. To enhance the data to include this information, new data needs to come from somewhere. Sometimes this data can be calculated by the conversion programs. Perhaps county code is needed in the new system, but doesn’t exist in the legacy systems. Data conversion programs can calculate the county value during their transformations.

However, something like primary contact indicator might not exist but is something that the business wants to utilize going forward. The easiest way to take care of this request is for the business to fill out a spreadsheet that indicates who the primary contact should be for each supplier. This spreadsheet can then be imported into the data repository and utilized by the data conversion programs during the migration.

6. Streamline Data Migration Testing

Data is constantly changing. The constant state of change makes it difficult to validate and test data conversion programs. Since data in the repository is captured at a point in time, enhancements can be reviewed and tested without worrying about timing issues. Without a data repository, validating record counts and tracking down various data issues becomes an effort of futility as those metrics change one minute to the next.

7. Simplify Post Conversion Reconciliation

A critical piece of all data conversion projects is the post conversion reconciliation. There are several ways to approach the reconciliation and the repository accelerates the effort. After the data is migrated, it can be extracted into the repository alongside the original source data. Since the repository has the data at the exact point in time of the conversion, it is possible to definitively prove out the migration process without having to worry about post conversion updates.

8. Zero In on Exceptions

A powerful validation technique is the identification of exceptions/differences that occur from test cycle to test cycle. A central data repository allows data to be captured at any point in time. Once this snapshot is in place, any piece of data that is different compared to another point in time can be identified. This information brings focus only on the differences, meaning no time needs to be spent on re-validating/re-reviewing any unchanged data. This exception-based reporting technique accelerates validation times and a lot of personal frustration of not having to repeatedly check the same data.

There are many other techniques and processes involved in the data migration process that further reduce risk on implementation, but the central data repository is an important one. If you have any questions regarding data migration processes, issues, and ways to reduce data migration risk, email me at steve_novak@premierintl.com or call me at 773.549.6945.

For more information about data migration repositories, check out this post by my colleague Rachel Sweeney.