Migration

Migrate SQL Job to HIVE SQL

SQL jobs and hence procedures are to be changed to Hive SQL if the same tasks to be performed in Big Data ecosystem. There are requirements of re-write the SQL SQL procedures are to be converted into Hive UDF. There are other many tasks to be performed while making such migration feasible and working.

Migrate User Metastore from one env to other

Some time there is the requirement of migrating user Metastore from one environment to another. The process starts with dumping Metastore database. Dumped Metastore database is copied into another environment. Metastore database is further configured. New or impacted Hive engines are directed to use migrated Metastore.

Migrate Database from one environment to another.

Migrating from one environment to another environment starts with creating snapshot of current state of database. Current snapshot of database is dumped and exported. The prerequisite configuration is created and configured in the target environment. Exported database snapshot is downloaded and imported in the target environment. Data export is optional in case required. Data is exported and loaded in the tables of new created database in the target environment.

Upgrade Database to higher versions.

Upgrading of Database to the higher version starts with creating snapshot and making the read replica. All connections pointing the database is routed to read replica so there will be no disruption of using current version of database. Once all previous tasks are completed then Database is upgraded. Upgrade process depends upon the source of binaries. If the database is provided by and through managed services like AWS then there is task of moving to higher version is just selecting to higher version. Otherwise there is requirement to download higher version binary and installing as per steps provided by the vendor on the supported infrastructure (whether Linux or Window or other type of Linux such Ubuntu, .. etc.). Once the step of moving to higher version is completed, upgraded database is configured as per system or applications required. But there is an urgent need of checking compatibilities and feasibilities of applications which basically will keep working with this upgraded database. There is ranges of versions of upgrade that should be included for consistent and seamless working of applications and system. Ranges of version falls within N-2 to N+2 if the current version is N. Once upgraded version database works fine with applications during testing, all connections are routed to newer data with upgraded version.

Upgrade engines (Hadoop or Hive or Spark or Airflow or Others) to higher versions.

Upgrading of Big Data engines to higher version is complex task in the organization. Such tasks are completed with coordination among different dependent projects for applications and infrastructure. Such work of upgrading engines start with checking compatibilities and feasibilities of applications which basically will keep working with this upgraded engines. The upgrade process start with taking back up of existing set up or creating parallel project and repository. Newer version of binary of engine is downloaded, installed, and configured as per current system and application requirement. Once new version of engine is tested and found working, the upgraded version is made available through different deployment and delivery methodologies considering that older version will be also there along with new one. Again, there is ranges of versions of upgrade that should be included for consistent and seamless working of applications and system. Ranges of version falls within N-2 to N+2 if the current version is N. Beyond this, older version is generally deprecated.