Summary
Overview
Work History
Education
Skills
Certification
Training
Timeline
Generic

Binoy Rappai

Senior Consultant
Dacula,GA

Summary

Accomplished IT professional with over 18 years of experience in the industry. Expertise in leading successful Cloud migration projects and overseeing Data governance and Data lakes initiatives. Proven track record in designing and implementing ETL solutions for complex data environments. Utilizes advanced ETL tools and techniques to optimize data processes and enhance system efficiency. Meticulous process control and rigorous quality checks consistently ensure data integrity and accuracy throughout career.

Overview

19
19
years of professional experience
5
5
years of post-secondary education
1
1
Certification

Work History

Senior ETL Consultant

Tek Systems, Bank of America
Atlanta, GA
04.2023 - 11.2024
  • The creation of runbook and other supporting materials that provide instructions on how to use and maintain the data governance tools and processes
  • Monitoring: Ongoing monitoring of data governance policies and procedures to ensure all are being followed and to make adjustments as needed
  • Provided Bigdata, and data warehouse consulting services to major financial, insurance, and automotive clients
  • Involved in Architecting the data lake using Map-R Hadoop cluster
  • Designed high-level data flow charts as part of documenting the data lake
  • Implemented best practices to build and maintain the data lake for unstructured, semi-structured, and structured data
  • Involved in Architecting the data lake using Map-R Hadoop cluster
  • Worked on the ELT process using Informatica DEI
  • Worked on Informatica DEQ to perform data profiling in the raw data set in the new data lake and estimated data usage and value
  • Implemented achieving mechanism using GitHub in Informatica DEI
  • Performed data loading into Map-R Hadoop cluster using Informatica Spark Engine
  • Worked on Hive tables performance tuning and HDFS
  • Designed and Implemented Audit Balance framework for the ETL jobs
  • Worked on INFA pushdown optimizations in ETL long running jobs
  • Involved in Bigdata business requirements, worked on ETL mapping documents and high-level documentations
  • Defined Data modeling and naming standards and Best Practices for the Modeling team as well as in the DDLs and DMLs
  • Hands on experience in end-to-end warehouse implementation which includes analysis, design, coding, and testing
  • Designed and coded application components in an Agile environment utilizing a test-driven development approach
  • Very good experience in implementation of data profiling, creating score cards, creating reference tables, and documenting Data Quality metrics/dimensions like Accuracy, completeness, duplication, validity, consistency
  • Bank of America is implementing data virtualization platform for its different complex application systems
  • This will help business team to analyze the data without ETL integration and DB request and access
  • This will assist users to find data lineage in the fast phase and more easy approach to know about other application data sets
  • Created data documentation for the data virtualization connected source systems
  • Created catalogs in Starburst in Teradata, Oracle and SQL Server
  • Produce artifacts in support of reference architecture advocacy and implementation, including authoring documentation, white papers, and presentations/diagrams for dissemination to technical and business audiences
  • In-depth knowledge and proficiency in working with SQL in the context of Oracle bank application and Teradata application databases
  • Ability to create effective DDLs and DMLs tailored for Oracle and Teradata Catalogs
  • Acting as a senior architect, provide technical and process leadership for projects, defining and documenting information integrations between systems and aligning project goals with reference architecture
  • Lead end-to-end Starburst implementation at large enterprise environment integrating with multiple legacy applications in heterogeneous technologies (Teradata, Oracle, SQL Server)
  • Configure and tune production and development Starburst environments
  • Provided support for QA team to assist their test queries to run against Starburst.
  • Enhanced data quality and accuracy by implementing stringent validation checks and monitoring procedures.
  • Provided expert guidance on database design principles, helping optimize schema structures for efficient querying and reporting capabilities.
  • Resolved complex technical issues through careful analysis and debugging, ensuring minimal impact on business operations.

Data Architect

Informatica LLC
Atlanta, GA
07.2020 - 01.2022
  • This project is a cloud journey
  • Converting all on-prem systems to AWS cloud platform
  • Client wants to use Hadoop ecosystem in AWS
  • I solution the On-prem Hadoop system conversion into Cloud
  • Created S3 connections, Snowflake connections etc
  • For the ETL migration to cloud
  • Designed shell script for the handling error handing scenarios
  • Design and develop high performance, robust and secure solutions based on Cloudera ecosystems
  • Skillful in optimizing data queries for improved performance and efficiency
  • Capable of enhancing SQL queries to achieve optimal results in Oracle, Hive, and Impala environments
  • Assess EMR-related knowledge across AWS sales, solutions architecture and business development teams and develop plans to address gaps
  • Provide MS Azure –IaaS/PaaS expertise on Hortonworks stacks
  • Provide expertise on batch & stream analytics with HDFS, Kafka, Spark, Web-HDFS and Hortonworks stack
  • Stay current with emerging tools and technologies and recommend adoption that will provide competitive advantage and development/delivery efficiencies
  • Develop technical presentations & proposals & perform customer presentations
  • Evaluate new technologies, execute proof-of-concepts and develop specialized algorithms.

Data Architect

Informatica LLC, Client - Ford Motors LLC
08.2019 - 06.2020
  • The project scope is to perform data migration, data quality and ETL code migration
  • The newly designed data lake have been used to store all legacy data, other system application data from production line, sales and raw material procurement etc
  • Informatica DEI/DEQ used here as ELT
  • Implemented the Metadata scanning strategy and data governance integration with metadata resources
  • As a lead Data Quality developer in this team initiated the process of Data profiling by profiling different formats of data from different
  • Created designs in BDM mappings for the corresponding Alteryx jobs
  • Validation, Standardization and cleansing of data will be done in the process of implementing the business rules
  • Most of data which belongs to various members and Providers will be carried out throughout the development
  • Used data profiling to verify final data output in Hive
  • Most of data which belongs to various members and Providers will be carried out throughout the development
  • And started analyzing its dimensions to find its actual structure and the rules which are implemented as part of Standardization
  • Created catalog resources and identified the Data lineage
  • Extensively worked on Informatica DEQ/IDQ
  • Designed high level data flow charts as part of documenting the data lake
  • Implemented best practices to build and maintain the data lake for unstructured, semi structured and structured data
  • Performed data loading into MapR Hadoop cluster using Informatica Spark Engine
  • Have very good knowledge on all the data quality transformation which will be used throughout the development
  • Involved in massive data profiling using IDQ (Analyst Tool) prior to data staging
  • Helped customer to create facets and Data governance implementations
  • Very familiar about Sorter, Filter, Expression, Consolidation, Match, Exception, Association and Address validator transformations
  • Used address doctor to automatically checks and corrects addresses in your CRM system.

Bigdata Architect

Calvin Klein
10.2018 - 07.2019
  • PVH is home for many famous cloth brands like, , Van Heusen etc
  • Building a Data Lake is primary goal and improve forecast accuracy in the inventory and days of supply
  • On top of this Data Lake, PVH wants to build predication algorithms models and run these models in Google Cloud Platform
  • Contributions:
  • Involved in Architecting the data lake using MapR Hadoop cluster
  • Designed high level data flow charts as part of documenting the data lake
  • Implemented best practices to build and maintain the data lake for unstructured, semi structured and structured data
  • Worked on SCOOP to migrate data from Oracle DB to HDFS
  • Designed HIVE database, tables and partitions
  • Designed Tableau report layout based on the business requirement
  • Designed the forecast accuracy dashboard like bulls Eye ,Tabular formats etc
  • Designed the data archiving mechanism for efficient and cheap data storage in GCP
  • Worked on ELT process using Informatica BDM
  • Worked on Informatica BDQ to perform data profiling in the raw data set in the new data lake and estimated data usage and value
  • Designed predication model using Python 2.7 and Tensor flow and implemented in GCP ML module as part of POC
  • Implemented achieving mechanism using Github
  • Performed data loading into MapR Hadoop cluster using Informatica Spark Engine
  • Experience in Informatica performance tuning and end to end deployments in various environments
  • Designed and Implemented Spark jobs to perform Audit Balance framework

Big Data Architect

Highmark Health Solutions
04.2018 - 10.2018
  • Highmark is undertaking a significant business initiative to architect and engineer a next generation Enterprise Data Hub for Analytics (EDHA)
  • This platform will drive critical business decisions across Highmark Health, Highmark Inc., Allegheny Health System and will serve as an analytic platform option for HM Health Solutions customers
  • Designed and build a true big data ELT that supported data science platform for optimizing patient experience
  • Patient and hospital premise data along with social media, surveys and health care data is available in Data Lake
  • But because of the complexity of the data semantics, this data is not valued, or insights are not extracted properly
  • The data platform design follows three patterns.- Before visit, during visit and after visit
  • The communication channel where patient interact was implemented out holistically in order to generate insights to act upon.- Data is classified to Billing/insurance data (more structured and reliable), cleanliness standards, Noise/wait time, staff behavior (Respect/empathy etc.), business unit data (surgical, clinical etc.) - Hadoop cluster was fine tuned for advanced data science application
  • Since most of the advance AI libraries Keras, Tensorflow etc
  • Are sequential, special care must be taken to run containers on Hadoop
  • Contributions:
  • Created strategy and planning for the cluster migrations
  • Designed high level data flow diagrams for the claims data
  • Designed and build Claims life cycle and increment data load
  • Proficient in the creation of comprehensive data sets
  • Well-versed in handling PHI (Protected Health Information) data
  • Familiar with managing PII (Personally Identifiable Information) data
  • Experienced in ensuring data privacy and security measures
  • Implemented the health care coding standards and best practices
  • Implemented data rules up to the Healthcare industry standards
  • Designed the data archiving mechanism for efficient and cheap data storage
  • Performed Informatica BDQ data profiling in the raw data set in the data lake and identified the useful data set that can be used for the analytical platform
  • Designed and defined the data layers from Raw, Refine and Publish
  • Imposed data quality rules in the Refine Area and ensure the grain data to be extracted and ready for the Publish data layer
  • Performed data loading into Hadoop cluster using Informatica Blaze Engine
  • Experience in Informatica performance tuning and end to end deployments in various environments
  • Proficient in utilizing SQL for Oracle database operations
  • Competent in crafting DDL and DML statements specific to Oracle
  • Identified the unstructured, semi unstructured raw data for the data Ingestion to the Raw data layer and performed data analysis and data profiling etc
  • Implemented Informatica BDM Audit Balance Control framework for the jobs execution and ensured the streamlined data feeds from Raw to Refine data layer and source data ingestion
  • Hands on with application upgrades like HDP 2.5 to 2.6 and Informatica BDM 10.1 to 10.2.
  • Mentored junior team members, fostering a culture of continuous learning and professional growth within the organization.
  • Planned migration strategies for legacy systems transition to modernized architectures without compromising operational continuity or end-user experience during critical transformation phases.

ETL Senior Consultant

State of Ohio, DAS
09.2017 - 04.2018
  • DAS has programs like SNAP cash and ORR to provide the accurate statistics to the state agencies
  • These programs are required data integration with other systems to build the reports in tabular format / Webi reports and other types of reports
  • This will help IT users in DAS to drill through various reports in these programs
  • Experienced in IVR project like IVR discontinuance letter design , IVR Reminder letter designs etc
  • Contributions:
  • Designed and worked on Informatica mappings required for the SNAP and ORR programs
  • Worked with Cognos team for the report data validations
  • Involved in UAT process and assisting UAT users to validate data scenarios
  • Created Informatica workflows and scheduled the tasks
  • Designed the error handling in the informatica mapping for the reject data
  • Worked on the high & low levels design documents
  • Implemented Informatica standards and rules up to the industry standards
  • Worked on code migration for data availability in higher regions for testing team and UAT
  • Worked on ETL integration Informatica designs
  • Helped on Prod support team for the urgent fixes
  • Written PLSQL code for the data integration between Informatica and other Source systems
  • Written IVR logic for the mail notifications.

Guidewire ETL Consultant

Farmers Insurance
11.2016 - 09.2017
  • Farmers has embarked upon a multi-year journey to implement Guidewire's PolicyCenter and ClaimCenter solutions
  • The PolicyCenter Guidewire Operational Data Store (GW ODS) will provide a common repository for policy and claims that is optimized for business usage
  • This repository allows a broader group to report, analyze, and distribute data without impacting Guidewire's day-to-day operations
  • The store establishes a common language for users that enables faster decision-making and more targeted actions to be taken directly in support of PolicyCenter and ClaimCenter operations
  • Contributions:
  • Develop and maintain data architecture road map for the Data as a Services Actively involved in the data analysis in legacy system
  • Extensively optimized all the Informatica sources, targets, mappings and sessions by finding the bottlenecks in different areas and debugging some existing mappings using the Debugger to test and fix the mappings
  • Provided technical inputs during project solution design, development, deployments and maintenance phases
  • Implemented best practices to configure and tune Big Data environments, application and services
  • Designed ODS tables in Hive data warehouse for the analytical platform for the policy system
  • Worked on the high & low levels design documents for the offshore team
  • Analyzed the Guidewire XML hierarchy and designed the Hive table structures
  • Implemented data rules up to the insurance industry standards
  • Worked on GW data dictionary and documentation
  • Used GIT/SVN for the code movements from lower region till Prod
  • Worked on code migration for data availability in higher regions for testing team and UAT
  • Worked on GW ETL integration Informatica designs
  • Designed the jobs dataflow to load from GW data to ODS
  • Involved in designing, Installing , configuring enterprise Big data Hadoop platform: Dev , QA and Prod clusters
  • Solution Architect responsible for designing and maintaining ETL solution, using Informatica & Oracle PL/SQL, including Application Data and dimension modeling and also project management for various projects in Information Management
  • Designed and build relational database models and defines data requirements to meet the business requirements
  • Worked on policy conversion from legacy to Guidewire system
  • Implemented data rules up to the insurance industry standards
  • Defined and implement ETL tool configuration
  • Developed and communicate enterprise data technology standards and policies.

ETL Lead

ICC Ohio, Nationwide Insurance
02.2013 - 11.2016
  • Nationwide's recent merging with other insurance companies like Harleysville and Titan Insurance companies, made them decided to maintain MDM for all agents across all the different systems
  • ETL integration would take much longer period than a data virtualization .This reason AIMS program chosen Denodo Platform to pull all source data into a unified virtual layer and present it as a virtual source that fed the ETL processes to IBM MDM database
  • Creation and maintenance of Logical and Physical Data Model diagrams
  • Understand business and technical concepts to create optimal data design solutions
  • Working in designing, development and data modeling in business intelligence, analytics and data warehousing environment
  • Extensively used Informatica Power Center to load data from Flat Files to DB2, Flat Files to SQL Server, DB2 to XML files, Flat Files to Oracle
  • Involved and develop several complex mappings in Informatica a variety of PowerCenter transformations, Mapping Parameters, Mapping Variables, Mapplets & Parameter files in Mapping Designer using Informatica PowerCenter
  • Provide the post production implementation support and work with Operations team to provide them the KT for continuous Application Support & Maintenance
  • Facilitate any performance tuning activity for Informatica Workflows
  • Hands on interaction, technical evaluation, project support, and the establishment of patterns and standards with the various and multiple emerging data products and technologies
  • Mentor junior associates
  • Created Informatica mappings as per technical specifications
  • Analyzing defects and incidents and document it
  • Worked on database environment setting for System test and Performance test etc
  • Responsible for on call and on call rotation
  • Coordinating Production deployment.

ETL Technical Lead

Accenture, Pfizer PACE
07.2011 - 01.2013
  • Pfizer Customer Business Unit Contract Strategy & Management, Global Financial Services, Established Products, and Business Technology are working collectively to consolidate the Pfizer and Wyeth contracting operations business functions as well as the corresponding contracting applications by the end of 2011
  • The objective of the PACE Program (defined as the Requirements, Design, Build, Test and Deploy phases for the six work streams defined above) is to harmonize Pfizer and Wyeth Contracting business functions as well as integrate all work streams (excluding CDW) on one common technology platform, I-Many's ContractSphere
  • Contribution: As a Team lead:
  • Responsible for offshore coordination, requirement gathering and point of contact for project deliverables
  • Created high-level & low-level ETL flow design and involved in the detailed technical design
  • Worked in gRebates technical design for rebate engine calculation
  • Assisting UAT team by providing technical support
  • Worked on database environment setting for System test, UAT and Performance test etc
  • As a Developer:
  • Working for Tricare ETL interface and Testing
  • Worked in complex mappings and workflow designs with UNIX script
  • Developed error handling mechanism for gRebates interface
  • Created views and materialized views for other downstream applications
  • Designed and implemented reconciliation strategy for each and every ETL interface

ETL Technical Lead, Developer

Accenture, Nomura Securities,
12.2010 - 07.2011
  • The Nomura Group is a financial services group comprising Nomura Holdings and its subsidiaries in Japan and overseas
  • Client wants to develop global finance architecture (GFA) for its financial systems, on this objective they want to migrate the people soft 8.6 to 9.1, for enabling this, there are some ETL interface to give feed to People Soft
  • Contribution: As a Team lead:
  • Responsible for offshore coordination, requirement gathering and point of contact for project deliverables
  • Worked in the migration plan, documentation and contingency plans
  • Created high-level & low-level ETL flow design and involved in the detailed technical design
  • Involved in data migration for testing purpose from real time using CLECT tool
  • Analyzed the dependency of jobs and scheduled jobs by using Autosys scheduler
  • As a, Currently working on UAT and QA cycle
  • Analyzing the results with the real data
  • Involved in complex mappings and workflow designs with UNIX script
  • Developed customized email task for error handling using script
  • Developed an error handling mechanism for GFA system
  • Mentored junior developers through regular 1-on-1 meetings, providing guidance on best practices, coding standards, and career growth opportunities.
  • Coordinated with cross-department teams like QA, DevOps, and Support to ensure seamless end-to-end software delivery process.

ETL Technical Lead

Farmers Insurance
10.2009 - 10.2010
  • Group of Companies is the country's third-largest insurer of both private Personal Lines passenger automobile and homeowners insurance, and also provides a wide range of other insurance and financial services products
  • In this project I worked as ETL team lead and same time having the responsibility for onsite coordination
  • This project goal is developing a HOI data mart from FDR
  • Contribution:
  • As a Team lead:
  • Gathered the Functional Requirement Specifications from business users
  • Responsible for designing the ETL Strategy & Architecture of the Project
  • Point person on the ETL team for other teams such as Reporting, Testing, QA and Project Management for updates on Project Status and issues
  • Created high level & low-level ETL flow design and was involved in the detailed technical design
  • Involved in conducting reviews of Informatica Code, Unit Test Cases & Results
  • Organized daily technical discussions with the Onsite team and including the individual offshore work stream leads and set expectations for offshore delivery
  • As a Developer:
  • Created mappings employing various transformations, filters, joiners, lookup, SQL overrides etc
  • Created Sessions and Workflows to schedule these mappings
  • Created Unit Test cases and Integration test cases
  • Performed unit testing of Workflows from source to staging and from staging to warehouse
  • Performed reconciliation and evidence creation
  • Performed analysis and fix of QA defects
  • Involved in Unit testing and UAT defect fixing
  • Involved in sessions partitioning and mappings performance tuning
  • Involved in the QA and production deployment
  • Environment: Informatica 8.6, UNIX shell scripts, DB2, Toad
  • Performed unit testing of Workflows from source to staging and from staging to warehouse
  • Performed reconciliation and evidence creation
  • Environment: Informatica 8.6, UNIX shell scripts, Oracle 10g

ETL Senior Developer

Ness Tech, Aerospace
11.2008 - 05.2009
  • This project is all about raw material management system for a healthy purchase from vendors and quality assurance for raw materials purchase est
  • Have written technical specifications and test cases for all the modules and done the reviews for other team mates work etc.We Implemented best practices in Informatica and Oracle for the best performance
  • Prepared High Level Design and Low Level Design (Technical Design) documents
  • Was responsible for managing tasks and deadlines for the ETL teams both Onsite and Offshore
  • Performed requirement gathering and requirement analysis
  • Interacted with the client on various forums to discuss the status of the project
  • Gave the inputs & shared the knowledge with other team members
  • Developed mappings for source to staging database and staging to target database
  • Created Informatica sessions and workflows
  • Worked on Performance tuning
  • Migrated the code to production
  • Designed and Implemented high complexity mapplets that was reused across all the mappings to load error tables
  • Created complex mappings to load the major facts and dimensions
  • Environment: Informatica 8.1.1, Business Objects, Oracle, UNIX

ETL Senior Developer

Ness Tech
03.2008 - 11.2008
  • This project, we have added five columns in the existing BO report, for achieving this we done the analysis ,business fundamentals and understood the requirement for proper placing of columns on respective tables
  • We have edited two main mappings that have similar functionality with these five columns
  • Main challenge on this is the impact analysis .We found that a few database objects like views and tables would get impacted Was responsible for managing tasks and deadlines for the ETL teams both Onsite and Offshore
  • Performed requirement gathering and requirement analysis
  • Interacted with the client on various forums to discuss the status of the project
  • Gave the inputs & shared the knowledge with other team members
  • Edited the respective mappings, which would impact the existing changes
  • Altered the DB tables and views for adding the five columns
  • Tested the edited work with proper test cases and gone through and verified the impacted areas etc
  • Preparation of unit test cases for reports
  • Involved in testing and reviewing of BO reports
  • Created reports in different formats (pdf, HTML) as per the Business requirement
  • Environment: Informatica 8.1.1, Business Objects, Oracle, UNIX

ETL Developer

Cisco Systems
04.2006 - 02.2008
  • Build a reporting Platform to create reports on various WPR systems in order to help WPR business in Planning, Analysis, and Forecasting
  • Source systems being used: All core WPR systems (eProjects, Dashboard)
  • Performed high level and low level ETL design
  • Shared data warehousing concepts among the team members
  • Deployed 7 quarter Production releases successfully and effectively
  • Created mappings like SCD -2, Simple pass through etc
  • Created UNIX batch scripts that executes all the Informatica mappings
  • Support System Test,UAT and post production fixes and issues
  • Involved in requirement and GAP analysis for CR's
  • Implemented Transformations like Unconnected Stored procedure transformation, Lookup, Update, Expression etc
  • Developed Mappings and unit tested them
  • Created workflows contains task like session, email command, decision task etc
  • Involved in creating the initial set up of informatica, folder architecture and mapping naming standards
  • Environment: Informatica 8, 1.1, Oracle, Business Objects

Education

MBA - Information Management

Indian Institute of Management
Bangalore , India
01.2007 - 12.2007

Bachelor's degree - electronics and communications

MVJCE
Bangalore
10.1997 - 03.2002

Skills

Informatica cloud

Cloudera/Hortonworks

Azure

AWS

GCP

Java

Python

MS SQL Server DB

Teradata

Oracle

DB2

Snowflakes

Databricks

Starburst

SQL programming

Data quality management

Project management

Certification

Certified Cloud Data Integration Practitioner

Training

  • Data Integration & BI Tools: Attended various trainings in Informatica PowerCenter, Power Exchange, MDM, Business Objects, Teradata, Snowflake, Databricks.
  • Data Science and Big Data: Apache SPARK, SCALA, KAFKA, and various other tools.

Timeline

Senior ETL Consultant

Tek Systems, Bank of America
04.2023 - 11.2024

Data Architect

Informatica LLC
07.2020 - 01.2022

Data Architect

Informatica LLC, Client - Ford Motors LLC
08.2019 - 06.2020

Bigdata Architect

Calvin Klein
10.2018 - 07.2019

Big Data Architect

Highmark Health Solutions
04.2018 - 10.2018

ETL Senior Consultant

State of Ohio, DAS
09.2017 - 04.2018

Guidewire ETL Consultant

Farmers Insurance
11.2016 - 09.2017

ETL Lead

ICC Ohio, Nationwide Insurance
02.2013 - 11.2016

ETL Technical Lead

Accenture, Pfizer PACE
07.2011 - 01.2013

ETL Technical Lead, Developer

Accenture, Nomura Securities,
12.2010 - 07.2011

ETL Technical Lead

Farmers Insurance
10.2009 - 10.2010

ETL Senior Developer

Ness Tech, Aerospace
11.2008 - 05.2009

ETL Senior Developer

Ness Tech
03.2008 - 11.2008

MBA - Information Management

Indian Institute of Management
01.2007 - 12.2007

ETL Developer

Cisco Systems
04.2006 - 02.2008

Bachelor's degree - electronics and communications

MVJCE
10.1997 - 03.2002
Binoy RappaiSenior Consultant