Over all 15.8 Years of experience and 8 Years of Experience in  BigData Hadoop Architect has a strong background with file distribution systems in a big-data arena.Understands the complex processing needs of big data and has experience developing codes and modules to address those needs.Brings a Master's Degree in Computer Science along with Certification as a Developer using Apache Hadoop.years of professional experience in design, development and implementation of business applications which includes  more than 5  years SAP Netweaver experience ,SAP BI and HR,MM,SD as Techno Functional consultant, Has an demonstrated ability to rapidly acquire technical knowledge and skills within a short period of time. Keeps up-to-date with industry standards and trends through continuing professional development.

Work History

Work History


Apr 2013 - Present

Responsibilities :

Developed Map-Reduce programs in HIVE and PIG to validate and cleanse the data in HDFS, obtained from heterogeneous data sources, to make it suitable for analysis.Worked on streaming the data into HDFS from web servers using Flume.Designed and implemented Hive and Pig UDF's for evaluation, filtering, loading and storing of data.The Hive tables created as per requirement were Internal or External tables defined with appropriate Static and Dynamic partitions, intended for efficiency.Wrote Scripts to generate Map Reduce jobs and performed ETL procedures on the data in HDFS.Implemented Lateral View in conjunction with UDTFs in Hive.Worked extensively with Sqoop for importing and exporting data from MySQL into HDFS and Hive.Performed complex Joins on the tables in Hive.Load and transform large sets of structured, semi structured using Hive and Impala.Connected Hive and Impala to Tableau reporting tool and generated graphical reports.

Deliverable :

  • Installing, configuring, and administrating Hadoop cluster of major Hadoop distributions.
  • Hands-on experience in writing MapReduce jobs in Java, Pig and Python.
  • Working with Java, C++ and C.
  • Extensive experience in designing analytical / OLAP and transactional / OLTP databases.
  • Hadoop applications, including administration, configuration management, debugging, and performance tuning.
  • Deploying applications in heterogeneous application servers.
  • Creating web-based applications using ActiveXControls, JSP, Servlets
  • Creating web pages using HTML, DHTML, Java Script, VB Script and CSS


Jul 2012 - Mar 2013

Designed and developed servers and integrated application components.Prepared automation systems and tested server hardware.Supported administration and designing of server infrastructure.Utilized Spice works for providing helpdesk support services.Analyzed and resolved software bugs with hardware manufacturers.Created architecture components with cloud and visualization methodologies.Evaluated and documented source system from RDBMS and other data sources.Developed process frameworks and supported data migration on Hadoop systems.

Responsibilities :

  • Installed and configured MapReduce, HIVE and the HDFS; implemented CDH3 Hadoop cluster on CentOS. Assisted with performance tuning and monitoring.
  • Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
  • Supported code/design analysis, strategy development and project planning.
  • Created reports for the BI team using Sqoop to export data into HDFS and Hive.
  • Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
  • Assisted with data capacity planning and node forecasting.
  • Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
  • Administrator for Pig, Hive and Hbase installing updates, patches and upgrades.


Mar 2006 - Feb 2012

Responsibilities :

Implemented cross-selling analytics initiatives which have shown 32 percent CAGR increase in revenues for specific products Developed a contributed value and risk adjusted marketing offer optimization framework with yearly impact of $7 MM a year Managed delivery of cross-sell, upsell and retention marketing strategies & models across clients in multiple domainsBig Data Solution “Skills Tracker” potential solution for collecting data and reporting for individual employees.Developed analytics and strategy to integrate debt collection analytics in outbound calling operations and Implemented analytics delivery on cloud-based visualization platform Structured a Value added Reseller agreement with a BI platform vendor to provide domain-specific mobile analytics solution .Played the role of data architecture steward across global implementation of CRM Analytics data  Delivered 10x impact from high-valued analytic consulting engagements for multiple client.

Deliverable :

  • Joins - Map Side and Reduce Side,Use of Secondary Sorting Importance of Writable and Writable Comparable Api's,Use of Compression techniques Snappy, LZO and Zip,Job Initialization
  • Task Assignment
  • Task Execution
  • Progress and status bar
  • Job Completion
  • Task Failure,Tasktracker failur,JobTracker failure,Job Scheduling,Shuffle & Sort in depth
  • Diving into Shuffle and Sort,Dive into Input Splits,Dive into Buffer Concepts,Dive into Configuration Tuning and Dive into Task Execution .
  • Migrated from Amazon AWS to Co-location datacenter.
  • Ran the benchmark tools to test the cluster performance.
  • Configured the hadoop properties to achieve the high performance.
  • Shell scripts to dump the data from MySQL to HDFS.
  • Setup the ganglia monitoring tool to monitor both hadoop specific metrics and also system metricsWrote custom Nagiosscripts to monitor Namenode, data node, secondary name node, job tracker and task trackers daemons and setup alerting system.
  • Experimented to dump the data from MySQL to HDFS using sqoop.Upgraded the hadoop cluster from CDH3u3 to CDH3u and Written cron job to backup Metadata

Tech Mahindra LTD 

Feb 2005 - Mar 2006

Responsibilities :

    Define overall, strategy for Big Data roadmap, design, development and implementation of Enterprise Data Warehouse (EDW) and its associated data stores such as data marts, ODS and data modeling techniques.

Deliverable :

  • Involved in Hadoop along with Map Reduce, Hive and Pig set up. Worked with Hive QL on big data of logs to perform a trend analysis of user behavior on various online modules. Written Map Reduce programs for some refined queries on big data.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Developed simple to complex Map Reduce job using Hive.
  • Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
  • Optimized Map/Reduce jobs to use HDFS efficiently by using various compression mechanisms.
  • Created partitioned tables in Hive.
  • Extensively used Pig for data cleansing.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Developed the Pig UDF's to pre-process the data for analysis.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • igrated from Amazon AWS to Co-location datacenter.
  • Ran the benchmark tools to test the cluster performance.
  • Configured the hadoop properties to achieve the high performance.
  • Shell scripts to dump the data from MySQL to HDFS.
  • Setup the ganglia monitoring tool to monitor both hadoop specific metrics and also system metrics
  • Wrote custom Nagiosscripts to monitor Namenode, data node, secondary name node, job tracker and task trackers daemons and setup alerting system.
  • Experimented to dump the data from MySQL to HDFS using sqoop.
  • Upgraded the hadoop cluster from CDH3u3 to CDH3u4
  • Written cron job to backup Metadata

Product Technical Consultant - Telecom 

Nov 2000 - Mar 2005


The GAMA MMSC ECCN Mediation Server (GAMA MEMS) System Development mainly concentrates on the communication between the Motorola MMSC and ECCN Intelligent Network (IN) based prepaid Billing System. Following are some of the major functions that will be performed by GAMA MEMS System:

Deliverable :

  • Coding
  • Testing
  • Review & Analysis
  • Identifies resources needed and assigns individual responsibilities.
  • Manages day-to-day operational aspects of a project and scope.
  • Reviews deliverables prepared by team before passing to client.
  • Effectively applies our methodology and enforces project standards.
  • Prepares for engagement reviews and quality assurance procedures.
  • Minimizes our exposure and risk on project.
  • Ensures project documents are complete, current, and stored appropriately
  • Conversion of HTTP based request to IN server understandable form.
  • Support for HTTP accounting request DebitAmountRequest and KeepAliveRequest.
  • Perform statistical calculations such as Billing, charging, error check.
  • Support for standard diameter messages CER/CEA, DWR/DWA, DPR/DPA towards ECCN.
  • Logging of all the transactions into Database





Hons. Diploma in System Management  

City Computer Communication



HRD(Training & Development, ,IR,Marketing) ,SYSTEM






  • Good experience with Python, Pig, Sqoop, Oozie, Hadoop Streaming and Hive
  • Solid understanding of the Hadoop file distributing system
  • Extensive knowledge of ETL, including Ab Initio and Informatica
  • Excellent oral and written communication skills
  • Collaborates well across technology groups
  • Vast experience with Java, Puppet, Chef, Linux, Perl and Python
  • In-depth understanding of MapReduce and the Hadoop Infrastructure
  • Focuses on the big picture with problem-solving

  • SAP NetWeaver Application Server (AS-JAVA)
  • SAP NetWeaver Enterprise Portal7.0, 7.2
  • CE Portal 7.3
  • ABAP
  • SAP Solution Manager 7.0 - Process & Configuration
  • SAP Change Request Management
  • SAP Business Process & Interface Monitoring
  • SAP NetWeaver Portal Implementation & Operation
  • SAP NetWeaver Portal Development
  • Development of Knowledge Management & Collaboration Applications
  • Managing Enterprise Portal Content
  • SAP Netweaver Portal Development
  • Enterprise Portal system Administration
  • Knowledge Management
  • SAP Enterprise Portal Implementation
  • Advanced Web Dynpro for Java
  • SAP Java Open Integration Technologies
  • NetWeaver Development Infrastructure
  • Java on SAP NetWeaver 7.1 Technology
  • Web Dynpro JAVA on SAP NetWeaver 7.1
  • Migrating a J2EE Application to SAP NetWeaver Basics
  • SAP CRM E-Commerce (ISA-B2B) 5.0
  • SAP ECC 5.0, SAP CRM 5.0,ECC 6.0
  • Apache Tomcat
  • Business Intelligence 7.0
  • SAP NetWeaver Developer Studio (NWDS),
  • SAP NetWeaver Development Infrastructure (NWDI - DTR)
  • MDM
  • JAVA API 7.0
  • SAP BW 3.5b
  • BI 7.0
  • Web Application Designer
  • SAP CRM 3.0
  • SAP Solution Manager
  • Struts Framework 1.1,Data access object.
  • Core JAVA
  • J2EE
  • DB2/SQL
  • EJB
  • C, C++
  • Java, PL/SQL,
  • Developer 2000
  • J2eeServer(Sun),
  • WeblogicServer-8.0(BEA),
  • Jakarta Tomcat 4.0,Struts frame work 1.1
  • XI,External Facing Portal, EJB 2.0
  • 0,
  • JavaServlet
  • Jsp,JavaScript,HTML

  MS-Access97,SQL Server7.0



Hadoop Fundamentals Hadoop, Hive, Pig, Jaql, HBase, Map Reduce, ZooKeeper  from Big-data Univercity


SAP BI 7.0 completed online  training 2010 December

Completed the Certificate Course On Diploma  In  Java Technologies from WebWeavers New Delhi 15th feb 2000 to 24th April 2000 .

Completed the Certificate Course On Web Development from SOFTWARE TECHNOLOGY PARK OF INDIA, Bhubaneswar. 15th jan 2000 to 15th Feb 2000 .

Oracle 8.0,Devloper 2000 from oracle India. SQL STAR INTERNATIONAL PUBLIC LTD, Bhubaneswar june 99 to 19th Nov 1999.  

SCJP Standard Edition 5.0


Text Section

SAP recently announced the release of SAP HANA’s latest version SPS10 geared towards offering big data capabilities. This new version of HANA focusses on mission critical applications by processing big data and connecting the  Internet of Things (IoT) to help enterprises climb a step ahead in innovating next gen big data applications. Enterprises can continue to harness the power of big data by exploiting the novel data integration capabilities of SAP HANA and the latest enterprise Hadoop distributions provided by Hortonworks or Cloudera. The new HANA version SPS10 has a user interface that uses Apache Ambari for combining SAP HANA and Hadoop cluster administration and it also uses Apache Spark SQL for fast data transfer.

SAP has also partnered with Cloudera to provide solutions that work together with SAP HANA and Apache Hadoop. The main motive of SAP to embrace Hadoop is having easy connectivity to data, regardless of the fact that it is from the SAP software or from any other vendor.

Large organizations that run SAP Analytics through SAP BI Platform, SAP Predictive Analytics and SAP Lumira can directly connect to an enterprise data hub based on Apache Hadoop to store huge volumes of data reliably and cost effectively –via a direct connection to Cloudera Impala, the most interactive and leading data analytics database for Apache Hadoop.