• New Delhi Delhi
  • 8930050050
  • sumitpwr7@gmai.com
Sumit Pawar

Sumit Pawar

Big-data Analytic,Business Analytic

Summary

I am a Big-Data Analytic and Business Analytic  consultant with around 2 years of IT Experience.

  • Sound Knowledge of Analytic, Big data and Hadoop, Software Engineering and Database Management.
  • Understanding of programming languages and operating systems.Knowledge of Analytic, Big data and Hadoop, Software Engineering and Database Management.
  • Understanding of programming languages and operating systems.

Career Objective

I am looking for a challenging and rewarding opportunity with an reputable organization which recognizes & utilizes my true potential while nurturing my analytical skills in the field of Big-Data Analytics and Business Analytics.

Work History

Work History

Big-Data Hadoop developer

Dec 2013 - Present
Primologic System Pvt. Ltd.
  • Hands on experience in Implementing Hadoop Cluster and different eco-system tools for analysis of Hadoop framework.  
  • Moved all crawl data flat files generated from various retailers to HDFS for further processing.
  • Developed Map-Reduce programs in HIVE and PIG to validate and cleanse the data in HDFS, obtained from heterogeneous data sources, to make it suitable for analysis.

  • The Hive tables created as per requirement were Internal or External tables defined with appropriate Static and Dynamic partitions, intended for efficiency.

  • Worked extensively with Sqoop for importing and exporting data from MySQL into HDFS and Hive.
  • Load and transform large sets of structured, semi structured using Hive.
  • Connected Hive to Tableau reporting tool and generated graphical reports.
  • Written the Apache PIG complex scripts to process the HDFS data.
  • Deployed and configured Flume agents to stream log events into HDFS for analysis.

HIVE, SQOOP, PIG

Dec 2013 - Present
HIVE, SQOOP, PIG
Hadoop-2.7. 0, HDFS, Project Name: Online Shopping

Salesperson

Nov 2013 - Feb 2014
HIVE, SQOOP, PIG
Hadoop-2.7. 0, HDFS, Project-Data Analysis of dummy Sales Data Client-Inhouse Description: Project is based on the data analysis of a sales data of huge amount and to develop a best through put method to analyze the data and get the required result which is expected. Responsibilities: Setup a Hadoop cluster by using hadoop-2.7. 0 latest version with HA. Install and Configure hive to a remote Metastore with the help of JDBC connector and MySQL. Uploading the data to the HDFS so that it can be accessed by both pig and hive. Write a query in hive to count the total no. of customers and accounts. Write a query in hive to extract the monthly sales of each product. Writing a pig script for the customer details whose monthly shopping is up to 10 thousands. Generate pig script for the products sales analysis. Creating Hive tables, and loading and analyzing data using hive queries. Get the customer's name grouped by location. Roles and Responsibilities: Performed data analytics on Sales data. Developed a map reduce code for the data or develop a pig or hive code to do the same.

Education

Education

Bachelor in technology (Information Technology)

Aug 2009 - Sep 2013
Arravali College of engineering and management

During my technical education i study about the different basic advance concepts

of software engineering such as Data structure & algorithms,OOPS,DBMS,Internet fundamental,Software testing,RAD,C Language etc.

  • My final project is Online Examination Which contain different modules for organizing exam. This application is used to conduct online exam for performance testing of student,we can easily update new question but only by admin. we found result with no. of right answer,wrong answer and not attempt at same time on completion of test.
  • My final year presentation is on Blue Ray Disc.

B.Technology

2008 - 2009
Maharishi Dayanand University

Intermediate

May 2007 - Feb 2008
Government senior secondary school ,Bahadurgarh


Skills

Skills

Hadoop-2.7.0

  • Configured the Hadoop cluster in Local (Standalone), Pseudo-Distributed, Fully-Distributed mode with the use of Apache,Ubuntu 14.04 ,Cloudera distributions and Cloudera manager.
  • To configured different HDFS properties according to cluster requirement.
  • Commission,Decommission new nodes  for scale out the cluster.
  • Hands-on experience in configuring, developing, monitoring, debugging and performance tuning of Big Data applications using Hadoop
  • Configure High availability of cluster of QJM or Zookeeper. 
  • Written the Map-Reduce jobs for analysis the Hdfs data.
  • Worked with multiple file Input Formats such as TextFile, KeyValue, SequenceFile and NLine input format.

Hive

  • Setup of Hive QL  engine in Hadoop cluster for storing and analysis of data in Tabular form.
  • Creating Tables according to importance of data ,Internal as well as External.
  • Loading data into tables from local or from HDFS. 
  • Configure the RDBMS for the meta-store of Hive framework.
  • Creating the static as well as dynamic partition in the Table for increasing the performance  of Hive query.
  • Creating Buckting in table for unique Value in Table.
  • Using HIVE processed extensively ETL loading on a Structured Data.
  • Experience in performing different types of JOINS  in HIVE.
  • Worked on complex data types Array, Map and Struct in Hive.

PIG

  • Setup of Pig Execution engine in Hadoop cluster for running the mapreduce jobs internally.
  • Written Apache Pig scripts for data analysis on top of the HDFS files
  •  Parsed JSON and XML files in PIG using Pig Loader functions and extracted meaningful information from Pig Relations by providing a regex using the built-in functions in Pig. 
  • Extract data from various log files and analyse using complex data types of Pig scripting. 

Sqoop

  • Extracted data from RDBMS - Oracle,Mysql etc into HADOOP using Sqoop.
  • Export the data return to RDBMS from HDFS after filtering according to client requirement.
  • Execute the different operation on RDBMS by using sqoop. 

Flume

  • Configure required files on Hadoop framework for execution of the Flume agents.
  • Written agent scripting in Flume for controlling the flow of data.
  • Configure the file resource type for flume agents. 

Horton_works Sandbox 2.2

  • A good exposure on Horton_works sandbox using Oracle virtual box.
  • Analysis the huge amount data file using Horton_works sandbox(HCatlog,Pig,Hive,flume).

Certifications

Certifications

Big-Data Analytic as Hadoop developer.

Aug 2013 - Dec 2013
Croma campous
  • During my Big-Data analytic course i learned different concepts of Bigdata and also complete by internship as Hadoop developer.
  • During my internship  i worked on Project of sales data as dummy project and perform different operation on it as real project.

National Cadet Core

Apr 2005 - Nov 2006
NCC
  • I successfully completed my National Cadet Core Training During my Schooling.

Personal Details

Sex                                                           Male

Nationality                                              Indian

Notice period                                         15 days

Language known                                   English,Hindi

Passport                                                  Yes