Apr 2013 - Present
Developed Map-Reduce programs in HIVE and PIG to validate and cleanse the data in HDFS, obtained from heterogeneous data sources, to make it suitable for analysis.Worked on streaming the data into HDFS from web servers using Flume.Designed and implemented Hive and Pig UDF's for evaluation, filtering, loading and storing of data.The Hive tables created as per requirement were Internal or External tables defined with appropriate Static and Dynamic partitions, intended for efficiency.Wrote Scripts to generate Map Reduce jobs and performed ETL procedures on the data in HDFS.Implemented Lateral View in conjunction with UDTFs in Hive.Worked extensively with Sqoop for importing and exporting data from MySQL into HDFS and Hive.Performed complex Joins on the tables in Hive.Load and transform large sets of structured, semi structured using Hive and Impala.Connected Hive and Impala to Tableau reporting tool and generated graphical reports.
- Installing, configuring, and administrating Hadoop cluster of major Hadoop distributions.
- Hands-on experience in writing MapReduce jobs in Java, Pig and Python.
- Working with Java, C++ and C.
- Extensive experience in designing analytical / OLAP and transactional / OLTP databases.
- Hadoop applications, including administration, configuration management, debugging, and performance tuning.
- Deploying applications in heterogeneous application servers.
- Creating web-based applications using ActiveXControls, JSP, Servlets
- Creating web pages using HTML, DHTML, Java Script, VB Script and CSS