Download PDF


Hi, currently I'm working as a Data Scientist at Elsevier, Amsterdam. Prior to that I was working  in Accenture Labs, Dublin from Feb 2017 to DEC 2017  as a Research New Associate where I was working in the areas of financial tech. such as, anti-money-laundering and fraud-detection. finished  my PhD from  School of Computing, Dublin City University (DCU), Dublin, Ireland.  I was a Senior Research Engineer in CLIA project Jadavpur University. My areas of interest are Natural Language Processing (NLP) and Machine Learning (ML). Being an NLP and AI enthusiast I always find it extremely interesting to apply AI/ML to solve real word problems.

Work History

2018 AugPresent

Data Scientist

Elsevier, Amsterdam

Projects: Biomedical Entity Detection , Auto-Structuring of Research Papers

Description: Application of state-of-the-art ML concepts to various domains and problems to produce scalable ML model. Handling large datasets (creation) using Apache Spark. and Databricks

2018 JanAug

Research Assistant

University College Dublin

Project: Intelligent Next Generation  Anti-Money Laundering (AML) Research

Description: Using NLP and  deep learning to help Accenture AI Labs and its bank clients to develop automated anti-money laundering solutions that can increase detection accuracy and reduce human effort in AML investigations. 

2017 Feb Dec

Research Associate

Accenture Labs, Dublin

Project: Intelligent Anti-Money Laundering

Description: This project is combination of decentralized NLP and Machine Learning (ML) modules that gather information and intelligence for economic organizations (banks/insurance) when targeted to existing  and/or new clients. The obtained information specifies the risk/threat regarding an entity, i.e. involvement /connection of that entity in financial scams/frauds/money-laundering.


Research Engineer

Jadavpur University, India

Designation: Senior Research EngineerProject: Sandhan (Cross Lingual Information Access System) Phase - II, Sponsored By: Ministry of Communications & Information Technology, Government of India

Description: The project was a multilingual search engine on tourism domain in based on Apache Lucene. My team was responsible for generating automatic query focused summary and snippets for web-pages.




Dublin City University

Thesis: Processing of Code-mixed/language-mixed Social Media User Generated Content Description: The aim of the research is to develop automatic natural language processing (NLP) software (e.g. word-level language identifier and part-of-speech(POS) tagger) for noisy multi-lingual social media user generated content with the help of state-of-the-art machine learning algorithms.



Jadavpur University

Dissertation: A search engine prototype for topic-based sentimental tweets. The prototype consists of a keyword based tweet crawler, a Conditional Random Field (CRF) based sentiment classifier and a Lucene based search engine. A user interface is also provided in the prototype to summaries the search results.



West Bengal University of Technology

Final Year Project: Automated Class Schedule Management System
Description: This is a class schedule management software for college administration which requires the information of teachers and subjects to process class schedule based some criteria (e.g. availability) provided by a user. The key technologies those are used to write the software are Java Swing and MySQL.

Selected Publications

Jingguang Han, Utsab Barman, Jeremiah Hayes, Jinhua Du, Edward Burgin, Dadong Wan, 2018.  NextGen AML: Distributed Deep Learning based Language Technologies to Augment Anti Money Laundering Investigation., ACL, 2018, System Demo

Barman, U., Wagner, J. and Foster, J., 2016. Part-of-speech Tagging of Code-mixed Social Media Content: Pipeline, Stacking and Joint ModellingEMNLP 2016, p.30.

Barman, U., Das, A., Wagner, J. and Foster, J., 2014, October. Code mixing: A challenge for language identification in the language of social media. In Proceedings of The First Workshop on Computational Approaches to Code Switching(pp. 13-23).

 Barman, U., Wagner, J., Chrupała, G. and Foster, J., 2014, October. Dcu-uvt: Word-level language classification with code-mixed data. In Proceedings of the First Workshop on Computational Approaches to Code Switching (pp. 127-132).

 Wagner, J., Arora, P., Cortes, S., Barman, U., Bogdanova, D., Foster, J. and Tounsi, L., 2014. Dcu: Aspect-based polarity classification for semeval task 4.

 Pakray, P., Barman, U., Bandyopadhyay, S. and Gelbukh, A., 2012. Semantic answer validation using universal networking languageInternational Journal of Computer Science and Information Technologies3(4), pp.4927-4932.

 Pakray, P., Barman, U., Bandyopadhyay, S. and Gelbukh, A., 2011. A statistics-based semantic textual entailment system. Advances in Artificial Intelligence, pp.267-276.

 Das, A., Burman, U., Balamurali, A.R. and Bandyopadhyay, S., 2013. NER from Tweets: SRI-JU System@ MSM 2013Making Sense of Microposts (# MSM2013).