Diptesh Kanojia | Diptesh Kanojia

Latest

On Bashing Scripts to Libraries
Garlic Fever
Optimizing Large Language Models for Low-resource Quality Estimation
Findings of the WMT25 Shared Task on Automated Translation Evaluation Systems: Linguistic Diversity is Challenging and References Still Help
The Tabby Autocrat
ALOPE: Adaptive Layer Optimization for Translation Quality Estimation using Large Language Models
Reference-Less Evaluation of Machine Translation: Navigating Through the Resource-Scarce Scenarios
ALOPE: Adaptive Layer Optimization for Translation Quality Estimation using Large Language Models
Cyberbullying Detection via Aggression-Enhanced Prompting
The Mind's Eye: A Multi-Faceted Reward Framework for Guiding Visual Metaphor Generation
BESSTIE: A Benchmark for Sentiment and Sarcasm Classification for Varieties of English
NEAR²: A Nested Embedding Approach to Efficient Product Retrieval and Ranking
Prompt-based Explainable Quality Estimation for English-Malayalam
From Search Relevance to Content Safety
Towards a Robust Framework for Multimodal Hate Detection: A Study on Video vs. Image-based Content
Automatically Generating Chinese Homophone Words to Probe Machine Translation Estimation Systems
Giving the Old a Fresh Spin: Quality Estimation-Assisted Constrained Decoding for Automatic Post-Editing
CAMU: Context Augmentation for Meme Understanding
Unsupervised Audio-Visual Segmentation with Modality Alignment
DGFM: Full Body Dance Generation Driven by Music Foundation Models
GCDance: Genre-Controlled 3D Full Body Dance Generation Driven By Music
Natural Language Processing for Dialects of a Language: A Survey
Connecting Ideas in Lower-Resource Scenarios: NLP for National Varieties, Creoles, and Other Low-resource Languages
Refer to the Reference: Reference-focused Synthetic Automatic Post-Editing Data Generation
When LLMs Struggle: Reference-less Translation Evaluation for Low-resource Languages
PortraitTalk: Towards Customizable One-Shot Audio-to-Talking Face Generation
AV-GS: Learning Material and Geometry aware Priors for Novel View Acoustic Synthesis
StableTalk: Advancing Audio-to-Talking Face Generation with Stable Diffusion and Vision Transformer
A Multi-task Learning Framework for Evaluating Machine Translation of Emotion-loaded User-generated Content
Are Large Language Models State-of-the-art Quality Estimators for Machine Translation of User-generated Content?
Findings of the Quality Estimation Shared Task at WMT 2024: Are LLMs Closing the Gap in QE?
Centrality-aware Product Retrieval and Ranking
Together We Can: Multilingual Automatic Post-Editing for Low-Resource Languages
What do Large Language Models Need for Machine Translation Evaluation?
Product Retrieval and Ranking for Alphanumeric Queries
Experiences from Creating a Benchmark for Sentiment Classification for Varieties of English
Sampling Strategies for Creation of a Benchmark for Dialectal Sentiment Classification
Edit Distances and Their Applications to Downstream Tasks in Research and Commercial Contexts
Connecting Ideas in 'Lower-Resource' Scenarios: NLP for National Varieties, Creoles and Other Low-resource Scenarios
A Survey of Multimodal Sarcasm Detection
Optimizing Quality Estimation for Low-Resource Language Translations: Exploring the Role of Language Relatedness
Evaluating Machine Translation for Emotion-loaded User Generated Content (TransEval4Emo-UGC)
Decoding Cyberbullying on Social Media: A Machine Learning Exploration
Using character-level models for efficient abbreviation and long-form detection
DiffSED: Sound Event Detection with Denoising Diffusion
Google Translate Error Analysis for Mental Healthcare Information: Evaluating Accuracy, Comprehensibility, and Implications for Multilingual Healthcare Communication
Generative AI and Natural Language Processing
Airavata: Introducing Hindi Instruction-tuned LLM
CreoleVal: Multilingual Multitask Benchmarks for Creoles
APE-then-QE: Correcting then Filtering Pseudo Parallel Corpora for MT Training Data Creation
Findings of the WMT 2023 Shared Task on Automatic Post-Editing
Findings of the WMT 2023 Shared Task on Quality Estimation
Predict and Use: Harnessing Predicted Gaze to Improve Multimodal Sarcasm Detection
Quality Estimation-Assisted Automatic Post-Editing
SurreyAI 2023 Submission for the Quality Estimation Shared Task
Sarcasm in Sight and Sound: Benchmarking and Expansion to Improve Multimodal Sarcasm Detection
Efficient and Explainable Information Retrieval
Challenges of Human vs Machine Translation of Emotion-Loaded Chinese Microblog Texts
Leveraging Foundation Models for Unsupervised Audio-Visual Segmentation
Modelling Political Aggression on Social Media Platforms
Towards Safer Communities: Detecting Aggression and Offensive Language in Code-Mixed Tweets to Combat Cyberbullying
A Multi-task Learning Framework for Quality Estimation
Evaluation of Chinese-English Machine Translation of Emotion-Loaded Microblog Texts: A Human Annotated Dataset for the Quality Assessment of Emotion Translation
Advances in Natural Language Processing @ Surrey
Workshop: Teaching Neural Machine Translation to Translators
Quality Estimation for Machine Translation
Applications and Challenges of Sentiment Analysis in Real-life Scenarios
Findings of the WMT 2022 Shared Task on Automatic Post-Editing
Findings of the WMT 2022 Shared Task on Quality Estimation
Utilizing Weak Supervision to Create S3D: A Sarcasm Annotated Dataset
Cognitive Natural Language Processing
Quality Estimation
Dialouge Processing: The role of NLP and building blocks of a chatbot
Harnessing Abstractive Summarization for Fact-Checked Claim Detection
Transformers: Perspectives from NLP
HiNER: A Large Hindi Named Entity Recognition Dataset
PLOD: An Abbreviation Detection Dataset for Scientific Documents
SURREY-CTS-NLP at WASSA2022: An Experiment of Discourse and Sentiment Analysis for the Prediction of Empathy, Distress and Emotion
Introducing Neural Machine Translation (NMT) to Translators
An Ensemble Approach to Acronym Extraction using Transformers
Quality Estimation for Machine Translation
Automated Evidence Collection for Fake News Detection
Pushing the Right Buttons: Adversarial Evaluation of Quality Estimation
'So You Think You’re Funny?': Rating the Humour Quotient in Standup Comedy
FrameNet-assisted Noun Compound Interpretation
Cognition-aware Cognate Detection
Unsupervised Neural Machine Translation
Harnessing Cross-lingual Features to Improve Cognate Detection for Low-resource Languages
A Survey on Using Gaze Behaviour for Natural Language Processing
Cognitively Aided Zero-Shot Automatic Essay Grading
Happy Are Those Who Grade without Seeing: A Multi-Task Learning Approach to Grade Essays Using Gaze Behaviour
Investigations into the use of Distributed Semantics for Cognate Detection and Computational Phylogenetics
"A Passage to India": Pre-trained Word Embeddings for Indian Languages
Challenge Datasets of Cognate and False Friend Pairs for Indian Languages
Recommendation Chart of Domains for Cross-Domain Sentiment Analysis: Findings of A 20 Domain Study
Strategies of Effective Digitization of Commentaries and Sub-commentaries: Towards the Construction of Textual History
"Keep Your Dimensions on a Leash": True Cognate Detection using Siamese Deep Neural Networks
Harnessing Deep Cross-lingual Word Embeddings to Infer Accurate Phylogenetic Trees
Utilizing Word Embeddings based Features for Phylogenetic Tree Generation of Sanskrit Texts
An Introduction to the Textual History Tool
Utilizing Wordnets for Cognate Detection among Indian Languages
Cognate Identification to improve Phylogenetic trees for Indian Languages
Some Strategies to Capture Karaka-Yogyata with Special Reference to apadana
Natural Language Processing and its intersection with Data Science
Eyes are the Windows to the Soul: Predicting the Rating of Text Quality Using Gaze Behaviour
Hindi Wordnet for Language Teaching: Experiences and Lessons Learnt
Indian Language Wordnets and their Linkages with Princeton WordNet
New Vistas to study Bhartṛhari: Cognitive NLP
pyiwn: A Python-based API to access Indian Language WordNets
Semi-automatic WordNet Linking using Word Embeddings
Synthesizing Audio for Hindi Wordnet
Is your Statement Purposeless? Predicting Computer Science Graduation Admission Acceptance based on Statement Of Purpose
Sarcasm Suite: A browser-based engine for sarcasm detection and generation
Scanpath Complexity: Modeling Reading Effort using Gaze Information
NLP Fundamentals at VIVA IET
A picture is worth a thousand words: Using OpenClipArt library for enriching IndoWordNet
Civique: Using Social Media to detect Urban Emergencies
Harnessing Cognitive Features for Sarcasm Detection
Leveraging Cognitive Features for Sentiment Analysis
Mapping it differently: A solution to the linking challenges
Predicting Readers' Sarcasm Understandability by Modeling Gaze Behavior
SlangNet: A WordNet like resource for English Slang
Sophisticated Lexical Databases - Simplified Usage: Mobile Applications and Browser Plugins For Wordnets
That’ll do fine!: A coarse lexical resource for English-Hindi MT, using polylingual topic models
TransChat: Cross-Lingual Instant Messaging for Indian Languages
Using Multilingual Topic Models for Improved Alignment in English-Hindi MT
World WordNet database structure: an efficient schema for storing information of WordNets of the world
Do not do processing, when you can look up: Towards a Discrimination Net for WSD
PaCMan: Parallel Corpus Management Workbench
PanchBhoota: Hierarchical phrase based machine translation systems for five Indian languages
More than meets the eye: Study of Human Cognition in Sense Annotation
A Study of the Sense Annotation Process: Man v/s Machine.
Discrimination-net for Hindi