Dr Félix do Carmo
About
Biography
I am a Senior Lecturer in Translation and Natural Language Processing, working on the application of natural language processing, machine learning, machine translation and assisted translation technologies in translation research, teaching and practice. Before joining the University of Surrey, I spent over 20 years as a translator, reviser, translation company owner, translator trainer, university lecturer and translation conference organiser in Porto, Portugal. In 2017, I moved to Ireland, to work for two years as a full-time post-doctoral researcher at the ADAPT Centre in Dublin City University, where I carried out research related to translation technology.
Areas of specialism
My qualifications
News
In the media
ResearchResearch interests
My research is framed by Applied Translation Studies, with a special focus on the automation of translation processes. My interests encompass translation process research, the complexity of translation data and human and economic factors of the translation industry, both in theoretical and empirical approaches.
Research projects
KAITER – Knowledge-Assisted Interactive Translation, Editing and RevisingKAITER aims at studying the development of a translation editor that supports the work of translators by estimating and suggesting their editing actions, integrated into a system that helps them manage their specialisation. This research is focused on the interconnection between tools used by translators and machine translation, a technology that tries to emulate their work. The objective of the project comes from the notion that, for us to leverage all the capabilities of this technology, this should be employed as a support tool for translators.
Research interests
My research is framed by Applied Translation Studies, with a special focus on the automation of translation processes. My interests encompass translation process research, the complexity of translation data and human and economic factors of the translation industry, both in theoretical and empirical approaches.
Research projects
KAITER aims at studying the development of a translation editor that supports the work of translators by estimating and suggesting their editing actions, integrated into a system that helps them manage their specialisation. This research is focused on the interconnection between tools used by translators and machine translation, a technology that tries to emulate their work. The objective of the project comes from the notion that, for us to leverage all the capabilities of this technology, this should be employed as a support tool for translators.
Supervision
Postgraduate research supervision
I am interested in supporting research projects that open up Translation Studies to challenging new areas, often crossing boundaries between disciplines, with a special focus on improving the interaction between people and technologies in multilingual contexts. I see my role as incentivising students and researchers to work on research that is applicable to real use cases, by developing acute observation and critical thinking skills, supported by strong theoretical knowledge and solid research methodologies.
I am currently co-supervising three PhD projects:
- Gustavo Zomer's PhD project: Translating science: developing an context-aware real-time writing assistant to support Brazilian researchers publishing in English (main supervisor: Dr Ana Frankenberg-Garcia)
- Eleanor Taylor-Stilgoe's PhD project: Use and Impact of Machine Translation in Healthcare Settings (main supervisor: Prof Constantin Orasan)
- Shenbin Qian's PhD project: Sentiment preservation in neural machine translation (main supervisor: Prof Constantin Orasan)
I have supervised 2 MA dissertations in 2019/2020, one of which was awarded the prize for best dissertation in the MA Translation that year. I have supervised 3 MA dissertations in 2020/2021.
Teaching
Current year (2021-2022):
- Module Leader: Translation as Human-Computer Interaction (TRAM476) - 1st semester
- Module Leader: Professional Translation Practice II (TRAM494) - 2nd semester
I also collaborate in the following modules:
- Principles and Challenges of Translation and Interpreting (TRAM495) - 1st semester
- Business and Management in Translation (TRAM499) - 2nd semester
- Smart Technologies for Translation (TRAM502) - 2nd semester
Previous years:
- Specialist Translation I - Portuguese (TRAM488)
- Specialist Translation II - Portuguese (TRAM489)
- Translation Technologies (TRAM428)
- Business and Industry Aspects of the Translation Profession (TRAM476)
Publications
Highlights
Chapters in books
do Carmo, Félix, and Belinda Maia. 2015. Sleeping with the enemy? Or should translators work with Google Translate? in Pilar Sánchez-Gijón, Olga Torres-Hostench, Bartolomé Mesa-Lao (eds). Conducting Research in Translation Technologies. New Trends in Translation Studies. vol. 13. Peter Lang.
Articles in peer-reviewed journals
Conference proceedings
Shterionov, Dimitar, Félix do Carmo, and Joachim Wagner. 2019. “APE through Neural and Statistical MT with Augmented Data - ADAPT/DCU Submission to the WMT 2019 APE Shared Task.” In Proceedings of ACL 2019 - WMT Shared Task on Automatic Post-Editing. Firenze, Italy.
Shterionov, Dimitar, Félix do Carmo, Joss Moorkens, Eric Pacquin, Dag Schmidtke, Declan Groves, and Andy Way. 2019. “When Less Is More in Neural Quality Estimation of Machine Translation - an Industry Case.” In Proceedings of the MT Summit XVII. Dublin.
do Carmo, Félix. 2019 ‘Edit distances do not describe editing, but they can be useful for translation process research’, in Carl, M. and Hansen-Schirra, S. (eds) Proceedings of the 2nd MEMENTO workshop on Modelling Parameters of Cognitive Effort in Translation Production. Dublin, Ireland. pp. 1–2. (Abstract)
do Carmo, Félix. 2018. “Does Machine Translation Really Produce Translations?” In Proceedings of the 21st Annual Conference of the European Association for Machine Translation - Translator’s Track, edited by Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez, Miquel Esplà-Gomis, Maja Popović, Célia Rico Pérez, André Martins, Joachim Van den Bogaert, and Mikel L. Forcada, 323. Alicante, Spain: EAMT. p. 323. (Abstract).
do Carmo, Félix, Luís Trigo, and Belinda Maia. 2016. From CATs to KATs. in Proceedings of the 38th Conference Translating and the Computer. London, UK: Editions Tradulex, Geneva. pp. 149–158.
do Carmo, Félix, and Belinda Maia. 2016. “A Description of Post-Editing, from Translation Studies to Machine Learning.” In Tradumàtica Research Group (eds.). Translators and Machine Translation: Book of presentations. Barcelona, Spain. pp. 126-152.
This dataset includes the data collected as part of the projects JAS, a study on the effects of automation of a job allocation system in a translation services company. The dataset includes counts of answers to a questionnaire answered by 38 participants. Answers are classified according to closed classes in closed questions, and thematic codes in answers to open questions. See readme file for more details and read the article "From responsibilities to responsibility: a study of the effects of translation workflow automation", due to be published in JoSTrans, Issue 40 (July 2023).
This article uses a multi-faceted approach to discuss the relation between time, money and different perspectives that help define the value of professional translation. It challenges the narratives created by the translation industry on post-editing as a revision of pre-translated content, confronting them with the detailed description of the task in industry standards and with the reality of translators' work. The article also addresses the different roles that time plays as an instrument of analysis and evaluation of translation, and as a fundamental factor in the definition of labour relations in the translation market. The main claim of the article is that translation is increasingly specialised high-value work, requiring translators that are able to make complex and efficient decisions, especially when they are expected to work under time restrictions, with the support of content that has been previously processed by machine translation.
Translation Revision and Post-editing looks at the apparently dissolving boundary between correcting translations generated by human brains and those generated by machines. It presents new research on post-editing and revision in government and corporate translation departments, translation agencies, the literary publishing sector and the volunteer sector, as well as on training in both types of translation checking work.This collection includes empirical studies based on surveys, interviews and keystroke logging, as well as more theoretical contributions questioning such traditional distinctions as translating versus editing. The chapters discuss revision and post-editing involving eight languages: Afrikaans, Catalan, Dutch, English, Finnish, French, German and Spanish. Among the topics covered are translator/reviser relations and revising/post-editing by non-professionals.The book is key reading for researchers, instructors and advanced students in Translation Studies as well as for professional translators with a special interest in checking translations.
This paper summarises the submissions our team, SURREY-CTS-NLP has made for the WASSA 2022 Shared Task for the prediction of empathy, distress and emotion. In this work, we tested different learning strategies, like ensemble learning and multi-task learning, as well as several large language models, but our primary focus was on analysing and extracting emotion-intensive features from both the essays in the training data and the news articles, to better predict empathy and distress scores from the perspective of discourse and sentiment analysis. We propose several text feature extraction schemes to compensate the small size of training examples for fine-tuning pretrained language models, including methods based on Rhetorical Structure Theory (RST) parsing, cosine similarity and sentiment score. Our best submissions achieve an average Pearson correlation score of 0.518 for the empathy prediction task and an F1 score of 0.571 for the emotion prediction task(1), indicating that using these schemes to extract emotion-intensive information can help improve model performance.
Linguistik International 2020 Although emotions are universal concepts, transferring the different shades of emotion from one language to another may not always be straightforward for human translators, let alone for machine translation systems. Moreover, the cognitive states are established by verbal explanations of experience which is shaped by both the verbal and cultural contexts. There are a number of verbal contexts where expression of emotions constitutes the pivotal component of the message. This is particularly true for User-Generated Content (UGC) which can be in the form of a review of a product or a service, a tweet, or a social media post. Recently, it has become common practice for multilingual websites such as Twitter to provide an automatic translation of UGC to reach out to their linguistically diverse users. In such scenarios, the process of translating the user's emotion is entirely automatic with no human intervention, neither for post-editing nor for accuracy checking. In this research, we assess whether automatic translation tools can be a successful real-life utility in transferring emotion in user-generated multilingual data such as tweets. We show that there are linguistic phenomena specific of Twitter data that pose a challenge in translation of emotions in different languages. We summarise these challenges in a list of linguistic features and show how frequent these features are in different language pairs. We also assess the capacity of commonly used methods for evaluating the performance of an MT system with respect to the preservation of emotion in the source text.
The literature on translation and technology has generally taken two forms: general overviews, in which the tools are described, and functional descriptions of how such tools and technologies are implemented in specific projects, often with a view to improving the quality of translator training. There has been far less development of the deeper implications of technology in its cultural, ethical, political and social dimensions. In an attempt to address this imbalance, the present volume offers a collection of articles, written by leading experts in the field, that explore some of the current communicational and informational trends that are defining our contemporary world and impinging on the translation profession. The contributions have been divided into three main areas in which translation and technology come together: (1) social spheres, (2) education and training and (3) research. This volume represents a bold attempt at contextualizing translation technologies and their applications within a broader cultural landscape and encourages intellectual reflection on the crucial role played by technology in the translation profession.
This article presents a review of the evolution of automatic post-editing, a term that describes methods to improve the output of machine translation systems, based on knowledge extracted from datasets that include post-edited content. The article describes the specificity of automatic post-editing in comparison with other tasks in machine translation, and it discusses how it may function as a complement to them. Particular detail is given in the article to the five-year period that covers the shared tasks presented in WMT conferences (2015–2019). In this period, discussion of automatic post-editing evolved from the definition of its main parameters to an announced demise, associated with the difficulties in improving output obtained by neural methods, which was then followed by renewed interest. The article debates the role and relevance of automatic post-editing, both as an academic endeavour and as a useful application in commercial workflows.
In a translation workflow, machine translation (MT) is almost always followed by a human post-editing step, where the raw MT output is corrected to meet required quality standards. To reduce the number of errors human translators need to correct, automatic post-editing (APE) methods have been developed and deployed in such workflows. With the advances in deep learning, neural APE (NPE) systems have outranked more traditional, statistical, ones. However, the plethora of options, variables and settings, as well as the relation between NPE performance and train/test data makes it difficult to select the most suitable approach for a given use case. In this article, we systematically analyse these different parameters with respect to NPE performance. We build an NPE "roadmap" to trace the different decision points and train a set of systems selecting different options through the roadmap. We also propose a novel approach for APE with data augmentation. We then analyse the performance of 15 of these systems and identify the best ones. In fact, the best systems are the ones that follow the newly-proposed method. The work presented in this article follows from a collaborative project between Microsoft and the ADAPT centre. The data provided by Microsoft originates from phrase-based statistical MT (PBSMT) systems employed in production. All tested NPE systems significantly increase the translation quality, proving the effectiveness of neural post-editing in the context of a commercial translation workflow that leverages PBSMT.
The amazing capacities of machine translation are supported by very rigorous and powerful research. However, science is also discourse, and sometimes scientific discourse creates myths, beliefs that are based on how terms and concepts may be used in scientific publications with no proper debate or understanding. In this lecture, I will present a critical view of three of the most influential papers from machine translation research, not criticising their scientific validity, but highlighting how their use of terms and concepts helped create myths around the power of machine translation. My perspective is that translation is much more complex than what common discourses about machine translation convey, and that we are losing sight of that complexity when we focus on the scientific achievements. My objective is to contribute to real convergence between machine translation research and translation studies by presenting a view that aims at solving current limitations of discussions about translation. I believe that real convergence can only be fruitful if translation studies contributes to the debate, bringing with it the power of a rich legacy of theories and practices that help us all understand the complexity of translation.
Additional publications
In Portuguese:
do Carmo, Félix. 2004. Saberes e Criatividade em acção na Tradução Técnica (Knowledge and Creativity in Technical Translation). in Revista Génesis – revista científica do ISAI. (2004_3). Porto: ISAI.
do Carmo, Félix. 2002. De Formando a Formador, passando por 'Formado em Tradução’ (From trainee to trainer, from someone with translation training’) in Actas do V Seminário de Tradução Científica e Técnica em Língua Portuguesa - Novos Empregos para os Tradutores?. Lisboa: União Latina.
do Carmo, Félix. 2000. “Onde está a Saída? - Perspectivas Profissionais do Curso de Tradução da FLUP” (Where is the exit? Professional Perspectives for the Translation degree of FLUP) in Actas do II Seminário de Tradução Científica e Técnica em Língua Portuguesa – A tradução científica e técnica em língua portuguesa. Lisboa: União Latina.