د. هدى بوعمر


Houda Bouamor, Ph.D.


Visiting Assistant Professor
Computer Science Department
Carnegie Mellon University, Qatar


Research Interests

Natural Language Processing (NLP)
Paraphrase Acquisition
Monolingual Alignment
Paraphrase Validation in Context
Contextual Targeted Paraphrasing
Machine Learning


Carnegie Mellon University
Qatar Foundation
Doha, State of Qatar

hbouamor [at] cmu.edu



I am a Visiting Assistant Professor in the Computer Science department at CMU-Q. I am also part of the Natural Language Processing Lab and work on Arabic NLP with Kemal Oflazer. My Ph.D. is from Paris-Sud University, France, where I worked on paraphrasing in the framework of Computational Linguistics, advised by Anne Vilnat and Aurélien Max. My M.Sc. is in Computer Science from the Paris-Est Marne-La-Vallée University and a bachelor degree in Computer Science from the University of Manouba, Tunisia. She worked on different projects including resolving different NLP issues. Her main research interest revolves around Statistical Machine Translation.

Curriculum Vitae »Research Statement »
Dissertation »Abstract »


Projects

QALB

QALB (Qatar Arabic Language Bank) is a joint project between us and Nizar Habash and colleagues at Columbia University . The project aims to build a large corpus of manually corrected Arabic text for building automatic correction tools for Arabic text. Furthermore, the project includes research on statistical techniques for automatic correction of Arabic text.

Kemal Oflazer, Behrang Mohit, Houda Bouamor, Wajdi Zaghouani, Ossama Obeid

Wiki TEA

Wiki TEA (Wikipedia Translation-English to Arabic) is a project focused on creating techniques, tools and resources for enhancement and expansion of Arabic Wikipedia through statistical machine translation.

Behrang Mohit, Houda Bouamor, Mahmoud Azab, Kemal Oflazer, Ossama Obeid, Wajdi Zaghouani


OptDiac

OptDiac: An Optimal Diacritization Scheme for Arabic Orthographic Representation. The overarching objective is to improve Arabic NLP in general as well as improve readability and comprehension rates for Arabic text thereby potentially having an impact on literacy in the Arabic world as well as creating principled writing standards that extend to the dialects. We believe that some form of partial diacritization can achieve these two goals. We do not hypothesize that the same partial diacritization scheme will be maximally useful for both areas.

Mona Diab, Kemal Oflazer, Houda Bouamor, Wajdi Zaghouani, Zeinab Ibrahim


English-Arabic SMT

Learning from Comparable Corpora for Improved English-Arabic Statistical Machine Translation, Funded by QNRF 12/2010 – 11/2013.


Upcoming Project:
MADAR: Multi-Arabic Dialect Applications and Resources

Our main objective in this project is to improve Dialectal Arabic NLP in general. Specifically we aim to develop a suite of four novel multi-dialectal resources which will be used to conduct original research in two applications that are valuable enabling technologies necessary to support future research in Arabic NLP.



The question of whether machines can think is about as relevant as the question of whether submarines can swim.

— Edsger Dijkstra, 1984


Publications

Google Scholar »

...


لا تخجل من السؤال عن شيء تجهله،فخير لك أن تكون جاهلا مرة من أن تظل على جهلك طول العمر

يوسف السباعي —


© Copyright 2014 Houda Bouamor, Ph.D.