We do research in all areas related to speech technology: speech synthesis and recognition, speech coding, speech pre- and post-processing, emotion recognition from speech, unsupervised speech segmentation and analysis.
Other interests relate to natural language processing and multimedia analysis.
We are constantly looking for witty, interested people to join our group. Doctoral Studies, full-time or part-time are offered by our faculty in the domain of Electronics and Telecommunications Engineering. Funding can be obtained either from the Ministry of National Education, or from other research projects.
To do a PhD in Speech Processing, get in touch with us!
The Technical University of Cluj-Napoca has several Masters programs that might interest you. More info here: Postgraduate studies.
To do a Masters in Electronics and Telecommunications, you can get more info here: Master Studies-ETTI*.
The Multimedia Technologies Masters includes two speech processing related topics: 1) Speech Coding Techniques and 2) Speech Analysis, Synthesis and Recognition.
*So far, all the current Masters programs are taught in Romanian
SINTERO Project Started (March, 2018)
SINTERO are ca obiectiv general crearea unui sistem de sinteză text-vorbire în limba română ce permite modelarea și controlul prozodiei (intonația în vorbire) într-un mod apropiat de vorbirea naturală. Alături de acest obiectiv, se urmărește crearea a cât mai multor voci sintetizate în limba română (în acest proiect minim 10 voci), astfel încât acestea să poată fi utilizate de o comunitate extinsă, inclusiv în aplicații comerciale. WEBPAGE.
SWARA Corpus Released (June, 2017)
The SWARA Corpus is a result of the SWARA Project, funded by the Romanian Ministry of Education, under the grant agreement PN-II-PT-PCCA-2013-4 No 6/2014. The corpus contains over 21 hours of high quality recordings from 17 different speakers. The data is segmented in 19,279 utterances and includes their orthographic transcripts and semi-automatic phone-level alignments. WEBPAGE.
MaRePhor Lexicon Released (June, 2017)
An Open Access Machine-Readable Phonetic Dictionary for Romanian: The dictionary consists of 72,375 words and 591,570 letters. The dictionary entries are words from the Romanian Scrabble Association's official list of words and the entries from a 15,517 words dictionary, developed according to the SpeechDat specifications. The phonetic transcriptions are in SAMPA format WEBPAGE.
ALISA Tool Released (June, 2017)
ALISA uses a two step approach for the task of aligning speech with imperfect transcripts: 1) sentence-level speech segmentation and 2) sentence-level speech and text alignment. Both processes are fully automated and require as little as 10 minutes of manually labelled speech: inter-sentence silence segments for the segmentation, and orthographic transcripts of these sentences for the aligner. The tool can be applied to any language with an alphabetic writing system and can align up to 75% of the original data with a sentence error rate of less then 8% and a word error rate of less than 1%. WEBPAGE.
MARA Corpus Released (April, 2013)
Mr. Mihai Nae from Cartea Sonora has kindly released a complete professional audiobook recording for use in speech processing research for Romanian. You can download it from here: WEBPAGE.
Congratulations to dr. Mihai ORDEAN (November, 2012)
Mihai Ordean had a successful public defense of his PhD Thesis entitled Secure Authentication using One-Time Visual Passwords. WEBPAGE.
2020 | |
Beáta Lőrincz, Maria Nutu, Adriana Stan, Mircea Giurgiu "An Evaluation of Postfiltering for Deep Learning Based Speech Synthesis with Limited Data", IEEE 10th International Conference on Intelligent Systems (IS), Bulgaria, 2020 [pdf] | |
Beáta Lőrincz, "Concurrent phonetic transcription, lexical stress assignment and syllabification with deep neural networks", Proceedings of the 24th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems KES2020, 2020 [pdf] | |
Adriana Stan, "RECOApy: Data Recording, Pre-Processing and Phonetic Transcription for End-to-End Speech-Based Applications", In Proceedings of the Interspeech, Shanghai, China, 2020 [pdf] | |
Kristen M Scott, Simone Ashby, Adriana Stan, "Designing a Synthesized Content Feed System for Community Radio", Proceedings of the 11th Nordic Conference on Human-Computer Interaction: Shaping Experiences, Shaping Society, Estonia, 2020 [pdf] | |
2019 | |
Adriana Stan, "Input Encoding for Sequence-to-Sequence Learning of Romanian Grapheme-to-Phoneme Conversion", In Proceedings of the 10th IEEE International Conference on Speech Technology and Human-Computer Dialogue (SpeD), Timisoara, Romania, 2019. [bib] [pdf] | |
Beata Lorincz, Maria Nutu, Adriana Stan, "Romanian Part of Speech Tagging using LSTM Networks", In Proceedings of the IEEE 15th International Conference on Intelligent Computer Communication and Processing, Cluj-Napoca, Romania, 2019. [bib] [pdf] | |
Maria Nutu, Beata Lorincz, Adriana Stan,"Deep Learning for Automatic Diacritics Restoration in Romanian", In Proceedings of the IEEE 15th International Conference on Intelligent Computer Communication and Processing, Cluj-Napoca, Romania, 2019. [bib] [pdf] | |
David A. Braude, Matthew P. Aylett, Caoimhin Laoide-Kemp, Simone Ashby, Kristen M. Scott, Brian O Raghallaigh, Anna Braudo, Alex Brouwer, Adriana Stan,"All Together Now: The Living Audio Dataset", In Proceedings of Interspeech, Graz, Austria, 2019. [bib] [pdf] | |
2018 | |
Adriana Stan, Mircea Giurgiu, "A Comparison Between Traditional Machine Learning Approaches And Deep Neural Networks For Text Processing In Romanian", In Proceedings of the 13th International Conference on Linguistic Resources and Tools for Processing Romanian Language (ConsILR), Jassy, Romania, 2018. [bib] [pdf] | |
2017 | |
Adriana Stan, Florina Dinescu, Cristina Tiple, Serban Meza, Bogdan Orza, Magdalena Chirila, Mircea Giurgiu, "The SWARA Speech Corpus: A Large Parallel Romanian Read Speech Dataset", In Proceedings of the 9th Conference on Speech Technology and Human-Computer Dialogue (SpeD), Bucharest, Romania, 2017. [bib] [pdf] | |
Stefan-Adrian Toma, Adriana Stan, Mihai-Lica Pura, Traian Barsan, "MaRePhoR - An Open Access Machine-Readable Phonetic Dictionary for Romanian", In Proceedings of the 9th Conference on Speech Technology and Human-Computer Dialogue (SpeD), Bucharest, Romania, 2017. [bib] [pdf] | |
2016 | |
Adriana Stan, Cassia Valentini-Botinhao, Bogdan Orza, Mircea Giurgiu, "Blind Speech Segmentation using Spectrogram-image Based Features and Mel Cepstral Coefficients", In Proc. IEEE Workshop on Spoken Language Technology, San Diego, USA, 2016. [bib] [pdf] | |
Alexandru Moldovan, Adriana Stan, Mircea Giurgiu, "Improving Sentence-level Alignment of Speech with Imperfect Transcripts using Utterance Concatenation and VAD", In Proc. of IEEE ICCP, Cluj-Napoca, Romania, 2016. [bib] [pdf] | |
Adriana Stan, Yoshitaka Mamiya, Junichi Yamagishi, Peter Bell, Oliver Watts, Rob Clark, Simon King, "ALISA: An automatic lightly supervised speech segmentation and alignment tool", In Computer Speech and Language, vol. 35, pp. 116-133, 2016. [bib] [pdf] [doi] | |
2015 | |
Adriana Stan, Cassia Valentini-Botinhao, Mircea Giurgiu, Simon King, "Phonetic Segmentation of Speech using STEP and t-SNE", In Proc. of the 8th International Conference on Speech Technology and Human-Computer Dialogue (SpeD), Bucuresti, Romania, 2015. [bib] [pdf] | |
2014 | |
Jószef Domokos, Adriana Stan, Mircea Giurgiu, "An Approach to Lexical Stress Detection from Transcribed Continuous Speech Using Acoustic Features", In Proc. 22nd Telecommunications Forum, Belgrade, Serbia, 2014. [bib] [pdf] | |
Dhananjaya Gowda, Heikki Kallasjoki, Reima Karhila, Cristian Contan, Jalle Palomaki, Mircea Giurgiu, Mikko Kurimo, "On the Role of Missing Data Imputation and NMF Feature Enhancement in Building Synthetic Voices Using Reverberant Speech", In Proc. Interspeech, Singapore, 2014. [bib] | |
O. Watts, S. Gangireddy, J. Yamagishi, S. King, S. Renals, A. Stan, M. Giurgiu, "Neural Net Word Representations for Phrase-Break Prediction Without a Part of Speech Tagger", In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Florence, Italy, 2014. [bib] [pdf] | |
Tiberiu Boroș, Adriana Stan, Oliver Watts, Stefan Daniel Dumitrescu, "RSS-TOBI - A Prosodically Enhanced Romanian Speech Corpus", In Proc. The 9th edition of the Language Resources and Evaluation Conference, Reykjavik, Iceland, 2014. [bib] [pdf] | |
2013 | |
Adriana Stan, Peter Bell, Junichi Yamagishi, Simon King, "Lightly Supervised Discriminative Training of Grapheme Models for Improved Sentence-level Alignment of Speech and Text Data", In Proc. Interspeech, 2013. [bib] | |
Y. Mamiya, A. Stan, J. Yamagishi, P. Bell, O. Watts, R.A.J. Clark, S. King, "Using Adaptation to Improve Speech Transcription Alignment in Noisy and Reverberant Environments", In Proc. SSW8, 2013. [bib] | |
O. Watts, A. Stan, R. Clark, Y. Mamiya, M. Giurgiu, J. Yamagishi, S. King, "Unsupervised and lightly-supervised learning for rapid construction of TTS systems in multiple languages from ‘found’ data: evaluation and analysis", In Proc. SSW8, 2013. [bib] | |
O. Watts, A. Stan, Y. Mamiya, A. Suni, M. Burgos, J.M. Montero, "The Simple4All entry to the Blizzard Challenge 2013", In Proc. Blizzard Challenge 2013, 2013. [bib] | |
A. Stan, O. Watts, Y. Mamiya, M. Giurgiu, R. A. J. Clark, J. Yamagishi, S. King, "TUNDRA: A Multilingual Corpus of Found Data for TTS Research Created with Light Supervision", In Proc. Interspeech, 2013. [bib] | |
Yoshitaka Mamiya, Junichi Yamagishi, Oliver Watts, Robert A.J. Clark, Simon King, Adriana Stan, "Lightly Supervised GMM VAD to use Audiobook for Speech Synthesiser", In Proc. ICASSP, 2013. [bib] | |
Ioana Muresan, Adriana Stan, Mircea Giurgiu, Rodica Potolea, "Evaluation of Sentiment Polarity Prediction using a Dimensional and a Categorical Approach", In Proc. SPED, 2013. [bib] | |
2012 | |
Adriana Stan, Peter Bell, Simon King, "A Grapheme-based Method for Automatic Alignment of Speech and Text Data", In Proc. IEEE Workshop on Spoken Language Technology, Miami, Florida, USA, 2012. [bib] | |
M. Giurgiu, A. Kabir, "Automatic transcription and speech recognition of Romanian corpus RO-GRID", In Telecommunications and Signal Processing (TSP), 2012 35th Intl Conf on, pp. 465 -468, 2012. [bib] [doi] | |
M. Ordean, M. Giurgiu, "Towards securing client-server connections against man-in-the-middle attacks", In Electronics and Telecommunications (ISETC), 2012 10th International Symposium on, pp. 127 -130, 2012. [bib] [doi] | |
2011 | |
M. Giurgiu, A. Kabir, "Improving automatic speech recognition in noise by energy normalization and signal resynthesis", In Intelligent Computer Communication and Processing (ICCP), 2011 IEEE Intl Conference on, pp. 311 -314, 2011. [bib] [doi] | |
M. Giurgiu, A. Kabir, "Comparison of Vocal Tract Length Normalization technique applied for clean and noisy speech", In Telecommunications and Signal Processing (TSP), 2011 34th Intl Conference on, pp. 351 -354, 2011. [bib] [doi] | |
Z.I. Kiss, Z.A. Polgar, M. Giurgiu, V. Dobrota, "Resource efficient network coding based congestion control for streaming applications", In Telecommunications and Signal Processing (TSP), 2011 34th Intl Conference on, pp. 85 -90, 2011. [bib] [doi] | |
Adriana STAN, "Romanian HMM-based Text-to-Speech Synthesis with Interactive Intonation Optimisation", PhD thesis, Technical University of Cluj-Napoca, 2011. [bib] | |
Adriana Stan, Junichi Yamagishi, Simon King, Matthew Aylett, "The Romanian speech synthesis (RSS) corpus: Building a high quality HMM-based speech synthesis system using a high sampling rate", In Speech Communication, vol. 53, no. 3, pp. 442-450, 2011. [bib] [pdf] [doi] | |
Adriana Stan, Mircea Giurgiu, "A Superpositional Model Applied to F0 Parametrisation using DCT for Text-to-Speech Synthesis", In Proceedings of the $6^th$ Conference on Speech Technology and Human-Computer Dialogue, Brasov, Romania, 2011. [bib] | |
Adriana Stan, Florin-Claudiu Pop, Marcel Cremene, Mircea Giurgiu, Denis Pallez, "Interactive Intonation Optimisation Using CMA-ES and DCT Parametrisation of the F0 Contour for Speech Synthesis", In Proceedings of the $5^th$ Workshop on Nature Inspired Cooperative Strategies for Optimisation, Springer, vol. 387, pp. 57-71, 2011. [bib] | |
2010 | |
A. Kabir, M. Giurgiu, J. Barker, "Robust automatic transcription of English speech corpora", In Communications (COMM), 2010 8th International Conference on, pp. 79 -82, 2010. [bib] [doi] | |
C.F.M. Veja, G. Hagedorn, G. Weber, M. Giurgiu, "Metadata repository management using the MediaWiki interoperability framework a case study: The KeyToNature project", In eChallenges, 2010, pp. 1 -9, 2010. [bib] | |
C. Veja, M. Giurgiu, G. Hagedorn, G. Weber, "Semantic MediaWiki interoperability framework from a semantic social software perspective", In , pp. 403 -406, 2010. [bib] [doi] | |
C. Veja, M. Giurgiu, G. Weber, G. Hagedorn, "MediaWiki interoperability framework for multimedia digital resources", In Intelligent Computer Communication and Processing (ICCP), 2010 IEEE International Conference on, pp. 329 -335, 2010. [bib] [doi] | |
M. Ordean, M. Giurgiu, "Implementation of a security layer for the SSL/TLS protocol", In Electronics and Telecommunications (ISETC), 2010 9th International Symposium on, pp. 209 -212, 2010. [bib] [doi] | |
Adriana Stan, Mircea Giurgiu, "Romanian language statistics and resources for text-to-speech systems", In Proceedings of the $9^th$ Edition of the International Symposium on Electronics and Telecommunications, Timisoara, Romania, 2010. [bib] | |
2009 | |
Adriana Stan, "Linear Interpolation of Spectrotemporal Excitation Pattern Representations for Automatic Speech Recognition in the Presence of Noise", In Proceedings of the 5th Conference on Speech Technology and Human- Computer Dialogue, Constanta, Romania, 2009. [bib] |
26-28 George Barițiu Street, room S2.3
400027, Cluj-Napoca, România
+40-264-202452
http://speech.utcluj.ro