Low Bit-rate Speech Coding With VQ-VAE and a WaveNet Decoder

“Low Bit-rate Speech Coding With VQ-VAE and a WaveNet Decoder” by Cristina Garbacea, Aaron van den Oord, Yazhe Li, Felicia S C Lim, Alejandro Luebs, Oriol Vinyals and Thomas C Walters has been accepted at ICASSP 2019 and will be presented this week at the conference in Brighton, UK. The work was carried during my internship with Google Deepmind.  I am posting the abstract of the paper below:

In order to efficiently transmit and store speech signals, speech codecs create a minimally redundant representation of the input signal which is then decoded at the receiver with the best possible perceptual quality. In this work we demonstrate that a neural network architecture based on VQ-VAE with a WaveNet decoder can be used to perform very low bit-rate speech coding with high reconstruction quality. A prosody-transparent and speaker-independent model trained on the LibriSpeech corpus coding audio at 1.6 kbps exhibits perceptual quality which is around halfway between the MELP codec at 2.4 kbps and AMR-WB codec at 23.05 kbps. In addition, when training on high-quality recorded speech with the test speaker included in the training set, a model coding speech at 1.6 kbps produces output of similar perceptual quality to that generated by AMR-WB at 23.05 kbps.

For more details please check the paper.

Judge the Judges: A Large-Scale Evaluation Study of Neural Language Models for Online Review Generation

Judge the Judges: A Large-Scale Evaluation Study of Neural Language Models for Online Review Generation” by Cristina Garbacea, Samuel Carton, Shiyan Yan and Qiaozhu Mei is available online now at this location.

Recent advances in deep learning have resulted in a resurgence in the popularity of natural language generation (NLG). Many deep learning based models, including recurrent neural networks and generative adversarial networks, have been proposed and applied to generating various types of text. Despite the fast development of methods, how to better evaluate the quality of these natural language generators remains a significant challenge. We conduct an in-depth empirical study to evaluate the existing evaluation methods for natural language generation. We compare human-based evaluators with a variety of automated evaluation procedures, including discriminative evaluators that measure how well the generated text can be distinguished from human-written text, as well as text overlap metrics that measure how similar the generated text is to human-written references. We measure to what extent these different evaluators agree on the ranking of a dozen of state-of-the-art generators for online product reviews. We find that human evaluators do not correlate well with discriminative evaluators, leaving a bigger question of whether adversarial accuracy is the correct objective for natural language generation. In general, distinguishing machine-generated text is a challenging task even for human evaluators, and their decisions tend to correlate better with text overlap metrics. We also find that diversity is an intriguing metric that is indicative of the assessments of different evaluators.

For more details please check the paper.

Google Student Research Summit

I have been invited by Google Research to attend the Machine Intelligence track of the first Google Student Research Summit. The event will be taking place September 20th – 22nd at the YouTube Headquarters in San Bruno, CA, and will consist of technical talks from Google researchers that deal with Machine Intelligence research at Google. In addition, there will be presentations of cutting-edge research problems that the Google researchers hope to tackle in the field.

Thank you, Google, I am honoured to have his chance!

KDD 2017 Student Travel Award

I was awarded a student travel grant for attending the 23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining. The conference brings together researchers from data science, data mining, knowledge discovery, large-scale data analytics and big data. It will be held between August 13 – 17, 2017 in Halifax, Nova Scotia – Canada.

Looking forward to KDD 2017!


#GHC17 Student Scholarship

I am one of the lucky winners of a student scholarship to attend the 2017 Grace Hopper Celebration of Women in Computing (GHC) in Orlando, Florida,  October 4-6, 2017.  This event is the world’s largest technical conference for women in computing, and is sponsored by the Anita Borg Institute for Women in Technology and the Association for Computing Machinery (ACM). According to the GHC Scholarship Committee: “we had a stellar group of applicants this year and you should be very proud that you were selected“. Thank you, #GHC17! Orlando, till this fall!

More information on this event can be found on the Grace Hopper website.

ECIR 2017 paper is now online!

Our paper “A Systematic Analysis of Sentence Update Detection for Temporal Summarization” with Evangelos Kanoulas is now online. You can read below the abstract of the paper:

“Temporal summarization algorithms filter large volumes of streaming documents and emit sentences that constitute salient event updates. Systems developed typically combine in an ad-hoc fashion traditional retrieval and document summarization algorithms to filter sentences inside documents. Retrieval and summarization algorithms however have been developed to operate on static document collections. Therefore, a deep understanding of the limitations of these approaches when applied to a temporal summarization task is necessary. In this work we present a systematic analysis of the methods used for retrieval of update sentences in temporal summarization, and demonstrate the limitations and potentials of these methods by examining the retrievability and the centrality of event updates, as well as the existence of intrinsic inherent characteristics in update versus non-update sentences.”

The full paper is available here.

Our paper “A Systematic Analysis of Sentence Update Detection for Temporal Summarization” with Evangelos Kanoulas (@ekanou) has been accepted as a full paper at ECIR2017!

Our paper “A Systematic Analysis of Sentence Update Detection for Temporal Summarization” with Evangelos Kanoulas (@ekanou) has been accepted as a full paper at ECIR2017! As in previous years, ECIR attracted a large number of submissions. The organizers received 135 valid long papers out of which 36 were accepted, leading to an acceptance rate of 27%. All submissions have been reviewed by at least three members of the Programme Committee. Subsequently, the discussion phase was guided by a senior reviewer and often led to quite extensive debates in case of conflicting reviews. All reviews and discussions were reviewed by the PC chairs.

The ECIR conference will be held in Aberdeeen, UK, between 8th -13th April 2017. I will post here a pre-print version of our paper soon.