Zhou Yu: One on One

silkthyme
11 min readMay 10, 2020

An Interview With Assistant Professor and Davis NLP Lab director Dr. Zhou Yu

Hello!

Hello!

How are you doing during quarantine? Are things going okay?

Yeah, they are. We had to do a lot of work to make the [ECS 189L] course into an online version. I think I’m more worried about my graduate students, and if they are actually okay working at home by themselves. It definitely puts a lot of stress on me in terms of classes and mentoring students, making sure they are physically and mentally healthy.

Are you referring to graduate students you are teaching or those in your research lab?

I’m mostly talking about the Ph.D. students I mentor and those in my lab.

How is your experience of mentoring graduate students?

I think mentoring is a very rewarding experience. You really do see them starting from not knowing too much. Graduates learn how to conduct research, how to write a paper, and how to present their work in front of other colleagues. It’s certainly nice to see them grow every year.

Are you teaching the 189L course right now?

Yes, it’s an NLP class; it’s called “Special Topics in AI”.

How is that going for you and your students?

We are doing well; we are adapting to online teaching, quizzing, and mentoring. So, we’re slowly adapting to it. I try to learn more in terms of students’ difficulties in doing the class. So, we’re going to have more support them for office hours and stuff like that.

One of the things I’ve noticed is that you studied linguistics as an undergraduate. I was wondering if you wanted to pursue this field very early on; was it something you knew, and what drove you?

I think it all started because when I was in high school, I liked literature and arts, but I also was very interested in science and engineering. And when I went to college, I basically was on the border and didn’t know what direction to do in my life. So I basically took classes from both areas. I did engineering a lot so I knew how to program early. I still did computer science as my major and linguistics as my second major. In general, I liked reading, I liked writing, and I liked to understand language. So, I tried to do both of them together. When later in my undergrad and started to work on my research, I had machine learning classes and so on. Then I started one of my undergrad theses, which was in machine translation (English to Chinese and Chinese to English). It actually ties in very well, that I had the background of both linguistics and computer science. Then I started to do more research on that, and after undergrad, I applied to grad school and I got into the program at CMU that is very focused on computational linguistics and natural language processing. I started from there and went deeper and deeper in this direction. My colleagues in natural language processing have similar backgrounds, and did both computer science and linguistics as an undergrad.

Is that what made you continue a Ph.D. later?

Yes, it was part of that. I think in undergrad it’s very interesting doing research, it’s like discovering new things. I felt like I wasn’t learning enough; I felt like there were more things that I could do. That’s why I applied for a Ph.D. And after I got into the Ph.D. program, I learned much more in terms of what was research. In undergrad, my research was very different from what I have now. It evolves over time.

How was your undergraduate and current research different?

In my undergrad, I did have some mentoring from professors, but I still wasn’t able to grasp, “Why do I need to work on these problems?” I knew what I was working on, but I was given a problem by other people. Eventually, later in grad school, you get more flexibility; you get more experience in what is going on in the field, knowing what is important, what is worthwhile doing.

Is that why you decided to focus more on dialog systems later on?

Yes, this is also mostly part of my interests. I worked on language and vision a little bit in my undergrad, and I wanted to see if I could combine them. Back then, there were two people who were doing vision plus language, like automatic image captioning. But I’m also interested in interaction, so how people and machines interact with each other. So I started to work on this disciplinary area with dialog systems. You can take into account language, facial affect, behaviors, and gestures. I deal with these kinds of interdisciplinary areas. So that’s why I think this kind of interaction with dialog systems and multimodalities.

I was looking up what multimodal systems were. I don’t know if this definition is correct, but I was thinking of using different modes of user input in order to interact with the system.

You can think about information being processed by various devices. For example, you have a microphone that accepts audio, and a camera that accepts images, so you have information coming from all these different devices and they are processed in different time frames. They are converted into different forms, and how do we combine the information from these different devices and these signals together to understand people and situations. So that’s part of the multimodal machine learning and mostly combining different streams of information from different modes.

Why do you find this particular area of natural language processing interesting?

Intrinsically, I think I just find language very interesting. And automating certain tasks such as translation and assistive writing: all these tasks are very powerful and can help everyday people’s lives. That’s why I think that dialog systems are one level above, where you can complete tasks with the machine. So it is ultimately one of the ways to automate things so you can reduce human effort and tedious work and actually assist humans better. For example, dialog systems can provide training systems for people. You can talk to a robot to practice your second language, practice interview skills or leadership skills. You can think of natural language as the most natural way people can communicate with each other, and people learn from other people. So can robots play certain roles in this? You can think of robots playing the roles of teachers, learners, and providers of service. This is very fascinating in a lot of ways.

I also wanted to ask you about Gunrock. Congrats on last year’s Alexa prize.

Yes, we are attending the competition this year as well. The final is actually next month.

That’s coming up soon.

Yes, we are pretty busy with that. We’re all wrapped into it.

I hope working on that remotely isn’t too difficult.

We couldn’t do a lot of user studies at the moment, so we had to do online user studies to remotely interact with Gunrock.

About this year’s competition, what kinds of different approaches are you doing in the competition?

We are adding more features. We are doing more user adaptation so based on the user’s behavior, we make the system adapt to them more and users get a more personalized experience. For example, if you are a person who likes to talk more, then we will do more listening. If you are a listening type, then we do more of the talking.

Oh that’s interesting, so you adapt to more of the users’ personalities, I’m assuming?

Yes, you can think about it that way, but more roughly.

That does make the dialogue flow better.

Last year, a lot of people complained that the bot was too talkative and “full of itself”. And we found that some people just wanted to express themselves. They wanted to tell you about your relationships, their dogs, etc. So they just wanted to talk, and for other people to listen.

What are you most proud of about last year’s journey of creating Gunrock?

I was proud because it was my first year as [part of the] faculty. And all my students were first years, I was so amazed that after an effort of less than a year, they were transferring from people who didn’t know that much about NLP to build[ing] a really good system that was the top one in the world. I was really happy that we won the challenge because my own advisor back in CMU, Alex Rudnicky, was in the challenge too, so we were able to beat him and that was kind of nice.

How do you think your project furthers human-machine interaction? Do you think there were particular advancements made in the field?

Because of all these machine learning and neural network model advancements, we can do much better at understanding people and responding to people. In general, we are doing much better. It’s a huge leap within 5 years. If you understand people better, you can plan better in terms of how to respond to people.

I also noticed that you used to do research at other companies such as Microsoft and ETS. I was wondering how that is different from conducting research in a more academic setting?

So for the two companies that I interned for, those two are very academic research driven. So they also write papers and publish. They are very similar to our academic setting. But of course, these companies serve a bigger purpose; hopefully something they did for the research would materialize into the product. So they definitely have that in mind when picking the topic of research. But in academic research, for example me, I would just need to write a grant that pivots to that. So it’s a bit more flexible in academia.

How do you feel about having more of this flexibility as you do right now?

I do think that recently because the entire society, [including] companies and government, are very supportive of AI. So we’re doing pretty well on funding, so that’s how we get the flexibility of choosing which project to work on.

What projects do you plan on working on in the near future?

We are still expanding a couple of projects that we have right now. One is to persuade people to do physical exercise. This is a collaboration with UCSF, because a lot of people, especially those with type 2 diabetes, need to do more physical exercise. So they have this metabolic deficiency. So doing more physical exercise can save their lives. This is a huge advocate we have on preventative medical care. So having a conversation with a system to remind them what they should do and provide them with factual information is very important. We added a persuasive touch into that so we could utilize different persuasive strategies and tailor that to individual people with different personalities. For example, if you are a risk-averse person, you can tell them that “if you don’t do this”, something bad is going to happen. If you are open-minded and exploring, you can tell them, “oh, here’s a new routine you can try out.”

That’s cool, I wish I could have that for myself to persuade myself to do more physical exercise.

You know, a lot of people want to do more exercise, but people get lazy over time. It’s exactly something I’ve been very passionate about. Especially in this pandemic situation, everyone is having trouble finding a good level of exercise and [a] healthy diet, so [we want to see] how could we make sure people form good habits.

How did you encounter this project?

We had another project before, which persuades people to donate to charities. It’s sort of another natural thing; there was one UCSF faculty who reached out to me during a conference and asked me if we could work together on chatbots using healthcare.

Is there something about your current or past research that you wished others asked you more about or you don’t have as much opportunity to talk as much about?

There was one piece of research that we did for multimodal translation. If you go on different websites, you see these IKEA websites with images of the products and their descriptions. But if you think about these international companies, if you want to sell products, you need to sell not only in the U.S., but also in France, Germany, etcetera. You need to change your website to adapt to that particular language. It’s expensive to have people translate that manually. So we automatically translate them. The idea is that you have an image that is correlated with the text. So we can condition on that picture to better translate English to another language or vice versa.

Are you saying that you’re translating images from a website to words?

It’s like a website that has both images and words, so we condition on the image to utilize the image to incur the translation, so that the translation is more accurate.

How did you do that?

What we do is that we train these languages and images and project them to a joined space so that we can do operations on the same space. And then different languages can all map onto the same space. So we can optimize them on the joined space. So that really helped a lot. It especially helped in terms of the images, you can get tangents. One of the biggest problems between languages of translating is prepositions. So prepositions [don’t] have a one-on-one translation in a lot of languages. Prepositions can have a lot of meanings; for example, in English, the word “in” can have various meanings. When you translate it to a different language like German, there are a lot [of] words it could translate into. So it really is situational. If you have an image, a person could be using it, or a person could be sitting on it. So this kind of positional information in the image can help in translating it to a different language.

Do you have any advice you would give to your 20-year-old self?

I would say one of the things I would say to my 20-year-old self is to make more friends. Interact with people more, in the end of the day, when you’re 30 or older, old friends are very important in your life and may give connections for you in the future. If you want to go into business or another career, you need more connections.

How do you incorporate [your linguistics work] into your current research and work?

A lot of linguistic theories and observations in human conversation can be incorporated in computational models so that we can do better. A lot of people do work in predicting syntax and semantics.

Do you have any recommendations for classes to take if someone wants to enter your area of expertise?

I recommend machine learning. I do have a class right now; Special Topics in AI, that is for Introduction to Natural Language Processing, but we want to make sure you take the Probabilities course. And of course the basics: introduction to programming, algorithms, and data structures. You can also take Coursera courses; Andrew Ng has a nice course on machine learning.

Thank you so much for taking the time to speak with me, because I know you must be busy.

I hope this is helpful for you.

It was; I really enjoyed learning your background and the work that you do. Especially some of your new projects; I think they’re really exciting!

Thank you, it was nice talking to you.

Interviewee: Dr. Zhou Yu, U.C. Davis Assistant Professor [http://zhouyu.cs.ucdavis.edu/]

Interviewer: Celine Liang, WiCS Davis Resource Director [celiang@ucdavis.edu]

Edited by: Christina Huang, WiCS Editor [chjhuang@ucdavis.edu]

--

--

silkthyme

i feel like a time traveler. june, july, august. summer dissolves in my mouth and i can't remember what it tasted like.