Public Speaking in Thai

This weekend Thailand hosted its first ever PyCon: a conference dedicated to the Python programming language. This was a great opportunity to meet fellow developers in the region and learn more about topics like Deep Learning, Natural Language Processing, Graph Theory and more. I even contributed with a talk on Teaching and Learning with Python. Another fun part of the conference is the lightning talk session. Lightning talks are 5 minute talks that anyone can sign up for that happened at the end of each day in the main hall. It’s a chance for people to dip into speaking at a conference or to test the water for new ideas or just to share something cool they’ve been working on. While I already have some experience in that arena, I have zero experience speaking in public in Thai. I decided to take the risk of trying my hand at it. I’d say around half or more of the audience understood Thai, but definitely a large part that did not so I made slides in English and Thai but made a goal of only speaking Thai during the presentation. There were a couple times when I couldn’t think of the word I wanted to say in Thai and was tempted to just say it in English but didn’t, I either found another way to express myself or just left the comment out. Below are the slides I created and used for the talk:

I already started realizing how hard translating this all would be by slide 1. For the first word should I use เรียน (riian – to study at the elementary level), เรียนรู้ (riian rúu – to undertake to study; learn; study), ศึกษา (sʉ̀k sǎa – to study; to be educated; to receive education; to go to school; to learn (at higher levels such as college)) or something else? And the connecting word, am I learning/studying with/by/through programming? What’s the most Thai way to express it? And it seems I was so focused on getting the Thai correct that I forgot to capitalize the ‘p’ in Programming for my title in English.

When I actually gave the talk, I was thinking “should I explain what I’m doing in English, that I’m learning Thai and want to practice speaking or should I just start speaking in Thai, I’m sure it won’t be very hard for them to figure out I’m just learning…” I jumped right in with an unsure “สวัสดีครับ… ทุกคน ยินดีต้อนรับ” (sà wát dii kráp… túk kon, yin dii dtɔ̂ɔn ráp – Hello… everyone. Welcome.”

After the initial awkwardness, I felt a little more comfortable. Sure, I’m speaking a new language and I might mess up but there are slides to help people figure out what’s going on even if I mispronounce something. I push my students who are English Language Learners to take risks and make an attempt. It’s more about pushing yourself out of your comfort zone and learning and getting your point across than delivering a perfect speech.

“วันนี้ผมจะพูดเกี่ยวกับ…” (wan níi pǒm jà pûut gìao gàp – Today I will speak about…)

Now for introductions. Pretty standard for a presentation, but it did feel rather like day 1 in a language class.

ประมาณ (bprà maan – approximately) was a new word for me. I’ve heard it before but I’ve never actually used it in conversation. I think the experience of using it in a talk in front of a large audience will help it stick in my memory pretty well.

Got my first laugh here. There’s a term for people who are half Thai, ลูกครึ่ง (lûuk krʉ̂ng – half child).

“ไม่ใช่ลูกครึ่ง เป็นลูกครึ่งครึ่ง” (mâi châi lûuk krʉ̂ng bpen lûuk krʉ̂ng krʉ̂ng) – “I’m not half Thai, I’m half half Thai”

The term for people like me who have 1 Thai grandparent and 3 non-Thai grandparents is “ลูกเสี้ยว” (lûuk sîao – crescent child) a reference to the crescent moon.

An interesting tidbit about the Thai language is that there are different words for maternal and paternal grandparents, so by using the word คุณย่า (kun yâa – paternal grandmother) instead of คุณยาย (kun yaai – maternal grandmother) it can be inferred that it’s my dad’s mom who is Thai, not my mom’s mom without me needing to elaborate.

After introductions I showed a few of the programs I made to help me learn Thai and explained briefly what they did. The first one was my program to help tell the time. This came from one of my first posts on this blog, Telling Time in Thai.

Second program, my Days of the Week quiz, also about something I made for the blog. I did say the English words “Saturday, Sunday, Friday” here because I was explaining that you have to select the correct English word in this example. Though my childlike enthusiasm when saying “ถูก!” (tùuk – correct, can also mean inexpensive) got me another laugh from the crowd.

This is the only example I shared that isn’t from the blog, JamDai, the Vocabulary Card Matching Game I made. Instead of ถูก, this one ends with a supportive เก่งมาก (gèng mâak – very good, clever, skillful, superbly performed).

The final example I shared was the Thai Chat Bot I made recently. Got a couple more laughs here excitedly reading the chat between myself and the bot and explaining that the bot is male since he uses the polite term “ครับ” (kráp) instead of “ค่ะ” (kâ).

Though, if I end up adding text-to-speech that may change since all the existing Thai text-to-speech tools I can find only have a female voice. I have noticed that general service messages, or posted announcements tend to be either gender neutral or use female terms. Another interesting difference between Thai and English is that there’s no difference between she and he, it’s the same word (เขา – kǎo) so no need to worry about misgendering someone because you don’t need to refer to people by their gender. Though you do gender yourself by the self-referential pronouns you use, ผม (pǒm – I (male)) and ฉัน (chǎn – I (female)). There are instances when speakers will use the opposite gender terms such as a male using female terms with close family members or intimate partners to show softness/gentleness (it’s common for male singers to use ฉัน in love songs for example) or women might use male terms to show harshness or to be stern. Reveals some of the cultural connotations surrounding gender.

I’m very glad I took this risk even though it was scary, I’m happy with how it went. I hope to continue growing and using my Thai language skills. It would be great to be able to speak directly to the parents of my students who speak Thai instead of relying on a translator (though, the Thai staff that helps us with that are awesome!) And of course, the most rewarding way in which I use this skill is getting to connect more with my family members on this side of the globe. Even though initially, I barely knew any Thai, they’ve been so kind, welcoming and warm to me.

I have to give a special shout-out to my wonderful girlfriend, Mild, who took a look over my slides for me and offered suggestions to improve them. In general, she’s been a huge factor in helping me learn and pushed me take chances like this.

Making a Thai Chat Bot

An assignment I’ve seen several Computer Science teachers give their students is to write a chat bot. I thought that could be cool to do with my students and I always run though assignments myself before assigning them to a class. So, for my take on the assignment, I decided to make a chat bot that speaks Thai. For the code, check it out on codepen. Note that the bot only knows the Thai script, phonetic transcription won’t work. To chat with it visit this page or try it out below:

 

 

 

 

 

 

 

There are lots of Natural Language Processing (NLP) tools out there for English but there aren’t as many in Thai. I took a pretty naive approach to my implementation since there are several challenges in NLP unique to Thai (even tokenizing words is nontrivial since there aren’t spaces between words in the Thai script).

The relevant function in the code is the one that determines what the bot’s response will be to what the user types. To start, lets just have it respond with ไม่เข้าใจ (mai kao jai – I don’t understand) to anything said. Our bot just arrived in Thailand and doesn’t know any phrases other than this.


function chatbotResponse() {
  botMessage = "ไม่เข้าใจ";
}

Alright, we have something going. Next, let’s teach our bot some basic greetings. How do we know if someone is greeting us? We could check to see if the user includes “สวัสดี” or “หวัดดี” anywhere in their message. That covers formal or informal and whatever articles someone may add at the end. It would catch user messages like “หวัดดี” (wat dee – hi) and “สวัสดีค่ะ/ครับ” (sawatdee ka/khrab – Hello) or as my first student tester entered: “สวัสดีจ้าาา” (sawatdee jaaaa – more colloquial way of saying hello in chat) Let’s respond with a random greeting such as สวัสดี (sa wat dee – hello) or สวัสดีครับ (sa wat dee khrab – hello). I’ve chosen to make my bot male so I’ll use particles like ครับ instead of ค่ะ. I’ll remember that going forward to stay consistent.


if (lastUserMessage.includes('สวัสดี') || lastUserMessage.includes('หวัดดี')) {
  /* randomElement is a custom function to pick one of the words in the given list */
  botMessage = randomElement(['สวัสดี','วัสดีครับ','สวัสดีครับ']);
}

Cool, now maybe we should give our bot a name. “Bot” seems appropriate, but let’s write it in Thai:


botName = 'บอท';
if (lastUserMessage.includes('ชื่อ')) {
  botMessage = 'ผมชื่อ' + botName;
}

Again, บอท is male, so we used ผม (pom) for I instead of ฉัน (chan). Next, we should check if the user is asking how we are. Since Bot is a pretty chill guy let’s have him always give a positive response.


  if (lastUserMessage.includes('เป็นอย่างไรบ้าง') || lastUserMessage.includes('สบายดีไหม') || lastUserMessage.includes('สบายดีมั้ย')) {
    botMessage = 'สบายดีมากครับ';
  }

Alright, we’re beefing up Bot’s vocabulary. How about another easy one, “Thank you” and “You’re welcome”. In Thai we might say thank you with either ขอบคุณ ครับ/ค่ะ (kop khun khrap/ka) or the more casual ขอบใจ (kop jai). We could respond with ยินดีครับ (yin dee khrap – you’re welcome) or ไม่เป็นไร (mai bpen rai – no problem/no worries), and of course we could always throw a ครับ (khrap) at the end to add some politeness.


  if (lastUserMessage.includes('ขอบคุณ') || lastUserMessage.includes('ขอบใจ')) {
    botMessage = randomElement([
      'ยินดีครับ',
      'ไม่เป็นไร',
      'ไม่เป็นไรครับ'
    ]);

Let’s give our bot a useful feature. How about telling you the time if you ask? กี่โมง (gee mong (long ‘o’ sound)) is how to ask what time it is so let’s check if the user writes that. And if so, we’ll print out the current time.


/* what time is it? */
if (lastUserMessage.includes('กี่โมง')) {
  botMessage = new Date().toLocaleTimeString();
}

Now, how about a sense of humor? When chatting in Thai, it’s common to see the number 5 (pronounced ‘ha’ in Thai) used for laughter. Maybe 555 or even more 5’s if it’s really funny. So, if we see the word ตลก (talok – funny) in the user’s message let’s output a string of 5’s (anywhere from 3 to 9) to indicate Bot’s amusement.


if (lastUserMessage.includes('ตลก')) {
  let extra_fives = Math.floor(Math.random()*6);
  botMessage = '555';
  for (var i=0; i < extra_fives; ++i) {
    botMessage += '5';
  }
}

Alright, let’s try something a bit more complicated. Let’s try to detect if the user is asking a question and respond either positively or negatively. ไหม (mai, also written as มั้ย) is a particle added to the end of a statement to make it a question. i.e.

เอาไหม (ow mai – do you want it?)
or
ไปไหม (pai mai – do you want to go?)

To respond positively we just chop off the question particle and use the verb i.e.

เอา (ow – I want it)
or
ไป (pai – let’s go)

To respond negatively we still chop off the particle but also add a negation (ไม่ – mai, with a falling tone) in front i.e.

ไม่เอา (mai ow – I don’t want it) or
ไม่ไป (mai pai – let’s not go).


if (lastUserMessage.includes('ไหม')) {
  let i = lastUserMessage.search('ไหม')
  botMessage = lastUserMessage.substr(0,i);
  let coinflip = Math.floor(Math.random()*2);
  if (coinflip) {
    botMessage = 'ไม่' + botMessage;
  }
}

if (lastUserMessage.includes('มั้ย')) {
  let i = lastUserMessage.search('มั้ย')
  botMessage = lastUserMessage.substr(0,i);
  let coinflip = Math.floor(Math.random()*2);
  if (coinflip) {
    botMessage = 'ไม่' + botMessage;
  }
}

I won’t list every single thing I put into the program here but I’ve added more stuff to it. Feel free to chat with บอท to find more messages I’ve added. Or peek at the code. If you’ve got more suggestions for what to teach him, let me know!

Machine Learning Lesson

A very important part of my job as a K-12 computer science teacher is to take concepts that may seem impossibly complex to some and come up with ways to make them more accessible to all of my students. In this post, I’d like to share an activity I did with my students grades 7-10 to engage them in machine learning.

There are many aspects to machine learning and the process I guided my students through is predictive modelling which consists of 5 steps:

  1. Obtaining data
  2. Correctly formatting the data
  3. Training a model with the data
  4. Testing your model
  5. Improving your model

Like many processes in engineering/design, this is not a process you step through once. It is a cycle you perform multiple iterations of and there is no set number of times you should go through the cycle.

I was inspired to tackle this subject with my students after watching a video on youtube by LearningCode.academy: “Machine Learning Tutorial for Beginners – USING JAVASCRIPT!”

There were several things I liked about this video:

  • It was short (less than 12 minutes)
  • I could jump right into the provided example code and start playing around with it with no setup required
  • The problem presented was easy to understand

All these things told me I could make a lesson plan out of it. If I stand in front of my students and lecture for more than 10-15 minutes I will lose most of their attention. I need to be able to get them engaged in a meaningful task for the most learning to occur. So, having tedious setup to go through on their computers would also be detrimental which is why I like online tools like codepen for in-class activities.

I did feel like I needed a different problem from the one in the video, one that could encourage both collaboration and individual effort. I thought of a well-known beginner project in machine learning, Building a model to recognize handwritten digits in Tensorflow. That project, as it is, is a bit out of scope for a 1-hour class with middle school students so I created my own version. I like the notion of recognizing digits, there are 10 distinct items we want to train our model to recognize and there are many ways to collaborate here. Even if students are tempted to split up the individual digits and have specific people get the model working for specific digits, they’ll still have to put their data together and test it out and find test cases that fail and work together to improve the model.

Here’s the program I made. I used two libraries. First, brain.js  (Neural Networks implemented in JavaScript) and second, p5.js (a JavaScript library similar to Processing, made to make programming more accessible).

I used p5.js since it’s something we’ve used in or class many times. In this program, I used it to draw a 6×4 grid of cells that start off empty but that the user can fill by clicking on them. Whenever the user “draws” in this way, the program attempts to guess what digit was drawn. I currently only provided training data for the digits 0-3. The students had do do the rest of the work to make it work for all digits 0-9. I wanted to make the representation of this grid as simple as possible so I made a 1-dimensional array of 24 1’s and 0’s: A 1 is a cell that’s filled in and a 0 is a cell that’s empty. They go from left-to-right, bottom-to-top the same way we read. So, as a class we looked at an example like the ‘2’ that I drew above and produced an array that would represent it: [1,1,1,1,  1,0,0,1, 0,0,0,1, 1,1,1,1, 1,0,0,0, 1,1,1,1]. The whitespace between groups of 4 help us to understand that each group of 4 represents a row in our grid. The training function in the code is where we have to insert our training data. A line might look like this:

{ input: [1,1,1,1, 1,0,0,1, 0,0,0,1, 1,1,1,1, 1,0,0,0, 1,1,1,1], output: { 2: 1 } },

We input the grid, and the output says that the probability that the input we provided is a “2” is 1 (i.e. We’re 100% confident this is a “2”). By this point, the students wanted to jump in and start adding their own data.

At the beginning of the tasks many of the questions are technical (maybe they’re getting syntax errors because they forgot a bracket or a curly brace), but once every figures out how to successfully add new test cases there are process questions like “How many test cases do I need to add?” I don’t give any guidance on this front, I just let them know that we’ll be testing out their implementations together. This encourages them to test out their own model beforehand. I also ask them questions as I’m circulating the room like “How many different ways can you think of to draw a 4?” or “Are there any numbers that could look similar to one another?”

Once we move onto testing, we quickly see that no one’s program works all the time. There’s a good chance we’ll be able to easily find some cases that are obviously wrong. So I might end up with something like this:

I can see why our program might think this is an 8, if we filled in 3 or 4 cells on the left border it would definitely look like an 8. But, as humans, we can look at this and agree that it’s most likely a 3. To improve our program we can train it with this case. In our program it would like like this:

{ input: [1,1,1,1, 0,0,0,1, 0,0,0,1, 0,1,1,1, 0,0,0,1, 1,1,1,1], output: { 3: 1 } },

And, after we re-run our program and try again, it gets it right:

We had good discussion both before and after the activity. We talked about how this process is different than the ways we’ve implemented algorithms in class before, where we give the program a step-by-step procedure to follow that we can easily trace. In this task, we give the program examples and it comes up with the rules. It’s messy and not as predictable, but it’s quite powerful. Trying to write our own algorithm to take a list of 24 1’s and 0’s and determine what digit it most closely resembles would be very difficult and the edge cases would take a long time to account for. Here, if we find a case that doesn’t work, we just throw another line of training data at it.

And if we look back at those 5 steps of predictive modelling, we’ve done each one:

  • Obtaining data – coming up with different ways to represent the digits
  • Correctly formatting the data – putting them into lines of code that syntactically work in our program
  • Training a model with the data – running the program after we’ve input the new lines
  • Testing your model – trying it out, drawing digits and seeing if it gets them right
  • Improving your model – when we find failed cases, we add more training data (Which sends up back to step 1)

 

What’s next?

Teaching at the secondary level, we tend to go broad but not too deep. I expose the students to all kinds of topics in computer science but the main learning goal is to build their skills in engaging in the processes rather than remembering all the details of the content. But, every now and then, some students will really latch on to a particular topic we covered which could help inform future projects that they either complete for my class or on their own. There are lots of directions to go from this simple lesson.

Bigger Test Cases

We just looked at a 6×4 grid. The original data set I referenced, MNIST, represented hand drawn digits as 28×28 grids. Or if you want to move into image recognition, a single image could have thousands or even millions of pixels with different color data in each.

More Training Data

At the end of the activity, we might have generated dozens or depending on your class size, hundreds of test cases but in the world of Big Data, that’s miniscule. MNIST has 60,000 training cases and 10,000 for testing. And think about how much data you’d want to give to something like a self-driving car where lives are on the line. You will have to develop more sophisticated methods for collecting and using data if you want to start working with large quantities of it.

Offline Training

The program I wrote trains the model in real-time. Once you start using much larger sets of data the time it takes to train the program increases dramatically. To accommodate this, you will need to be able to train the data offline and generate a pre-trained network that you could plop into a website if you wanted people to present it without having to wait for it to re-train every time.

Resources

Between the time I ran this lesson and writing this post, A new javascript library was released by Google’s AI team: tensorflow.js. But if you really wanted to dig into machine learning, Javascript is not the best way to go due to performance. I like it for the classroom for the speed with which we can implement and test our programs. Using TensorFlow with Python would be a good route to explore.