[MUSIC PLAYING] SPEAKER: Well, hello, one and all, and welcome to our short on capture groups. Now, those of you who have made international calls might be familiar with what's called a country calling code. And I have here three of them up above in a dictionary called locations. I have 1 plus 1, which often involves numbers from the United States and Canada. I have 1 plus 62, which often involves numbers from Indonesia, and 1 plus 505 that involves numbers from Nicaragua. And I have down below here, in main, a program that in this case validates phone numbers internationally. So I have here a pattern that I'm going to look for within each of these phone numbers I'm going to enter into my program down below on line 7. In this case, notice what I'm expecting. I'm expecting the literal actual character plus here. I've escaped it with this backslash because plus has other meaning meanings within regular expressions. I'm not expecting any kind of number, in this case, 0 through 9, between 1 and 3 times. So notice here, the country code for the US and Canada, that's plus 1. So only one number here. For Indonesia, it's two, 62. And for Nicaragua, it's three, 505. So between, in this case, one and three numbers following some plus. Thereafter, there will hopefully be a space for this invalid number, and then there will be exactly three numbers, a dash, again, exactly three numbers, followed by a dash again, and then exactly four numbers. So this is the pattern we are looking for. And down below, on lines 9 through 13, well, this is the code doing that work for us. We've stored, within number, the user's phone number that they have entered, and we're going to check, using re.search, if we found a match for our pattern within the number string. If we do have a match is returned to us, we'll print valid. If we don't, we'll print invalid. Let me go ahead, down below here, let me type "main" to ensure I call main when I run this program. And I'll go ahead and run Python of groups.py. And if I hit Enter, now I should be able to enter a number. I'll test one here, plus 1. And I'll type in that 617-495-1000. I'll hit Enter here, and we'll see that is valid. Maybe I'll try it, too, for Indonesia. I'll do plus 62, and I'll enter in my phone number again. And I'll hit Enter, and I'll see if that's valid as well. And just for good measure, I'll test Nicaragua. I'll do plus 505, and I'll type in this number, and we'll see that is valid as well. So it seems like our pattern is working, but there's more we could do with this program, I think, thanks to this feature called a capture group. Well, maybe what I want to do is not just share if this number is valid or invalid, but maybe tell somebody from what country, in this case, this number is calling from. You can think of your phone. When it receives some call from an unknown number, it might at least tell you the location or the area that number is calling from. What if we could write the same thing here where people call us internationally and we show the user, in this case, what country they are calling from? Well, in this case, we don't want to just test to see if we find the pattern within our phone numbers here. We also want to extract some portion of it, in this case, the very first portion, the country calling code-- plus 505 for Nicaragua, plus 62 for Indonesia, or plus 1 for the US and Canada. But we run into a problem here if we to use maybe simple a string manipulation. If I enter in some number and was trying to extract, in this case, the country calling code, well, I wouldn't immediately know whether I should extract, in this case, the first two characters, plus 1, the first three characters, plus 62, or, in this case, the first four characters, plus 505. But thankfully, I actually use regular expressions and capture groups to dynamically capture the portion of the content I'm looking for. Now, the way I can make a capture group is by using parentheses inside of a regular expression. So really I want to capture or extract, in this case, the country calling code, which we said the pattern exists for right here, a literal plus sign followed by on to e numbers. Now, I can encase this inside of parentheses, and this becomes my own capture group. But how could I maybe find the information I capture? Well, if I find a match here, turns out that this match object in Python comes with another one called group-- group. If I were to do-- let me do match.group. Well, this would help me find all of the capture groups I've actually implemented in my regular expression and extract them from this match. Because, let's say, this is the first capture group we have, I could go ahead and type 1 here. Capture groups, at least in Python, in this particular case, are one index. So I'm saying here, if there is a match, go ahead and give me, in this case, the result that I found within the first capture group. I'll go ahead and store this in a variable called country_code. And I could-- instead of printing valid here, maybe I'll go ahead and print something like country_code and see what we can find. Well I'll go ahead and run Python of groups.py, and I'll go ahead and do plus 1, followed by my phone number, hit Enter. And now we'll see plus 1. So it seems like we extracted, in this case, the portion of our content that matched the pattern within these parentheses. I'll try it with another one here. I'll do Python of groups.py plus 62. And we'll see plus 62. So it is dynamic. And it's not looking for the first two characters all the time or the first three characters all the time. It's looking for this pattern. And when it finds it, it's returning it to us as appropriate. Now, what else could we do with this? Well, country_code literally is a string here. So if I wanted to, in this case, find the country somebody is calling from based on their country calling code, well, I could perhaps use country_code as the key for this dictionary here. I could type locations bracket locations bracket country_code. And because each of these country calling codes is a key in my dictionary, I should hopefully find, in this case, the actual location they are calling from. Let's try this out. I'll run Python of groups.py, and I'll now type-- oops. I'll now type-- let's do plus 1 again, followed by the number, hit Enter. And now we'll see United States and Canada. I could do this again. I could try, let's say, a plus 62 and number again-- Indonesia. I'll now try plus 505, and I'll see Nicaragua. So the capture group here is doing the work of finding the portion of our content that matches some pattern we were looking for. Well, I think we've really seen a lot of what this can do for us, but there is one more feature to take a look at. Here, notice how on line 11 I am really using indices, indexes, to find the capture group I'm looking for. But a more complex regular expression might involve more than one capture group, could involve up to, I don't know, more than one, two, three, four, could get up to 10. However many it is, it can be helpful to have a better way to refer to these capture groups. So if this capture group has some particular meaning to it, I could actually give it a name to refer to later on within the regular expression. And the way I do this is with the following syntax. Within my capture group, after the first parentheses, I can type question mark p and then open bracket close bracket, or, in this case, less than sign, greater than sign, and then some name for this capture group. I could call this country_code just like this. So now this pattern here and the capture group has a name I can refer to later to extract it with. Down here on line 11, I could, in this case, use 1, but now I could actually make use of country_code, the name I gave for this particular capture group. And I could type in country_code just like this, which will say, find for me, in this case, the capture group that I named country_code and use that instead. I'll type Python of groups.py. I'll go ahead and type plus 1, same number here. And now we'll see United States and Canada. Seems to work but is now a little more readable, even some name something that we might later hope to capture in our programs. So this was our brief foray into capture groups. And this was our short. We'll see you next time.