Lazy/Greedy Disney Writers: A Regex Story

Nick Stebbs
4 min readJun 12, 2019

--

Disclaimer: The story I’m about to tell you may or may not have happened, but for reasons of libel (and since I’m not one to take on a giant multinational so early in my career), just to confirm, all dream content herein has no basis in reality…

As part of my studies during the first module of the Launch School curriculum, we were asked to start a blog and write about the things we’ve learnt. There’s a concept that has had me scratching my head whenever I encountered it in a piece of code: Regular expressions.

Not one to shy away from a challenge, one afternoon I poured a big filter coffee and popped my head under the bonnet of this useful pattern matching tool. I was delighted to find that after many hours of dissecting the various components of each expression, some of it no longer looked like gibberish.

If you’ve never encountered a Regex before, check out this one which matches with a phone number:

\A(?:\+?\d{1,3}\s*-?)?\(?(?:\d{3})?\)?[- ]?\d{3}[- ]?\d{4}\z

Given enough time with Rubular, and reference to the Launch School guidebook, I now have the confidence I could prise it apart. If you feel different — don’t worry. I’m going to concentrate this post on only one concept. It was a concept, however, which initially made me a little puzzled. Or was I just Drowsy?

It was time for another coffee. It’d help me understand greediness, I thought; but before I had time to refuel I had already nodded off. Fortunately for me, the understanding came to me in a dream…

Setting: A Disney writing room. 1993. I’m on a team that’s on the brink of finishing what will go on to be a huge hit: Snow White and the Seven Dwarfs. We’re all set to sign off on the script, but there’s a problem: the name of the final dwarf. With a list of 50+ possible names, at 4pm on a Friday afternoon, how will we break the deadlock? Regex to the rescue!

Head writer:
OK, so far we’ve got Doc, Grumpy, Happy, Sleepy, Bashful, Dopey, and Sneezy. Let’s finish this thing so I can get dinner. There’s an all you can eat buffet round the corner that I’ve been dying to try out!

Me:
Believe me - I’m ready. But I’m also not about to sift through this entire list, are you?

Silly, Sappy, Scrappy, Snappy, Snoopy, Goopy, Gloomy, Gaspy, Gabby, Blabby, Flabby, Crabby, Cranky, Lazy, Dizzy, Dippy, Dumpy, Dirty, Deafy, Daffy, Doleful, Woeful, Wistful, Soulful, Helpful, Awful, Graceful, Tearful, Tubby, Weepy, Wheezy, Sneezy-Wheezy, Sniffy, Puffy, Stuffy, Strutty, Shorty, Shifty, Thrifty, Nifty, Neurtsy, Hotsy, Hungry, Hickey, Hoppy, Jumpy, Jaunty, Chesty, Busy, Burpy, Baldy, Biggy-Wiggy, Biggo-Ego, Sleezy

Head writer:
No way! Well then let’s narrow it down. It should end in a ‘y’, I feel. We’re not about to have another ‘Doc’ scenario.

Staff writer:
Mmhmm… and I like the one’s starting with ‘S’, they roll off the tongue.

Me:
Me too. I’m also a fan of the long ‘e’ sound. There’s a nice rhyme to it when it ends with a ‘y’ too. So what have we got? I’ll use a regex to sift out the matches. (At this point I start jotting on a notepad)

Head writer:
Regex? What are you talking about? It’s 1993! Come on, man! The early bird special finishes at 5.30!

Me:
Alright… I’m done. (I show them the notepad:)

S.+ee.y

Staff writer (staring blankly):
What on earth does that mean?

Me:
Obviously it’s just an ‘S’, 1 or more other characters, two ‘e’s, another character and then a ‘y’. That should narrow it down. Let me run it through the list (having internalised Ruby Regular Expressions by this point, it takes me but a moment).

Well, it’s become easier… in a sense. We’re down to two matches:

(Sneezy-Weezy, Sleezy)

Staff writer:
Sneezy-Weezy’? That’s ridiculous, but… I want to get out of here… This is a kids’ film, so that rules out the other option. I mean, ‘Sneezy-Weezy’ doesn’t sound that bad…

Me:
Wait one minute. I think I can fix this with a well placed question mark! (I go back to the notepad to tweak it. Now it reads:)

S.+?ee.y

I’ve changed it. And now… we get ‘Sneezy’!

Head writer:
I think I love you. Let’s eat!

At this point, I woke up from my dream. A shame I didn’t make it to the weekend, but at least I learnt something:

Quantifiers are modifiers to a pattern in in a regular expression, defining the quantity of that pattern to match against. Here that’s the ‘+’, which let’s us have at least 1 wildcard character (.) between the ‘S’ and the ‘ee’. However, the quantifier was being greedy. What does that mean? It’s not just another dwarf name!

The regex was taking the longest value possible for the pattern match. That’s why ‘Sneezy-Weezy’ came back in the match data: it matches the pattern, just as well as ‘Sneezy’, but since the quantifier is by default greedy it returns the longest string possible.

The ‘?’ modifier makes the quantifier lazy, meaning it will match the shortest string first. That’s how we finally narrowed it down!

--

--

Nick Stebbs
Nick Stebbs

Written by Nick Stebbs

Just a guy, frivolously writing about tech

No responses yet