A expression (commonly known as a “regex”) is a string or a sequence of characters that specifies a pattern. Think of it as a search string — but with super powers!

A plain old search in a text editor or word processor allows you to find simple matches. A regular expression can also perform these simple searches, but it takes things a step further and lets you search for patterns, such as two digits followed by a letter, or three letters followed by a hyphen.

This pattern matching allows you to do useful things like validate fields (phone numbers, email addresses), check user input, perform advanced text manipulation and much, much more.

Use the Download Materials button at the top or bottom of this tutorial to download a Regular Cheat Sheet PDF and a Swift playground to practice with. You can print out the Cheat Sheet and use it as reference as you’re developing. Use the Swift playground, which contains examples, to try out lots of different regular . All of the examples of regular that appear, both in this tutorial and the successor, have live examples in that playground, so be sure to check them out.

/The (Basics|)/

If you are new to regular expressions and are wondering what all the hype is about, here’s a simple explanation: regular expressions provide a way to search a given text document for matches to a specific pattern, and they may alter the text based on those matches. There are many awesome books and tutorials written about regular expressions — you’ll find a short list of them at the end of this tutorial.

Regular Expressions Playground

In this tutorial, you’ll create a lot of regular expressions. If you want to try them out visually as you’re working with them, then a Swift playground is an excellent way to do so!

The playground in the materials you’ve downloaded contains a number of functions at the top to highlight the search results from a regular expression within a piece of text, display a list of matches or groups in the results pane of the playground, and replace text. Don’t worry about the implementation of these methods for now though; you can learn about them in the next tutorial. Instead, scroll down to the Basic Examples and Cheat Sheet sections and follow along with the examples.

In the results sidebar of the playground, you’ll see a list of matches alongside each example. For “highlight” examples, you can hover over the result and click the eye or the empty circle icons to display the highlighted matches in the search text.

Viewing results in the playground

You’ll learn how to create NSRegularExpressions later, but for now you can use this playground to get a feeling for how various regular expressions work, and to try out your own patterns.

Examples

Let’s start with a few brief examples to show you what regular expressions look like.

Here’s an example of a regular expression that matches the word “jump”:


jump

That’s about as simple as regular expressions get. You can use some APIs that are available in iOS to search a string of text for any part that matches this regular expression — and once you find a match, you can find where it is or replace the text.

Here’s a slightly more complicated example — this one matches either of the words “jump” or “jumps”:


jump(s)?

This is an example of using some special characters that are available in regular expressions. The parenthesis create a group, and the question mark says “match the previous element (the group in this case) 0 or 1 times”.

Now for a really complex example. This one matches a pair of opening and closing HTML tags and the content in between.


<([a-z][a-z0-9]*)b[^>]*>(.*?)</1>

Wow, looks complicated, eh? :] Don’t worry, you’ll be learning about all the special characters in this regular expression in the rest of this tutorial and, by the time you’re done, you’ll understand how this works! :]

If you want more details about the previous regular expression, check out this discussion for an explanation.

Overall Concepts

Before you go any further, it’s important to understand a few core concepts about regular expressions.

Literal characters are the simplest kind of regular expression. They’re similar to a “find” operation in a word processor or text editor. For example, the single-character regular expression t will find all occurrences of the letter “t”, and the regular expression jump will find all appearances of “jump”. Pretty straightforward!

Just like a programming language, there are some reserved characters in regular expression syntax, as follows:

  • [
  • ( and )
  • *
  • +
  • ?
  • { and }
  • ^
  • $
  • .
  • | (pipe)
  • /

These characters are used for advanced pattern matching. If you want to search for one of these characters, you need to escape it with a backslash. For example, to search for all periods in a block of text, the pattern is not . but rather ..

Each environment, be it Python, Perl, Java, C#, Ruby or whatever, has special nuances in its implementation of regular expressions. And Swift is no exception!

Both Objective-C and Swift require you to escape special characters in literal strings (i.e., precede them by a backslash character). One such special character is the backslash itself! Since the patterns used to create a regular expression are also strings, this creates an added complication in that you need to escape the backslash character when working with String and NSRegularExpression.

That means the standard regular expression . will appear as \. in your Swift (or Objective-C) code.

To clarify the above concept in point form:

  • The literal "\." defines a string that looks like this: .
  • The regular expression . will then match a single period character.

Capturing parentheses are used to group part of a pattern. For example, 3 (pm|am) would match the text “3 pm” as well as the text “3 am”. The pipe character here (|) acts like an OR operator. You can include as many pipe characters in your regular expression as you would like. As an example, (Tom|Dick|Harry) is a valid pattern that matches any of those three names.

Grouping with parentheses comes in handy when you need to optionally match a certain text string. Say you are looking for “November” in some text, but it’s possible the user abbreviated the month as “Nov”. You can define the pattern as Nov(ember)? where the question mark after the capturing parentheses means that whatever is inside the parentheses is optional.

These parentheses are called “capturing” because they capture the matched content and allow you reference it in other places in your regular expression.

As an example, assume you have the string “Say hi to Harry”. If you created a search-and-replace regular expression to replace any occurrences of (Tom|Dick|Harry) with that guy $1, the result would be “Say hi to that guy Harry”. The $1 allows you to reference the first captured group of the preceding rule.

Capturing and non-capturing groups are somewhat advanced topics. You’ll encounter examples of capturing and non-capturing groups in the follow up tutorial.

Character classes represent a set of possible single-character matches. Character classes appear between square brackets ([ and ]).

As an example, the regular expression t[aeiou] will match “ta”, “te”, “ti”, “to”, or “tu”. You can have as many character possibilities inside the square brackets as you like, but remember that any single character in the set will match. [aeiou] looks like five characters, but it actually means “a” or “e” or “i” or “o” or “u”.

You can also define a range in a character class if the characters appear consecutively. For example, to search for a number between 100 to 109, the pattern would be 10[0-9]. This returns the same results as 10[0123456789], but using ranges makes your regular expressions much cleaner and easier to understand.

But character classes aren’t limited to numbers — you can do the same thing with characters. For instance, [a-f] will match “a”, “b”, “c”, “d”, “e”, or “f”.

Character classes usually contain the characters you want to match, but what if you want to explicitly not match a character? You can also define negated character classes, which start with the ^ character. For example, the pattern t[^o] will match any combination of “t” and one other character except for the single instance of “to”.

NSRegularExpressions Cheat Sheet

Regular expressions are a great example of a simple syntax that can end up with some very complicated arrangements! Even the best regular expression wranglers keep a cheat sheet handy for those odd corner cases.

The official raywenderlich.com Regular Expressions Cheat Sheet PDF is included in the download materials available via the Download Materials button at the top or bottom of this tutorial.

In addition, here’s an abbreviated form of the cheat sheet below with some additional explanations to get you started:

  • . matches any character. p.p matches pop, pup, pmp, [email protected], and so on.
  • w matches any “word-like” character which includes the set of numbers, letters, and underscore, but does not match punctuation or other symbols. hellow will match “hello_” and “hello9” and “helloo” but not “hello!”
  • d matches a numeric digit, which in most cases means [0-9]. dd?:dd will match strings in time format, such as “9:30” and “12:45”.
  • b matches word boundary characters such as spaces and punctuation. tob will match the “to” in “to the moon” and “to!”, but it will not match “tomorrow”. b is handy for “whole word” type matching.
  • s matches whitespace characters such as spaces, tabs, and newlines. hellos will match “hello ” in “Well, hello there!”.
  • ^ matches at the beginning of a line. Note that this particular ^ is different from ^ inside of the square brackets! For example, ^Hello will match against the string “Hello there”, but not “He said Hello”.
  • $ matches at the end of a line. For example, the end$ will match against “It was the end” but not “the end was near”
  • * matches the previous element 0 or more times. 12*3 will match 13, 123, 1223, 122223, and 1222222223
  • + matches the previous element 1 or more times. 12+3 will match 123, 1223, 122223, 1222222223, but not 13.
  • Curly braces {} contain the minimum and maximum number of matches. For example, 10{1,2}1 will match both “101” and “1001” but not “10001” as the minimum number of matches is one and the maximum number of matches is two. He[Ll]{2,}o will match “HeLLo” and “HellLLLllo” and any such silly variation of “hello” with lots of L’s, since the minimum number of matches is 2 but the maximum number of matches is not set — and therefore unlimited!

That’s enough to get you started!

It’s time to start experimenting with these examples yourself, as they’re all included in the playground mentioned above.

Where to Go From Here?

Here is a short list of some useful resources about regular expressions:

Make sure to download the Regular Expressions Cheat Sheet PDF and practice playground by using the Download Materials button at the top or bottom of this tutorial.

Head over to our NSRegularExpression Tutorial to learn how to use regular expressions in your Swift code! :]



Source link https://www.raywenderlich.com/5767-an-introduction-to-regular-expressions

LEAVE A REPLY

Please enter your comment!
Please enter your name here