The Importance of Social Media in Digital Marketing
What is Growth Hacking and Growth Hacker?
- What is RegEx?
- What are RegEx Meta Characters?
- What are RegEx Replacement Characters?
- RegEx. What are Character Classes?
A text string consisting of many characters and symbols that allow creating patterns that help match, find or manipulate any text is called regular expression (regex). In other words, regular expressions are a kind of algorithm that allows us to easily find a searched phrase in texts. In other words, it allows us to find the phrase we want to find in long sentences with the patterns we use.
Many data science professionals, analysts, and programmers have to deal with regular expressions at some point.
RegEx is universal and can handle almost all data types. Various data analysis platforms and programming languages such as SQL, Python, R, Tableau, Java, and .NET support them.
Thanks to RegEx, if you are scanning brand word(s) or brand word(s) one by one to distinguish them within a campaign, you can use it to make your analysis faster.
Reg Ex. Meta Characters;
Vertical Line: Vertical line is an expression that is used very often when working with Analytics. Many regular expressions can be replaced with the default options offered by Analytics, but using the vertical line still minimizes wasted time.
Dot: Represents a single character. Anything can happen where you are. Example: .at — kat, sat etc.
Asterisk “*”: Ignores the previous statement or ensures its repeated use.
Plus: Matches one or more of the previous expressions. Example: hello+ — hello, helloaa etc.
Backslash: Backslash is a very important and useful sign and we should not forget it. We use it when we want to convert a RegEx character into a normal character that has no other meaning. For example: 216.335.128.120 – where we write the regex will only match the IP 216.335.128.120.
Caret: Refers to the beginning of a line. It means it starts with.
Dollar Sign: Indicates the end of a line. It means ending with.
Question Mark: It ensures that the preceding character is in the string or not. It is generally added to detect spelling errors.
Parentheses Sign: The content of parentheses is defined as an element.
Square Brackets: Matches one of the characters inside the square brackets.
Reg Ex. Displacement Characters;
“$number” : Replaces the last substring matched with the group having a decimal number.
“${name}” : Replaces the last substring matched with a group.
“$$” : Replaces a “$” sign.
“$&” : Replaces an entire duplicate copy.
“$`” : Replaces an entire string input before matching.
“$”: Replaces an entire string entry after matching.
“$+” : Replaces the last retained group.
“$_” : Replaces an entire string entry.
Reg Ex. Character Classes;
“p{name}”: Matches any “Unicode” category name. For example, it matches “Unicode” names such as “Ll, Lu, Nd, Z, IsGreek”.
“P{name}”: Matches any name that is not in the “Unicode” category.
“w”: Matches all words. For example, it is equivalent to the following “Unicode” category: “[p{Ll}p{Lu}p{Lt}p{Lo}p{Nd}p{Pc}p{Lm}]“ or the following characters that are not in the “Unicode” category: corresponds to: “[a-zA-Z_0-9]”.
“W”: Matches all characters that are not words. It corresponds to the following “Unicode” category: “[^p{Ll}p{Lu}p{Lt}p{Lo}p{Nd}p{Pc}p{Lm}]”, also equivalent to: “[ ^a-zA-Z_0-9]”.
“d” : Maps to all decimal numbers. “Unicode” is equivalent in the category to: “p{Nd}”. It also corresponds to “[0-9]”.
The “@” sign is used to assign the special characters described above to the string. To use regular expressions in .Net, the “System.Text.RegularExpressions” namespace is used. An object is created in this namespace with the help of the “Regex” class.
The “Regex” class has 7 separate methods. These:
Escape: Escapes meta characters from the string.
Unescape: Adds characters removed from the string using the “Escape” method back into the string.
IsMatch: Returns true or false (boolean) whether a regular expression matches a string.
Match: Returns the matched regular expressions.
Matches: Returns a list of matched regular expressions.
Replace: Replaces matched regular expressions with the matching string.
Split: Returns an array of strings defined with regular expressions. Regular expressions are often used to extract expressions such as an email address, phone number or date from a text. Here are simple examples of the patterns of all these expressions in regular expressions:
Email pattern: “^[w-.]+@([w-]+.)+[w-]{2,4}$”
Phone number pattern: “0([0-9]{3})-[0-9]{3}-[0-9]{2}-[0-9]{2}”
Date pattern: “^d{1,2}/d{1,2}/d{4}$”