Skip to content Skip to sidebar Skip to footer

Selecting Variations Of Phone Numbers Using Regex

import re s = 'so the 1234 2-1-1919 215.777.9839 1333331234 20-20-2000 A1234567 (515)2331129 7654321B (511)231-1134 512-333-1134 7777777 a7727373 there 1-22-2001 *1831 5647

Solution 1:

To put my two cents in, you could use a regex/parser combination as in:

from parsimonious.grammar import Grammar
from parsimonious.expressions import IncompleteParseError, ParseError
import re

junk = """so the 1234 2-1-1919 215.777.9839 1333331234 20-20-2000 A1234567 (515)2331129 7654321B 
(511)231-1134 512-333-1134 7777777 a7727373 there 1-22-2001 *1831 5647 and !2783"""

rx = re.compile(r'[-()\d]+')
grammar = Grammar(
    r"""
    phone       = area part part
    area        = (lpar digits rpar) / digits
    part        = dash? digits

    lpar        = "("
    rpar        = ")"
    dash        = "-"
    digits      = ~"\d{3,4}"
    """
)

for match in rx.finditer(junk):
    possible_number = match.group(0)
    try:
        tree = grammar.parse(possible_number)
        print(possible_number)
    except (ParseError, IncompleteParseError):
        pass

This yields

(515)2331129
(511)231-1134512-333-1134

The idea here is to first match possible candidates which are then checked with the parser grammar.

Solution 2:

Maybe, we could use alternation based on the cases you might have:

\d{3}-\d{3}-\d{4}|\(\s*\d{3}\s*\)\d{7}|\(\s*\d{3}\s*\)\s*\d{3}-\d{4}

We can also include additional boundaries if it'd be necessary:

(?<!\S)(?:\d{3}-\d{3}-\d{4}|\(\s*\d{3}\s*\)\d{7}|\(\s*\d{3}\s*\)\s*\d{3}-\d{4})(?!\S)

Demo

Test

import re

expression = r"\d{3}-\d{3}-\d{4}|\(\s*\d{3}\s*\)\d{7}|\(\s*\d{3}\s*\)\s*\d{3}-\d{4}"

string = """
so the 1234 2-1-1919 215.777.9839 1333331234 20-20-2000 A1234567 (515)2331129 7654321B (511)231-1134 512-333-1134 7777777 a7727373 there 1-22-2001 *1831 5647 and !2783 (511) 231-1134 ( 511)231-1134 (511 ) 231-1134
511-2311134

"""print(re.findall(expression, string))

Output

['(515)2331129', '(511)231-1134', '512-333-1134', '(511) 231-1134', '( 511)231-1134', '(511 ) 231-1134']

If you wish to explore/simplify/modify the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.


RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

Post a Comment for "Selecting Variations Of Phone Numbers Using Regex"