Skip to content Skip to sidebar Skip to footer

Remove Content Between Parentheses Using Python Regex

I have a text file like - {[a] abc (b(c)d)} I want to remove the content between these bracket [] and (()). so the output should be - abc I removed the content between parenthes

Solution 1:

Imo not as easy as it first might look, you'd very likely need some balanced (recursive) approach which could be achieved with the newer regex module:

import regex as re

string = "some lorem ipsum {[a] abc (b(c)d)} some other lorem ipsum {defg}"

rx_part = re.compile(r'{(.*?)}')
rx_nested_parentheses = re.compile(r'\((?:[^()]*|(?R))*\)')
rx_nested_brackets = re.compile(r'\[(?:[^\[\]]*|(?R))*\]')

for match in rx_part.finditer(string):
    part = rx_nested_brackets.sub('', 
        rx_nested_parentheses.sub('', 
            match.group(1))).strip()
    print(part)

Which would yield

abc
defg


The pattern is
\(         # opening parenthesis
(?:        # non.capturing group
    [^()]* # not ( nor )
    |      # or
    (?R)   # repeat the pattern
)*
\)

Solution 2:

You may check if a string contains [, ], (<no_parentheses_here>) or [no_brackets_here] substrings and remove them while there is a match.

import re                                    # Use standard re
s='{[a] abc (b(c)d)}'
rx = re.compile(r'\([^()]*\)|\[[^][]*]|[{}]')
while rx.search(s):                          # While regex matches the string
    s = rx.sub('', s)                        # Remove the matchesprint(s.strip())                             # Strip whitespace and show the result# => abc

See the Python demo

It will also work with paired nested (...) and [...], too.

Pattern details

  • \([^()]*\) - (, then any 0+ chars other than ( and ), and then )
  • | - or
  • \[[^][]*] - [, then any 0+ chars other than [ and ], and then ]
  • | - or
  • [{}] - a character class matching { or }.

Solution 3:

i tried this and i got your desired output...i hope i got you right

import re

withopen('aa.txt') as f:
    input = f.read()
    line = input.replace("{","")
    line = line.replace("}","")
    output = re.sub(r'\[.*\]', "", line)
    output = re.sub(r'\(.*\)', "", output)
    print(output)

Post a Comment for "Remove Content Between Parentheses Using Python Regex"