Remove Content Between Parentheses Using Python Regex
I have a text file like - {[a] abc (b(c)d)} I want to remove the content between these bracket [] and (()). so the output should be - abc I removed the content between parenthes
Solution 1:
Imo not as easy as it first might look, you'd very likely need some balanced (recursive) approach which could be achieved with the newer regex
module:
import regex as re
string = "some lorem ipsum {[a] abc (b(c)d)} some other lorem ipsum {defg}"
rx_part = re.compile(r'{(.*?)}')
rx_nested_parentheses = re.compile(r'\((?:[^()]*|(?R))*\)')
rx_nested_brackets = re.compile(r'\[(?:[^\[\]]*|(?R))*\]')
for match in rx_part.finditer(string):
part = rx_nested_brackets.sub('',
rx_nested_parentheses.sub('',
match.group(1))).strip()
print(part)
Which would yield
abc
defg
The pattern is
\( # opening parenthesis
(?: # non.capturing group
[^()]* # not ( nor )
| # or
(?R) # repeat the pattern
)*
\)
Solution 2:
You may check if a string contains [
, ]
, (<no_parentheses_here>)
or [no_brackets_here]
substrings and remove them while there is a match.
import re # Use standard re
s='{[a] abc (b(c)d)}'
rx = re.compile(r'\([^()]*\)|\[[^][]*]|[{}]')
while rx.search(s): # While regex matches the string
s = rx.sub('', s) # Remove the matchesprint(s.strip()) # Strip whitespace and show the result# => abc
See the Python demo
It will also work with paired nested (...)
and [...]
, too.
Pattern details
\([^()]*\)
-(
, then any 0+ chars other than(
and)
, and then)
|
- or\[[^][]*]
-[
, then any 0+ chars other than[
and]
, and then]
|
- or[{}]
- a character class matching{
or}
.
Solution 3:
i tried this and i got your desired output...i hope i got you right
import re
withopen('aa.txt') as f:
input = f.read()
line = input.replace("{","")
line = line.replace("}","")
output = re.sub(r'\[.*\]', "", line)
output = re.sub(r'\(.*\)', "", output)
print(output)
Post a Comment for "Remove Content Between Parentheses Using Python Regex"