Regular Expression Extracting Number Dimension
I'm using python regular expressions to extract dimensional information from a database. The entries in that column look like this: 23 cm 43 1/2 cm 20cm 15 cm x 30 cm What I need
Solution 1:
try regex below, it will capture 1st digits and optional fractional come after it before the 1st 'cm'
import re
regex = re.compile('(\d+.*?)\s?cm') # this will works for all your example data# or# this asserted whatever come after the 1st digit group must be fractional number only
regex = re.compile('(\d+(?:\s+\d+\/\d+)?)\s?cm')
>>> regex.match('23 cm').group(1)
>>> '23'>>> regex.match('43 1/2 cm').group(1)
>>> '43 1/2'>>> regex.match('20cm').group(1)
>>> '20'>>> regex.match('15 cm x 30 cm').group(1)
>>> '15'
Solution 2:
This regex should work (Live Demo)
^(\d+)(?:\s*cm\s+[xX])
Explanation
^(\d+)
- capture at least one digit at the beginning of the line(?:
- start non-capturing group\s*
- followed by at least zero whitespace characterscm
- followed by a literalc
andm
\s+
- followed by at least one whitespace character[xX]
- followed by a literalx
orX
)
- end non-capturing group
You shouldn't need to bother matching the rest of the line.
Solution 3:
Here's a sample of how to do it from a text file. It works for the provided data.
f = open("textfile.txt",r')
for line in f :
if 'x'in line:
iposition = line.find('x')
print(line[:iposition])
Post a Comment for "Regular Expression Extracting Number Dimension"