Skip to content Skip to sidebar Skip to footer

Regular Expression Extracting Number Dimension

I'm using python regular expressions to extract dimensional information from a database. The entries in that column look like this: 23 cm 43 1/2 cm 20cm 15 cm x 30 cm What I need

Solution 1:

try regex below, it will capture 1st digits and optional fractional come after it before the 1st 'cm'

import re
regex = re.compile('(\d+.*?)\s?cm') # this will works for all your example data# or# this asserted whatever come after the 1st digit group must be fractional number only
regex = re.compile('(\d+(?:\s+\d+\/\d+)?)\s?cm') 


>>> regex.match('23 cm').group(1)
>>> '23'>>> regex.match('43 1/2 cm').group(1)
>>> '43 1/2'>>> regex.match('20cm').group(1)
>>> '20'>>> regex.match('15 cm x 30 cm').group(1)
>>> '15'

regex101 demo

Solution 2:

This regex should work (Live Demo)

^(\d+)(?:\s*cm\s+[xX])

Explanation

  • ^(\d+) - capture at least one digit at the beginning of the line
  • (?: - start non-capturing group
  • \s* - followed by at least zero whitespace characters
  • cm - followed by a literal c and m
  • \s+ - followed by at least one whitespace character
  • [xX] - followed by a literal x or X
  • ) - end non-capturing group

You shouldn't need to bother matching the rest of the line.

Solution 3:

Here's a sample of how to do it from a text file. It works for the provided data.

f = open("textfile.txt",r')

     for line in f :
         if 'x'in line:
             iposition = line.find('x')
             print(line[:iposition])

Post a Comment for "Regular Expression Extracting Number Dimension"