Convert Decimal To Roman Numerals
Solution 1:
Sorry, didn't notice that you're not merely updating the field but you actually want to replace a number at the end, but even if that's the case - it's much better to properly convert your number to roman numerals than to map every possible occurrence of such (what would happen with your code if there is a number larger than 25?). So, here's one way to do it:
ROMAN_MAP = [(1000, 'M'), (900, 'CM'), (500, 'D'), (400, 'CD'), (100, 'C'), (90, 'XC'),
(50, 'L'), (40, 'XL'), (10, 'X'), (9, 'IX'), (5, 'V'), (4, 'IV'), (1, 'I')]
def romanize(data):
if not data or not isinstance(data, str): # we know how to work with strings only
return data
data = data.rstrip() # remove potential extra whitespace at the end
space_pos = data.rfind(" ") # find the last space before the number
if space_pos != -1:
try:
number = int(data[space_pos + 1:]) # get the number at the end
roman_number = ""
for i, r in ROMAN_MAP: # loop-reduce substitution based on the ROMAN_MAP
while number >= i:
roman_number += r
number -= i
return data[:space_pos + 1] + roman_number # put everything back together
except (TypeError, ValueError):
pass # couldn't extract a number
return data
So now if we create your data frame as:
HSP_OLD = pd.DataFrame({"tryl": ["SAF/HSP: Secondary diagnosis E code 1",
None,
"SAF/HSP: Secondary diagnosis E code 11",
"Something else without a number at the end"]})
We can noe easily apply our function over the whole column with:
HSP_OLD['tryl'] = HSP_OLD['tryl'].apply(romanize)
Which results in:
tryl
0 SAF/HSP: Secondary diagnosis E code I
1 None
2 SAF/HSP: Secondary diagnosis E code XI
3 Something else without a number at the end
Of course, you can adapt the romanize()
function to your needs to search any number within your string and turn it to roman numerals - this is just an example for how to quickly find the number at the end of the string.
Solution 2:
You need to keep the order of the items, and start searching with the longest substring.
You may use an OrderDict
here. To initialize it, use a list of tuples. You may reverse it already here, when initializing, but you can do it later, too.
import collections
import pandas as pd
# My test data
HSP_OLD = pd.DataFrame({'tryl':['1. Text', '11. New Text', '25. More here']})
d_hsp_lst=[("1","I"),("2","II"),("3","III"),("4","IV"),("5","V"),("6","VI"),("7","VII"),("8","VIII"), ("9","IX"),("10","X"),("11","XI"),("12","XII"),("13","XIII"),("14","XIV"),("15","XV"), ("16","XVI"),("17","XVII"),("18","XVIII"),("19","XIX"),("20","XX"),("21","XXI"), ("22","XXII"),("23","XXIII"),("24","XXIV"),("25","XXV")]
d_hsp = collections.OrderedDict(d_hsp_lst) # Creating the OrderedDict
d_hsp = collections.OrderedDict(reversed(d_hsp.items())) # Here, reversing
>>> HSP_OLD['tryl'] = HSP_OLD['tryl'].replace(d_hsp, regex=True)
>>> HSP_OLD
tryl
0 I. Text
1 XI. New Text
2 XXV. More here
Post a Comment for "Convert Decimal To Roman Numerals"