Check Pandas Dataframe Column For String Type
I have a fairly large pandas dataframe (11k rows and 20 columns). One column has a mixed data type, mostly numeric (float) with a handful of strings scattered throughout. I subset
Solution 1:
This is one way. I'm not sure it can be vectorised.
import pandas as pd
df = pd.DataFrame({'A': [1, None, 'hello', True, 'world', 'mystr', 34.11]})
df['stringy'] = [isinstance(x, str) for x in df.A]
# A stringy# 0 1 False# 1 None False# 2 hello True# 3 True False# 4 world True# 5 mystr True# 6 34.11 False
Solution 2:
Here's a different way. It converts the values of column A
to numeric, but does not fail on errors: strings are replaced by NA. The notnull()
is there to remove these NA.
df = df[pd.to_numeric(df.A, errors='coerce').notnull()]
However, if there were NAs in the column already, they too will be removed.
See also: Select row from a DataFrame based on the type of the object(i.e. str)
Post a Comment for "Check Pandas Dataframe Column For String Type"