Skip to content Skip to sidebar Skip to footer

Python Dataframe Query With Spaces In Column Name

I want to filter dataframe using query ExcludeData= [1,3,4,5] dfResult.query('Column A in @ExcludeData') How do I use Column A in query without renaming it ?

Solution 1:

Starting with Pandas v. 0.25, it is possible to refer to columns with names containing spaces if you enclose the column name in backticks within the query.

Using Pandas 0.25.2:

>>> df = pd.DataFrame({'a a': [1, 0], 'b b': [1, 1]})
>>> df
   a a  b b
011101>>> df.query('`a a`==`b b`')
   a a  b b
011

From the API docs: https://pandas.pydata.org/pandas-docs/version/0.25/reference/api/pandas.DataFrame.query.html

In your case, the usage would be:

dfResult.query('`Column A` in @ExcludeData')

Solution 2:

I wouldn't use query function. I would use the square bracket notation:

dfResult = dfResult[dfResult['Column A'].isin(ExcludeData)]

Solution 3:

As pointed out by @ayhan, it's not supported right now. However, you can make sure to read your columns without space.

In [51]: df
Out[51]: 
    A  B
012135248355444552698789846923

In [52]: df.columns
Out[52]: Index([u' A', u'B'], dtype='object')

In [53]: pd.read_csv(pd.io.common.StringIO(df.to_csv(index=False)),sep='\s*,').query('A in [2,3]') 
Out[53]: 
   A  B
135923

Post a Comment for "Python Dataframe Query With Spaces In Column Name"