Pandas & Glob - Excel File Format Cannot Be Determined, You Must Specify An Engine Manually
Solution 1:
Found it. When an excel file is opened for example by MS excel a hidden temporary file is created in the same directory:
~$datasheet.xlsx
So, when I run the code to read all the files from the folder it gives me the error:
Excel file format cannot be determined, you must specify an engine manually.
When all files are closed and no hidden temporary files~$filename.xlsx
in the same directory the code works perfectly.
Solution 2:
Also make sure you're using the correct pd.read_*
method. I ran into this error when attempting to open a .csv
file with read_excel()
instead of read_csv()
. I found this handy snippet here to automatically select the correct method by Excel file type.
if file_extension == 'xlsx':
df = pd.read_excel(file.read(), engine='openpyxl')
elif file_extension == 'xls':
df = pd.read_excel(file.read())
elif file_extension == 'csv':
df = pd.read_csv(file.read())
Solution 3:
I also got an 'Excel file format...' error when I manually changed the 'CSV' suffix to 'XLS'. All I had to do was open excel and save it to the format I wanted.
Solution 4:
Looks like an easy fix for this one. Go to your excel file, whether it is xls or xlsx or any other extension, and do "save as" from file icon. When prompted with options. Save it as CSV UTF-8(Comma delimited)(*.csv)
Solution 5:
https://stackoverflow.com/a/32241271/17411729
link to an answer on how to remove hidden files
Mac = go to folder press cmd + shift + . will show the hidden file, delete it, run it back.
Post a Comment for "Pandas & Glob - Excel File Format Cannot Be Determined, You Must Specify An Engine Manually"