How to prevent scientific notation in pandas to_csv for mixed data types?

CreativeArtist88 · April 15, 2025, 3:26pm

I’m working with a pandas DataFrame that has both float64 and string columns. When I use to_csv to save it, big numbers show up in scientific notation. For instance, 1344154454156.992676 becomes 1.344154e+12 in the file.

I want to keep the full numbers without scientific notation. I tried using float_format, but it didn’t work because of the string columns. Here’s a simple example:

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'text': ['apple', 'banana', 'cherry'],
    'numbers': np.random.rand(3) * 1e14
})

df.to_csv('output.csv')

This outputs scientific notation for the ‘numbers’ column. How can I make it show the full numbers instead? I’d like the output to look something like this:

  text                 numbers
0 apple 94184321380806.796875
1 banana 22383735919307.046875
2 cherry 99180119890642.859375

Any ideas on how to achieve this without breaking the string columns?

elizabeths · April 25, 2025, 3:50pm

I’ve dealt with this problem in my data analysis work. A straightforward solution is to use the ‘float_format’ parameter with a custom format string. Try this:

df.to_csv(‘output.csv’, float_format=‘%.6f’)

This will format all float columns to 6 decimal places without scientific notation. It won’t affect your string columns, so you don’t need to worry about those.

If you need more control, you can use a dictionary comprehension to specify formats for each column:

formats = {col: ‘%.6f’ if df[col].dtype == ‘float64’ else ‘%s’ for col in df.columns}
df.to_csv(‘output.csv’, float_format=formats)

This approach gives you flexibility to handle different column types individually. Just adjust the format strings as needed for your specific requirements.

SpinningGalaxy · April 23, 2025, 10:06pm

hey, i’ve run into this too. one thing that worked for me was using pandas’ to_string() method first, then writing that to a file. something like:

with open(‘output.csv’, ‘w’) as f:
f.write(df.to_string(index=False))

this keeps the full numbers without scientific notation. just remember to set index=False if you don’t want the index column in your output.

RunningTiger · April 23, 2025, 12:57pm

I’ve encountered this issue before when working with large datasets containing mixed data types. One approach that worked for me is using the float_format parameter in combination with a custom formatter function. Here’s a solution I found effective:

def format_float(x):
    return f'{x:.6f}' if isinstance(x, float) else x

df.to_csv('output.csv', float_format='%.6f', 
          formatters={'text': format_float, 'numbers': format_float})

This method preserves the full precision of your float values without scientific notation, while also handling the string columns correctly. The format_float function checks if the value is a float and formats it accordingly, otherwise it leaves it as is for string values.

Keep in mind that this approach might slightly increase the file size due to the full representation of large numbers. If file size is a concern, you might want to consider alternative storage formats like parquet or HDF5 for more efficient handling of mixed data types.