Measuring the memory usage of a Pandas DataFrame
How much memory are your Pandas DataFrame or Series using? Pandas provides an API for measuring this information, but a variety of implementation details means the results can be confusing or misleading. Consider the following example: >>> import pandas as pd >>> series = pd.Series([“abcdefhjiklmnopqrstuvwxyz” * 10 … for i in range(1_000_000)]) >>> series.memory_usage() 8000128 >>> series.memory_usage(deep=True) 307000128 Which is correct, is memory usage 8MB or 300MB? Neither! In this special case, it’s actually 67MB, at least with the default […]
Read more