python - How can I work with larger than memory Snowflake datasets in polars? - Stack Overflow

admin2025-04-17 15

I'm trying to work with data stored in a Snowflake database using polars in python. I see I can access the data with pl.read_database_uri with the adbc engine. I was wondering how I can do this efficiently for larger-than-memory datasets.

Is it possible to stream the results using polar's lazy API, or any other method?
Is it possible to batch the results as pl.read_database can? Or is it possible to partition the results, as the docs say is possible with connectorx?
Are there any other ways I might use polars to help work with larger-than-memory datasets in this instance? Or do I need to do my processing in SQL so that the data comes into python in a manageable size?

Thanks!

Is it possible to stream the results using polar's lazy API, or any other method?
Is it possible to batch the results as pl.read_database can? Or is it possible to partition the results, as the docs say is possible with connectorx?
Are there any other ways I might use polars to help work with larger-than-memory datasets in this instance? Or do I need to do my processing in SQL so that the data comes into python in a manageable size?

Thanks!

Share Improve this question asked Jan 31 at 19:25 user2966505 671 silver badge5 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

As of polars==1.25.2 there's not an easy way to do this.

One way I've approached this problem is to use the Snowflake Connector for Python to iteratively retrieve batches of of a query result and process those batches using polars.

But I encountered some surprising Snowflake Connector behavior when doing this:

Queries which produce zero rows return None rather than an empty table. This is possible to work around with fetch_arrow_all(..., force_return_table=True) (docs).
When using fetch_arrow_batches() , column datatypes can vary among batches

转载请注明原文地址:http://anycun.com/QandA/1744849673a88485.html

python - How can I work with larger than memory Snowflake datasets in polars? - Stack Overflow

1 Answer 1

pythonHow can I work with larger than memory Snowflake datasets in polarsStack Overflow