I was recently confronted with this situation:
an S3 bucket contains a zip file the zip file contains a CSV I want to take the content of that CSV, transform it, and push it to a Postgres database Without streaming Without streaming, the steps to do that are the following:
download the zip archive locally (maybe it is huge!) unzip the archive open the CSV file and load its content in memory (maybe it is huge!...
Make your Elixir streams come true Last time we talked about simply streaming a paginated API with Elixir. It was fun, but somehow pointless, because we ended up writing this:
datasets = stream_api.("https://data.gouv.fr/api/1/datasets/") |> Stream.take(50) |> Enum.to_list() Streams in Elixir are lazy, meaning that no work will be done by the stream until necessary. If you just write:
datasets = stream_api.("https://data.gouv.fr/api/1/datasets/") |> Stream.take(50) You create a stream, say that you are only interested by the first 50 elements....