Stream S3 > zip > CSV > Postgres with Elixir

I was recently confronted with this situation: an S3 bucket contains a zip file the zip file contains a CSV I want to take the content of that CSV, transform it, and push it to a Postgres database Without streaming Without streaming, the steps to do that are the following: download the zip archive locally (maybe it is huge!) unzip the archive open the CSV file and load its content in memory (maybe it is huge!...

March 25, 2022 · 3 min · Francis Chabouis

Stream data from an API to your database with Elixir

Make your Elixir streams come true Last time we talked about simply streaming a paginated API with Elixir. It was fun, but somehow pointless, because we ended up writing this: datasets = stream_api.("https://data.gouv.fr/api/1/datasets/") |> Stream.take(50) |> Enum.to_list() Streams in Elixir are lazy, meaning that no work will be done by the stream until necessary. If you just write: datasets = stream_api.("https://data.gouv.fr/api/1/datasets/") |> Stream.take(50) You create a stream, say that you are only interested by the first 50 elements....

October 8, 2021 · 9 min · Francis Chabouis