CSV Parsing: The CSV Parser
Explore how to implement a CSV parser in Elixir using property-based testing with PropEr. Understand encoding and decoding strategies, manage edge cases in CSV data, and combine property-based tests with example-based unit tests to ensure robustness and reliability.
We'll cover the following...
The CSV parser
We can now move on to implementing a CSV parser. Here is a possible implementation:
Note: Decoding is done by fetching the headers, then fetching all of the rows. A header line is parsed by reading each column name one at a time, and a row is parsed by reading each field one at a time.
First, there’s the public interface with two functions:
encode/1decode/1.
The functions are fairly straightforward, delegating the more complex operations to private helper functions. Let’s start by looking at those helping with encoding:
If a string is judged to need escaping (according to escapable/1), then the string is wrapped in double quotes (") and all double quotes inside of it are escaped with another double quote. With this, encoding is covered. Next, there are decoding’s private functions:
Decoding is done by fetching the headers, then fetching all of the rows. A header line is parsed by reading each column name one at a time, and a row is parsed by reading each field one at a time. At the end we can see that both fields and names are actually implemented as quoted or unquoted ...