Click here to Skip to main content
15,881,173 members
Articles / Programming Languages / F#
Tip/Trick

Reading CSV Files in F# with Type Providers

Rate me:
Please Sign up or sign in to vote.
5.00/5 (5 votes)
16 Oct 2019CPOL2 min read 9.7K   2  
A simple example of usage of type providers in F#

Introduction

This tip gives a very brief overview about type providers in F# and how to use them to explore CSV files.

Background

In case you have never heard about type providers in F#, then it is worth giving them a try since they are pretty powerful.

Given a certain data source, type providers generate a data type that matches the structure of your data source. This new data type is generated by the F# compiler.

Depending on the type of data source, there are several type providers (XML, SQLProvider, etc.). To keep things simple in this post, we will use the CSV type providers.

Using the Code

Let us suppose we have a CSV file called myFile.csv with this content (the first row is the header).

C1,C2,C3
4,5,6
7,2,8
9,12,3

and we want to explore it using the CSV type provider in F#.

To start, we can create a console F# application and add the NuGet package FSharp.Data (to date version 3.1.1). This package can handle the following format types: CSV, HTML, JSON and XML

Let us suppose we want to find the maximum value at column c1. The F# code for that is:

F#
open FSharp.Data

// 1) CSV type provider definition for file "myFile.csv"
type myCsvTypeProvider = CsvProvider<"myFile.csv", HasHeaders=true>

[<EntryPoint>]
let main argv =
    // 2) create instance of CSV type provider
    let myCsv = myCsvTypeProvider.GetSample()
    let maxC1 =
        // 3) iterate over the rows
        myCsv.Rows
        // 4) for each row extract the value at column c1
        |> Seq.map(fun row -> row.C1)
        // 5) extract the max of all values at column c1
        |> Seq.max
    printfn "the max value in c1 is %A" maxC1
    System.Console.ReadKey |> ignore
    0

A particular interesting piece of code is the line:

F#
|> Seq.map(fun row -> row.C1)

You can notice that row has the C1 property which is the name of column 1. In the same way, you could also use row.C2 or row.C3.  In other words, the myCsvTypeProvider CSV type provider contains the information about the structure of the CSV file, that is the CSV file contains three columns. You can access this information thanks to the type provider. The F# compiler cares about extracting the file structure of the CSV file for you.

As a result, you can access the content of the CSV file without the need to parse the rows, etc. since the type provider does this already for you.

If you wanted to find the maximum value at column c1 in C#, it would not be that direct, indeed you would have to

  1. parse each line
  2. get the column values for each line
  3. convert each value to an integer

Type providers in F# make all these steps unnecessary since type providers do this automatically.

Points of Interest

This tip is a short introduction to type providers in F#. As a simple example, we explore a CSV file with a CSV type providers. We show how the CSV type provider makes it easy to access the file content.

History

  • 16th October, 2019: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Italy Italy
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
-- There are no messages in this forum --