Why I like F#.

Recently Dr. James McCaffery, posted Why he doesn't like the F# language. Quite a few of his points are subjective. People have different preferences and it seems like F# and more generally functional programming takes him outside of this comfort zone. This is fine, and I have absolutly no objections about views like this. I have a similar feeling when I'm in C# or Java. I don't feel safe, or comfortable, again it is just a preference thing.

However, there are a few points raised in the blog post that I don't really agree with. I'll tackle each one seperately not to loose any context.

  1. F# has a tiny user base.

I did a quick search on the day I wrote this post at a job aggregation site and found 109 job listings that mentioned F#. There were over 34,000 job listings that mentioned C#. And at MSDN Magazine, where I'm the Senior Contributing Editor, our F# articles get very few reads. The tiny user base means there is relatively weak community technical support on sites like Stack Overflow, compared to mainstream languages. Additionally, unlike other languages with relatively few users (such as R), there`s no real motivation for me to adopt F# from a career point of view, because of the very limited job opportunities.

While I somewhat agree, that F# adoption in industry has been slow. I think alot of this is to do with the fact that in the early days F# wasn't pushed as a general purpose programming language. This was obviously a marketing decision made in Microsoft, for reasons that are unknown to me. This decision caused an elitist view of F# in the early days with the preception that you need a advanced degree in a mathematical subject to use it, categorising it as only being good for Data Science, Finance, Mathematics and Research. Thus these areas were the early adoptors. In fact a quick browse of the testimonials page on FSharp.org backs this up. With testimonails coming from one of these areas. There are of course some exceptions most notably the design of Louvre in Abu Dhabi and it's use at GameSys.

However this metric is only one dimension and just because there are currently only a few jobs in a language doesn't mean you should not learn it. I'm currently learning langauges like Haskell, Coq and Idris. For the latter I doubt there is a single role in this country (although I'm willing to be proved wrong on this). Why do I do this? I hear you ask. Well I believe by learning different langauges and paradigms pushes me slightly out of my comfort zone and makes me a better programmer, in which-ever language I ultimately end up coding a commercial product in.

With the commercial prospects aside a conslusion is drawn that a small user base => a weak technical community. I don't know about other languages but I can categorically say that with F# this is simply not true. In fact, as I started writing this blog post, I raised an issue in the Paket project on github and within an hour, I had a fix presented too me.

For other sites like Stack Overflow, I can't really comment on the experience as I don't tend to use it much myself. However we can use F# to do some data munging to see how the community it doing. i.e. What is the average time for questions with an accepted answer to have got that answer?

To acheive this we can download the first 10000 questions with the F# tag, and write the result of each request out to a set of files.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
let baseUri = "https://api.stackexchange.com/2.2/"
let [<Literal>] dataPath = __SOURCE_DIRECTORY__ + "/data/stackoverflow/"

let dataDir =
    let path = new DirectoryInfo(dataPath)
    if not(path.Exists)
    then path.Create()
    path
    
let getQuestions(page) =
    let outputPath = new FileInfo(Path.Combine(dataDir.FullName, sprintf "questions_%d.questions" page))
    if(not <| outputPath.Exists)
    then
        let results =

            Http.RequestString(baseUri + sprintf "search?page=%d" page + "&pagesize=100&order=desc&sort=creation&tagged=f%23&site=stackoverflow")
        File.WriteAllText(outputPath.FullName, results)

let writeQuestions() =
    [1 .. 100] |> List.iter getQuestions

Next we can merge all of these questions using the Json type provider into a single list,

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
let [<Literal>] questionPath = dataPath + "questions.json"
type Questions = JsonProvider<questionPath>

let questions =
    [
       for file in  dataDir.EnumerateFiles("*.questions") do
           yield! Questions.Load(file.FullName).Items
    ]

Next up is getting the accepted answers. Firstly we build a map of the accepted answersId against the questions so we can relate them again later, then we use getAcceptedAnswers to chunk the requests and write the results out to a file. Once we have the results we again use the Json type provider to merge the results up into a single list.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 
28: 
29: 
30: 
31: 
32: 
33: 
34: 
35: 
36: 
37: 
let questionAnswerMap = 
    questions
    |> Seq.fold (fun state question ->
        match question.AcceptedAnswerId with
        | Some answerId -> (answerId, question) :: state  
        | None -> state
    ) []  
    |> Map.ofSeq
    
let getAcceptedAnswers() =
    let answerIds =
        questionAnswerMap
        |> Map.toSeq
        |> Seq.map (fun (answerId,_) -> answerId.ToString())

    let counter = ref 1
    for answers in chunkBySize 100 answerIds do
        let outputPath = new FileInfo(Path.Combine(dataDir.FullName, sprintf "answers_%d.answers" !counter))
        if (not <| outputPath.Exists)
        then
            let answersStr = String.Join(";", answers)
            let answers =
                Http.RequestString(
                     baseUri + sprintf "answers/%s?order=desc&sort=creation&site=stackoverflow" answersStr
                )
            printfn "Writing answers %s" outputPath.FullName
            File.WriteAllText(outputPath.FullName, answers)
            incr(counter)

let [<Literal>] answersPath = dataPath + "answers.json"
type Answers = JsonProvider<answersPath>

let answers =
    [
       for file in  dataDir.EnumerateFiles("*.answers") do
           yield! Answers.Load(file.FullName).Items
    ]

Next up we pair the questions with the accepted answers.

1: 
2: 
3: 
4: 
5: 
6: 
7: 
let mergeQuestionAnswers =
    [
        for answer in answers do
            match questionAnswerMap.TryFind answer.AnswerId with
            | Some question -> yield question, answer
            | None -> ()            
    ]

And we are now at a point where we can compute some statistics around the questions.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
let getTimeToClose (question : Questions.Item, answer : Answers.Item) =
    (unixToDateTime answer.CreationDate).Subtract(unixToDateTime question.CreationDate).TotalHours

let statsByYear =
    mergeQuestionAnswers
    |> Seq.groupBy (fun (q,a) -> (unixToDateTime q.CreationDate).Year)
    |> Seq.map (fun (year, data) ->
         let timeToClose = 
             data |> Seq.map getTimeToClose
         let average = timeToClose |> Seq.average
         let median = timeToClose |> median
         year, average, median
    )
    |> Seq.sortBy (fun (y, _, _) -> y)
    |> Seq.toArray

This gives the following results.

[|(2008, 2345.916019, 1.900555556); (2009, 333.4175912, 0.4058333333);
  (2010, 96.91721817, 0.5219444444); (2011, 48.4008104, 0.5786111111);
  (2012, 165.2300729, 0.66); (2013, 36.97864167, 0.74);
  (2014, 42.91575397, 0.6572222222); (2015, 25.93529412, 0.6344444444)|]

Which we can then plot using

1: 
2: 
3: 
4: 
(Chart.Combine [
    Chart.Line(statsByYear |> Seq.map (fun (y, a, _) -> y, a), Name = "Average")
    Chart.Line(statsByYear |> Seq.map (fun (y, _, m) -> y, m), Name = "Median")
]).WithLegend(true)

And actually we see that in 2008 when FSharp first appeared it took a long time for questions to get closed. This is the year F# was introduced and I suspect there was only a handful of people outside of Microsoft Research that actually where able to answer these questions. However as time has progressed we see an exponential improvement it the time for questions to get answered, which typically bottoms out with an average of 25 hours and a median of about 30 mins. This is clearly a sign of a responsive community, that is indeed growing. Whats more, I still don't think that an average of 25 hours is actually representative. In my experience I rarely use Stack Overflow for F# questions, instead I direct my questions to the fsharp github repository previously on codeplex, the repository of the project I am using, or finally twitter with the #fsharp tag, and wait for the plethora of responses to come in from the help and very active community members. And in these domains the response time is typically around ~5 minutes.

In fact as I write this I'm wondering whether the comment

few irritatingly vocal people in the F# community implicitly try to guilt non-believers into F# adoption.

has been spurred by the willingness to help in the community. Yes there is a certain amount of advertisment that goes on for features specific to F#, but in general it is just sound fundamental programming advice. I'm fairly sure every single one of those people would offer examples in C#, Haskell or VB if asked. Anyway I digress.

The second comment that stood out for me in the post was,

  1. F# has no compelling technical advantage. Some overly zealous F# fans claim that F# can do things that general purpose languages like Perl and C# can't. This all depends on how you define features of languages. But more reasonable F# proponents usually state, correctly, that F# isn't intended to replace languages like C# and that F# doesn't have any unique, magic capabilities. So, there's no technical reason for me to use F#, and the cost of context switching between the primarily procedural C# and the primarily functional F# is a huge price to pay.

I think you only have to look at type providers, which are used to analyse the Stack Overflow questions above are certainly a nice and almost unique feature. That to my knowledge only one other language has Idris. Sure you can do the analysis I have done above in C# but there will be alot more typing and additionally a lot less safety, since you will ultimately loose the strongly typed data access, that type providers offer. Moreover it is this safety that F# and statically typed functional programming languages in general offers you and makes it worth the context switch.

Since I adopted F# and functional programming it has completely changed the way I think about coding problems in all of the other languages I use, most notable C#. It has made me a better developer.

namespace Microsoft.FSharp.Data
namespace System
namespace System.IO
Multiple items
val seq : sequence:seq<'T> -> seq<'T>

Full name: Microsoft.FSharp.Core.Operators.seq

--------------------
type seq<'T> = System.Collections.Generic.IEnumerable<'T>

Full name: Microsoft.FSharp.Collections.seq<_>
module Array

from Microsoft.FSharp.Collections
val zeroCreate : count:int -> 'T []

Full name: Microsoft.FSharp.Collections.Array.zeroCreate
Multiple items
val ref : value:'T -> 'T ref

Full name: Microsoft.FSharp.Core.Operators.ref

--------------------
type 'T ref = Ref<'T>

Full name: Microsoft.FSharp.Core.ref<_>
val sub : array:'T [] -> startIndex:int -> count:int -> 'T []

Full name: Microsoft.FSharp.Collections.Array.sub
Multiple items
val int : value:'T -> int (requires member op_Explicit)

Full name: Microsoft.FSharp.Core.Operators.int

--------------------
type int = int32

Full name: Microsoft.FSharp.Core.int

--------------------
type int<'Measure> = int

Full name: Microsoft.FSharp.Core.int<_>
type DateTimeKind =
  | Unspecified = 0
  | Utc = 1
  | Local = 2

Full name: System.DateTimeKind
field System.DateTimeKind.Utc = 1
Multiple items
val float : value:'T -> float (requires member op_Explicit)

Full name: Microsoft.FSharp.Core.Operators.float

--------------------
type float = System.Double

Full name: Microsoft.FSharp.Core.float

--------------------
type float<'Measure> = float

Full name: Microsoft.FSharp.Core.float<_>
module Seq

from Microsoft.FSharp.Collections
val toArray : source:seq<'T> -> 'T []

Full name: Microsoft.FSharp.Collections.Seq.toArray
val sort : array:'T [] -> 'T [] (requires comparison)

Full name: Microsoft.FSharp.Collections.Array.sort
val floor : value:'T -> 'T (requires member Floor)

Full name: Microsoft.FSharp.Core.Operators.floor
val ceil : value:'T -> 'T (requires member Ceiling)

Full name: Microsoft.FSharp.Core.Operators.ceil
Multiple items
type LiteralAttribute =
  inherit Attribute
  new : unit -> LiteralAttribute

Full name: Microsoft.FSharp.Core.LiteralAttribute

--------------------
new : unit -> LiteralAttribute
val not : value:bool -> bool

Full name: Microsoft.FSharp.Core.Operators.not
val sprintf : format:Printf.StringFormat<'T> -> 'T

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.sprintf
Multiple items
module List

from Microsoft.FSharp.Collections

--------------------
type List<'T> =
  | ( [] )
  | ( :: ) of Head: 'T * Tail: 'T list
  interface IEnumerable
  interface IEnumerable<'T>
  member GetSlice : startIndex:int option * endIndex:int option -> 'T list
  member Head : 'T
  member IsEmpty : bool
  member Item : index:int -> 'T with get
  member Length : int
  member Tail : 'T list
  static member Cons : head:'T * tail:'T list -> 'T list
  static member Empty : 'T list

Full name: Microsoft.FSharp.Collections.List<_>
val iter : action:('T -> unit) -> list:'T list -> unit

Full name: Microsoft.FSharp.Collections.List.iter
val fold : folder:('State -> 'T -> 'State) -> state:'State -> source:seq<'T> -> 'State

Full name: Microsoft.FSharp.Collections.Seq.fold
union case Option.Some: Value: 'T -> Option<'T>
union case Option.None: Option<'T>
Multiple items
module Map

from Microsoft.FSharp.Collections

--------------------
type Map<'Key,'Value (requires comparison)> =
  interface IEnumerable
  interface IComparable
  interface IEnumerable<KeyValuePair<'Key,'Value>>
  interface ICollection<KeyValuePair<'Key,'Value>>
  interface IDictionary<'Key,'Value>
  new : elements:seq<'Key * 'Value> -> Map<'Key,'Value>
  member Add : key:'Key * value:'Value -> Map<'Key,'Value>
  member ContainsKey : key:'Key -> bool
  override Equals : obj -> bool
  member Remove : key:'Key -> Map<'Key,'Value>
  ...

Full name: Microsoft.FSharp.Collections.Map<_,_>

--------------------
new : elements:seq<'Key * 'Value> -> Map<'Key,'Value>
val ofSeq : elements:seq<'Key * 'T> -> Map<'Key,'T> (requires comparison)

Full name: Microsoft.FSharp.Collections.Map.ofSeq
val toSeq : table:Map<'Key,'T> -> seq<'Key * 'T> (requires comparison)

Full name: Microsoft.FSharp.Collections.Map.toSeq
val map : mapping:('T -> 'U) -> source:seq<'T> -> seq<'U>

Full name: Microsoft.FSharp.Collections.Seq.map
module String

from Microsoft.FSharp.Core
val printfn : format:Printf.TextWriterFormat<'T> -> 'T

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.printfn
val incr : cell:int ref -> unit

Full name: Microsoft.FSharp.Core.Operators.incr
val groupBy : projection:('T -> 'Key) -> source:seq<'T> -> seq<'Key * seq<'T>> (requires equality)

Full name: Microsoft.FSharp.Collections.Seq.groupBy
val average : source:seq<'T> -> 'T (requires member ( + ) and member DivideByInt and member get_Zero)

Full name: Microsoft.FSharp.Collections.Seq.average
val sortBy : projection:('T -> 'Key) -> source:seq<'T> -> seq<'T> (requires comparison)

Full name: Microsoft.FSharp.Collections.Seq.sortBy
tweet-share