<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://andyfriesen.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://andyfriesen.com/" rel="alternate" type="text/html" /><updated>2025-01-22T05:24:59+00:00</updated><id>https://andyfriesen.com/feed.xml</id><title type="html">Andy Friesen</title><subtitle>Interblag</subtitle><entry><title type="html">Crux - Records</title><link href="https://andyfriesen.com/2016/06/08/crux-records.html" rel="alternate" type="text/html" title="Crux - Records" /><published>2016-06-08T00:00:00+00:00</published><updated>2016-06-08T00:00:00+00:00</updated><id>https://andyfriesen.com/2016/06/08/crux-records</id><content type="html" xml:base="https://andyfriesen.com/2016/06/08/crux-records.html"><![CDATA[<p>This is my second post about Crux.  You may want to read the <a href="/2016/05/19/crux.html">first one</a> first.</p>

<p>Records are a pretty fundamental idea in every programming language.  It’s important to get them right.</p>

<p>We have some requirements:</p>

<ul>
  <li>Records have to be easy to understand</li>
  <li>Crux must be a delightful programming environment on the web too, so records must be easy to use when working with foreign JavaScript APIs.</li>
  <li>Mutability needs to be convenient and predictable</li>
</ul>

<p>I think we’ve come up with something that’s both novel and hits all the sweet spots.</p>

<h1 id="row-polymorphism">Row Polymorphism</h1>

<p>First off, records in Crux are what we call <em>row polymorphic</em>.  This means firstly that a record is no more or less than the set of fields it has.  If two records have the same fields, and those fields have the same types, then they have the same record type.  This is also called <em>structural typing</em>.  OCaml and TypeScript also make use of this idea.</p>

<p>This is in stark contrast to languages like C# and Java where a type declaration adds a sort of identity to the data type.  This is what we call <em>nominal typing</em> and Crux supports this as well. (I’ll get to this another day)</p>

<p>In a structural type system, we don’t care so much about exact matches.  Instead, we just care that a value has the properties that a particular function needs.  For instance, we might write a hypotenuse function for points</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fun hypot(point) {
    sqrt(point.x * point.x + point.y * point.y)
}
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">point</code> parameter of this function clearly needs to have an <code class="language-plaintext highlighter-rouge">x</code> and a <code class="language-plaintext highlighter-rouge">y</code>, but we haven’t said anything about what other properties it might have.  In Crux, like TypeScript, it doesn’t matter:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>let named_point = {
    name: "My House",
    x: 122.4194,
    y: 37.7749
}
let h = hypot(named_point)
</code></pre></div></div>

<p>This is ok.  As long as the argument satisfies the required properties, additional properties are allowed.</p>

<h1 id="mutability">Mutability</h1>

<p>Immutable values are fantastic things to have around.  They’re so much easier to reason about.  We’ve done a lot of work both in environments where things are mutable and immutable by default, and the latter is quite a lot better longterm.</p>

<p>We’ve also worked in environments where mutability is a fair bit less convenient to get at, and we’d really prefer to be on the other side of that fence.</p>

<p>To that end, we wanted Crux to afford easy access to immutable data, but with a convenient way to strip that off and start changing things.</p>

<p>We use type inference to sort all of this out.</p>

<p>You can mutate a record field just like you think you should:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>let named_point = {
    name: "My House",
    x: 122.4194,
    y: 37.7749
}
named_point.name = "This name is much better"
</code></pre></div></div>

<p>One thing you can do in Crux is to explicitly declare record fields to be mutable or immutable.  Presently, we do this with a type annotation.  We might add syntax to make this easier.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// Define a little type alias, for brevity
type NamedPoint = {
    const name: String,
    mutable x: Number,
    mutable y: Number
}

let named_point : NamedPoint = {
    name: "The Greatest Point",
    x: 999,
    y: 999
}
</code></pre></div></div>

<p>These annotations are optional, and if you don’t specify one, the type inference engine will figure it out.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fun zero_out(point) {
    point.x = 0
    point.y = 0
}
let my_point = { x: 3, y: 2 } // x and y must be mutable
zero_out(my_point)
</code></pre></div></div>

<p>So far we’ve talked about mutable and immutable record fields, but there is actually a third state which we haven’t figured out a name for yet.  It is a record field that isn’t mutated in the current scope, but may or may not be mutable in other scopes.</p>

<p>The reason for this is because we can easily prove that a function <em>requires</em> a mutable field, but we can never prove that a mutable field is <em>forbidden</em>.  Consider our first example:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fun hypot(point) {
    sqrt(point.x * point.x + point.y * point.y)
}
</code></pre></div></div>

<p>Either a mutable or an immutable <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code> will work just fine.  Every record field is thus one of the following:</p>

<ul>
  <li>Mutable,</li>
  <li>Immutable, or</li>
  <li>An immutable view into a value which might be mutable.  This is very similar to <code class="language-plaintext highlighter-rouge">const</code> in C++.</li>
</ul>

<p>The really nice thing about this scheme is that the type inference engine will generally stay out of your way until you put a type annotation on a record field.</p>

<h1 id="javascript">JavaScript</h1>

<p>Lastly, Crux will not be delightful to use if it’s difficult to talk to JavaScript code.  To make this easy, we promise that Crux will obey two rules:</p>

<ul>
  <li>A Crux record maps exactly to a JavaScript object, and</li>
  <li>Calling a function on a Crux record always generates the code for a JS method call</li>
</ul>

<p>Let’s look at a simple example.  Say we want to run this function:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fun main() {
    document.body.insertBefore(
        document.createTextNode("Hello!"),
        document.body.firstChild
    )
}
</code></pre></div></div>

<p>First off, Crux doesn’t yet know anything about browser APIs.  We’ll add this to the standard library someday, but for now, we need to <em>build</em> our standard library. :)</p>

<p>The <code class="language-plaintext highlighter-rouge">document</code> object is always in scope on a web page, so we’ll use the <code class="language-plaintext highlighter-rouge">declare</code> construct to tell the compiler that it exists.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>declare document : Document
</code></pre></div></div>

<p>No code is generated from this declaration.  It’s just a promise to the compiler.</p>

<p>Next, we need to define the <code class="language-plaintext highlighter-rouge">Document</code> type:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Document = {
    createTextNode: (String) -&gt; Node,
    body: {
        insertBefore: (Node, Node) -&gt; Node,
        firstChild: Node
    }
}

data Node {}
</code></pre></div></div>

<p>Astute readers might ask about what this means in relation to JS prototypes and method dispatch, and the answer is quite simple: Crux has no awareness whatsoever of these things.  We promise that <code class="language-plaintext highlighter-rouge">document.createTextNode("Hello!")</code> in Crux will generate the JS <code class="language-plaintext highlighter-rouge">document.createTextNode("Hello!")</code>, but how that JS statement will be executed is left up to the JS engine.</p>

<p>Note here that we also defined a <code class="language-plaintext highlighter-rouge">Node</code> type, but didn’t say anything at all about its composition.  This is an easy way to make a data type that has no user-inspectable parts.  You can think of it as an inscrutable baton that gets passed around.</p>

<p>You can try it yourself in our <a href="http://cruxlang.org/try">online playground</a>.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[This is my second post about Crux. You may want to read the first one first.]]></summary></entry><entry><title type="html">Crux - A Programming Language for People</title><link href="https://andyfriesen.com/2016/05/19/crux.html" rel="alternate" type="text/html" title="Crux - A Programming Language for People" /><published>2016-05-19T00:00:00+00:00</published><updated>2016-05-19T00:00:00+00:00</updated><id>https://andyfriesen.com/2016/05/19/crux</id><content type="html" xml:base="https://andyfriesen.com/2016/05/19/crux.html"><![CDATA[<p><a href="https://chadaustin.me/">Chad Austin</a> and I have been working on a programming language for the past 6 months or so.  It is still not sufficiently stable that I’d recommend it for actual production use, but we’ve done enough that we think it might be interesting to language nerds.</p>

<p>Crux arose from a lot of research and personal experience on both our parts.</p>

<h1 id="javascript">JavaScript</h1>

<p>To start, we both have a lot of experience dealing with large, old code bases written in dynamic languages.  We had the privilege of working with some tremendously smart, motivated people who all wanted to do the right thing, but we were nevertheless left feeling <a href="https://chadaustin.me/2015/04/the-long-term-problem-with-dynamically-typed-languages/">unsatisfied</a> with the amount of work it takes to get good reliability and agility out of dynamic languages.</p>

<p>JavaScript is absolutely not the language we’d like to build our web applications in.</p>

<h1 id="haskell">Haskell</h1>

<p>Secondly, we’ve also got a lot of experience on the opposite extreme: <a href="https://engineering.imvu.com/2014/03/24/what-its-like-to-use-haskell/">we have both written quite a lot of production Haskell</a>.  We love the fidelity of Haskell’s type system and how it helps real humans write good software that can still change even when it is large and old, but we found the human factors to leave something to be desired:</p>

<ul>
  <li>In order to do anything with any data type, you have to move your cursor to the top of the file and add an <code class="language-plaintext highlighter-rouge">import</code> statement.  Larger modules require dozens of imports.  We’ve seen over a hundred imports in a single source file.</li>
  <li>Haskell is lazy.  Because of this, a lot more is required of the compiler to get reasonable code, and even then, it’s easy for a well-meaning person to write code that allocates far more memory than expected. (a “space leak”)</li>
  <li>There is a JS backend for Haskell, but the code it generates is very large (960kb for Hello World!) and is almost impossible for a human being to understand.</li>
</ul>

<p>Haskell is great (we’re using it to author the compiler!), but it’s far from perfect, and we can’t use it on the web anyway.</p>

<h1 id="ocaml">OCaml</h1>

<p>There exists a <em>spectacular</em> JS backend for OCaml called <a href="http://ocsigen.org/js_of_ocaml/">js_of_ocaml</a>.  It generates fast, somewhat readable JS, and the OCaml language itself is remarkably well thought out.</p>

<p>The problem is that (and I must stress that I think this regrettable) OCaml will never become a popular mainstream language, and it has nothing to do with OCaml’s theoretical soundness.</p>

<p>OCaml is <em>culturally tonedeaf</em>:</p>

<ul>
  <li>Arrays use the syntax <code class="language-plaintext highlighter-rouge">[| 1 ; 2 ; 3 ; 4 |]</code>.  Linked lists use the syntax <code class="language-plaintext highlighter-rouge">[1 ; 2 ; 3 ; 4]</code>.</li>
  <li>Tuples use <code class="language-plaintext highlighter-rouge">,</code> and do not require parens.  The expression <code class="language-plaintext highlighter-rouge">[1 , 2]</code> is actually a list of 1 tuple.</li>
  <li>OCaml has no overloading.  Adding integers is done with the <code class="language-plaintext highlighter-rouge">+</code> operator; to add floats, the <code class="language-plaintext highlighter-rouge">+.</code> operator must instead be used.</li>
  <li>OCaml has objects, but the method access operator is <code class="language-plaintext highlighter-rouge">#</code>, not <code class="language-plaintext highlighter-rouge">.</code> or <code class="language-plaintext highlighter-rouge">-&gt;</code>  eg <code class="language-plaintext highlighter-rouge">document#createElement "text"</code></li>
  <li>Mutable data is created with the <code class="language-plaintext highlighter-rouge">ref</code> function.  This looks great.  Assignment, however, uses <code class="language-plaintext highlighter-rouge">:=</code>.  Reading a ref cell requires using the <code class="language-plaintext highlighter-rouge">!</code> operator, much like you use the <code class="language-plaintext highlighter-rouge">*</code> operator in C to dereference a pointer. eg <code class="language-plaintext highlighter-rouge">x := !x + 1</code></li>
</ul>

<p>Crucially, there are very good historical and technical reasons why all of these things are the way they are, but contemporary programmers don’t look at that.  We see <code class="language-plaintext highlighter-rouge">let a = [|1; 2; 3|];;</code> and we’re <em>done</em>.  No further justification is necessary.</p>

<p>OCaml is a surprisingly adept language for the web, but it can never be more than a tiny niche.</p>

<h1 id="typescript">TypeScript</h1>

<p>Lastly, we looked at TypeScript.</p>

<p>TypeScript looks as though it is purpose-made to be a success among JavaScript developers.</p>

<p>Almost all of its syntax is instantly recognizable to people coming from JS, C#, or Java, and it has a stellar story for working with untyped JS: the JS compiler is designed from the start to do very little more than perform extra type checking.  If you strip the types from TypeScript, you get JavaScript.</p>

<p>Unfortunately, preexisting JS doesn’t necessarily map to any kind of sane static type system, so TypeScript is intentionally <em>unsound</em>.  By this I mean that it is possible to write a valid TypeScript program that incorrectly uses a value of one type as though it has some other (unrelated) type.</p>

<p>TypeScript also wound up repeating the <a href="https://en.wikipedia.org/wiki/Tony_Hoare#Apologies_and_retractions">Billion Dollar Mistake</a>.</p>

<p>Now, it’s certainly the case that TypeScript is a killer solution if you specifically have a preexisting JS application that you need to improve incrementally, but I think unsoundness and pervasive nullability fatally compromise a system’s resilience to change.</p>

<p>TypeScript is what I want to move my aging JS codebase <em>to</em>, but it’s not where I want to start, if I have any choice.</p>

<h1 id="crux">Crux</h1>

<p>From these, we arrive at Crux’s key pillars:</p>

<ul>
  <li>Crux helps you write programs that are still easy to change when they are old and large</li>
  <li>Compiled Crux is small, fast, and has predictable performance</li>
  <li>Crux looks like contemporary programmers expect</li>
</ul>

<p>I’ll go into more detail about what this means in upcoming posts.</p>

<p><a href="https://github.com/cruxlang/crux">Crux</a></p>]]></content><author><name></name></author><summary type="html"><![CDATA[Chad Austin and I have been working on a programming language for the past 6 months or so. It is still not sufficiently stable that I’d recommend it for actual production use, but we’ve done enough that we think it might be interesting to language nerds.]]></summary></entry><entry><title type="html">Haskell Basics: How to Loop</title><link href="https://andyfriesen.com/2015/12/18/haskell-basics-how-to-loop.html" rel="alternate" type="text/html" title="Haskell Basics: How to Loop" /><published>2015-12-18T00:00:00+00:00</published><updated>2015-12-18T00:00:00+00:00</updated><id>https://andyfriesen.com/2015/12/18/haskell-basics-how-to-loop</id><content type="html" xml:base="https://andyfriesen.com/2015/12/18/haskell-basics-how-to-loop.html"><![CDATA[<p>One of the things that really gets newcomers to Haskell is that it’s got a vision of flow control that’s completely foreign.  OCaml is arguably Haskell’s nearest popular cousin, and even it has basic things like while and for loops.</p>

<p>Throw in all this business with <a href="http://stackoverflow.com/questions/3870088/a-monad-is-just-a-monoid-in-the-category-of-endofunctors-whats-the-problem">endofunctors</a> and <a href="https://byorgey.wordpress.com/2009/01/12/abstraction-intuition-and-the-monad-tutorial-fallacy/">burritos</a> and it’s pretty clear that a lot of newcomers get frustrated because all this theoretical stuff gets in the way of writing algorithms that they already know how to write.  In other languages, these newcomers are experts and they are not at all used to feeling lost.</p>

<p>As a preface, I’m not going to explain how monads work, and I’m not going to explain any of the historical anecdotes that explain why these things are the way they are.  This territory is incredibly well-trod by others.</p>

<p>Additionally, many of the things that I’ll describe here are non-idiomatic Haskell, but none create design-wrecking maintenance or performance problems.  I think it’s better that newcomers write “ugly” code that works than it is that they learn all of functional programming all at once. <code class="language-plaintext highlighter-rouge">:)</code></p>

<h1 id="pure-loops">Pure Loops</h1>

<p>If your loop doesn’t require side effects, the thing you’re actually after is some kind of transform.  You want to turn a sequence into something else by walking it.</p>

<h2 id="transforming-elements">Transforming Elements</h2>

<p>If you just want to transform each element of a collection, but you don’t want to change the type (or length!) of the collection at all, you probably want a map.  The map function is called <code class="language-plaintext highlighter-rouge">map</code> and has this signature:</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">map</span> <span class="o">::</span> <span class="p">(</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="n">b</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="o">-&gt;</span> <span class="p">[</span><span class="n">b</span><span class="p">]</span>
</code></pre></div></div>

<p>If you don’t have a list, but instead have a Vector, Map, deque or whatever, you can use its more general cousin <code class="language-plaintext highlighter-rouge">fmap</code>:</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">fmap</span> <span class="o">::</span> <span class="kt">Functor</span> <span class="n">f</span> <span class="o">=&gt;</span> <span class="p">(</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="n">b</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">f</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="n">f</span> <span class="n">b</span>
</code></pre></div></div>

<h2 id="accumulating-aka-folding">Accumulating (aka folding)</h2>

<p>Consider this simple JS:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">function</span> <span class="nx">count</span><span class="p">(</span><span class="nx">anArray</span><span class="p">)</span> <span class="p">{</span>
    <span class="kd">var</span> <span class="nx">result</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="k">for</span> <span class="p">(</span><span class="kd">var</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span> <span class="o">&lt;</span> <span class="nx">anArray</span><span class="p">.</span><span class="nx">length</span><span class="p">;</span> <span class="o">++</span><span class="nx">i</span><span class="p">)</span> <span class="p">{</span>
        <span class="nx">result</span> <span class="o">+=</span> <span class="nx">anArray</span><span class="p">[</span><span class="nx">i</span><span class="p">];</span>
    <span class="p">}</span>
    <span class="k">return</span> <span class="nx">result</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This clearly isn’t a map.  The result isn’t an array at all.  It’s something else.</p>

<p>When you want to walk an array and build up a value like this, use a fold.  The Haskell function you should start with is called <code class="language-plaintext highlighter-rouge">foldl'</code>, found in the <code class="language-plaintext highlighter-rouge">Data.Foldable</code> package.  The above transliterates to this Haskell:</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">count</span> <span class="n">l</span> <span class="o">=</span>
    <span class="kr">let</span> <span class="n">accumulate</span> <span class="n">acc</span> <span class="n">el</span> <span class="o">=</span> <span class="n">el</span> <span class="o">+</span> <span class="n">acc</span>
    <span class="kr">in</span> <span class="n">foldl'</span> <span class="n">accumulate</span> <span class="mi">0</span> <span class="n">l</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">foldl'</code> takes a function, an initial value and the collection to walk.  This function takes the result that has been computed so far, and the next element to merge in.</p>

<h2 id="accumulations-that-exit-early-sometimes">Accumulations that exit early sometimes</h2>

<p><em>Edited: Updated this section per feedback from <a href="https://www.reddit.com/user/lamefun">lamefun</a>.  Thanks!</em>.</p>

<p>Consider this:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">function</span> <span class="nx">indexOf</span><span class="p">(</span><span class="nx">list</span><span class="p">,</span> <span class="nx">element</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">for</span> <span class="p">(</span><span class="kd">var</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span> <span class="o">&lt;</span> <span class="nx">list</span><span class="p">.</span><span class="nx">length</span><span class="p">;</span> <span class="o">++</span><span class="nx">i</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">if</span> <span class="p">(</span><span class="nx">list</span><span class="p">[</span><span class="nx">i</span><span class="p">]</span> <span class="o">==</span> <span class="nx">element</span><span class="p">)</span> <span class="p">{</span>
            <span class="k">return</span> <span class="nx">i</span><span class="p">;</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This is superficially similar to what we were doing above, but we want to stop looping when we hit a certain point.</p>

<p>When the builtin traversals don’t obviously provide something you actually want, the end-all solution is the tail-recursive loop.</p>

<p>This is the most manual way to loop in Haskell, and as such it’s the most flexible.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">indexOf'</span> <span class="n">list</span> <span class="n">element</span> <span class="o">=</span>
    <span class="kr">let</span> <span class="n">step</span> <span class="n">l</span> <span class="n">index</span> <span class="o">=</span> <span class="kr">case</span> <span class="n">l</span> <span class="kr">of</span>
            <span class="kt">[]</span> <span class="o">-&gt;</span> <span class="kt">Nothing</span>
            <span class="p">(</span><span class="n">x</span><span class="o">:</span><span class="n">xs</span><span class="p">)</span> <span class="o">-&gt;</span>
                <span class="kr">if</span> <span class="n">x</span> <span class="o">==</span> <span class="n">element</span>
                    <span class="kr">then</span> <span class="kt">Just</span> <span class="n">index</span>
                    <span class="kr">else</span> <span class="n">step</span> <span class="n">xs</span> <span class="p">(</span><span class="n">index</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span>
    <span class="kr">in</span> <span class="n">step</span> <span class="n">list</span> <span class="mi">0</span>
</code></pre></div></div>

<p>The pattern you want to follow is to write a helper function that takes as arguments all the state that changes from iteration to iteration.  When you want to update your state and jump to the start of the loop, do a recursive call with your new, updated arguments.</p>

<p>The only thing to worry about is to ensure that your recursive call is in <a href="https://en.wikipedia.org/wiki/Tail_call">tail position</a>.  The compiler will optimize tail calls into “goto” instructions rather than “calls.”</p>

<h1 id="impure-loops">Impure Loops</h1>

<h2 id="just-plain-doing-stuff">Just Plain Doing Stuff</h2>

<p><code class="language-plaintext highlighter-rouge">Data.Traversable</code> exports a function called <code class="language-plaintext highlighter-rouge">forM_</code> which takes a traversable data structure and a monadic function and it runs the action on each element, discarding the results.</p>

<p>This is as close to a C++-style <code class="language-plaintext highlighter-rouge">for()</code> loop as you’re going to get.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">main</span> <span class="o">=</span> <span class="kr">do</span>
    <span class="n">forM_</span> <span class="p">[</span><span class="mi">1</span><span class="o">..</span><span class="mi">100</span><span class="p">]</span> <span class="o">$</span> <span class="nf">\</span><span class="n">number</span> <span class="o">-&gt;</span> <span class="kr">do</span>
        <span class="n">putStr</span> <span class="o">$</span> <span class="n">show</span> <span class="n">number</span> <span class="o">++</span> <span class="s">" "</span>
        <span class="n">when</span> <span class="p">(</span><span class="mi">0</span> <span class="o">==</span> <span class="n">number</span> <span class="p">`</span><span class="n">mod</span><span class="p">`</span> <span class="mi">3</span><span class="p">)</span> <span class="o">$</span>
            <span class="n">putStr</span> <span class="s">"Fizz"</span>
        <span class="n">when</span> <span class="p">(</span><span class="mi">0</span> <span class="o">==</span> <span class="n">number</span> <span class="p">`</span><span class="n">mod</span><span class="p">`</span> <span class="mi">5</span><span class="p">)</span> <span class="o">$</span>
            <span class="n">putStr</span> <span class="s">"Buzz"</span>
        <span class="n">putStrLn</span> <span class="s">""</span>
</code></pre></div></div>

<h2 id="mapping">Mapping</h2>
<p>If you drop the underscore and use <code class="language-plaintext highlighter-rouge">forM</code> instead, you can capture the results.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">main</span> <span class="o">=</span> <span class="kr">do</span>
    <span class="n">strings</span> <span class="o">&lt;-</span> <span class="n">forM</span> <span class="p">[</span><span class="mi">1</span><span class="o">..</span><span class="mi">5</span><span class="p">]</span> <span class="o">$</span> <span class="nf">\</span><span class="n">number</span> <span class="o">-&gt;</span> <span class="kr">do</span>
        <span class="n">putStr</span> <span class="o">$</span> <span class="s">"Enter string "</span> <span class="o">++</span> <span class="n">show</span> <span class="n">number</span> <span class="o">++</span> <span class="s">": "</span>
        <span class="n">getLine</span>

    <span class="n">print</span> <span class="n">strings</span>
</code></pre></div></div>

<h2 id="accumulating">Accumulating</h2>

<p>Honestly, if it’s impure, you can just create an <code class="language-plaintext highlighter-rouge">IORef</code>.  <code class="language-plaintext highlighter-rouge">IORef</code>s are mutable variables in Haskell.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">main</span> <span class="o">=</span> <span class="kr">do</span>
    <span class="kr">let</span> <span class="n">increment</span> <span class="n">n</span> <span class="o">=</span> <span class="n">n</span> <span class="o">+</span> <span class="mi">1</span>

    <span class="n">count</span> <span class="o">&lt;-</span> <span class="n">newIORef</span> <span class="mi">0</span>

    <span class="n">forM_</span> <span class="p">[</span><span class="mi">0</span><span class="o">..</span><span class="mi">50</span><span class="p">]</span> <span class="o">$</span> <span class="nf">\</span><span class="n">number</span> <span class="o">-&gt;</span> <span class="kr">do</span>
        <span class="n">modifyIORef'</span> <span class="n">count</span> <span class="n">increment</span>

    <span class="n">c</span> <span class="o">&lt;-</span> <span class="n">readIORef</span> <span class="n">count</span>
    <span class="n">print</span> <span class="n">c</span>
</code></pre></div></div>

<h2 id="better-accumulating">Better Accumulating</h2>

<p><code class="language-plaintext highlighter-rouge">foldM</code> is exactly analogous to <code class="language-plaintext highlighter-rouge">foldl'</code>, except it’s monadic.  This means that you can use it to perform side effects in your loop body as you accumulate values.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">main</span> <span class="o">=</span> <span class="kr">do</span>
    <span class="kr">let</span> <span class="n">l</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="o">..</span><span class="mi">4</span><span class="p">]</span>
    <span class="kr">let</span> <span class="n">iter</span> <span class="n">acc</span> <span class="n">element</span> <span class="o">=</span> <span class="kr">do</span>
            <span class="n">putStrLn</span> <span class="o">$</span> <span class="s">"Executing side effect "</span> <span class="o">++</span> <span class="n">show</span> <span class="n">element</span>
            <span class="n">return</span> <span class="p">(</span><span class="n">acc</span> <span class="o">+</span> <span class="n">element</span><span class="p">)</span>
    <span class="n">total</span> <span class="o">&lt;-</span> <span class="n">foldM</span> <span class="n">iter</span> <span class="mi">0</span> <span class="n">l</span>
    <span class="n">putStrLn</span> <span class="o">$</span> <span class="s">"Total is "</span> <span class="o">++</span> <span class="n">show</span> <span class="n">total</span>
</code></pre></div></div>

<h2 id="accumulation-with-early-termination">Accumulation with early termination</h2>

<p>Just like with pure code, when libraries don’t seem to offer what you want, just write out the tail-recursive loop.  The only difference is that monadic functions generally have to <code class="language-plaintext highlighter-rouge">return</code> some value in non-recursive cases.  If you just want to do stuff and don’t have a result you want to carry back, return <code class="language-plaintext highlighter-rouge">()</code>.  Think of it as an empty tuple.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">main</span> <span class="o">=</span> <span class="kr">do</span>
    <span class="kr">let</span> <span class="n">test</span> <span class="n">a_list</span> <span class="o">=</span> <span class="kr">case</span> <span class="n">a_list</span> <span class="kr">of</span>
            <span class="kt">[]</span> <span class="o">-&gt;</span>
                <span class="n">return</span> <span class="nb">()</span>
            <span class="p">(</span><span class="n">x</span><span class="o">:</span><span class="n">xs</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kr">do</span>
                <span class="n">putStrLn</span> <span class="o">$</span> <span class="s">"Testing element "</span> <span class="o">++</span> <span class="n">show</span> <span class="n">x</span>
                <span class="kr">if</span> <span class="mi">0</span> <span class="o">==</span> <span class="n">x</span> <span class="p">`</span><span class="n">mod</span><span class="p">`</span> <span class="mi">3</span>
                    <span class="kr">then</span> <span class="n">return</span> <span class="nb">()</span>
                    <span class="kr">else</span> <span class="n">test</span> <span class="n">xs</span>
    <span class="n">test</span> <span class="p">[</span><span class="mi">1</span><span class="o">..</span><span class="mi">10</span><span class="p">]</span>
</code></pre></div></div>

<p>Here, our <code class="language-plaintext highlighter-rouge">test</code> function splices apart the list it is given, and stops if it is empty or if it divides evenly into 3.  If not, it tail recurses with the rest of the list.</p>

<p>Something useful to observe here is that we are, in a certain sense, effecting a “mutable variable” by way of the recursive call.  The parameter “shrinks” with each successive recursive step.</p>

<p>This is also the most flexible way to write a loop.  Anything you can do in C, you can do in Haskell by way of variations on this template.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[One of the things that really gets newcomers to Haskell is that it’s got a vision of flow control that’s completely foreign. OCaml is arguably Haskell’s nearest popular cousin, and even it has basic things like while and for loops.]]></summary></entry><entry><title type="html">Testable IO in Haskell</title><link href="https://andyfriesen.com/2015/06/17/testable-io-in-haskell.html" rel="alternate" type="text/html" title="Testable IO in Haskell" /><published>2015-06-17T00:00:00+00:00</published><updated>2015-06-17T00:00:00+00:00</updated><id>https://andyfriesen.com/2015/06/17/testable-io-in-haskell</id><content type="html" xml:base="https://andyfriesen.com/2015/06/17/testable-io-in-haskell.html"><![CDATA[<p>At IMVU, we write a lot of tests.  Ideally, we write tests for every feature and bugfix
we write.  The problem we run into is one of scale: if each of IMVU’s tests were 99.9% reliable, 
1 out of every 5 runs would result in an intermittent failure.</p>

<p>Tests erroneously fail for lots of reasons: the test could be running in the midst of the “extra” daylight-savings hour
or a leap day (or a leap second!).  The database could have been left corrupted by another test.  CPU scheduling could prioritize one
process over another.  Maybe the random number generator just so happened to produce two zeroes in a row.</p>

<p>All of these things boil down to the same root cause: nondeterminism within the test.</p>

<p>We’ve done a lot of work at IMVU to isolate and control nondeterminism in our test frameworks.  One of my favourite
techniques is the way we make our Haskell tests provably perfectly deterministic.</p>

<p>Here’s how it works.</p>

<p>This post is Literate Haskell, which basically means you can point GHC at it directly and run it.
You can download it <a href="https://raw.githubusercontent.com/andyfriesen/andyfriesen.github.io/master/_lhs/testable-io-in-haskell.lhs">here</a>.</p>

<p>We’ll start with some boilerplate.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">{-# LANGUAGE FlexibleInstances #-}</span>
<span class="cp">{-# LANGUAGE NamedFieldPuns #-}</span>

<span class="kr">module</span> <span class="nn">Main</span> <span class="kr">where</span>

<span class="kr">import</span> <span class="nn">Control.Monad.State.Lazy</span> <span class="k">as</span> <span class="n">S</span>
</code></pre></div></div>

<p>What we’re looking to achieve here is a syntax-lightweight way of writing side effectful logic in a way that permits
easy unit testing.</p>

<p>In particular, a property we’d very much like to have is the ability to deny our actions access to IO when they are
running in a unit test.</p>

<p>For this example, we’ll posit that the very important business action we wish to test is to prompt the user for their
name, then say hello:</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">importantBusinessAction</span> <span class="o">=</span> <span class="kr">do</span>
    <span class="n">writeLine</span> <span class="s">"Please enter your name: "</span>
    <span class="n">name</span> <span class="o">&lt;-</span> <span class="n">readLine</span>
    <span class="kr">if</span> <span class="s">""</span> <span class="o">==</span> <span class="n">name</span>
        <span class="kr">then</span> <span class="kr">do</span>
            <span class="n">writeLine</span> <span class="s">"I really really need a name!"</span>
            <span class="n">importantBusinessAction</span>
        <span class="kr">else</span>
            <span class="n">writeLine</span> <span class="o">$</span> <span class="s">"Hello, "</span> <span class="o">++</span> <span class="n">name</span> <span class="o">++</span> <span class="s">"!"</span>
</code></pre></div></div>

<p>We’ll achieve this by defining a class of monad in which testable side effects can occur.  We’ll name this class
<code class="language-plaintext highlighter-rouge">World</code>.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">class</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="o">=&gt;</span> <span class="kt">World</span> <span class="n">m</span> <span class="kr">where</span>
    <span class="n">writeLine</span> <span class="o">::</span> <span class="kt">String</span> <span class="o">-&gt;</span> <span class="n">m</span> <span class="nb">()</span>
    <span class="n">readLine</span> <span class="o">::</span> <span class="n">m</span> <span class="kt">String</span>
</code></pre></div></div>

<p>We can now write the type of our <code class="language-plaintext highlighter-rouge">importantBusinessAction</code>:</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">importantBusinessAction</span> <span class="o">::</span> <span class="kt">World</span> <span class="n">m</span> <span class="o">=&gt;</span> <span class="n">m</span> <span class="nb">()</span>
</code></pre></div></div>

<p>The name of this type can be read as “an action producing unit for some monad <code class="language-plaintext highlighter-rouge">m</code> in <code class="language-plaintext highlighter-rouge">World</code>.”</p>

<p>When our application is running in production, we don’t require anything except IO to run, so it’s perfectly sensible
for <code class="language-plaintext highlighter-rouge">IO</code> to be a context in which <code class="language-plaintext highlighter-rouge">World</code> actions can be run.  The Haskell Prelude already offers the exact functions
we need, so this instance is completely trivial:</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">instance</span> <span class="kt">World</span> <span class="kt">IO</span> <span class="kr">where</span>
    <span class="n">writeLine</span> <span class="o">=</span> <span class="n">putStrLn</span>
    <span class="n">readLine</span> <span class="o">=</span> <span class="n">getLine</span>
</code></pre></div></div>

<p>In unit tests, we specifically want to deny access to any kind of nondeterminism, so we’ll use the <code class="language-plaintext highlighter-rouge">State</code> monad.
<code class="language-plaintext highlighter-rouge">State</code> provides the illusion of a mutable piece of data through a pure computation.  We’ll pack the state of our
application up in a record.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">type</span> <span class="kt">FakeIO</span> <span class="o">=</span> <span class="kt">S</span><span class="o">.</span><span class="kt">State</span> <span class="kt">FakeState</span>
</code></pre></div></div>

<p>(I’ll get to <code class="language-plaintext highlighter-rouge">FakeState</code> in a second)</p>

<p>Aside from reliability, this design has another very useful property: It is impossible for tests to interfere with one
another even if many tests share the same state.  This means that “test fixtures” can trivially be effected by simply
running an action and using the resulting state in as many tests as desired.</p>

<p>The state record <code class="language-plaintext highlighter-rouge">FakeState</code> itself essentially captures the full state of the fake application at any one moment.</p>

<p>The <code class="language-plaintext highlighter-rouge">writeLine</code> implementation is very easy: We just need to accumulate a list of lines
that were printed.  We can carry that directly in our state record.</p>

<p>The <code class="language-plaintext highlighter-rouge">readLine</code> action is a bit more complicated.  We’re going to write all kinds of tests for our application, and we
really don’t want to burn any one particular behaviour into the framework.  We want to parameterize this on a per-test
basis.</p>

<p>We’ll solve this by embedding an action directly into our state record.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">data</span> <span class="kt">FakeState</span> <span class="o">=</span> <span class="kt">FS</span>
    <span class="p">{</span> <span class="n">fsWrittenLines</span> <span class="o">::</span> <span class="p">[</span><span class="kt">String</span><span class="p">]</span>
    <span class="p">,</span> <span class="n">fsReadLine</span>     <span class="o">::</span> <span class="kt">FakeIO</span> <span class="kt">String</span>
    <span class="p">}</span>

<span class="n">def</span> <span class="o">::</span> <span class="kt">FakeState</span>
<span class="n">def</span> <span class="o">=</span> <span class="kt">FS</span>
    <span class="p">{</span> <span class="n">fsWrittenLines</span> <span class="o">=</span> <span class="kt">[]</span>
    <span class="p">,</span> <span class="n">fsReadLine</span> <span class="o">=</span> <span class="n">return</span> <span class="s">""</span>
    <span class="p">}</span>
</code></pre></div></div>

<p>Now, given this record, we can declare that <code class="language-plaintext highlighter-rouge">FakeIO</code> is also a valid <code class="language-plaintext highlighter-rouge">World</code> <code class="language-plaintext highlighter-rouge">Monad</code>, and provide
implementations for our platform when run under unit test.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">instance</span> <span class="kt">World</span> <span class="p">(</span><span class="kt">S</span><span class="o">.</span><span class="kt">State</span> <span class="kt">FakeState</span><span class="p">)</span> <span class="kr">where</span>
    <span class="n">writeLine</span> <span class="n">s</span> <span class="o">=</span> <span class="kr">do</span>
        <span class="n">st</span> <span class="o">&lt;-</span> <span class="kt">S</span><span class="o">.</span><span class="n">get</span>
        <span class="kr">let</span> <span class="n">oldLines</span> <span class="o">=</span> <span class="n">fsWrittenLines</span> <span class="n">st</span>
        <span class="kt">S</span><span class="o">.</span><span class="n">put</span> <span class="n">st</span> <span class="p">{</span> <span class="n">fsWrittenLines</span> <span class="o">=</span> <span class="n">s</span><span class="o">:</span><span class="n">oldLines</span> <span class="p">}</span>

    <span class="n">readLine</span> <span class="o">=</span> <span class="kr">do</span>
        <span class="n">st</span> <span class="o">&lt;-</span> <span class="kt">S</span><span class="o">.</span><span class="n">get</span>
        <span class="kr">let</span> <span class="n">readLineAction</span> <span class="o">=</span> <span class="n">fsReadLine</span> <span class="n">st</span>
        <span class="n">readLineAction</span>
</code></pre></div></div>

<p>We also write a small helper function to make unit tests read a bit more naturally:</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">runFakeWorld</span> <span class="o">::</span> <span class="n">b</span> <span class="o">-&gt;</span> <span class="kt">State</span> <span class="n">b</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span>
<span class="n">runFakeWorld</span> <span class="o">=</span> <span class="n">flip</span> <span class="kt">S</span><span class="o">.</span><span class="n">runState</span>
</code></pre></div></div>

<p>Now, let’s write our first unit test.</p>

<p>We wish to test that our application rejects the empty string as a name.  When the user does this, we wish to verify
that the customer sees an error message and is asked again for their name.</p>

<p>First, we’ll craft a <code class="language-plaintext highlighter-rouge">readLine</code> implementation that produces the empty string once, then the string “Joe.”</p>

<p>Making this function more natural without compromising extensibility is left as an exercise to the reader. :)</p>

<p>Note that by providing the type <code class="language-plaintext highlighter-rouge">FakeIO String</code>, we have effectively authored an action that can <em>only</em> be used in a
unit test.  The build will fail if production code tries to use this action.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">main</span> <span class="o">::</span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="n">main</span> <span class="o">=</span> <span class="kr">do</span>
    <span class="kr">let</span> <span class="n">readLine_that_is_incorrect_once</span> <span class="o">::</span> <span class="kt">FakeIO</span> <span class="kt">String</span>
        <span class="n">readLine_that_is_incorrect_once</span> <span class="o">=</span> <span class="kr">do</span>
            <span class="kt">S</span><span class="o">.</span><span class="n">modify</span> <span class="p">(</span><span class="nf">\</span><span class="n">s</span> <span class="o">-&gt;</span> <span class="n">s</span> <span class="p">{</span> <span class="n">fsReadLine</span> <span class="o">=</span> <span class="n">return</span> <span class="s">"Joe"</span> <span class="p">})</span>
            <span class="n">return</span> <span class="s">""</span>
</code></pre></div></div>

<p>Now that we have that, we can create a <code class="language-plaintext highlighter-rouge">FakeState</code> that represents the scenario we wish to test.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="kr">let</span> <span class="n">initState</span> <span class="o">=</span> <span class="n">def</span>
            <span class="p">{</span> <span class="n">fsReadLine</span> <span class="o">=</span> <span class="n">readLine_that_is_incorrect_once</span> <span class="p">}</span>
</code></pre></div></div>

<p>And go!</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="kr">let</span> <span class="p">(</span><span class="nb">()</span><span class="p">,</span> <span class="n">endState</span><span class="p">)</span> <span class="o">=</span> <span class="n">runFakeWorld</span> <span class="n">initState</span> <span class="n">importantBusinessAction</span>
</code></pre></div></div>

<p>Note that <code class="language-plaintext highlighter-rouge">runFakeWorld</code> produces a pair of the result of the action and the final state.  We can inspect this record
freely:</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="n">forM_</span> <span class="p">(</span><span class="n">reverse</span> <span class="o">$</span> <span class="n">fsWrittenLines</span> <span class="n">endState</span><span class="p">)</span> <span class="o">$</span> <span class="nf">\</span><span class="n">line</span> <span class="o">-&gt;</span>
        <span class="n">print</span> <span class="n">line</span>
</code></pre></div></div>

<p>That’s it!</p>

<p>In a real application, your <code class="language-plaintext highlighter-rouge">FakeState</code> analogue will be much more complex, potentially including things like a clock,
a pseudo-random number generator, and potentially state for a pure database of some sort.  Some of these things are
themselves complex to build out, but, as long as those implementations are pure, everything snaps together neatly.</p>

<p>If complete isolation from IO is impractical, this technique could also be adjusted to run atop a <code class="language-plaintext highlighter-rouge">StateT</code> rather than
pure <code class="language-plaintext highlighter-rouge">State</code>.  This allows for imperfect side-effect isolation where necessary.</p>

<p>Happy testing!</p>

<p><a href="https://raw.githubusercontent.com/andyfriesen/andyfriesen.github.io/master/_lhs/testable-io-in-haskell.lhs">Source Code</a>.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[At IMVU, we write a lot of tests. Ideally, we write tests for every feature and bugfix we write. The problem we run into is one of scale: if each of IMVU’s tests were 99.9% reliable, 1 out of every 5 runs would result in an intermittent failure.]]></summary></entry><entry><title type="html">What does unique_ptr&amp;lt;&amp;gt; cost?</title><link href="https://andyfriesen.com/2014/10/07/unique_ptr.html" rel="alternate" type="text/html" title="What does unique_ptr&amp;lt;&amp;gt; cost?" /><published>2014-10-07T00:00:00+00:00</published><updated>2014-10-07T00:00:00+00:00</updated><id>https://andyfriesen.com/2014/10/07/unique_ptr</id><content type="html" xml:base="https://andyfriesen.com/2014/10/07/unique_ptr.html"><![CDATA[<p>I just watched <a href="http://www.youtube.com/watch?v=TH9VCN6UkyQ">Jonathan Blow’s proposal</a> for a new programming language, which got me thinking about the difficulties that motivated the talk.</p>

<p>In particular, I think <code class="language-plaintext highlighter-rouge">unique_ptr&lt;&gt;</code> is fantastic, but I’m curious about how it affects compile times and code size.  Let’s find out.</p>

<p>First, some C++</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;memory&gt;</span><span class="cp">
</span>
<span class="k">using</span> <span class="n">std</span><span class="o">::</span><span class="n">unique_ptr</span><span class="p">;</span>

<span class="k">struct</span> <span class="nc">S</span> <span class="p">{</span>
    <span class="n">unique_ptr</span><span class="o">&lt;</span><span class="kt">int</span><span class="p">[]</span><span class="o">&gt;</span> <span class="n">ints</span><span class="p">;</span>
<span class="p">};</span>

<span class="kt">int</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">const</span> <span class="kt">int</span> <span class="n">LEN</span> <span class="o">=</span> <span class="mi">50</span><span class="p">;</span>
    <span class="k">auto</span> <span class="n">s</span> <span class="o">=</span> <span class="n">S</span> <span class="p">{</span>
        <span class="n">unique_ptr</span><span class="o">&lt;</span><span class="kt">int</span><span class="p">[]</span><span class="o">&gt;</span> <span class="p">{</span> <span class="k">new</span> <span class="kt">int</span><span class="p">[</span><span class="n">LEN</span><span class="p">]</span> <span class="p">}</span>
    <span class="p">};</span>

    <span class="k">auto</span> <span class="n">j</span> <span class="o">=</span> <span class="mi">0u</span><span class="p">;</span>
    <span class="k">for</span> <span class="p">(</span><span class="k">auto</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0u</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">LEN</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">s</span><span class="p">.</span><span class="n">ints</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="o">++</span><span class="n">j</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="k">auto</span> <span class="n">sum</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="k">for</span> <span class="p">(</span><span class="k">auto</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0u</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">LEN</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">sum</span> <span class="o">+=</span> <span class="n">s</span><span class="p">.</span><span class="n">ints</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
    <span class="p">}</span>

    <span class="n">printf</span><span class="p">(</span><span class="s">"Sum! %i</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">sum</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Next, the equivalent C:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;stdlib.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;stdio.h&gt;</span><span class="cp">
</span>
<span class="k">typedef</span> <span class="k">struct</span> <span class="nc">S</span> <span class="p">{</span>
    <span class="kt">int</span><span class="o">*</span> <span class="n">ints</span><span class="p">;</span>
<span class="p">}</span> <span class="n">S</span><span class="p">;</span>

<span class="cp">#define LEN 50
</span>
<span class="kt">int</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">S</span> <span class="n">s</span><span class="p">;</span>
    <span class="kt">unsigned</span> <span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">sum</span><span class="p">;</span>

    <span class="n">s</span><span class="p">.</span><span class="n">ints</span> <span class="o">=</span> <span class="p">(</span><span class="kt">int</span><span class="o">*</span><span class="p">)</span><span class="n">malloc</span><span class="p">(</span><span class="n">LEN</span> <span class="o">*</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">int</span><span class="p">));</span>

    <span class="n">j</span> <span class="o">=</span> <span class="mi">0u</span><span class="p">;</span>
    <span class="k">for</span> <span class="p">(</span><span class="n">i</span> <span class="o">=</span> <span class="mi">0u</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">LEN</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">s</span><span class="p">.</span><span class="n">ints</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="o">++</span><span class="n">j</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="n">sum</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="k">for</span> <span class="p">(</span><span class="n">i</span> <span class="o">=</span> <span class="mi">0u</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">LEN</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">sum</span> <span class="o">+=</span> <span class="n">s</span><span class="p">.</span><span class="n">ints</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
    <span class="p">}</span>

    <span class="n">printf</span><span class="p">(</span><span class="s">"Sum! %i</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">sum</span><span class="p">);</span>

    <span class="n">free</span><span class="p">(</span><span class="n">s</span><span class="p">.</span><span class="n">ints</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>On my machine (a mid-2012 Retina MBP), I get these figures: (averaged over 5 runs of clang and clang++ each)</p>

<p>We’ll look at code size too:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>c1.c:      37ms / 8kb
cpp1.cpp: 123ms / 8kb
</code></pre></div></div>

<p>Wow!  What’s going on?</p>

<p>First off, using clang++ to build the C version runs at the same speed.  That should have been obvious, but I wanted to test it anyway.</p>

<p>Secondly, if I add <code class="language-plaintext highlighter-rouge">#include &lt;memory&gt;</code> to the C version and run it through clang++, I see the same 123ms build times.</p>

<p>So, that’s probably most of the picture, but my mental model of templates is that you pay for them in two ways:</p>

<ol>
  <li>When you #include the header, you pay the time it takes for the compiler to parse that header, and</li>
  <li>you pay again to instantiate the template with a particular set of types.</li>
</ol>

<p>So, what’s the instantiation cost?</p>

<p>Let’s try this:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">S0</span> <span class="p">{</span> <span class="kt">int</span> <span class="n">a0</span><span class="p">;</span> <span class="n">S0</span><span class="p">(</span><span class="kt">int</span> <span class="n">i</span><span class="p">)</span> <span class="o">:</span> <span class="n">a0</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="p">{}</span> <span class="p">};</span> <span class="k">auto</span> <span class="n">u0</span> <span class="o">=</span> <span class="n">unique_ptr</span><span class="o">&lt;</span><span class="n">S0</span><span class="o">&gt;</span> <span class="p">{</span> <span class="k">new</span> <span class="n">S0</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="p">};</span>
<span class="k">struct</span> <span class="nc">S1</span> <span class="p">{</span> <span class="kt">int</span> <span class="n">a1</span><span class="p">;</span> <span class="n">S1</span><span class="p">(</span><span class="kt">int</span> <span class="n">i</span><span class="p">)</span> <span class="o">:</span> <span class="n">a1</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="p">{}</span> <span class="p">};</span> <span class="k">auto</span> <span class="n">u1</span> <span class="o">=</span> <span class="n">unique_ptr</span><span class="o">&lt;</span><span class="n">S1</span><span class="o">&gt;</span> <span class="p">{</span> <span class="k">new</span> <span class="n">S1</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="p">};</span>
<span class="c1">// 998 more!</span>
</code></pre></div></div>

<p>vs this</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">S0</span> <span class="p">{</span> <span class="kt">int</span> <span class="n">a0</span><span class="p">;</span> <span class="n">S0</span><span class="p">(</span><span class="kt">int</span> <span class="n">i</span><span class="p">)</span> <span class="o">:</span> <span class="n">a0</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="p">{}</span> <span class="p">};</span> <span class="k">auto</span> <span class="n">u0</span> <span class="o">=</span> <span class="k">new</span> <span class="n">S0</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>
<span class="k">struct</span> <span class="nc">S1</span> <span class="p">{</span> <span class="kt">int</span> <span class="n">a1</span><span class="p">;</span> <span class="n">S1</span><span class="p">(</span><span class="kt">int</span> <span class="n">i</span><span class="p">)</span> <span class="o">:</span> <span class="n">a1</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="p">{}</span> <span class="p">};</span> <span class="k">auto</span> <span class="n">u1</span> <span class="o">=</span> <span class="k">new</span> <span class="n">S1</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
<span class="c1">// 998 more!</span>

<span class="c1">// also</span>
<span class="k">delete</span> <span class="n">u0</span><span class="p">;</span>
<span class="k">delete</span> <span class="n">u1</span><span class="p">;</span>
<span class="c1">// et cetera</span>
</code></pre></div></div>

<p>The results:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1000_structs.cpp:       1303ms / 79kb
1000_unique_ptrs.cpp:   6668ms / 185kb
</code></pre></div></div>

<p>Wow!  It looks like each <code class="language-plaintext highlighter-rouge">unique_ptr&lt;&gt;</code> costs about 5ms and 100 bytes.</p>

<p>First, let’s look at compile speed.  I wonder if it has to do with the number of unique (heh) <code class="language-plaintext highlighter-rouge">unique_ptr&lt;&gt;</code> instantiations, or mere utterance of the type name.  If we change all the <code class="language-plaintext highlighter-rouge">unique_ptr&lt;&gt;</code>s so they have the same type, we get:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1000_identical_unique_ptrs.cpp: 599ms / 79kb
</code></pre></div></div>

<p>Wait, what?  Why is it faster than doing it the hard way?  Shouldn’t our build times be worse because we’re asking it to expand a bunch of extra templates?</p>

<p>Also note that file size matches up with what we get when we deallocate explicitly: Our executable gets larger with the number of <em>kinds</em> of <code class="language-plaintext highlighter-rouge">unique_ptr&lt;&gt;</code>s we instantiate, but it doesn’t cost anything to use the same kind of pointer many times.  This makes sense: the full implementation shouldn’t be much more than a deleted copy constructor, a move constructor, and a destructor.</p>

<p>Could it be that all those <code class="language-plaintext highlighter-rouge">delete</code> statements cost 120ms?  What happens if we remove them?</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1000_struct_no_free.cpp: 667ms / 63kb
</code></pre></div></div>

<p>This is unexpected: a very boring, monomorphic built-in language construct costs more to use than a template class.</p>

<p>We still need to look into the code size:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>clang++ <span class="nt">-S</span> <span class="nt">-Os</span> <span class="nt">-std</span><span class="o">=</span>c++11 1000_unique_ptrs.cpp
<span class="nv">$ </span>emacs 1000_unique_ptrs.cpp
</code></pre></div></div>

<p>I see this over and over:</p>

<pre><code class="language-asm">    .private_extern __ZNSt3__110unique_ptrI2S1NS_14default_deleteIS1_EEED1Ev
    .globl  __ZNSt3__110unique_ptrI2S1NS_14default_deleteIS1_EEED1Ev
    .weak_def_can_be_hidden __ZNSt3__110unique_ptrI2S1NS_14default_deleteIS1_EEED1Ev
    .align  1, 0x90
__ZNSt3__110unique_ptrI2S1NS_14default_deleteIS1_EEED1Ev: ## @_ZNSt3__110unique_ptrI2S1NS_14default_deleteIS1_EEED1Ev
    .cfi_startproc
## BB#0:
    pushq   %rbp
Ltmp7:
    .cfi_def_cfa_offset 16
Ltmp8:
    .cfi_offset %rbp, -16
    movq    %rsp, %rbp
Ltmp9:
    .cfi_def_cfa_register %rbp
    movq    %rdi, %rax
    movq    (%rax), %rdi
    movq    $0, (%rax)
    testq   %rdi, %rdi
    je  LBB1_1
## BB#2:                                ## %_ZNKSt3__114default_deleteI2S1EclEPS1_.exit.i.i
    popq    %rbp
    jmp __ZdlPv                 ## TAILCALL
LBB1_1:                                 ## %_ZNSt3__110unique_ptrI2S1NS_14default_deleteIS1_EEED2Ev.exit
    popq    %rbp
    retq
</code></pre>

<p>FYI, the tail call at the end is the global <code class="language-plaintext highlighter-rouge">operator delete</code>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>c++filt __ZdlPv
operator delete<span class="o">(</span>void<span class="k">*</span><span class="o">)</span>
</code></pre></div></div>

<p>It looks like the destructor isn’t being inlined.  That’s a shame.  I wasn’t able to coerce clang into inlining it. (it already has the <code class="language-plaintext highlighter-rouge">__always_inline__</code> attribute)</p>

<h1 id="recap">Recap</h1>

<ul>
  <li>You pay a tiny bit of constant overhead just to <code class="language-plaintext highlighter-rouge">#include &lt;memory&gt;</code></li>
  <li>You pay a bit for each distinct specialization of <code class="language-plaintext highlighter-rouge">unique_ptr&lt;&gt;</code>, but it’s <strong>cheaper</strong> than what you pay for an explicit <code class="language-plaintext highlighter-rouge">delete</code> statement.</li>
  <li>You pay a bit of filesize for each kind of <code class="language-plaintext highlighter-rouge">unique_ptr&lt;&gt;</code>.</li>
  <li>It’s basically free to talk about many <code class="language-plaintext highlighter-rouge">unique_ptr&lt;&gt;</code>s of the same type.</li>
</ul>

<p><a href="https://github.com/andyfriesen/benchmark_unique_ptr">Code</a></p>]]></content><author><name></name></author><summary type="html"><![CDATA[I just watched Jonathan Blow’s proposal for a new programming language, which got me thinking about the difficulties that motivated the talk.]]></summary></entry><entry><title type="html">Haskell at IMVU</title><link href="https://andyfriesen.com/2014/03/25/what-its-like-to-use-haskell.html" rel="alternate" type="text/html" title="Haskell at IMVU" /><published>2014-03-25T00:00:00+00:00</published><updated>2014-03-25T00:00:00+00:00</updated><id>https://andyfriesen.com/2014/03/25/what-its-like-to-use-haskell</id><content type="html" xml:base="https://andyfriesen.com/2014/03/25/what-its-like-to-use-haskell.html"><![CDATA[<p>This is a copy of <a href="http://engineering.imvu.com/2014/03/24/what-its-like-to-use-haskell/">the article I wrote</a> for the IMVU Engineering Blog.</p>

<p>Since early 2013, we at IMVU have used Haskell to build several of the REST APIs that power our service.</p>

<p>When the company started, we chose PHP as our application server language, in part, because the founders expected the website to only be a small part of the business!  IMVU was primarily about a downloadable 3D client.  We needed “a website or something” to give users a place to download our client from, but didn’t expect it would have to be much more than that. This shows that predicting the future is hard.
Years later, we have quite a lot of customers, and we primarily use PHP to serve them.  We’re big enough that we run multiple subteams on separate initiatives at the same time.  Performance is becoming important to us not just because it matters to our customers, but because it can easily make the difference between buying 4 servers and buying 40 servers to support some new feature.</p>

<p>So, early in 2012, we found ourselves ready to look for an alternative that would help us be more rigorous.  In particular, we were ready for the idea that sacrificing a tiny bit of short term, straight-line time to market might actually speed us up in the long run.</p>

<h1 id="how-we-got-here">How We Got Here</h1>

<p>I started learning Haskell in my spare time in part because Haskell seems like the exact opposite of PHP: Natively compiled, statically typed, and very principled.</p>

<p>My initial exploration left me interested in evaluating Haskell at real scale.  A year later, we did a live-fire test in which we taught multiple teammates Haskell while delivering an important new feature under a deadline.</p>

<p>Today, a lot of our backend code is still driven by PHP, but we have a growing amount of Haskell that powers newer features. The process has been exciting not only because we got to actually answer a lot of the questions that keep many people from choosing not to try Haskell, but also because it’s simply a better solution.</p>

<p>The experiment to start developing in Haskell took a lot of internal courage and dedication, and we had to overcome a number of, quite rational, concerns related to adopting a whole new language. Here are the main ones and how they worked out for us:</p>

<h1 id="scalability">Scalability</h1>

<p>The first thing we did was to replace a single service with a Haskell implementation.  We picked a service that was high-volume but was not mission critical.</p>

<p>We didn’t do any particular optimization of this new service, but it nevertheless showed excellent performance characteristics in the field.  Our little Haskell server was running on a pair of spare servers that were otherwise set for retirement, and despite this, each machine was handling about 20x as many requests as one of our high-spec PHP servers could manage.</p>

<h1 id="reliability">Reliability</h1>

<p>The second thing we did was to take our hands off the Haskell service and leave it running until it fell over.  It ran for months without intervention.</p>

<h1 id="training">Training</h1>

<p>After the reliability test, we were ready to try a live fire exercise, but we had to wait a bit for the right project.  We got our chance in early 2013.</p>

<p>The rules of the experiment were simple: Train 3 engineers to write the backend for an important new project and keep up with a separate frontend team.  Most of the code was to be new, so there was relatively little room for legacy complications.</p>

<p>We very quickly learned that we had also signed up for a lot of catch-up work to bring the Haskell infrastructure inline with what we’ve had for years in PHP.  We were very busy for awhile, but once we got this infrastructure out of the way, the tables turned and the front-end team became the limiting factor.</p>

<p>Today, training an engineer to be productive in our Haskell code is not much harder than training someone to be productive in our PHP environment.  People who have prior functional programming knowledge seem to find their stride in just a few days.</p>

<h1 id="testing">Testing</h1>

<p>Correctness is becoming very important for us because we sometimes have to change code that predates every current developer.  We have enough users that mistakes become very costly, very quickly.  Solving these sorts of issues in PHP is sometimes achievable but always difficult.  We usually solve them with unit tests and production alerts, but these approaches aren’t sufficient for all cases.</p>

<p>Unit tests are incredible and great, but you’re always at the mercy of the level of discipline of every engineer at every moment. It’s easy to tell your teammates to write tests for everything, but this basically boils down to asking everyone to be at their very best every day.  People make mistakes and things slip through the cracks.</p>

<p>When using Haskell, we actually remove an entire class of defects that we have to write tests for. Thus, the number of tests we have to write is smaller, and thus there are fewer cases we can forget to write tests for.</p>

<p>We like unit testing and test-driven development (TDD) at IMVU and we’ve found that Haskell is better with TDD, but also that TDD is better with Haskell.  It takes fewer tests to get the same degree of reliability out of Haskell.  The static verification takes care of quite a lot of error checking that has to be manually implemented (or forgotten) in PHP.  The Haskell QuickCheck tool is also a wonderful help for developers.
The way Haskell separates pure computations from side effects let us build something that isn’t practical with other languages: We built a custom monad that lets us “switch off” side effects in our tests.  This is incredible because it means that trying to escape the testing sandbox breaks compilation. While we have had to fight intermittent test failures for eight years in PHP (and at times have had multiple engineers simultaneously dedicated to the problem of test intermittency,) our unit tests in Haskell cannot intermittently fail.</p>

<h1 id="deployment">Deployment</h1>

<p>Deployment is great. At IMVU, we do continuous deployment, and Haskell is no exception. We build our application as a statically linked executable, and rsync it out to our servers. We can also keep old versions around, so we can switch back, should a deployment result in unexpected errors.</p>

<p>I wouldn’t write an OS kernel in it, but Haskell is way better than PHP as a systems language. We needed a Memcached client for our Haskell code, and rather than try to talk to a C implementation, we just wrote one in Haskell.  It took about a half day to write and performs really well. And, as a side effect, if we ever read back some data we don’t expect from memcached (say, because of an unexpected version change) then Haskell will automatically detect and reject this data.</p>

<p>We’ve consistently found that we unmake whole classes of bugs by defining new data types for concepts to wrap primitive types like integers and strings.  For instance, we have two lines of code that say that “customer IDs” and “product IDs” are represented to the hardware as numbers, but they are not mutually convertible.  Setting up these new types doesn’t take very much work and it makes the type checker a LOT more helpful. PHP, and other popular dynamic server languages like Javascript or Ruby, make doing the same very hard.</p>

<p>Refactoring is a breeze.  We just write the change we want and follow the compile errors.  If it builds, it almost certainly also passes tests.</p>

<h1 id="not-all-sunshine-and-rainbows">Not All Sunshine and Rainbows</h1>

<p>Resource leaks in Haskell are nasty.  We once had a bug where an unevaluated dictionary was the source of a space leak that would eventually take our servers down.  We also ran into an issue where an upstream library opened /dev/urandom for randomness, but never closed the file handle.  These issues don’t happen in PHP, with its process-per-request model, and they were more difficult to track down and resolve than they would have been in C++.</p>

<p>The Haskell package manager, Cabal, ended up getting in the way of our development. It lets you specify version ranges of particular packages you want, but it’s important for everyone on the team to have exactly the same versions of every package.  That means controlling transitive dependencies, and Cabal doesn’t really offer a way to handle this precisely. For a language that is so very principled on type algebra, it’s surprising that the package manager doesn’t follow suit regarding package versioning. Instead, we use Cabal for basic package installation, and a custom build tool (written in Haskell.)</p>

<h1 id="hiring">Hiring</h1>

<p>I’ll admit that I was very worried that we wouldn’t be able to hire great people if our criteria was expertise in an uncommon language without a comparatively sparse industrial track record, but the honest truth is that we found a great Haskell hacker in the Bay area after about 4 days of looking.</p>

<p>We had a chance to hire him because we were using Haskell, not in spite of it.</p>

<h1 id="final-thoughts">Final Thoughts</h1>

<p>While it’s usually difficult to objectively measure things like choice of programming language or softwarestack, we’re now seeing fantastic, obvious productivity and efficiency gains.  Even a year later, all the Haskell code we have runs on just a tiny number of servers and, when we have to make changes to the code, we can do so quickly and confidently.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[This is a copy of the article I wrote for the IMVU Engineering Blog.]]></summary></entry><entry><title type="html">Literate Haskell and Jekyll</title><link href="https://andyfriesen.com/2014/03/23/lhs-to-jekyll-markdown.html" rel="alternate" type="text/html" title="Literate Haskell and Jekyll" /><published>2014-03-23T00:00:00+00:00</published><updated>2014-03-23T00:00:00+00:00</updated><id>https://andyfriesen.com/2014/03/23/lhs-to-jekyll-markdown</id><content type="html" xml:base="https://andyfriesen.com/2014/03/23/lhs-to-jekyll-markdown.html"><![CDATA[<p>I think I’ve more or less decided to switch my blog over to use <a href="http://pages.github.com/">Github Pages</a> and
<a href="http://jekyllrb.com/">Jekyll</a>.  It’s pretty neat and it means I can sleep safer knowing that I’m not inadvertently
inflicting unpatched PHP on some poor unsuspecting web host.</p>

<p>One of the things that’s kind of annoying, though, is that Jekyll won’t correctly handle Literate Haskell out of the
box.  You can drop code into a Markdown document and it will even syntax highlight it, but it doesn’t support any
syntax that also happens to line up with Literate Haskell.</p>

<p>Clearly, this is a problem in need of a <a href="https://github.com/andyfriesen/andyfriesen.github.io/blob/master/_lhs/lhs-to-jekyll-markdown.lhs">self-referential solution</a>.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">{-# LANGUAGE OverloadedStrings #-}</span>
<span class="kr">module</span> <span class="nn">Main</span> <span class="kr">where</span>

<span class="kr">import</span> <span class="nn">System.IO</span> <span class="p">(</span><span class="nf">stdin</span><span class="p">,</span> <span class="nf">stdout</span><span class="p">,</span> <span class="nf">hIsEOF</span><span class="p">)</span>
<span class="kr">import</span> <span class="nn">Control.Monad</span> <span class="p">(</span><span class="nf">when</span><span class="p">,</span> <span class="nf">unless</span><span class="p">)</span>
<span class="kr">import</span> <span class="k">qualified</span> <span class="nn">Data.Text</span> <span class="k">as</span> <span class="n">T</span>
<span class="kr">import</span> <span class="nn">Data.Text.IO</span> <span class="p">(</span><span class="nf">hGetLine</span><span class="p">,</span> <span class="nf">hPutStrLn</span><span class="p">)</span>
</code></pre></div></div>

<p>We run a very basic state machine: each line is either part of a Literate Haskell block, or it isn’t.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">data</span> <span class="kt">State</span> <span class="o">=</span> <span class="kt">LHS</span> <span class="o">|</span> <span class="kt">Prose</span>
    <span class="kr">deriving</span> <span class="p">(</span><span class="kt">Eq</span><span class="p">)</span>

<span class="n">processNextLine</span> <span class="o">::</span> <span class="kt">State</span> <span class="o">-&gt;</span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="n">processNextLine</span> <span class="n">state</span> <span class="o">=</span> <span class="kr">do</span>
    <span class="n">eof</span> <span class="o">&lt;-</span> <span class="n">hIsEOF</span> <span class="n">stdin</span>

    <span class="c1">-- UPDATE 3 April 2014: My program had a bug! It would not close the last</span>
    <span class="c1">-- Markdown group if the last line of the input program was Haskell code and</span>
    <span class="c1">-- not prose.</span>
    <span class="n">when</span> <span class="p">(</span><span class="n">eof</span> <span class="o">&amp;&amp;</span> <span class="n">state</span> <span class="o">==</span> <span class="kt">LHS</span><span class="p">)</span> <span class="o">$</span>
        <span class="n">hPutStrLn</span> <span class="n">stdout</span> <span class="s">"```"</span>

    <span class="n">unless</span> <span class="n">eof</span> <span class="o">$</span> <span class="kr">case</span> <span class="n">state</span> <span class="kr">of</span>
        <span class="kt">LHS</span>   <span class="o">-&gt;</span> <span class="n">processLhsLine</span>
        <span class="kt">Prose</span> <span class="o">-&gt;</span> <span class="n">processProseLine</span>
</code></pre></div></div>

<p>When looking at prose, all we need to do is watch out for lines that have bird tracks.  If it’s anything else,
spit the line out verbatim.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">processProseLine</span> <span class="o">::</span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="n">processProseLine</span> <span class="o">=</span> <span class="kr">do</span>
    <span class="n">line</span> <span class="o">&lt;-</span> <span class="n">hGetLine</span> <span class="n">stdin</span>
    <span class="kr">if</span> <span class="n">hasBirdTrack</span> <span class="n">line</span>
        <span class="kr">then</span>
            <span class="n">switchToLhs</span> <span class="n">line</span>
        <span class="kr">else</span> <span class="kr">do</span>
            <span class="n">hPutStrLn</span> <span class="n">stdout</span> <span class="n">line</span>
            <span class="n">processNextLine</span> <span class="kt">Prose</span>
</code></pre></div></div>

<p>LHS is pretty much identical, but we need to strip the bird track from each line as we read it.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">processLhsLine</span> <span class="o">::</span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="n">processLhsLine</span> <span class="o">=</span> <span class="kr">do</span>
    <span class="n">line</span> <span class="o">&lt;-</span> <span class="n">hGetLine</span> <span class="n">stdin</span>
    <span class="kr">if</span> <span class="n">hasBirdTrack</span> <span class="n">line</span>
        <span class="kr">then</span> <span class="kr">do</span>
            <span class="n">hPutStrLn</span> <span class="n">stdout</span> <span class="p">(</span><span class="n">stripBirdTrack</span> <span class="n">line</span><span class="p">)</span>
            <span class="n">processNextLine</span> <span class="kt">LHS</span>
        <span class="kr">else</span>
            <span class="n">switchToProse</span> <span class="n">line</span>
</code></pre></div></div>

<p>To flip between states, print out a line that signifies Haskell or not-Haskell, and resume in the new state.</p>

<p>It’s kind of funny that we call this sort of recursion “functional.”  In this program, it looks an awful lot like
“goto.” :)</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">switchToLhs</span> <span class="o">::</span> <span class="kt">T</span><span class="o">.</span><span class="kt">Text</span> <span class="o">-&gt;</span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="n">switchToLhs</span> <span class="n">line</span> <span class="o">=</span> <span class="kr">do</span>
    <span class="n">hPutStrLn</span> <span class="n">stdout</span> <span class="s">"```haskell"</span>
    <span class="n">hPutStrLn</span> <span class="n">stdout</span> <span class="p">(</span><span class="n">stripBirdTrack</span> <span class="n">line</span><span class="p">)</span>
    <span class="n">processNextLine</span> <span class="kt">LHS</span>

<span class="n">switchToProse</span> <span class="o">::</span> <span class="kt">T</span><span class="o">.</span><span class="kt">Text</span> <span class="o">-&gt;</span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="n">switchToProse</span> <span class="n">line</span> <span class="o">=</span> <span class="kr">do</span>
    <span class="n">hPutStrLn</span> <span class="n">stdout</span> <span class="s">"```"</span>
    <span class="n">hPutStrLn</span> <span class="n">stdout</span> <span class="n">line</span>
    <span class="n">processNextLine</span> <span class="kt">Prose</span>

<span class="n">hasBirdTrack</span> <span class="o">::</span> <span class="kt">T</span><span class="o">.</span><span class="kt">Text</span> <span class="o">-&gt;</span> <span class="kt">Bool</span>
<span class="n">hasBirdTrack</span> <span class="n">line</span> <span class="o">=</span> <span class="s">"&gt;"</span> <span class="o">==</span> <span class="n">line</span> <span class="o">||</span> <span class="s">"&gt; "</span> <span class="o">==</span> <span class="kt">T</span><span class="o">.</span><span class="n">take</span> <span class="mi">2</span> <span class="n">line</span>

<span class="n">stripBirdTrack</span> <span class="o">::</span> <span class="kt">T</span><span class="o">.</span><span class="kt">Text</span> <span class="o">-&gt;</span> <span class="kt">T</span><span class="o">.</span><span class="kt">Text</span>
<span class="n">stripBirdTrack</span> <span class="n">line</span> <span class="o">=</span> <span class="kt">T</span><span class="o">.</span><span class="n">drop</span> <span class="mi">2</span> <span class="n">line</span>

<span class="n">main</span> <span class="o">::</span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="n">main</span> <span class="o">=</span> <span class="n">processNextLine</span> <span class="kt">Prose</span>
</code></pre></div></div>

<p>And that’s it!</p>

<p><a href="https://github.com/andyfriesen/andyfriesen.github.io/blob/master/_lhs/lhs-to-jekyll-markdown.lhs">Source</a></p>]]></content><author><name></name></author><summary type="html"><![CDATA[I think I’ve more or less decided to switch my blog over to use Github Pages and Jekyll. It’s pretty neat and it means I can sleep safer knowing that I’m not inadvertently inflicting unpatched PHP on some poor unsuspecting web host.]]></summary></entry><entry><title type="html">Using C++ Macros to Inline Repetitious Code</title><link href="https://andyfriesen.com/2009/02/11/macros.html" rel="alternate" type="text/html" title="Using C++ Macros to Inline Repetitious Code" /><published>2009-02-11T00:00:00+00:00</published><updated>2009-02-11T00:00:00+00:00</updated><id>https://andyfriesen.com/2009/02/11/macros</id><content type="html" xml:base="https://andyfriesen.com/2009/02/11/macros.html"><![CDATA[<p>One of the neat things the IMVU client has is balls-awesome crash handling. We do a number of things to ensure that, any time our client crashes for any reason, that we find out as much as we can about them. This serves a bunch of purposes: we use the raw number of crashes to determine whether or not we have broken something, and we use all the information we can get out of the crashes (stack traces, memory dumps, log files, and the like) to fix them.</p>

<p>One of the tricks to testing crash handling is that you can’t really do it without crashing. :) To that end, we have a large array of functions that exist only to crash our client. Moreover, we have parameterized them on the context in which we would like the crash to occur: with Python on the stack, without, from a WndProc, from within an exception handler, and so forth</p>

<p>This would be a terrible source of duplicated code, but for a satanic little trick that you can do with the C preprocessor. Since I’m such a nice guy, I’ll even tell you what it is right up front, so you can skip the rest of this post if you want:</p>

<p>Macros can be macro arguments.</p>

<p>Here’s how it works:</p>

<p>First, we’re going to define a list of crashes, using the C preprocessor:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="cp">#define CRASH_LIST(F)               \
    F(WriteToNull)                  \
    F(ReadFromNull)                 \
    F(JumpToNull)                   \
    F(CallPureCall)                 \
    F(ThrowFromExceptionHandler)    \
    F(DestroySun)
</span>
</code></pre></div></div>

<p>Weird, right? Right. Now, the first thing we have to do with all of these crashes, is declare functions in a header file:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#define DECLARE_CRASH(T)       \
    void T();
</span>
<span class="n">CRASH_LIST</span><span class="p">(</span><span class="n">DECLARE_CRASH</span><span class="p">)</span>
</code></pre></div></div>

<p>Next, we need to declare functions that crash within a WndProc:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#define DECLARE_WNDPROC_CRASH(T)       \
    void T ## InWndProc();
</span>
<span class="n">CRASH_LIST</span><span class="p">(</span><span class="n">DECLARE_WNDPROC_CRASH</span><span class="p">)</span>
</code></pre></div></div>

<p>Then, we need to implement the crashes themselves:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">WriteToNull</span><span class="p">()</span> <span class="p">{</span>
    <span class="o">*</span><span class="p">((</span><span class="kt">int</span><span class="o">*</span><span class="p">)</span><span class="mi">0</span><span class="p">)</span> <span class="o">=</span> <span class="mi">1234</span><span class="p">;</span>
<span class="p">}</span>

<span class="cm">/* etc */</span>
</code></pre></div></div>

<p>Yeah, we can’t use the fun macro trick this time. :( You can’t win them all, I guess. Next, crashing in WndProc:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">HWND</span> <span class="nf">createCrashWindow</span><span class="p">()</span> <span class="p">{</span> <span class="cm">/*this is boring win32 gook */</span> <span class="p">}</span>

<span class="n">LRESULT</span> <span class="n">crasherWndProc</span><span class="p">(</span><span class="n">HWND</span> <span class="n">hWnd</span><span class="p">,</span> <span class="n">UINT</span> <span class="n">msg</span><span class="p">,</span> <span class="n">WPARAM</span> <span class="n">wParam</span><span class="p">,</span> <span class="n">LPARAM</span> <span class="n">lParam</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">msg</span> <span class="o">==</span> <span class="n">IMVU_WM_CRASH</span><span class="p">)</span> <span class="p">{</span>
        <span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">crashFunc</span><span class="p">)()</span> <span class="o">=</span> <span class="k">reinterpret_cast</span><span class="o">&lt;</span><span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="p">)()</span><span class="o">&gt;</span><span class="p">(</span><span class="n">wParam</span><span class="p">);</span>
        <span class="n">crashFunc</span><span class="p">();</span>
        <span class="k">return</span> <span class="mi">0</span>
    <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
        <span class="k">return</span> <span class="n">DefWindowProc</span><span class="p">(</span><span class="n">hWnd</span><span class="p">,</span> <span class="n">msg</span><span class="p">,</span> <span class="n">wParam</span><span class="p">,</span> <span class="n">lParam</span><span class="p">);</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="n">crashInWindow</span><span class="p">(</span><span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">crashFunc</span><span class="p">)())</span> <span class="p">{</span>
    <span class="n">HWND</span> <span class="n">hWnd</span> <span class="o">=</span> <span class="n">createCrashWindow</span><span class="p">();</span>
    <span class="n">PostMessage</span><span class="p">(</span><span class="n">hWnd</span><span class="p">,</span> <span class="n">IMVU_WM_CRASH</span><span class="p">,</span> <span class="k">reinterpret_cast</span><span class="o">&lt;</span><span class="n">WPARAM</span><span class="o">&gt;</span><span class="p">(</span><span class="n">crashFunc</span><span class="p">),</span> <span class="mi">0</span><span class="p">);</span>
<span class="p">}</span>

<span class="cp">#define IMPLEMENT_WNDPROC_CRASH(T)  \
    void T ## InWndProc() {         \
        crashInWindow(&amp;T);          \
    }
</span>
<span class="n">CRASH_LIST</span><span class="p">(</span><span class="n">IMPLEMENT_WNDPROC_CRASH</span><span class="p">)</span>
</code></pre></div></div>

<p>And lastly, we need to be able to call all of these functions from Python:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#define IMPLEMENT_BOOST_PYTHON_CRASH(T)    \
    def(#T, &amp;T);                        \
    def(#T "InWndProc", &amp;T ## InWndProc);
</span>
<span class="n">CRASH_LIST</span><span class="p">(</span><span class="n">IMPLEMENT_BOOST_PYTHON_CRASH</span><span class="p">)</span>
</code></pre></div></div>

<p>Now, I’ll be the first one to wail in terror at how theoretically terrible the C preprocessor is, and what it does to
maintainability, but, in this case, at least, the payoff is undeniable: with very small effort, and without having to
build some kind of “CrashRegistry” framework, we can add additional crash cases to our client.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[One of the neat things the IMVU client has is balls-awesome crash handling. We do a number of things to ensure that, any time our client crashes for any reason, that we find out as much as we can about them. This serves a bunch of purposes: we use the raw number of crashes to determine whether or not we have broken something, and we use all the information we can get out of the crashes (stack traces, memory dumps, log files, and the like) to fix them.]]></summary></entry><entry><title type="html">Ghetto Closures in C++ III: Templates and Traits and Interfaces, oh my!</title><link href="https://andyfriesen.com/2009/02/10/closures3.html" rel="alternate" type="text/html" title="Ghetto Closures in C++ III: Templates and Traits and Interfaces, oh my!" /><published>2009-02-10T00:00:00+00:00</published><updated>2009-02-10T00:00:00+00:00</updated><id>https://andyfriesen.com/2009/02/10/closures3</id><content type="html" xml:base="https://andyfriesen.com/2009/02/10/closures3.html"><![CDATA[<p>Last time, we went over generating a tiny ASM thunk that could wrap a C++ method pointer (instance plus function) up
in a non-method <code class="language-plaintext highlighter-rouge">__stdcall</code> function pointer, suitable for use as, say, a win32 WndProc.  Next, I’m going to talk about
wrapping it all up in a convenient API. To do this, we’re going to need some template-fu.</p>

<p>Since I am using a modern compiler, I am going to see how close I can get to boost.function‘s interface:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o">&lt;</span><span class="kt">void</span> <span class="p">()</span><span class="o">&gt;</span> <span class="n">fn</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">someFunction</span><span class="p">;</span>
</code></pre></div></div>

<p>I consider this interface to be pretty rad.</p>

<p><code class="language-plaintext highlighter-rouge">boost::function</code> works with only a single template argument, so we could go that route too. We could also accept that we’re doing something a bit different, and add a second parameter:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Thunk</span><span class="o">&lt;</span><span class="n">LRESULT</span> <span class="p">(</span><span class="n">Window</span><span class="o">::*</span><span class="p">)(</span><span class="n">HWND</span> <span class="n">hWnd</span><span class="p">,</span> <span class="n">UINT</span> <span class="n">msg</span><span class="p">,</span> <span class="n">WPARAM</span> <span class="n">wParam</span><span class="p">,</span> <span class="n">LPARAM</span> <span class="n">lParam</span><span class="p">)</span><span class="o">&gt;</span> <span class="n">wndProcThunk</span><span class="p">;</span>
</code></pre></div></div>

<p>or</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Thunk</span><span class="o">&lt;</span><span class="n">Window</span><span class="p">,</span> <span class="n">LRESULT</span> <span class="p">(</span><span class="n">HWND</span> <span class="n">hWnd</span><span class="p">,</span> <span class="n">UINT</span> <span class="n">msg</span><span class="p">,</span> <span class="n">WPARAM</span> <span class="n">wParam</span><span class="p">,</span> <span class="n">LPARAM</span> <span class="n">lParam</span><span class="p">)</span><span class="o">&gt;</span> <span class="n">wndProcThunk</span><span class="p">;</span>
</code></pre></div></div>

<p>I picked the first one, mostly because it was the first thing that popped into my head. Here is my test harness:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;cstdio&gt;</span><span class="cp">
</span><span class="k">using</span> <span class="n">std</span><span class="o">::</span><span class="n">printf</span><span class="p">;</span>

<span class="k">struct</span> <span class="nc">I</span> <span class="p">{</span>
    <span class="k">virtual</span> <span class="kt">void</span> <span class="n">print</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="k">virtual</span> <span class="kt">void</span> <span class="n">printSum</span><span class="p">(</span><span class="kt">double</span> <span class="n">y</span><span class="p">)</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">};</span>

<span class="k">struct</span> <span class="nc">C</span> <span class="o">:</span> <span class="n">I</span> <span class="p">{</span>
    <span class="n">C</span><span class="p">(</span><span class="kt">int</span> <span class="n">x</span><span class="p">)</span>
        <span class="o">:</span> <span class="n">x</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
    <span class="p">{</span> <span class="p">}</span>

    <span class="kt">void</span> <span class="n">print</span><span class="p">()</span> <span class="p">{</span>
        <span class="n">printf</span><span class="p">(</span><span class="s">"My x is %in"</span><span class="p">,</span> <span class="n">x</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="kt">void</span> <span class="n">printSum</span><span class="p">(</span><span class="kt">double</span> <span class="n">y</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">printf</span><span class="p">(</span><span class="s">"%i + %f = %fn"</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="kt">int</span> <span class="n">x</span><span class="p">;</span>
<span class="p">};</span>

<span class="kt">void</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">C</span> <span class="n">instance</span><span class="p">(</span><span class="mi">4</span><span class="p">);</span>

    <span class="n">Thunk</span><span class="o">&lt;</span><span class="kt">void</span> <span class="p">(</span><span class="n">I</span><span class="o">::*</span><span class="p">)()</span><span class="o">&gt;</span> <span class="n">thunk</span><span class="p">(</span><span class="o">&amp;</span><span class="n">instance</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">I</span><span class="o">::</span><span class="n">print</span><span class="p">);</span>
    <span class="n">printf</span><span class="p">(</span><span class="s">"n"</span><span class="p">);</span>
    <span class="n">thunk</span><span class="p">.</span><span class="n">get</span><span class="p">()();</span>

    <span class="n">Thunk</span><span class="o">&lt;</span><span class="kt">void</span> <span class="p">(</span><span class="n">I</span><span class="o">::*</span><span class="p">)(</span><span class="kt">double</span><span class="p">)</span><span class="o">&gt;</span> <span class="n">thunk2</span><span class="p">(</span><span class="o">&amp;</span><span class="n">instance</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">I</span><span class="o">::</span><span class="n">printSum</span><span class="p">);</span>
    <span class="n">printf</span><span class="p">(</span><span class="s">"n"</span><span class="p">);</span>
    <span class="n">thunk2</span><span class="p">.</span><span class="n">get</span><span class="p">()(</span><span class="mf">3.14</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>First, I extracted the part of the code that actually constructs and cleans up the generated code. Everything else I’ll outline will just be scaffolding so that the interface is prettier.</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">D</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">S</span><span class="p">&gt;</span>
<span class="n">D</span> <span class="nf">really_reinterpret_cast</span><span class="p">(</span><span class="n">S</span> <span class="n">s</span><span class="p">)</span> <span class="p">{</span>
    <span class="kt">char</span> <span class="n">__static_assert_that_types_have_same_size</span><span class="p">[</span><span class="k">sizeof</span><span class="p">(</span><span class="n">S</span><span class="p">)</span> <span class="o">==</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">D</span><span class="p">)];</span>

    <span class="k">union</span> <span class="p">{</span>
        <span class="n">S</span> <span class="n">s</span><span class="p">;</span>
        <span class="n">D</span> <span class="n">d</span><span class="p">;</span>
    <span class="p">}</span> <span class="n">u</span><span class="p">;</span>

    <span class="n">u</span><span class="p">.</span><span class="n">s</span> <span class="o">=</span> <span class="n">s</span><span class="p">;</span>
    <span class="k">return</span> <span class="n">u</span><span class="p">.</span><span class="n">d</span><span class="p">;</span>
<span class="p">}</span>

<span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">C</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">M</span><span class="p">&gt;</span>
<span class="kt">void</span><span class="o">*</span> <span class="n">createThunk</span><span class="p">(</span><span class="n">C</span><span class="o">*</span> <span class="n">instance</span><span class="p">,</span> <span class="n">M</span> <span class="n">method</span><span class="p">)</span> <span class="p">{</span>
    <span class="kt">char</span> <span class="n">code</span><span class="p">[]</span> <span class="o">=</span> <span class="p">{</span>
        <span class="mh">0xB9</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span>   <span class="c1">// mov ecx, 0</span>
        <span class="mh">0xB8</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span>   <span class="c1">// mov eax, 0</span>
        <span class="mh">0xFF</span><span class="p">,</span> <span class="mh">0xE0</span>          <span class="c1">// jmp eax</span>
    <span class="p">};</span>

    <span class="c1">// YEEHAW</span>
    <span class="o">*</span><span class="p">((</span><span class="n">I</span><span class="o">**</span><span class="p">)(</span><span class="n">code</span> <span class="o">+</span> <span class="mi">1</span><span class="p">))</span> <span class="o">=</span> <span class="n">instance</span><span class="p">;</span>
    <span class="o">*</span><span class="p">((</span><span class="kt">void</span><span class="o">**</span><span class="p">)(</span><span class="n">code</span> <span class="o">+</span> <span class="mi">6</span><span class="p">))</span> <span class="o">=</span> <span class="n">really_reinterpret_cast</span><span class="o">&lt;</span><span class="kt">void</span><span class="o">*&gt;</span><span class="p">(</span><span class="n">method</span><span class="p">);</span>

    <span class="kt">void</span><span class="o">*</span> <span class="n">thunk</span> <span class="o">=</span> <span class="n">VirtualAlloc</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">code</span><span class="p">),</span> <span class="n">MEM_COMMIT</span><span class="p">,</span> <span class="n">PAGE_EXECUTE_READWRITE</span><span class="p">);</span>
    <span class="n">memcpy</span><span class="p">(</span><span class="n">thunk</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">code</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">code</span><span class="p">));</span>
    <span class="n">FlushInstructionCache</span><span class="p">(</span><span class="n">GetCurrentProcess</span><span class="p">(),</span> <span class="n">thunk</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">code</span><span class="p">));</span>

    <span class="k">return</span> <span class="n">thunk</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="n">releaseThunk</span><span class="p">(</span><span class="kt">void</span><span class="o">*</span> <span class="n">thunk</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">VirtualFree</span><span class="p">(</span><span class="n">thunk</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">MEM_RELEASE</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>To be honest, this interface isn’t all that bad: all you have to do is remember to manage the lifetime of the generated code and cast the void* you get to the right type. That’s kind of boring, though, so let’s instead see if we can make something kickass and typesafe.</p>

<p>I like objects, so let’s start with one of those.</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">M</span><span class="p">&gt;</span>
<span class="k">struct</span> <span class="nc">Thunk</span> <span class="p">{</span>
    <span class="k">typedef</span> <span class="n">M</span> <span class="n">Method</span><span class="p">;</span>
    <span class="k">typedef</span> <span class="k">typename</span> <span class="n">methodptr_traits</span><span class="o">&lt;</span><span class="n">M</span><span class="o">&gt;::</span><span class="n">class_type</span> <span class="n">Class</span><span class="p">;</span>
    <span class="k">typedef</span> <span class="k">typename</span> <span class="n">methodptr_traits</span><span class="o">&lt;</span><span class="n">M</span><span class="o">&gt;::</span><span class="n">function_type</span> <span class="n">Function</span><span class="p">;</span>

    <span class="n">Thunk</span><span class="p">(</span><span class="n">Class</span><span class="o">*</span> <span class="n">instance</span><span class="p">,</span> <span class="n">Method</span> <span class="n">method</span><span class="p">)</span>
        <span class="o">:</span> <span class="n">ptr</span><span class="p">(</span><span class="n">createThunk</span><span class="p">(</span><span class="n">instance</span><span class="p">,</span> <span class="n">method</span><span class="p">))</span>
    <span class="p">{</span> <span class="p">}</span>

    <span class="o">~</span><span class="n">Thunk</span><span class="p">()</span> <span class="p">{</span>
        <span class="n">releaseThunk</span><span class="p">(</span><span class="n">ptr</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="n">Function</span> <span class="n">get</span><span class="p">()</span> <span class="k">const</span> <span class="p">{</span>
        <span class="k">return</span> <span class="k">reinterpret_cast</span><span class="o">&lt;</span><span class="n">Function</span><span class="o">&gt;</span><span class="p">(</span><span class="n">ptr</span><span class="p">);</span>
    <span class="p">}</span>

<span class="nl">private:</span>
    <span class="n">Thunk</span><span class="p">(</span><span class="k">const</span> <span class="n">Thunk</span><span class="o">&amp;</span><span class="p">);</span>

    <span class="kt">void</span><span class="o">*</span> <span class="n">ptr</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<p>This is simple enough to be boring, except for this methodptr_traits thing.</p>

<p>methodptr_traits is an instance of something called a traits class. Basically, it is a fancy template type that defines various other types. You can think of it as a way to code ad-hoc, compile-time type introspection.</p>

<p>If you’ve never used templates this way, the implementation is pretty intimidating:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">F</span><span class="p">&gt;</span>
<span class="k">struct</span> <span class="nc">methodptr_traits</span><span class="p">;</span>

<span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">ReturnType</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="k">struct</span> <span class="nc">methodptr_traits</span><span class="o">&lt;</span><span class="n">ReturnType</span> <span class="p">(</span><span class="n">T</span><span class="o">::*</span><span class="p">)()</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">typedef</span> <span class="n">T</span> <span class="n">class_type</span><span class="p">;</span>
    <span class="k">typedef</span> <span class="n">ReturnType</span> <span class="p">(</span><span class="kr">__stdcall</span> <span class="o">*</span><span class="n">function_type</span><span class="p">)();</span>
<span class="p">};</span>
</code></pre></div></div>

<p>This is one of the more convoluted things one can do with templates, and I’ll be the first to admit that I think it’s a bit scary. Let’s rewind a bit and look at this in simpler terms. Say we want a boolean variable that’s true if a particular template type is a number. We can use template specialization to accomplish this pretty easily:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span> <span class="k">struct</span> <span class="nc">is_int</span> <span class="p">{</span> <span class="k">enum</span> <span class="p">{</span><span class="n">value</span> <span class="o">=</span> <span class="nb">false</span><span class="p">};</span> <span class="p">};</span>
<span class="k">template</span> <span class="o">&lt;</span><span class="p">&gt;</span> <span class="k">struct</span> <span class="nc">is_int</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="p">{</span> <span class="k">enum</span> <span class="p">{</span><span class="n">value</span> <span class="o">=</span> <span class="nb">true</span><span class="p">};</span> <span class="p">};</span>
<span class="k">template</span> <span class="o">&lt;</span><span class="p">&gt;</span> <span class="k">struct</span> <span class="nc">is_int</span><span class="o">&lt;</span><span class="kt">short</span><span class="o">&gt;</span> <span class="p">{</span> <span class="k">enum</span> <span class="p">{</span><span class="n">value</span> <span class="o">=</span> <span class="nb">true</span><span class="p">};</span> <span class="p">};</span>
<span class="k">template</span> <span class="o">&lt;</span><span class="p">&gt;</span> <span class="k">struct</span> <span class="nc">is_int</span><span class="o">&lt;</span><span class="kt">char</span><span class="o">&gt;</span> <span class="p">{</span> <span class="k">enum</span> <span class="p">{</span><span class="n">value</span> <span class="o">=</span> <span class="nb">true</span><span class="p">};</span> <span class="p">};</span>
<span class="c1">// and so on</span>
</code></pre></div></div>

<p>I might leverage this code with something like the following:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="p">(</span><span class="n">is_int</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;::</span><span class="n">value</span><span class="p">)</span> <span class="p">{</span> <span class="cm">/* stuff */</span> <span class="p">}</span>
</code></pre></div></div>

<p>and be off to the races.</p>

<p>Cool, right? Now what if we instead wanted to know whether something is a std::vector, whatever the element type? The same principle applies, but now we have a template specialization that is itself a template:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template &lt;typename T&gt; struct is_vector { enum {value=false}; };
template &lt;typename E&gt; struct is_vector&lt;std::vector&lt;E&gt; &gt; { enum {value=true}; };
How to use this should be obvious:

if (is_vector&lt;T&gt;::value) { /* do something that only works on std::vector */ }
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">methodptr_traits</code> is just a tiny jump further. Most of the terror that this sort of thing inspires is really the fault of C++’s ridiculous function pointer syntax.</p>

<p>It is kind of a drag that C++ templates cannot express functions without specifying exactly how many arguments the function has. Because of this, a new specialization must be written for each argument count you want to support. I only did 0 and 1 arguments because this is just a small example. boost tends to support a minimum of 10 arguments by default, which is good enough for almost everyone.</p>

<p>For this example, I only need 0 and 1 argument, so here’s the specialization for a one-argument function:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">ReturnType</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">Arg1</span><span class="p">&gt;</span>
<span class="k">struct</span> <span class="nc">methodptr_traits</span><span class="o">&lt;</span><span class="n">ReturnType</span> <span class="p">(</span><span class="n">T</span><span class="o">::*</span><span class="p">)(</span><span class="n">Arg1</span><span class="p">)</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">typedef</span> <span class="n">T</span> <span class="n">class_type</span><span class="p">;</span>
    <span class="k">typedef</span> <span class="n">ReturnType</span> <span class="p">(</span><span class="kr">__stdcall</span> <span class="o">*</span><span class="n">function_type</span><span class="p">)(</span><span class="n">Arg1</span><span class="p">);</span>
<span class="p">};</span>
</code></pre></div></div>

<p>And that’s it! With a single templatized class, we can dynamically generate an assembly thunk that works as a perfectly usable <code class="language-plaintext highlighter-rouge">__stdcall</code> function. We can pass this function pointer on to Win32, GLU, or whatever other C library we might need a callback function for.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Last time, we went over generating a tiny ASM thunk that could wrap a C++ method pointer (instance plus function) up in a non-method __stdcall function pointer, suitable for use as, say, a win32 WndProc. Next, I’m going to talk about wrapping it all up in a convenient API. To do this, we’re going to need some template-fu.]]></summary></entry><entry><title type="html">Ghetto Closures in C++ II: __thiscall</title><link href="https://andyfriesen.com/2009/02/09/closures2.html" rel="alternate" type="text/html" title="Ghetto Closures in C++ II: __thiscall" /><published>2009-02-09T00:00:00+00:00</published><updated>2009-02-09T00:00:00+00:00</updated><id>https://andyfriesen.com/2009/02/09/closures2</id><content type="html" xml:base="https://andyfriesen.com/2009/02/09/closures2.html"><![CDATA[<p>Today, we’re going to extend our little closure library to support the <code class="language-plaintext highlighter-rouge">__thiscall</code> calling convention.</p>

<p>After some poking around, I have discovered that some guy has figured this out already. Rad! I’m going to go over it quickly anyway so that I can build on it for tomorrow.</p>

<p>First, the setup/demo part changes a bit:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">I</span> <span class="p">{</span>
    <span class="k">virtual</span> <span class="kt">void</span> <span class="n">print</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">};</span>

<span class="k">struct</span> <span class="nc">C</span> <span class="o">:</span> <span class="n">I</span> <span class="p">{</span>
    <span class="n">C</span><span class="p">(</span><span class="kt">int</span> <span class="n">x</span><span class="p">)</span>
        <span class="o">:</span> <span class="n">x</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
    <span class="p">{</span> <span class="p">}</span>

    <span class="kt">void</span> <span class="n">print</span><span class="p">()</span> <span class="p">{</span>
        <span class="n">printf</span><span class="p">(</span><span class="s">"My x is %in"</span><span class="p">,</span> <span class="n">x</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="kt">int</span> <span class="n">x</span><span class="p">;</span>
<span class="p">};</span>

<span class="k">typedef</span> <span class="kt">void</span> <span class="p">(</span><span class="kr">__stdcall</span> <span class="o">*</span><span class="n">Function0</span><span class="p">)();</span>
</code></pre></div></div>

<p>It turns out that <code class="language-plaintext highlighter-rouge">__thiscall</code> and <code class="language-plaintext highlighter-rouge">__stdcall</code> are not all that different. All you need to do is stuff your <code class="language-plaintext highlighter-rouge">this</code> pointer in ECX, and you’re set. Our satanic little blob of assembly changes to:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">__asm</span> <span class="p">{</span>
    <span class="n">mov</span> <span class="n">ecx</span><span class="p">,</span> <span class="n">my_this</span>        <span class="c1">// B9 xx xx xx xx</span>
    <span class="n">mov</span> <span class="n">eax</span><span class="p">,</span> <span class="n">real_func</span>      <span class="c1">// B8 yy yy yy yy</span>
    <span class="n">jmp</span> <span class="n">eax</span>                 <span class="c1">// FF E0</span>
<span class="p">}</span>
</code></pre></div></div>

<p>As before, we just have to create a block of memory, plug the code and pointers into it, and execute it:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// YEEHAW</span>
<span class="o">*</span><span class="p">((</span><span class="n">I</span><span class="o">**</span><span class="p">)(</span><span class="n">code</span> <span class="o">+</span> <span class="mi">1</span><span class="p">))</span> <span class="o">=</span> <span class="n">instance</span><span class="p">;</span>
<span class="o">*</span><span class="p">((</span><span class="kt">void</span><span class="o">**</span><span class="p">)(</span><span class="n">code</span> <span class="o">+</span> <span class="mi">6</span><span class="p">))</span> <span class="o">=</span> <span class="n">really_reinterpret_cast</span><span class="o">&lt;</span><span class="kt">void</span><span class="o">*&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="n">I</span><span class="o">::</span><span class="n">print</span><span class="p">);</span>

<span class="kt">void</span><span class="o">*</span> <span class="n">buffer</span> <span class="o">=</span> <span class="n">VirtualAlloc</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">code</span><span class="p">),</span> <span class="n">MEM_COMMIT</span><span class="p">,</span> <span class="n">PAGE_EXECUTE_READWRITE</span><span class="p">);</span>
<span class="n">memcpy</span><span class="p">(</span><span class="n">buffer</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">code</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">code</span><span class="p">));</span>
<span class="n">FlushInstructionCache</span><span class="p">(</span><span class="n">GetCurrentProcess</span><span class="p">(),</span> <span class="n">buffer</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">code</span><span class="p">));</span>

<span class="n">Function0</span> <span class="n">f0</span> <span class="o">=</span> <span class="k">reinterpret_cast</span><span class="o">&lt;</span><span class="n">Function0</span><span class="o">&gt;</span><span class="p">(</span><span class="n">buffer</span><span class="p">);</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">really_reinterpret_cast</code> is a hack because we are doing Very Bad Things. The C++ standard says that you cannot convert
pointer-to-methods to any other kind of pointer (not even <code class="language-plaintext highlighter-rouge">void*</code>!). I was unable to dig up the exact reasoning, but I
think it is because the C++ implementation is allowed to make pointer-to-members have any format it wants. They don’t
even have to be the same size as a normal pointer.</p>

<p>But that’s boring. :D I doubt Microsoft will change how this works any time soon now that existing code depends on it, and there is potential here to do something that is very useful.</p>

<p>It turns out that the unspecified implementation that they chose was for a pointer-to-member-function to either point to the code directly (like a stdcall function), or to point to a thunk that will go to the right place, if the method is virtual. In other words, we can just jump to it and we will be jumping to the right place.</p>

<p>Here’s my evil, standard-subverting cast:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">D</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">S</span><span class="p">&gt;</span>
<span class="n">D</span> <span class="nf">really_reinterpret_cast</span><span class="p">(</span><span class="n">S</span> <span class="n">s</span><span class="p">)</span> <span class="p">{</span>
    <span class="kt">char</span> <span class="n">__static_assert_that_types_have_same_size</span><span class="p">[</span><span class="k">sizeof</span><span class="p">(</span><span class="n">S</span><span class="p">)</span> <span class="o">==</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">D</span><span class="p">)];</span>

    <span class="k">union</span> <span class="p">{</span>
        <span class="n">S</span> <span class="n">s</span><span class="p">;</span>
        <span class="n">D</span> <span class="n">d</span><span class="p">;</span>
    <span class="p">}</span> <span class="n">u</span><span class="p">;</span>

    <span class="n">u</span><span class="p">.</span><span class="n">s</span> <span class="o">=</span> <span class="n">s</span><span class="p">;</span>
    <span class="k">return</span> <span class="n">u</span><span class="p">.</span><span class="n">d</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>That weird looking char array is an ad-hoc compile-time assertion. A C array of length 0 is illegal, so if sizeof(S) != sizeof(D), then the compile will fail.</p>

<p>And, as before, the sweet thrill of victory:</p>

<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Function0</span> <span class="n">f0</span> <span class="o">=</span> <span class="k">reinterpret_cast</span><span class="o">&lt;</span><span class="n">Function0</span><span class="o">&gt;</span><span class="p">(</span><span class="n">buffer</span><span class="p">);</span>
<span class="n">f0</span><span class="p">();</span>
</code></pre></div></div>

<p>woo.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Today, we’re going to extend our little closure library to support the __thiscall calling convention.]]></summary></entry></feed>