Merkle Trees and Data Verification

A Merkle tree is a binary tree where the nodes stores hashes of data, rather than the data itself. The leaf nodes store hashes of the data and the parent nodes are the results of concatenating the left and right children and then applying a hash function on the result.

To read more about Merkle trees go here.

In this post, I will talk about how data verification is performed in Merkle trees.

Firstly, a client receives a root hash from a trusted server/source:

root_hash = "1234"

The client then asks peers on an untrusted peer-2-peer network, to send us the data chunks associated with the root hash, we receive the chunks:

Data Chunks: [1] [2] [3] [4]

The data chunks came from random people on the network and we cannot trust them. So we build a Merkle tree from the received data chunks:

   H(1234)
  /       \
 H12     H34
 / \     / \
H1 H2   H3 H4

We then compare the root hash of the tree we just built H(1234), to the root hash we received from our trusted server. And in this instance, they match which means that the data has not been corrupted or tampered with.

Audit Proof

In the example above we had to download all the chunks before we could verify the data. It would be better if we could verify chunks as they are downloaded from peers, and not wait for all the data to arrive. This is possible and it is known as partial verification or audit-proof.

Our trusted server has the following Merkle tree:

   H(1234)
  /       \
 H12     H34
 / \     / \
H1 H2   H3 H4

The client downloads the following data chunk from the untrusted network:

Data Chunks: [3]

Whilst the client is downloading other chunks, we want to verify this chunk. So we perform the following steps:

  1. The client sends data chunk 3 to the trusted server.
  2. The server checks that Hash(3) is in the tree. it is.
  3. The server sends back hashes H(4) and H(12). We need these to compute the root hash.
  4. The client then computes:
H(34)   = H(3)  + H(4)
H(1234) = H(12) + H(34)

And we compare H(1234) to root_hash and they match. We did not need all the data chunks to get the answer.

What is also amazing about this process is that the trusted server had to only send a little information. If the number of data chunks were doubled, the server would only have to send two more hashes and the client would only have to perform two more hash calculations.

The Merkle tree is a really interesting data structure and is used in a wide range of applications, including Bitcoin, Git, and BitTorrent.

I hope you found this post helpful.

Fin.

Server-Sent Events (SSE) With Go

Server-Sent Events (SSE) technology is best explained by this article on MDN:

The EventSource interface is used to receive Server-Sent Events. It connects to a server over HTTP and receives events in text/event-stream format without closing the connection.

The connection is persistent and communication is one-way, from the server to the client, the client cannot push data to the server. This is unlike WebSockets where communication is bi-directional.

Some unique characteristics of SSE compared to WebSockets or long polling are:

  • Communication is one-way from server to client and read-only.
  • Requests are regular HTTP requests.
  • The client will attempt auto-reconnect if the connection drops.
  • Event messages can contain id’s, so if a client connection drops. On reconnect, it can send the last id it saw, and the server can work out the number of messages the client has missed. And push those message to the client on reconnect.

There is one important fact that you should understand about SSE when used over HTTP/1 and HTTP/2, from the MDN article above:

When not used over HTTP/2, SSE suffers from a limitation to the maximum number of open connections, which can be specially painful when opening various tabs as the limit is per browser and set to a very low number (6). The issue has been marked as “Won’t fix” in Chrome and Firefox. This limit is per browser + domain, so that means that you can open 6 SSE connections across all of the tabs to www.example1.com and another 6 SSE connections to www.example2.com. (from Stackoverflow). When using HTTP/2, the maximum number of simultaneous HTTP streams is negotiated between the server and the client (defaults to 100).

SSE Server

Here is a basic example of an SSE server in Go using the eventsource package:

// sse_server.go
package main

import (
    "fmt"
    "net/http"
    "strconv"
    "time"

    "github.com/bernerdschaefer/eventsource"
)

func main() {
    es := eventsource.Handler(func(lastID string, e *eventsource.Encoder, stop <-chan bool) {
        var id int64
        for {
            select {
            case <-time.After(3 * time.Second):
                fmt.Println("sending event...")
                id += 1
                e.Encode(eventsource.Event{ID: strconv.FormatInt(id, 10),
                    Type: "add",
                    Data: []byte("some data")})
            case <-stop:
                return
            }
        }
    })
    http.HandleFunc("/events", func(w http.ResponseWriter, r *http.Request) {
        w.Header().Set("Content-Type", "text/event-stream")
        w.Header().Set("Cache-Control", "no-cache")
        w.Header().Set("Connection", "keep-alive")
        es.ServeHTTP(w, r)
    })
    if e := http.ListenAndServe(":9090", nil); e != nil {
        fmt.Println(e)
    }
}

The above code sets up an endpoint /events and pushes events to clients every three seconds.

Start the server:

$ go run sse_server.go

SSE Client

To view events pushed by the server, we will use the excellent HTTPie tool:

$ http --stream http://localhost:9090/events
HTTP/1.1 200 OK
Cache-Control: no-cache
Connection: keep-alive
Content-Type: text/event-stream
Transfer-Encoding: chunked
Vary: Accept

id: 1
event: add
data: some data

id: 2
event: add
data: some data

id: 3
event: add
data: some data

An example using HTML/Javascript:

<!DOCTYPE html lang="en">
<html>
  <body>
    <h1>Getting server updates</h1>
    <div id="result"></div>
    <script>
      if(typeof(EventSource) !== "undefined") {
        var source = new EventSource("http://localhost:9090/events");
        source.addEventListener("add", (message) => {
                  console.log(message) 
                  document.getElementById('result').innerHTML += message.data + "<br>"; 
              }); 
      } else {
        document.getElementById("result").innerHTML = "Sorry, your browser does not support server-sent events...";
      }
    </script>
  </body>
</html>

Save the above code into file sse_client.html, and then open it in your web browser.

you will see the event “some data” being output on the page, if you check the console you will see much more detailed information about the events:

add { target: EventSource, isTrusted: true, data: "some data", origin: "http://localhost:9090", lastEventId: "1", ports: Restricted, srcElement: EventSource, currentTarget: EventSource, eventPhase: 2, bubbles: false, … }

I hope you found this post helpful.

Fin

Secure Tunnels with WireGuard

Wireguard is a secure networking tunnel, it can be used for VPNs, connecting data centers together over the Internet, or any place where you need to connect two or more networks together in a secure way.

Wireguard is open source and it is designed to be much simpler to configure than other tools such as OpenVPN or IPSec.

Below, we are going to connect two computers together:

Machine A IP address 192.168.0.40
Machine B IP address 192.168.0.41

First, we need to install WireGuard on each machine:

$ sudo apt install wireguard

Or if you are on macOS:

$ brew install wireguard-tools

On machine A and machine B run the following commands:

$ wg genkey | tee privatekey | wg pubkey > publickey

This will read privatekey from stdin and write the corresponding public key to publickey on stdout.

On machine A create the configuration file:

$ nano wg0.conf

And enter the following configuration data:

[Interface]
Address    = 10.0.0.40/24
PrivateKey = < copy & paste machine A's private key >
ListenPort = 58891

[Peer]
PublicKey  = < copy & paste machine B's public key >
AllowedIPs = 10.0.0.41/32 
EndPoint   = 192.168.0.41:56991

# if behind firewall or NAT
PersistentKeepalive = 25

On machine B create the configuration file:

$ nano wg0.conf

And enter the following configuration data:

[Interface]
Address    = 10.0.0.41/24
PrivateKey = < copy & paste machine B's private key >
ListenPort = 56991

[Peer]
PublicKey  = < copy & paste machine A's public key >
AllowedIPs = 10.0.0.40/32 
EndPoint   = 192.168.0.40:58891

# if behind firewall or NAT
PersistentKeepalive = 25

To start WireGuard, on both machines run the command:

wg-quick up ./wg0.conf

On machine A run:

ping 10.0.0.41

On machine B run:

ping 10.0.0.40

If you see output from the ping commands, success! you now have two machines with a secure tunnel between them.

To stop WireGuard, on both machines run the command:

wg-quick down ./wg0.conf

I hope this introduction to WireGuard helped you.

Fin.

Configuring Remote Connections with PostgreSQL

If you need to access your PostgreSQL database from another machine, you need to edit the configuration, to make the database listen to incoming connections from other hosts. Below are the steps on how to do this.

Note: Do not do this on your production database environment, It will open up a potential security flaw. This should be fine to do on your staging environment for testing purposes.

On the machine running PostgreSQL, locate configuration file postgres.conf:

$ su - postgres
$ psql
$ SHOW config_file;
postgres=# SHOW config_file;
                   config_file
    ------------------------------------------
     /etc/postgresql/9.6/main/postgresql.conf
    (1 row)

Open the file:

$ sudo nano /etc/postgresql/9.6/main/postgresql.conf

Edit the line:

listen_address = "localhost"

And append the machines IP address. So it looks like:

listen_address = "localhost, 192.168.0.40"

Note: you can get the machines IP address with ifconfig -a.

Now we need to edit the configuration file pg_hba.conf, you can locate the file with:

$ su - postgres
$ psql
$ SHOW hba_file;

Open the file:

$ sudo nano /etc/postgresql/9.6/main/pg_hba.conf

Add the line:

host    example    postgres    192.168.0.40/32    md5

Now restart PostgreSQL:

$ sudo service postgresql restart

Testing:

Let’s connect to our PostgreSQL database from another machine, using Python:

import psycopg2

# Connect to an existing database
conn = psycopg2.connect("host='192.168.0.40' dbname='test' user='postgres' password='password'")
cur  = conn.cursor()
cur.execute("SELECT * FROM test;")
...
cur.close()
conn.close()

Fin

Compiling OpenVPN on the Raspberry Pi 3

Recently I needed to upgrade the version of OpenVPN on my Raspberry Pi 3 Model B. Below are the steps needed to do this, here we upgrade to OpenVPN version 2.4.8.

$ cd /tmp
$ wget https://swupdate.openvpn.org/community/releases/openvpn-2.4.8.tar.gz
$ tar xf openvpn-2.4.8.tar.gz
$ cd openvpn-2.4.8/
$ sudo apt-get install libssl-dev
$ sudo apt-get install liblzo2-dev
$ sudo apt-get install libpam0g-dev
$ ./configure --prefix=/usr
$ sudo make
$ sudo make install

Verify that the new version installed:

$ openvpn --version
OpenVPN 2.4.8 armv7l-unknown-linux-gnueabihf 
[SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [MH/PKTINFO] [AEAD] built on May 28 2020
library versions: OpenSSL 1.1.1d  10 Sep 2019, LZO 2.10
Originally developed by James Yonan
Copyright (C) 2002-2018 OpenVPN Inc <sales@openvpn.net>
...

Hope this helps.

Fin.

Styling Markers on Mapbox Static Maps

Recently I was using the python map-box api to generate static maps. I needed to style the markers which pin pointed locations on the map. I found it difficult to find the necessary documentation on how to do this. In the end it simply involved adding the following fields:

  • maker-colour (hex value).
  • marker-symbol (Maki icon name. an integer, or a lowercase letter).
  • marker-size (small, large).

To the JSON like so:

origin = {
    'type': 'Feature',
    'properties': {
        'name': 'Cambridge', 
        'marker-color': '#f600f6', 
        'marker-symbol': 'c',
        'marker-size': 'large',
        },
    'geometry': {
        'type': 'Point',
        'coordinates': [0.1218,52.2053]
        },
    }

Here is the output:

Here is the full code sample:

from mapbox import StaticStyle

location1 = {
    'type': 'Feature',
    'properties': {
        'name': 'Cambridge', 
        'marker-color': '#f600f6', 
        'marker-symbol': 'c',
        'marker-size': 'large',
        },
    'geometry': {
        'type': 'Point',
        'coordinates': [0.1218,52.2053]
        },
}
    
location2 = {
    'type': 'Feature',
    'properties': {
        'name': 'Kings Cross London',
        'marker-color': '#26fae4',
        'marker-symbol': 'k',
        'marker-size': 'large',
        },
    'geometry': {
        'type': 'Point',
        'coordinates': [-0.125250,51.544180]
        }
}

features = [location1, location2]
service  = StaticStyle()
response = service.image(username='mapbox', 
                         style_id='streets-v9', 
                         retina=True,
                         attribution=False, 
                         logo=False,
                         features=features)
    
with open('map.png', 'wb') as img:
    img.write(response.content)

Note: Remember to the set the environment variable:

MAPBOX_ACCESS_TOKEN="MY_ACCESS_TOKEN"

Before you run your script, the environment variable is read by the mapbox library.

Mapbox static maps documentation.

Fin.

Ignoring Directories When Backing up with Borg

Borg backup has a --exclude-if-present flag, which allows you to ignore a directory from being backed up, If it contains a certain ‘tag’ file.

So given the following directory structure, which we want to back up:

+ MyFiles
	+ dirA
		- file_one.pdf
		- file_two.pdf
	+ dirB
		- spreadsheet.xsl
		- image.png

If we want to ignore dirB from being backed up by borg, we have to create a tag file in that directory:

$ cd dirB
$ touch borg-ignore-dir

Note: the tag file does not have to be named borg-ignore-dir. You can name it however you like.

Now we can run borg backup as follows:

$ borg create --exclude-if-present borg-ignore-dir ~/borgs/MyFiles::BackupName MyFiles/

Only dirA will be backed up.

Useful Links:

BorgBackup Homepage

Fin.

Tar Archiving Using Relative Paths

If you want to archive a directory (using the tar command), but not include the absolute path to the directory. You can use the -C option to the tar command, which essentially cd’s into the directory and then archives it. Here is an example:

$ tar czf /tmp/my_backup.tar.gz -C ~/home/coorp/my_files

When un-archived, the directory structure will be:

  • /some_path/my_files

And not:

  • /some_path/home/coorp/my_files.

Fin.

Make 'echo %PATH' More Readable

When you want to check the contents of your $PATH variable. To see which directories are in your session, you run the command:

$ echo $PATH

Output:

/usr/local/opt/coreutils/libexec/gnubin:/usr/local/bin:/usr/local/sbin:/opt:/usr/local/go/bin:/usr/local/texlive/2015basic/bin/x86_64-darwin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/opt/coreutils/libexec/gnubin:/usr/local/sbin:/opt:/usr/local/go/bin:/usr/local/texlive/2015basic/bin/x86_64-darwin

This is all well and good, but the output of the command can be hard to read if you have many directories in your path. What you want is a command that outputs the $PATH variable in an easier to read format. Luckily, this can accomplished with a little bit of bash code:

path()
{
    p=$PATH;
    OIFS="$IFS";
    IFS=':';
    read -a paths <<< "${p}";
    IFS="$OIFS";
    for i in "${paths[@]}";
    do
        echo "- $i";
    done
}

Now if we run:

$ path

Output:

- /usr/local/opt/coreutils/libexec/gnubin
- /usr/local/bin
- /usr/local/sbin
- /usr/local/go/bin
- /usr/local/texlive/2015basic/bin/x86_64-darwin
- /usr/local/bin
- /usr/bin
- /bin
- /usr/sbin
- /sbin
- /usr/local/opt/coreutils/libexec/gnubin
- /usr/local/sbin
- /opt
- /usr/local/go/bin
- /usr/local/texlive/2015basic/bin/x86_64-darwin

This is much easier to read. And better, it still displays the original search order of the directories.

Fin.

Piping Commands into Go Binaries

In this post we will see how we can pipe commands into our Go applications. We will write a basic command line program which demonstrates the functionality. Before we get started, it’s worth going over the basics of piping commands.

The modular nature of the UNIX operating system. Allows the user to use basic commands to build up new commands. By allowing the output of one command to be used as the input for the next command. The general form of the pipe command is:

command1 | command2 | command3

Here, command1 is piping its output into command2, which transforms the incoming data, and then pipes it into command3.

Some examples:

ls -l | grep "ninja" | less
who | sort
file * | grep text

Piping commands into your Go applications is quite easy. Thanks to the excellent standard library. We will write a very basic command line program ‘change’. Which takes input, and replaces any occurrence of word A with word B. Here is an example of the the command in action:

$ echo "I hate Brussel sprouts!" | change "hate" "love"
I love Brussel sprouts!

Below is the code for the full program:

// change.go
package main

import (
	"bytes"
	"fmt"
	"io"
	"log"
	"os"
)

func main() {
	buf := &bytes.Buffer{}
	n, err := io.Copy(buf, os.Stdin)
	if err != nil {
		log.Fatalln(err)
	} else if n <= 1 { // buffer always contains '\n'
		log.Fatalln("no input provided")
	}
	if len(os.Args) != 3 {
		log.Fatalln("usage: echo \"hello world\" | change hello bye")
	}
	oldWord := os.Args[1]
	newWord := os.Args[2]
	r := bytes.Replace(buf.Bytes(), []byte(oldWord), []byte(newWord), -1)
	fmt.Println(string(r))
}
$ go build change.go

Note: this code needs more work to make it more robust, if it were to be used in production. But it is okay for demonstration purposes.

And that is how you pipe commands into your Go applications. Another command could pipe in the output of our command, as follows:

$ echo "I hate hate Brussel sprouts!" | change "hate" "love" | grep -o love | wc -l
2

I hope you enjoyed this blog post.

Fin.

Sync.Pool is Drained During Garbage Collection

The Go sync.Pool type stores temporary objects, and provides get and put methods. Allowing you to cache allocated but unused items for later reuse. And relieving pressure on the garbage collector.

The purpose of the sync.Pool type is to reuse memory between garbage collections. Which is why sync.Pool is drained during garbage collection (GC).

Here is an example of how to use the sync pool:

package main

import (
	"bytes"
	"fmt"
	"sync"
)

var bufPool = sync.Pool{
	New: func() interface{} {
		return new(bytes.Buffer)
	},
}

func main() {
	b := bufPool.Get().(*bytes.Buffer)
	b.WriteString("What is past is prologue.")
	bufPool.Put(b)
	b = bufPool.Get().(*bytes.Buffer)
	fmt.Println(b.String())
}

Output:

What is past is prologue.

Note: When calling ‘bufpool.Get’ it is not guaranteed that we will get a specific buffer from the pool. The ‘Get’ method selects a random buffer from the pool, removes it from the pool and then returns it to the caller.

As stated before, the interesting thing to note when using sync.Pool, is that it is drained during GC. That is, all the objects within the pool are removed. Lets look at an example of this in action:

func main() {
    b := bufPool.Get().(*bytes.Buffer)
	b.WriteString("What is past is prologue.")
	bufPool.Put(b)
	
	// never actually call runtime.GC() in your program :)
	runtime.GC() 
	
	b = bufPool.Get().(*bytes.Buffer)
	fmt.Println(b.String())
}

The program will no longer output “What is past is prologue.” because the call to runtime.GC() drained ‘bufpool’. And when we make the call to ‘bufPool.Get()’, we get a brand new buffer and not a recycled one.

If we look at the Go source code repository on Github. We see that within the file mgc.go, the gcStart function makes a call to the function clearpools. This function drains all the sync.Pool types.

There are other techniques that can be used to drain the sync pool, such as:

  • Sometime before a GC
  • Sometime after a GC
  • Clock based
  • Using weak references/pointers

Each of these techniques have their advantages and disadvantages. However, Draining the pool during a GC is a good technique as it is simple and does not circumvent the garbage collector.

Fin.

Access Raspberry Pi Externally using ngrok

In this blog post we will set up our Raspberry Pi so it will accessed using SSH from outside our home network. Below is a diagram of the architecture:

[ Home Network : Raspberry Pi ] <– [ ngrok ] –> [ External Network ]

From our home network we will create a secure tunnel, through ngrok. Which we will then connect to from our external network. This will allow us to SSH into our Raspberry Pi, and manage it.

What is ngrok?

ngrok is a fantastic tool which allows you to create secure tunnels to localhost. So you can do things like expose a local server behind a NAT or firewall to the internet. See the ngrok homepage for more information.

Lets get started!

Step 1: Enable Passwordless SSH Access

You need to configure your Pi for SSH access. Follow the steps described in the following link (very carefully):

By the end, you should be able to type:

$ ssh USER@Pi-IP-ADDRESS

And connect to your Raspberry Pi without a password prompt. For example:

$ ssh pi@192.168.0.42

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Sat Feb 25 20:24:30 2017 from 192.168.0.42

pi@raspberrypi ~ $ 

No prompt for a password.

Note: the computer you generated the SSH keys on (and copied to the Pi). Is the computer you will be using to connect to your Pi, from the external network.

Step 2: Install ngrok on your Raspberry Pi

Go here and follow the instructions to install ngrok on your Pi.

Step 3: Run ngrok on your Raspberry Pi

Run ngrok with the following options:

pi@raspberrypi ~ $ ngrok tcp 22

Step 4: Copy ngrok Host and Port

Once ngrok starts, it will display a tcp:// address on the forwarding line, for example:

Forwarding tcp://123xyz0.tcp.ngrok.io:17684 -> localhost:22

Copy the host name 123xyz0.tcp.ngrok.io and the port 17684.

Note: the host name and port will be different for you.

Step 5: Edit SSH config file

In step one you generated some ssh keys on your computer, and then copied them to the raspberry Pi. On the computer you generated the SSH keys on, add the following details to ~/.ssh/config:

Host ngrok
        HostName 123xyz.tcp.ngrok.io
        IdentityFile ~/.ssh/id_rsa

Again, the host name will be different for you.

Note: we are using the same key we copied over to the Pi in step one, you can generate new SSH keys if you like.

Step 6: Access the Raspberry Pi from outside your network

Now with your computer not connected to your home network, type the command:

$ ssh pi@ngrok -p 17684

Change the username and port to match yours.

You should now be connected to your Raspberry Pi using SSH through ngrok.

Fin.

Raspberry Pi Slow SSH Fix

If you SSH into your Raspberry Pi and have noticed a lag when typing characters into the terminal. Then the following fix may get rid of the lag (it worked for me!).

Log into your Raspberry Pi and type:

$ sudo nano /etc/ssh/sshd_config

At the bottom of the config file add:

UseDns no

Save the file, then restart sshd:

$ service ssh restart

Or better, reboot your Raspberry Pi:

$ sudo reboot

SSH’ing into your Raspberry Pi should no longer be slow after these steps. 🤞

Fin.

Runtime and User Level Panics in Go

How can our application code detect the difference between a runtime panic and a user level panic? Before answering this question. Let’s take a look at the difference between a runtime panic, and user level panic.

A runtime panic is one thrown by the Go runtime, there are many things that can trigger a runtime panic. An example of the the most common runtime panic would be attempting to index an array out of bounds. The Go runtime would detect the illegal access and call the built-in panic function.

A user level panic would be were code outside the runtime i.e. your code, 3rd partly library, etc… makes a call to the built-in panic function:

panic("he's dead jim")

In Go 1.7 a change was made to the panic values thrown by the runtime. From the 1.7 release notes it states:

All panics started by the runtime now use panic values that implement both the builtin error, and runtime.Error, as required by the language specification.

When the runtime throws a panic it will call the built-in panic function. With a value which implements the runtime.Error interface. When user level code calls the built-in panic function. It will provide a value which does not implement the runtime.Error interface.

Lets look at a code example:

package main

import (
	"errors"
	"fmt"
	"runtime"
)

func runtimePanic() {
	var a []int
	// index out of range, will trigger a runtime panic
	b := a[99]
	_ = b
}

func nonRuntimePanic() {
	panic(errors.New(":("))
}

func main() {
	defer func() {
		if x := recover(); x != nil {
			if _, ok := x.(runtime.Error); ok {
				fmt.Println("this is a runtime panic error")
			} else {
				fmt.Println("this is a non-runtime panic error")
			}
		}
	}()
	// nonRuntimePanic()
	runtimePanic()
}

In our defer closure we call the built-in recover function. And then use the return value, to see if the type satisfies the runtime.Error interface.

If we run this code, we will get the output:

this is a runtime panic error

If we uncomment the call to nonRuntimePanic, and then run the code again, we will get the output:

this is a non-runtime panic error

Now we can distinguish between runtime and user level panics. This information may be useful if your are determining whether to rescue your application from a panic, or to let it crash.

Useful links:

runtime.Error Documentation

Run-time panics

Go 1.7 Release Notes

Fin.

Creating Empty Git Commits

You may not know it, but Git allows you to create an empty commit in to your repo. It doesn’t like it, but you can do it any way.

An empty commit is one where you don’t actually commit any code changes i.e. if you git status your repo, you get the message:

On branch master
nothing to commit, working tree clean

Why would you want to create an empty commit? You may want to communicate changes, which have nothing to do with code. But you have updated something that you want the rest of the team to know about. And you feel that communicating this via the git log make sense.

Creating an empty commit

Start by creating a repo:

$ mkdir test 
$ cd test 
$ git init 
$ echo "hello world" >> hello.txt 
$ git add hello.txt 
$ git commit -m "initial commit" 

Now we have a repo with one commit and nothing else to commit:

$ git status
On branch master
nothing to commit, working tree clean

We can now use the –allow-empty flag to insert an empty commit:

$ git commit -m "this is an empty commit" --allow-empty
[master 02cdd7f] this is an empty commit

Lets see what the git log looks like:

$ git log
commit 02cdd7f74c0fe4219a061e94066251dbf1137475
Author: Philip J. Fry <fry@planet-express.com> 
Date:   Thu Dec 01 21:24:28 2016 +0000

    this is an empty commit

commit f503c2ccf476b144f69108a842782464dd3ca61e
Author: Philip J. Fry <fry@planet-express.com> 
Date:   Thu Dec 01 21:15:40 2016 +0000

    initial commit

We can also create an empty commit with an empty commit message. Using the –allow-empty-message flag:

$ git commit --allow-empty-message --allow-empty
[master 8546295]

$ git log
commit 85462959fedf63fd4a59c578caccef4df8cceff8
Author: Philip J. Fry <fry@planet-express.com> 
Date:   Thu Dec 01 21:57:47 2016 +0000

commit 02cdd7f74c0fe4219a061e94066251dbf1137475
Author: Philip J. Fry <fry@planet-express.com> 
Date:   Thu Dec 01 21:24:28 2016 +0000

    this is an empty commit

commit f503c2ccf476b144f69108a842782464dd3ca61e
Author: Philip J. Fry <fry@planet-express.com> 
Date:   Thu Dec 01 21:15:40 2016 +0000

    initial commit

Thats it! if you want to read more about the flags discussed go here.

Fin.

Changing the Date of Git Commits

Have you ever looked at peoples Github contribution timelines and seen a word or some ASCII art. This is done by creating a repo, then manipulating the dates of commits, to create the desired design.

This blog post is not about creating funky pieces of art in your Github commit history timeline. But instead, shows you how to change the date of a commit using Git.

When you make a git commit you can actually set the date of the commit to anything you want. Lets get started.

First, you have to set two environment variables - GIT_AUTHOR_DATE and GIT_COMMITTER_DATE, like so:

$ export GIT_AUTHOR_DATE="Wed Jan 10 14:00 2099 +0100"
$ export GIT_COMMITTER_DATE="Wed Jan 10 14:00 2099 +0100"

Create a git repo:

$ mkdir test
$ cd test
$ git init
$ echo "hello world" >> hello.txt
$ git add hello.txt

Now lets commit our changes:

$ git commit -m "initial commit"
$ git log
commit 80b2d1f499d6ae455a4f87f68e107b275abfefe8
Author: Philip J. Fry <fry@planet-express.com>
Date:   Wed Jan 10 14:00:00 2099 +0100

    initial commit

You can see the date of our commit has changed, rather the being the current date/time, it is set to day in the year 2099.

If you want to change the date of a commit in the past, I recommend reading this blog post.

Useful links:

Git commit date formats

Changing the timestamp of a previous Git commit

gitfiti - abusing github commit history for the lulz

Fin.

Using Subtests and Sub-benchmarks in Go

In this post we will walk through an example of how to use the new subtests and sub-benchmarks functionality introduced in Go 1.7.

Subtests

One of the nifty features in Go is the ability to write table driven tests. For example, if we wanted to test the function:

func Double(n int) int {
    return n * 2
}

Then we could write a table driven test as follows:

func TestDouble(t *testing.T) {
	testCases := []struct {
		n    int
		want int
	}{
		{2, 4},
		{4, 10},
		{3, 6},
	}
	for _, tc := range testCases {
		got := Double(tc.n)
		if got != tc.want {
			t.Errorf("fail got %v want %v", got, tc.want)
		}
	}
}

Note: The test case {4, 10} is present to make the test fail, 4 * 2 != 10 😃.

If we run this test, we get the following output:

$ go test -v
=== RUN   TestDouble
--- FAIL: TestDouble (0.00s)
        example_test.go:25: fail got 8 want 10
FAIL
exit status 1
FAIL    example    0.005s

The problem here is that we don’t know which table test case failed. It would be better, if we could identify a table test case, and display its name in the output if it fails.

This is what subtests in GO 1.7 allow us to do. The testing.T type now has a Run method, were the first argument is a string (the name of the test). And the second argument is a function. Below we re-implement the above test, using the Run method:

func TestDouble(t *testing.T) {
	testCases := []struct {
		n    int
		want int
	}{
		{2, 4},
		{4, 10},
		{3, 6},
	}
	for _, tc := range testCases {
		t.Run(fmt.Sprintf("input_%d", tc.n), func(t *testing.T) {
			got := Double(tc.n)
			if got != tc.want {
				t.Errorf("fail got %v want %v", got, tc.want)
			}
		})
	}
}

A few things to note here are that one, we are setting the name of the test to the ‘n’ value of the test case. So our tests are named “input_2”, “input_3” and “input_4”. And two, for the second parameter we are passing in a closure which has the same method signature as a normal test.

If we run this test, we get the following output:

$ go test -v
=== RUN   TestDouble
=== RUN   TestDouble/input_2
=== RUN   TestDouble/input_3
=== RUN   TestDouble/input_4
--- FAIL: TestDouble (0.00s)
    --- PASS: TestDouble/input_2 (0.00s)
    --- PASS: TestDouble/input_3 (0.00s)
    --- FAIL: TestDouble/input_4 (0.00s)
        example_test.go:43: fail got 8 want 10
FAIL
exit status 1
FAIL    example    0.006s

This time we get a more detailed output, we can see that “input_4” was the failing test case from the table. And the pass/fail status of each individual table test case.

We can run a subset of our table tests, by matching the unique names set for them (the first parameter to the Run method), as follows:

$ go test -v -run="TestZap/input_2"
=== RUN   TestZap
=== RUN   TestZap/input_2
--- PASS: TestZap (0.00s)
    --- PASS: TestZap/input_2 (0.00s)
PASS
ok      example    0.008s

Running many tests, by matching the test names:

$ go test -v -run="TestZap/input_[1-3]"
=== RUN   TestZap
=== RUN   TestZap/input_2
=== RUN   TestZap/input_3
--- PASS: TestZap (0.00s)
    --- PASS: TestZap/input_2 (0.00s)
    --- PASS: TestZap/input_3 (0.00s)
PASS
ok      example    0.006s

Here “input[1-3]” matched _“input2” and _“input3” but not _“input4”.

Sub-benchmarks

Unlike table driven testing there was no equal approach for benchmarking. But now in Go 1.7, we have the ability to create table driven benchmarks. Imagine we need to benchmark the following function:

func AppendStringN(s string, n int) {
	a := make([]string, 0)
	for i := 0; i < n; i++ {
		a = append(a, s)
	}
}  

We can define a top-level benchmark function like this:

func BenchmarkAppendStringN(b *testing.B) {
	benchmarks := []struct {
		fruit string
		n     int
	}{
		{fruit: "apple", n: 10},
		{fruit: "pear", n: 20},
		{fruit: "mango", n: 40},
		{fruit: "berry", n: 60},
		{fruit: "banana", n: 80},
		{fruit: "orange", n: 100},
	}
	for _, bm := range benchmarks {
		b.Run(bm.fruit, func(b *testing.B) {
			for i := 0; i < b.N; i++ {
				AppendStringN(bm.fruit, bm.n)
			}
		})
	}
}

The Run methods signature is the same as described above. But for the testing.B type, rather than the testing.T type. Our benchmark names are set to “apple”, “pear”, “mango”, etc…

If we run this benchmark, we get the following output:

$ go test -v -run="xxx" -bench=.
BenchmarkAppendStringN/apple-8           3000000               495 ns/op
BenchmarkAppendStringN/pear-8            2000000               713 ns/op
BenchmarkAppendStringN/mango-8           1000000              1101 ns/op
BenchmarkAppendStringN/berry-8           1000000              1156 ns/op
BenchmarkAppendStringN/banana-8          1000000              1803 ns/op
BenchmarkAppendStringN/orange-8          1000000              1153 ns/op
PASS
ok      example_test    9.428s

An important thing to note is that, each time b.Run is invoked it creates a separate benchmark. The outer benchmark function (BenchmarkAppendStringN) is only run once and it is not measured.

One last thing to mention is that, like subtests. You can run individual benchmarks by there set unique names. Below, we run just the “berry” benchmark:

$ go test -v -run="xxx" -bench="/berry"
BenchmarkAppendStringN/berry-8           1000000              1143 ns/op
PASS
ok      subtestbench    1.165s

I hope you have found this blog post helpful.

Fin.

SVG Sprites

Within an single SVG file we can define many sprites. This consists of merging all your SVG sprites into a single .svg image file. Every sprite is wrapped in a ‘symbol’ tag, like this:

<svg class="character" width="100pt" height="100pt" version="1.1" xmlns="http://www.w3.org/2000/svg">
    <symbol id="circle-red" viewBox="0 0 100 100">    
      <circle cx="50" cy="50" r="40" stroke="black" stroke-width="3" fill="red" />
    </symbol>
    <symbol id="circle-black" viewBox="0 0 100 100">    
      <circle cx="50" cy="50" r="40" stroke="black" stroke-width="3" />
    </symbol>
</svg>

We can then use HTML or CSS to pick out each part of the image:

<hmtl>
  <body>
     <svg class="c-red" >
            <use xlink:href="test.svg#circle-red"></use>
        </svg>
        <svg class="c-black" >
            <use xlink:href="test.svg#circle-black"></use>
        </svg>        
  </body>
</hmtl>

We can animate the sprite with CSS:

<style type="text/css">
    .c-black:hover {
        fill: #fe2fd0;
    }
<style>

If creating an SVG sprite file seems tedious or error prone. You can use a tool like gulp-svgstore to automate the process. And generate a single SVG file from your individual sprite files.

One of the advantages of using SVG sprites are the improved page load times. One of the disadvantages of using SVG sprites, is that when linking the ‘use’ tag to the ‘symbol’ tag, the image gets injected into the Shadow DOM. Meaning we lose some CSS capabilities, and cannot apply some styling to the SVG image.

Fin.

Updating Third Party Packages in Go

Just a short post on how to update packages using go get.

To update all third party packages in your GOPATH use the following command:

go get -u all

To update a specific package, just provide the full package name to go get:

go get -u github.com/gorilla/mux

What about vendor-ed packages? These are updated in exactly the same way as above:

go get -u my-project/vendor/megacorp/foo

If you want more information about your GOPATH, run the command:

go help gopath

Fin.

sort.Sort & sort.Stable

Go 1.6 made improvements to the Sort function in the sort package. It was improved to make fewer calls to the Less and Swap methods. Here are some benchmarks showing the performance of sort.Sort in Go 1.5 vs 1.6:

Sort []int with Go 1.5
BenchmarkSort_1-4       20000000              67.2 ns/op
BenchmarkSort_10-4      10000000               227 ns/op
BenchmarkSort_100-4       500000              3863 ns/op
BenchmarkSort_1000-4       30000             52189 ns/op

Sort []int with Go 1.6
BenchmarkSort_1-4       20000000              64.7 ns/op
BenchmarkSort_10-4      10000000               137 ns/op
BenchmarkSort_100-4       500000              2849 ns/op
BenchmarkSort_1000-4       30000             46949 ns/op

source: state of go

Sort does not use a stable sorting algorithm, it does not make any guarantees about the final order of equal values. A stable sort algorithm, is one in which items which have the same key stay in the same relative order during the sort. The sorting algorithms mergesort and radixsort are stable, were as quicksort, heapsort and shellsort are not stable. If this property is important to your application then you may want to use sort.Stable.

sort.Sort under the hood uses the quicksort algorithm, were as sort.Stable uses insertion sort. Below is an example of Sort and Stable in action:

type byLength []string

func (b byLength) Len() int           { return len(b) }
func (b byLength) Less(i, j int) bool { return len(b[i]) < len(b[j]) }
func (b byLength) Swap(i, j int)      { b[i], b[j] = b[j], b[i] }

func main() {
    values1 := []string{"ball", "hell", "one", "joke", "fool", "moon", "two"}
    sort.Sort(byLength(values1))
    fmt.Println("sort.Sort", values1)
    
    values2 := []string{"ball", "hell", "one", "joke", "fool", "moon", "two"}
    sort.Stable(byLength(values2))
    fmt.Println("sort.Stable", values2)
}

Output:

sort.Sort   [two one hell joke fool moon ball]
sort.Stable [one two ball hell joke fool moon]

Fin.

The Defer Statement

The Go programming language has a defer statement that allows for a function call to be run just before the currently running function returns. Here is how the defer statement is explained in the language specification:

“A “defer” statement invokes a function whose execution is deferred to the moment the surrounding function returns, either because the surrounding function executed a return statement, reached the end of its function body, or because the corresponding goroutine is panicking.”

Here is example usage of the defer statement:

func DoWork(f Foo) {
    defer f.CleanUp()
    f.DoTask()
}

The function call in the defer statement (CleanUp), happens just before the function ‘DoWork’ exits. Below I have listed some properties of the defer statement:

Defer’d functions are executed in LIFO

Given the following program:

package main

import "fmt"

func a() {
	defer fmt.Println("a0")
	defer fmt.Println("a1")
	b()
}	
	
func b() {
	defer fmt.Println("b")
	c()
}


func c() {
	defer fmt.Println("c")
}

func main() {
	a()
}

Here is what the defer’d queue looks like:

Front : [ fmt.Println(“c”) ][ fmt.Println(“b”) ][ fmt.Println(“a1”) ][ fmt.Println(“a0”) ]

And the output:

c
b
a1
a0

Defer’d functions execute even if the function panics

In the following program the method ‘Space’ panics, but the defer’d function queued up still executes.

func Space() {
    defer fmt.Println("I'm a rocket ship on my way to Mars")
    panic("On a collision course")
}

Output:

I'm a rocket ship on my way to Mars
panic: On a collision course

goroutine 1 [running]:
main.Space()
	/tmp/sandbox062698335/main.go:46 +0x160
main.main()
	/tmp/sandbox062698335/main.go:51 +0x20

When a panic function executes, it begins to unwind the stack executing any defer statements as it goes.

Arguments are evaluated when the defer statement is encountered

Take the following example:

func Foo() {
    var x int
    defer fmt.Println("value of x =", x)
    x = x + 1
    fmt.Println("value of x =", x)
}

You may expect the program to output:

value of x = 1
value of x = 1

But instead the output is:

value of x = 1
value of x = 0  // defer'd function call output

The reason for this is before the defer’d function was queued, it’s arguments were evaluated and saved (x = 0). When the defer’d function was executed, rather than seeing that x == 1, it instead output the value of x it saved previously. The defer section in the Go language specification states:

“Each time a “defer” statement executes, the function value and parameters to the call are evaluated as usual and saved anew but the actual function is not invoked.”

If we would like to have the arguments evaluated when the defer executes, wrap the function in a closure:

func Foo() {
    var x int
    defer func() {
        fmt.Println("value of x =", x) 
    }
    x = x + 1
    fmt.Println("value of x =", x)
}

Output:

value of x = 1
value of x = 1

Defer’d functions can access named parameters

In the following example (taken from the defer section of the language specification):

// f returns 1
func f() (result int) {
	defer func() {
		result++
	}()
	return 0
}

You can see that defer’d functions can access and modify named parameters. Notice how the defer in function f is a closure, otherwise we could not capture the most up to date value of ‘result’ or modify it.

Fin.

Go Channel Axioms

A while ago I was watching a tech talk by Blake Caldwell on Building Resilient Services with Go. In his presentation he had a slide which listed go channel axioms. I have listed his channel axioms here and provided some short code snippets to hopefully clarify them.

A send to a nil channel blocks forever

var ch chan bool
ch <- true // will always block

We declare a channel ‘ch’ but do not initialize it (with make), so it is a nil channel. We then attempt to send a value down the channel, this causes a blocking operation.

A receive from a nil channel blocks forever

var ch chan bool
v := <-ch // will always block

We declare a channel ‘ch’ but do not initialize it (with make), so it is a nil channel, We then attempt to receive a value from the channel, this causes a blocking operation.

A send to a closed channel panics

ch := make(chan bool)
...
close(ch)
...
ch <- true // panics!

We initialize a channel ‘ch’ then later close the channel, and then later attempt to send a value down the channel. This causes a panic, and we receive a stack trace. In this case, before we send a value down the channel, we may want check if the channel is open:

if v, ok := <-ch; ok {
    // ch is open
}

A receive from a closed channel returns the zero value immediately

ch := make(chan int)
...
close(ch)
...
v := <-ch // v = 0

We initialize a channel ‘ch’ then later close the channel, and then later attempt to receive a value from the channel. This causes the zero value to be returned immediately, this is a non-blocking operation.

dotGo 2015

Dot conf. logo

On Monday the 9th of November I was in Paris attending dotGo, the European Go conference. This blog post is a summary of my time there.

Pre-Conference

The day before the conference, the Paris tech talks group organized a pre-conference meetup/party. There were a large number of delegates who turned up and the meet up consisted of about six or seven talks. The talks were about how individuals at there respective companies were using Go. All the projects demoed and talked about were network related i.e. using Go to write a load balancer which solved a particular problem. After the talks, there was a chance to socialize and eat pizza with other gophers :).

Venue

On the day of the conference I made my way down to the venue Théâtre de Paris. This venue was absolutely beautiful, the high ceilings, the seating, the theatre boxes and stage. Made this venue truly great, and I was very grateful that the building owners allowed the conference to take place there.

Talks

dotGo is a single track conference which I prefer, as I always have a hard time making my mind up about which talks to attend. Here is a summary of the talks:

Microservices

This talk was given by Peter Bourgon the creator of Gokit. The talk centered around his thoughts, opinions and efforts of getting people to adopt microservices in organizations.

gokit on github

Tools for working with Go code

This talk was given by Fatih Arslan the creator of Vim-go. This was a great talk were you were introduced to a wide range of tools that exist in the Go ecosystem. Tools such as gorename, generate and oracle were presented along with examples on how to use them.

The Docker Trail

This talk was given by Jessica Frazelle who is a core team member at Docker. She talked about three odd things the team at Docker noticed and how they went about debugging and fixing them.

Applied Concurrency in Go

A talk given by Matt Aimonetti who is the co-founder and CTO of Splice. Matt was running Go code (which used concurrency) on an Arduino which made some LED’s blink according to some rules. He ran various versions of the code all of which contained concurrency related bugs, he fixed the bugs as he went along. Showing all the the mistakes we usually make when writing concurrent code.

Functional Go?

A talk given by the excellent Francesc Campoy Flores a member of the Google Go team. This talk concentrated on his efforts to use Go in a functional manner, after his experiences with Haskell. The functional code he wrote for the problem he was trying solve was scary, he really was actively working against the language (something he admitted). It was a fun talk though.

The Other Side of Go: Programming Pictures

This talk was given by Anthony Starks the creator of SVGo. It was great to see what people were doing with Go outside the network related projects we always see. Anthony’s talk was great, he was using Go to generate SVG for all sorts of things. One important point that stayed with me about his talk, was about dissecting a complex image into just lines and arcs, this allowed you to then build up a replica image (using SVGo) using just these primitives. He also spoke about his project with great passion.

Gomobile

David Crawshaw the creator of gomobile spoke about the challenges of getting Go up and running on mobile platforms. Some of the topics he discussed were Go’s calling convention and threading (goroutines, OS threads and CPUs).

gombile on github

A Tour of the Bleve

Marty Schoch the creator of Bleve gave an excellent talk on the open-source full-text search library for Go. Marty presented great code examples along side his talk, which really helped clarify the points he was trying to get across. He also spoke about how members of the community have contributed to the project. This was probably my favorite talk, and I also learned how to pronounce Bleve :)

bleve homepage

Simplicity is Complicated

A talk given by the excellent Rob Pike co-creator of Go. He stated that even though Go is a very simple language, a lot of the complexity is hidden behind the scenes. He talked about how simplicity and complexity are part of the design and finding the right balance is a challenging task. This was good talk, and when listening I thought about the go keyword.

go someFunction()

This in Go is a simple way to get concurrency into your program and all the complexity of scheduling is hidden behind the scenes.

There were also a series of ten minute lightning talks, one of these was given by Brad Fitz about http2 in Go. http2 will be available in go 1.6.

Full list of dotGo videos can be found here.

Overall I really enjoyed my time at dotGo, and Paris is an amazing and buzzing city. I will never forget standing on my Hotel balcony and listening to a trumpeter playing for tips, with the noise of the city in the background.

Fin.

tmux Cross Platform Config.

If like me you use tmux on both Linux and OS X, then managing your tmux configuration can be a pain. The problem is there is configuration that is specific to either OS, such as copy and paste behavior. This blog post will show you how to manage your tmux configuration across platforms in a better way.

The first thing you want to do is create a dot file for each platform, so for Linux create .tmux-linux.conf and for OS X create .tmux-osx.conf. Create and place these files in the same location as your .tmux.conf file (most likely your home directory).

Now move any OS specific settings out of .tmux.conf into .tmux-osx.conf and .tmux-linux.conf respectively.

In your .tmux.conf file add the line:

if-shell "uname | grep -q Darwin" 'source-file ~/.tmux-osx.conf' \
'source-file ~/.tmux-linux.conf'

Basically, when tmux reads in it’s configuration, if the OS is ‘Darwin’ (OS X) then it will read .tmux-osx.conf else it will read in .tmux-linux.conf.

Note: the above code snippet assumes that the three *.conf files are placed in your home directory, if they are not change the paths.

Things to do before committing Go code

This blog post will list some of the basic things you should really do before committing Go code into your repository.

1) Run gofmt/goimports

gofmt is probably the most popular Go tool amongst gophers. The job of gofmt is to format Go packages, your code will be formatted to be consistent across your code base. For example if we have the following if statement:

if x == 42 { fmt.Println("The answer to everything") }

Then running gofmt on this will format the code to:

if x == 42 { 
    fmt.Println("The answer to everything") 
}

goimports is a tool that does exactly what gofmt does, but takes it a step further and adds/removes packages. For example if you had the following code:

package main

import (
    "log"
)

func main() {
    fmt.Println("Hello")
}

As you can see from this code, package ‘log’ is not used anywhere and package ‘fmt’ has not been imported. Running goimports on this code transforms it to:

package main

import (
    "fmt"
)

func main() {
    fmt.Println("Hello")
}

Running either of these tools on your code is a must, its great to see a code base using common idioms and a consistent format/style. And to me, it’s one of main things that makes Go code bases so more approachable than code bases in other languages.

2) Run golint

golint will lint your source code and make suggestions concerning coding style.

see golint on github.

3) Run go vet

Go vet is another important tool, it’s perfectly summed up by it’s documentation:

“Vet examines Go source code and reports suspicious constructs, such as Printf calls whose arguments do not align with the format string. Vet uses heuristics that do not guarantee all reports are genuine problems, but it can find errors not caught by the compilers.”

If we run go vet on the following code:

fmt.Printf("%s", 42)

We would get the following error:

test.go:6: arg 42 for printf verb %s of wrong type: int
exit status 1

Go vet differs from golint, go vet is concerned with correctness and golint is concerned with coding style.

see here for more information on vet.

4) Run build/install/run with -race flag:

There is a fully integrated race detector in the go tool chain. The race detector contains complex race detection and deadlock algorithms, to help you hunt down those hard to find concurrency related bugs.

To enable the race detector in your code, add the -race flag on the command line:

$ go test -race mypkg    // test the package
$ go run -race mysrc.go  // compile and run the program
$ go build -race mycmd   // build the command
$ go install -race mypkg // install the package

Note: Your code will run slower when you enable this flag, as the race detector is busy doing its thing :)

5) Run a dependency management tool

It’s very likely that your project will make use of 3rd party libraries. If you need to capture there versions, run a dependency management tool. Below is a list of some popular ones:

I have no real experience with the above tools but instead use the go 1.5 vendor experiment feature to capture dependencies.

I hope this lists helps you in some way in managing your Go projects. Remember, this is just a basic list, Go has numerous tools that you probably run which are essential to you or your project. The ones listed above are ones which I believe are the basics.

Golang UK Conference 2015

British Gopher

On Friday the 21st of August I attended the Golang UK conference 2015 held at the amazing Brewery in London. This post is a short write up of my time at the conference.

This was my first ever conference so apart from the talks, I did not know what else to expect. Overall though, I found the conference was excellent and I met a wide range of interesting people.

Turnout

The turnout was quite big with over 250 delegates from all over the world.

Go usage

From the people I met no one was really using Go in a big way, but had an API or a small server component written in Go. However, everyone was expecting to increase there Go usage through out the coming year.

Talks

The conference consisted of 2 tracks, a main track and a side track. The talks I went to:

  • Opening Keynote
  • Crossing the Language Chasm
  • Building API’s
  • Complex Concurrency Patterns in Go
  • Code Analysis
  • Understanding Memory Allocation
  • Whispered Secrets
  • The Go Community

Out of all the talks, the best one was Code Analysis presented by Francesc Campoy.

Food :)

The food served at the conference was absoutetly amazing. Many of the delgates said that it was best food of any conference they had been to.

Fin.

Gofmt and Rewrite Rules

One thing I absolutely love about Go is its tooling support. Whenever I use the numerous tools I always discover something new. In this short post I will be showing off gofmt’s -r flag, this flag allows you to apply a rewrite rule to your source before formatting.

A rewrite rule is a string in the following format:

pattern -> replacement

Both pattern and replacement must be valid Go expressions (more on this later), lets apply a simple rewrite to the following code:

// test1.go
package main

import (
    "fmt"
)

func main() {
    foo := "Hello World"

    fmt.Println(foo)
}

The following rewrite rule changes the variable name from ‘foo’ to ‘bar’:

$ gofmt -r='foo -> bar' test1.go

Output:

// test1.go
package main

import (
    "fmt"
)

func main() {
    bar := "Hello World"

    fmt.Println(bar)
}

We will now apply a more powerful rule to the below code:

// test2.go
package main

func main() {
    vals := make([]int, 0)

    vals = append(vals, 15)
    vals = append(vals, 17)
    vals = append(vals, 23)

    slice := vals[1:len(vals)]

    _ = slice
}

The line:

slice := vals[1:len(vals)]

Is not very idiomatic Go so lets change this:

$ gofmt -r='a[b:len(a)] -> a[b:]' test2.go

Output:

// test2.go
package main

func main() {
    vals := make([]int, 0)

    vals = append(vals, 15)
    vals = append(vals, 17)
    vals = append(vals, 23)

    slice := vals[1:]

    _ = slice
}

As you can see the code was correctly transformed. Notice how the rule used the characters ‘a’ and ‘b’. If your rule uses single-character lowercase identifiers, then these will serve as wild-cards matching arbitrary sub-expressions; these expressions will be substituted for the same identifiers in the replacement. So the rule:

-r='a[b:len(a)] -> a[b:]'

Would match:

x := vals[1:len(vals)] // vals[1:]
y := nums[5:len(nums)] // nums[5:]

Were on the first match:

‘a’ would be substituted for ‘val’
‘b’ would be substituted for ‘1’

And on the second match:

‘a’ would be substituted for ‘nums’
‘b’ would be substituted for ‘5’

An important thing to remember when using the -r flag is that the resulting transformation must be a syntactically valid declaration list, statement list, or expression. So the following rule:

-r='a[b:len(a)] -> a[const]'

Would be syntactically incorrect (const is a reserved keyword), and you would get the error:

parsing replacement a[const] at 1:4: expected operand, found 'const' 

If you would like to learn more about rewrite rules then run:

$ godoc gofmt

Note: godoc gofmt reports the following at the end:

BUGS

The implementation of -r is a bit slow.

:)

Note: I am using Go version 1.4.2

Go and String Concatenation

When writing Go code you should try to stay away from concatenating strings using the ‘+’ and ‘+=” operators.

Strings in Go, like many other languages (Java, C#, etc…) are immutable, this means after a string has been created it is impossible to change. Here is what the Go Programming Language Specification has to say about the string type:

“A string type represents the set of string values. A string value is a (possibly empty) sequence of bytes. Strings are immutable: once created, it is impossible to change the contents of a string. The predeclared string type is string.”

Lets look at an example:

var town string = "Spring"
town += "field"

When you write the above code the compiler actually creates a new sequence of bytes, and assigns it to the variable ‘town’. The string “Spring” is then eligible for garbage collection. Here is what the heap would look like, after the above code was run:

           "Spring"

town ----> "Springfield"

Now if we concatenate the string ‘town’ with the another string:

town = "742 Evergreen Terrace " + town

The heap would look like this:

           "Spring"

           "Springfield"

town ----> "742 Evergreen Terrace Springfield"

If you concatenate a lot of strings using ‘+’ and ‘+=’ you will be generating a lot of garbage. This will make the garbage collector work harder as all those potential dead strings will need to be analyzed and freed.

There are two ways to concatenate string more efficiently:

1. strings.Join()
town := strings.Join([]string{"Spring", "field"}, ""))
2. bytes.Buffer
func concat(vals ...string) string {
    var buffer bytes.Buffer
    for _, s := range vals {
        buffer.WriteString(s)
    }
    return buffer.String()
}

func main() {
    town := concat("Spring", "field")

    names := []string{"Homer", "Moe", "Barney", "Carl", "Lenny"}

    friends := concat(names...)
    ...
}

You can use either, I prefer using the ‘concat’ method as it’s easier to read.

These two methods are much more efficient as behind the scenes they allocate a variable size buffer of bytes. Which can be modified over and over again with out leaving behind a lot of unused strings.

Calling C from Go

This post will show you the basics of how to call C code from a Go package.

Lets get started with an example:

Create a header file “add.h” with a function prototype:

#ifndef _ADD_H_
#define _ADD_H_
int add(int, int);
#endif

Create the source file “add.c” containing the definition for add:

#include "add.h"

int add(int a, int b) 
{
   return a + b;
}

Create a Go package “main.go”:

package main

// #include "add.c"
import "C"

import (
    "fmt"
)

func main() {
    r := C.add(40, 2)
    fmt.Println("result = ", r)
}

Ouptut:

$ go build main.go
$ ./main
result = 42

Notice the comment above the import “C” statement we include add.c not add.h.

If import “C” is immediately preceded by comments, then those comments become apart of the compilation process. If there are any spaces between the comments, then those are seen as normal go comments, for example:

// #include <math.h>

// #include <stdio.h>
// #include <errno.h>
import "C"

The header files stdio.h and errno.h are included as part of the compilation process, math.h is not.

Inline C

You can also write C code directly in the comments, here is an example:

package main

/*
int fortytwo()
{
    return 42;
}
*/
import "C"

import (
    "fmt"
)

func main() {
    fmt.Println(C.fortytwo)     // address
    fmt.Println(C.fortytwo())   // invocation
}

Output:

$ go build inline.go
$ ./inline
0x40014a0
42

Accessing C structs

Assume we have the below C struct defined in a .h or .c file:

struct point {
    int x;
    int y;
};

In order to access this from a Go package, you simple prefix the type name “point” with “C.struct_”:

func main() {
    p := C.struct_point{}
    p.x = 99
    p.y = 42
    fmt.Printf("type:   %T\n", p)
    fmt.Printf("struct: %+v\n", p)
}

Output:

$ go build cstruct.go
$ ./cstruct
type:   main._Ctype_struct_point
struct: {x:99 y:42}

Controlling the behaviour of the C compiler

You can pass flags to the C compiler to control its behaviour. This is done by defining a CFLAGS with a pseudo #cgo directive in the comments. In the example below, the -H flag will be passed to the C compiler (clang in my case) when it’s invoked. The -H flag tells clang to show the header includes and nesting depth.

package main

// #cgo CFLAGS: -H
// #include "add.c"
import "C"

import (
    "fmt"
)

func main() {
    r := C.add(40, 2)
    fmt.Println("result = ", r)
}

Ouptut:

$ go build main.go
!! this output is from clang writing to stdout because we passed the -H flag !!
. ./add.c
.. ./add.h
. /usr/include/errno.h
.. /usr/include/sys/errno.h
... /usr/include/sys/cdefs.h
.... /usr/include/sys/_symbol_aliasing.h
.... /usr/include/sys/_posix_availability.h
...

$ ./main
result = 42

Peeking behind the scenes

What is happening when we build a Go package that includes an import “C” statement.

Firstly import “C” is a pseudo-package it is not listed in the standard library. When the go compiler sees the pseudo-package import “C” it runs the cgo command, this generates all the supporting infrastructure. It transforms main.go outputting some .h and .c files, these are then passed to clang to compile.

Note: the cgo command actually invokes the gcc compiler, however on my machine gcc is an alias for clang, so it’s clang that is doing the compiling.

The output from the C compiler is an object file named _cgo_.o, which contains the compiled C code. This object code (_cgo_.o) is then linked into the rest of the go binary.

Lets runs the cgo command directly to get a better understanding:

$ go tool cgo main.go

This command will output a directory called _obj:

$ ls _obj/  

_cgo_.o
_cgo_defun.c
_cgo_export.c
_cgo_export.h
_cgo_flags
_cgo_gotypes.go
_cgo_main.c
main.cgo1.go
main.cgo2.c

As stated _cgo_.o contains the compiled C code, one interesting file to browse is main.cgo2.c:

$ cat main.cgo2.c

#line 10 "/go/src/samples/main.go"

 #include "add.c"

// Usual nonsense: if x and y are not equal, the type will be invalid
// (have a negative array count) and an inscrutable error will come
// out of the compiler and hopefully mention "name".
#define __cgo_compile_assert_eq(x, y, name) typedef char name[(x-y)*(x-y)*-2+1];

// Check at compile time that the sizes we use match our expectations.
#define __cgo_size_assert(t, n) __cgo_compile_assert_eq(sizeof(t), n, _cgo_sizeof_##t##_is_not_##n)

__cgo_size_assert(char, 1)
__cgo_size_assert(short, 2)
__cgo_size_assert(int, 4)
typedef long long __cgo_long_long;
__cgo_size_assert(__cgo_long_long, 8)
__cgo_size_assert(float, 4)
__cgo_size_assert(double, 8)

extern char* _cgo_topofstack(void);

#include <errno.h>
#include <string.h>

void
_cgo_7474d4d504ba_Cfunc_add(void *v)
{
    struct {
        int p0;
        int p1;
        int r;
        char __pad12[4];
    } __attribute__((__packed__)) *a = v;
    char *stktop = _cgo_topofstack();
    __typeof__(a->r) r = add(a->p0, a->p1);
    a = (void*)((char*)a + (_cgo_topofstack() - stktop));
    a->r = r;
}

Take a while to browse the other files, they are quite interesting.

I hope you enjoyed this post.

Note: I am on OSX 10.10.4 64 bit using go version 1.4.2 and clang version 602.0.53.

First Post

This is my blog. There are many like it, but this one is mine.