Protobuf two ways: Code generation and reflection with Go


(Example code for this blog post lives at eggybytes/protobuf-two-ways.)

We love protocol buffers. In our work, we depend on their out-of-box functionality:

  • to define core types and use them consistently across our server/web/mobile stack,
  • to serialize our data efficiently in transit and at rest,
  • to generate robust distributed-systems primitives for free with gRPC,
  • and for much else.

Protobuf has reasonably-good implementations for most of the languages we use (Go on the server, TypeScript and JS on the web, Swift/Python/Kotlin/Rust/C++ for iOS/ML/Android/low-level stuff/masochism respectively), and it also exposes hooks where we can customize generated and runtime code to better suit our use case.

A new Go API for protobuf was recently released that exposes much richer protobuf functionality than the previous API, with a lovely developer experience. We’d recommend it wholeheartedly to anyone writing Go.

In case it’s helpful, we’ve written up two examples of how to use the new API to extend your protobufs and make them more powerful.

A companion repo with compilable/runnable versions of the examples below is on GitHub here, and shows how the pieces fit together. (It also demonstrates building your protos with Bazel, which has made our protobuf experience much smoother.)

1. Extend compile-time functionality: code generation with protogen

One powerful thing about protobuf is that it’s “just” statically-generated code: your service- and message-definition files get compiled into the language or other output of your choice with the protoc protobuf compiler. This means that protobuf functionality is fully-inspectable and fast at runtime.

The previous Go API for protobufs exposed an internal package for writing plugins for the protobuf compiler. This was used internally to implement the Go protobuf and gRPC compilers, but also allowed programmers to extend or replace the code generated by default if they were willing to use the internal package. The new Go protobuf API makes this functionality publicly-available and pleasant to use.

There are powerful libraries to aid in writing protobuf compiler plugins across languages, like protoc-gen-star by the lovely @rodaine (who is a treasure, and who taught me about protobuf code generation and many other things in the first place). If you’re generating Go code, it’s also easy to write custom generators with the protogen API alone.

For example, we might want the protobuf compiler to generate mock client classes for our services to help with testing.

So for example, we might have as a service definition:

In example.proto (defined by us):
10service EggDeliveryService {
11  option (annotations.client_mock) = true;
12
13  rpc OrderEgg (OrderEggRequest) returns (OrderEggResponse);
14}

By default, this will generate a client interface:

In example_grpc.pb.go (generated by protoc-gen-grpc):
16// EggDeliveryServiceClient is the client API for EggDeliveryService service.
17//
18// For semantics around ctx use and closing/ending streaming RPCs, please refer to https://pkg.go.dev/google.golang.org/grpc/?tab=doc#ClientConn.NewStream.
19type EggDeliveryServiceClient interface {
20	OrderEgg(ctx context.Context, in *OrderEggRequest, opts ...grpc.CallOption) (*OrderEggResponse, error)
21}

For ease of testing, we want to generate a client mock that conforms to the same interface, so we can mock out and test features that use the service.

What we want to appear in example.pb.custom.go (generated by our plugin):
 1package example
 2
 3import (
 4	context "context"
 5	mock "github.com/stretchr/testify/mock"
 6	grpc "google.golang.org/grpc"
 7)
 8
 9// MockEggDeliveryServiceClient is a mock EggDeliveryServiceClient which
10// satisfies the EggDeliveryServiceClient interface.
11type MockEggDeliveryServiceClient struct {
12	mock.Mock
13}
14
15func NewMockEggDeliveryServiceClient() *MockEggDeliveryServiceClient {
16	return &MockEggDeliveryServiceClient{}
17}
18
19func (c *MockEggDeliveryServiceClient) OrderEgg(ctx context.Context, in *OrderEggRequest, opts ...grpc.CallOption) (*OrderEggResponse, error) {
20	args := c.Called(ctx, in)
21	if args.Get(0) == nil {
22		return nil, args.Error(1)
23	}
24	return args.Get(0).(*OrderEggResponse), args.Error(1)
25}

To generate these mocks every time we invoke the protobuf compiler, we can write a short plugin using google.golang.org/protobuf/compiler/protogen. A Go protobuf plugin is a binary that takes as stdin some paths to .proto files and other parameters, and prints generated output to stdout. This is kind of a weird interface, so protogen helps us satisfy it by providing helpers to wrap inputs and outputs, as well as access parsed protobuf type information. For example, a full plugin that generates a static useful comment might look like:

In example main.go (written by hand):
package main

import (
	"flag"
	"fmt"

	"google.golang.org/protobuf/compiler/protogen"
)

func main() {
	var flags flag.FlagSet

	protogen.Options{
		ParamFunc: flags.Set,
	}.Run(func(plugin *protogen.Plugin) error {
		for _, f := range plugin.Files {
			filename := fmt.Sprintf("%s.pb.eggy.go", f.GeneratedFilenamePrefix)
			g := plugin.NewGeneratedFile(filename, f.GoImportPath)

			// Code generation here
			g.P("package ", f.GoPackageName)
			g.P()
			g.P("// don't forget to take some time to take a sip of water please")
		}

		return nil
	})
}

If this is run on our input example.proto, it yields output:

In example.pb.custom.go (generated by our plugin):
package example

// don't forget to take some time to take a sip of water please

Not close to what we want yet. We can extend this to iterate through the services in the passed-in files; for example, we could generate a helper function for every service with code like:

In the example main.go above (written by hand):
func main() {
	...
		for _, f := range plugin.Files {
			...

			// Code generation here
			g.P("package ", f.GoPackageName)
			g.P()
			g.P("// don't forget to take some time to take a sip of water please")

			// Iterate through all the `service`s in the passed-in file
			for _, svc := range f.Services {
				// For each service, generate a function that prints its name
				g.P("// Here is a useful function")
				g.P(fmt.Sprintf("func %sNamePrinter() {", svc.GoName))
				g.P(fmt.Sprintf(`	log.Println("I'm the printer for the service named %s")`, svc.GoName))
				g.P("}")
			}

			g.QualifiedGoIdent(protogen.GoIdent{GoName: "log", GoImportPath: "log"})
		}

		return nil
	})
}

This would now generate, if run on our input example.proto, the output:

In example.pb.custom.go (generated by our plugin):
package example

// don't forget to take some time to take a sip of water please

// Here is a useful function
func EggDeliveryServiceNamePrinter() {
	log.Println("I'm the printer for the service named EggDeliveryService")
}

To finish implementing our client mock generator, we iterate one level deeper through the svcs' protogen.Methods to generate a fully-formed mock.

The full example is shown here, and generates the client mock code we wanted.

2. Extend runtime functionality: reflection with protoreflect

Sometimes code-generation isn’t enough — sometimes you want to be able to inspect your types at runtime. The new Go protobuf API exposes a rich reflection API that exposes a view of types and values from the protobuf type system.

For example, you might want to have a function that sanitizes requests received from a client by replacing any empty primitive values with nil (in proto2), both at the top-level and recursively descending into messages. This might look like:

In go/reflect/clean.go (written by hand; runs at runtime not at compile-time):
11// Clean replaces every zero-valued primitive field with a nil value. It recurses
12// into nested messages, so cleans nested primitives also
13func Clean(pb proto.Message) proto.Message {
14	m := pb.ProtoReflect()
15
16	m.Range(cleanTopLevel(m))
17
18	return pb
19}
20
21func cleanTopLevel(m protoreflect.Message) func(protoreflect.FieldDescriptor, protoreflect.Value) bool {
22	return func(fd protoreflect.FieldDescriptor, v protoreflect.Value) bool {
23		switch kind := fd.Kind(); kind {
24		case protoreflect.BoolKind:
25			if fd.Default().Bool() == v.Bool() { m.Clear(fd) }
26		case protoreflect.Int32Kind, protoreflect.Sint32Kind, protoreflect.Sfixed32Kind, protoreflect.Int64Kind, protoreflect.Sint64Kind, protoreflect.Sfixed64Kind:
27			if fd.Default().Int() == v.Int() { m.Clear(fd) }
28		case protoreflect.Uint32Kind, protoreflect.Fixed32Kind, protoreflect.Uint64Kind, protoreflect.Fixed64Kind:
29			if fd.Default().Uint() == v.Uint() { m.Clear(fd) }
30		case protoreflect.FloatKind, protoreflect.DoubleKind:
31			if fd.Default().Float() == v.Float() { m.Clear(fd) }
32		case protoreflect.StringKind:
33			if fd.Default().String() == v.String() { m.Clear(fd) }
34		case protoreflect.BytesKind:
35			if len(v.Bytes()) == 0 { m.Clear(fd) }
36		case protoreflect.EnumKind:
37			if fd.Default().Enum() == v.Enum() { m.Clear(fd) }
38		case protoreflect.MessageKind:
39			nested := v.Message()
40			nested.Range(cleanTopLevel(nested))
41		}
42
43		return true
44	}
45}

We can configure this further by defining an optional proto annotation on fields:

 9extend google.protobuf.FieldOptions {
10  // If true, tells Clean() function in go/reflect not to clean this field
11  optional bool do_not_clean = 80001;
12}

We can then set this annotation value to false for any fields we don’t want to be Clean()ed:

16message OrderEggRequest {
17  optional string name = 1;
18  optional string description = 2 [(annotations.do_not_clean) = true];
19  optional int32 num_eggs = 3;
20  optional bool with_shell = 4;
21  optional Recipient recipient = 5;
22}

And exclude it in our cleaning:

In go/reflect/clean.go (written by hand; runs at runtime not at compile-time):
20...
21
22func cleanTopLevel(m protoreflect.Message) func(protoreflect.FieldDescriptor, protoreflect.Value) bool {
23	return func(fd protoreflect.FieldDescriptor, v protoreflect.Value) bool {
24
25		// Skip cleaning any fields that are annotated with do_not_clean
26		opts := fd.Options().(*descriptorpb.FieldOptions)
27		if proto.GetExtension(opts, annotations.E_DoNotClean).(bool) {
28			return true
29		}
30
31		// Otherwise, set any empty primitive fields to nil. For non-primitive fields, recurse down
32		// one level with this function
33		...
34	}
35}

What else

We extend protobufs at eggybytes to enforce access control, generate code for database access, and do many other things that involve repetitive but critical code. It’s extremely handy, and lets us ensure we have unified behavior around our types across all parts of our stack. Let us know if you’d like to dig deeper into protobuf, Bazel, Go, or anything else!