Reveal type in Crystal

06 Mar 2023

Brian J. Cardiff

Recently I came across reveal_type from Sorbet as a way to inspect the type of an expression, thanks Brian Hicks. I wondered if that can be ported to Crystal. You can jump to the conclusions section if you want to copy-paste the good-enough™️ solution in your project.

Inspecting the type of an expression is a reasonable question to ask. When the program compiles, the compiler knows the answer for sure.

Let’s start working with a relatively simple example grabbed from Sorbet’s documentation.

def maybe(x, default)
  # what's x type here?
  if x
    x
  else
    default
  end
end

def sometimes_a_string
  rand > 0.5 ? "a string" : nil
end

maybe(sometimes_a_string, "a default value")

Existing Solution: puts debug

Debugging the execution of a program using printf/print/puts is widely used. In Crystal we could write some variation of:

puts "x = #{x.inspect} : #{x.class}"
#
# Output:
#
# x = "a string" : String
#
# or
#
# x = nil : Nil

That will show, when the program is executed, the actual value and type of x. But we don’t want to see the runtime type, we need the compile-time type. So a more accurate alternative would be

puts "x = #{x.inspect} : #{typeof(x)}"
#
# Output:
#
# x = "a string" : (String | Nil)
#
# or
#
# x = nil : (String | Nil)

pp! typeof(x)
#
# Output:
#
# typeof(x) # => (String | Nil)

Existing Solution: context tool

About 8 years ago Crystal gained some built-in tooling and one of those tools would give us exactly the information we are looking for.

Assuming the previous def maybe is defined at the beginning of a program.cr we could use the context tool as follows:

% crystal tool context -c program.cr:2:3 program.cr
1 possible context found

| Expr    | Type         |
--------------------------
| x       | String | Nil |
| default |    String    |

It will give us a bit more of what we wanted since the type of all variables/parameters in the position of the cursor that was given will be shown.

It will also use only compile time information. The program is never executed in this case as opposed to the puts debug.

Unfortunately the tool will not allow us to print the type of any expression unless we assign it to a variable previously.

The context tool is a separate implementation that relies on the compiler but essentially traverse the whole compiled program. There are some edge cases that are currently not handled.

The most important shortcoming I think is the developer experience. It’s not great unless it’s integrated with an editor.

Adding `reveal_type` to Crystal

The developer experience of Sorbet’s reveal_type is awesome:

The modifications needed to the program are simple and can be applied to any valid expression.
The same type checker is the one that show the information.
There is no need to discover an internal tool command
There is no need for additional editor integration
It supports multiple reveal_type mentions in one pass.

I want the same for Crystal 🙂.

def maybe(x, default)
  reveal_type(x) # what's x type here?
  if x
    x
  else
    default
  end
end

% crystal build program.cr
Revealed type program.cr:2:15
  x : String | Nil

Or some similar output.

Before hacking the compiler I wanted to see if it can be done at user code.

Crystal macros can print during compile time and we have access to the expression’s AST.

The following will give us the first part of the message.

macro reveal_type(t)
  {%- loc = "#{t.filename.id}:#{t.line_number}:#{t.column_number}" %}
  {%- puts "Revealed type #{loc.id}" %}
  {%- puts "  #{t.id}" %}
  {{t}}
end

If we try to get the compile-time type of the expression t we will stumble on the infamous “can’t execute TypeOf in a macro”.

In program.ign.cr:4:26

  9 | {% puts "  #{t.id} : #{typeof(t)}" %}
                            ^
Error: can't execute TypeOf in a macro

To overcome this we can use the fact that defs can have macro code.

def reveal_type_helper(t : T) : T forall T
  {%- puts "   : #{T}" %}
  t
end

macro reveal_type(t)
  {%- loc = "#{t.filename.id}:#{t.line_number}:#{t.column_number}" %}
  {%- puts "Revealed type #{loc.id}" %}
  {%- puts "  #{t.id}" %}
  reveal_type_helper({{t}})
end

With this snippet at the top of our program.cr we already have the output we want. 🎉

% crystal build program.cr
Revealed type /path/to/program.cr:14:15
  x
  : (String | Nil)

Unfortunately if we put multiple reveal_type invocations things will not work as expected. The macro at reveal_type_helper is executed only once per distinct type.

To force a different reveal_type_helper instance for each reveal_type invocation we would need a distinct type for each. Surprisingly we can do that.

def reveal_type_helper(t : T, l) : T forall T
  {%- puts "   : #{T}" %}
  t
end

macro reveal_type(t)
  {%- loc = "#{t.filename.id}:#{t.line_number}:#{t.column_number}" %}
  {%- puts "Revealed type #{loc.id}" %}
  {%- puts "  #{t.id}" %}
  reveal_type_helper({{t}}, { {{loc.tr("/:.","___").id}}: 1 })
end

The l argument will have a tuple of type { <loc>: Int32 } where <loc> is an identifier that depends on the invocation location of the reveal_type macro. 🤯

Caveats

There are a couple of more caveats of this solution that are worth mentioning. A proper built-in feature in the compiler would not be affected by all of them. Essentially these can be summed up as:

The reveal_type needs to be within used code
Our implementation is very sensible to the internals compiler’s execution order
It doesn’t handle fully recursive definitions
It could change the semantic of the program since it affects the memory layout

Feel free to skip to the next section unless you want any further details and examples of each.

Due to how Crystal compiler works, the reveal_type needs to appear within static reachable code. Even if you start in a def with arguments with types (type restrictions actually) you need that def to be called. Otherwise the compiler ignores it. Similar to how C++ templates are not expanded unless they are used.

Splitting the desired output between macro and defs is very sensible to the compiler’s execution order. The following code suffers from this:

"a".tap do |a|
  reveal_type a
end

1.tap do |a|
  reveal_type a
end

Revealed type /path/to/program.cr:2:15
  a
Revealed type /path/to/program.cr:6:15
  a
    : String
    : Int32

Recursive programs can hit an edge case that would hide the output of reveal_type_helper. The following program will have several dig_first instantiation. As such the reveal_type macro will be called once per each instantiation, but all of the reveal_type_helper invocations are with the same t type and in the same location. We fall again in the problem we solved earlier with the { <loc>: Int32 } param.

def dig_first(xs)
  case xs
  when Nil
    nil
  when Enumerable
    reveal_type(dig_first(xs.first))
  else
    xs
  end
end

dig_first([[1,[2],3]])

Revealed type /path/to/program.cr:6:17
  dig_first(xs.first)
Revealed type /path/to/program.cr:6:17
  dig_first(xs.first)
Revealed type /path/to/program.cr:6:17
  dig_first(xs.first)
Revealed type /path/to/program.cr:6:17
  dig_first(xs.first)
    : (Int32 | Nil)

We mostly care about the compile-time experience here, but the reveal_type_helper invocation duplicates value-type values and could change the semantic of the running program.

struct SmtpConfig
  property host : String = ""
end

struct Config
  property smtp : SmtpConfig = SmtpConfig.new
end

config = Config.new
config.smtp.host = "example.org"

pp! config # => Config(@smtp=SmtpConfig(@host="example.org"))

If we add a reveal_type around config.smtp

reveal_type(config.smtp).host = "example.org"

We will alter the program output

config # => Config(@smtp=SmtpConfig(@host=""))

We could do an alternative reveal_type implementation that will preserve the memory layout, but it fails even to compile the previous recursive program. Either way, the following would be that variation:

def reveal_type_helper(t : T, l) : Nil forall T
  {%- puts "   : #{T}" %}
end

macro reveal_type(t)
  {%- loc = "#{t.filename.id}:#{t.line_number}:#{t.column_number}" %}
  {%- puts "Revealed type #{loc.id}" %}
  {%- puts "  #{t.id}" %}
  %t = uninitialized typeof({{t}})
  reveal_type_helper(%t, { {{loc.tr("/:.","___").id}}: 1 })
  {{t}}
end

That’s it, no more caveats I can think of!

Ideas for the compiler

It’s definitely feasible to implement better reveal_type in the compiler. It would require to reserve the method name for the compiler for a start.

Since it would not require a user defined macro/method it would not suffer from the issues exposed in the caveats.

But maybe there is something intermediate we can do to allow more use cases in the future.

In the reveal_type macro we needed to show the expression and its location. This is something that is already done in AST#raise. Unfortunately it’s not possible to tweak the output and it’s always treated as a compiler error:

macro reveal_type(t)
  {%- t.raise "Lorem ipsum" %}
end

In program.cr:24:1

  24 | reveal_type(config.smtp).host = "example.org"
      ^----------
Error: Lorem ipsum

If we would like to keep the reveal_type defined in user code I think it would be nice to have something similar to AST#raise to print information only. This could have the location, expression and ^------- already solved, while allowing use to customize the message. Additionally,

It could allow multiple information invocation and not abort the compilation as AST#raise does.
It could have access to some additional information as final node type by having specific execution live-cycle: AST#at_exit_info for example.
It could be used to experiment with additional compile-time tooling (eg: checking if a database and model is up-to-date with one another)

Some of these ideas resonate a lot with requests made by paulcsmith of Lucky on how to extend the compiler’s behavior. I expect that something like AST#at_exit_info or AST#info would be useful in that regard.

Conclusions

The final shape of our solutions can be easily added in our Crystal app for development purposes.

def reveal_type_helper(t : T, l) : T forall T
  {%- puts "   : #{T}" %}
  t
end

macro reveal_type(t)
  {%- loc = "#{t.filename.id}:#{t.line_number}:#{t.column_number}" %}
  {%- puts "Revealed type #{loc.id}" %}
  {%- puts "  #{t.id}" %}
  reveal_type_helper({{t}}, { {{loc.tr("/:.","___").id}}: 1 })
end

If we have an expression like foo(bar.baz) in which we are uncertain about the type of bar we can:

Surround bar with reveal_type as in foo(reveal_type(bar).baz)
Build the program as usual.
See the compiler output as

Revealed type /path/to/program.cr:14:15
  bar
    : (String | Nil)

As mentioned earlier, there are a couple of caveats with this solution but I think it works for vast majority of cases. It was great to extend somehow the compiler tooling via user code only.

It would be great to have some feedback if a proper built-in alternative to this would be valuable.

Comment archive for this post