Thriftgo IDL Trimmer

Use the IDL trimming tool to remove unnecessary IDL definitions

Introduction

Thriftgo is a Thrift IDL parser and code generator implemented in Go. It supports comprehensive Thrift IDL syntax and semantic checks. Compared to the official Golang code generated by Apache Thrift, Thriftgo has made some bug fixes and supports plugin mechanisms, allowing users to customize the generated code according to their needs. The code generation tool for Kitex is one of the plugin implementations of Thriftgo.

Some IDL repositories may accumulate unrelated content over time due to lack of timely maintenance. With version iterations, the amount of unrelated content increases, resulting in large generated Golang code repositories and reduced readability of the IDL. It may also include excessive IDL references, resulting in irrelevant structures being included in the generated code, leading to oversized code or exceptions during the code generation process. These issues frequently occur in complex project practices and may hinder the development workflow.

Thriftgo provides IDL trimming functionality starting from version v0.3.1+. This feature is used to trim structures that are not referenced by services, thereby reducing unnecessary generated code that has no impact or dependency on RPC.

This feature can be used separately using the Trimmer tool provided under the Thriftgo project, or it can be used in conjunction with the Thriftgo/Kitex command line to filter out unused content during code generation.

Trimming Principle

  • Traverse all services included in the given IDL file and mark all struct structures directly or indirectly referenced by services and methods as “used”.
  • Scan all IDLs directly or indirectly referenced by this project. For all struct structures, if they are not marked as “used”, they will be pruned.
  • Since businesses often use IDL to introduce enums and constants, the tool does not trim typedefs, enums, and constants in the IDL.
  • The trimmed result is output as an IDL file with the .thrift extension or as generated Golang code.

The following is an illustration of the IDL files before trimming.
img

After using the trimming tool, it will traverse all structures based on service A in IDL A and remove unused struct structures. The final output is as follows.
img

Using the Trimmer Tool

The Trimmer tool supports processing thrift IDL files and outputs the trimmed results as thrift IDL files.

Install the Trimmer tool:

go install github.com/cloudwego/thriftgo/tool/trimmer@latest

Check the version/verify the installation:

trimmer -version

Usage format:

trimmer [options] file

Options:
--version                           Print version information
-h, --help                          Print help information
-o, --out [file/dir name]           Specify the file name/output directory for the IDL
-r, --recurse [dir]                 Specify a root directory, copy its directory structure to the output directory, and recursively output the specified IDL and its referenced IDLs to the corresponding locations. If -o is specified, it must be a directory.
-m, --method [service.method]       Specify one or more methods to be retained for trimming

Single File Processing

When you want to trim a single IDL file, you can execute the following command:

trimmer sample.thrift

After successful execution, the location of the trimmed IDL file will be output, and you will see the following prompt:
success, dump to example_trimmed.thrift

By default, the trimmed IDL is named with the original name + “trimmed” suffix. If you want to output the trimmed IDL to a specific directory or rename it, you can use the -o parameter:

trimmer -o abc/my_sample.thrift sample.thrift

Note that due to the fact that IDL definitions themselves do not record indentation, line breaks, and order, the field order in the output new IDL file may differ from the original version.

Recursive Trimming

If you want to trim a specific IDL and simultaneously trim and output the related IDLs it references, you can use the -r parameter:

trimmer -r test_cases/ test_cases/my_idl/a.thrift

Note that when using -r, you need to specify the IDL file directory after the -r parameter. The tool will search for dependent IDLs in this folder, trim them, and maintain the directory structure in the output (based on the dependency of the specified IDL).
The default output location is the “trimmed_idl” folder. You can set the output folder name using -o. The directory structure after output is as follows:

.
├── test_cases    // the original folder
│        ├── my_idl
│        │       └── a.thrift
│        ├── b.thrift
│        ├── c.thrift
└── trimmed_idl    // trimmed folder
         ├── my_idl
         │       └── a.thrift
         └── c.thrift

Specifying Methods to Retain

In the “Trimming Principle” section, it was mentioned that by default, the Trimmer tool searches for all methods of all services to trim the structures. If you want to refine the trimming logic to specific methods or multiple methods, you can use the -m parameter and specify them in the format [service_name.method_name].

trimmer -m MyService.MethodA -m MyService.MethodB example.thrift`

When there is only one service in the target IDL, you can omit the service name:

trimmer -m MethodA -m MethodB example.thrift

After execution, the other services and methods will be trimmed, and only the specified methods and their dependent structures will be retained.

In Thriftgo v0.3.3 (not yet released, but can be obtained from the latest commit), the ability to specify methods for trimming supports regular expression matching. Users can construct regular expressions to match the method names in the format [serviceName].[methodName] and precisely specify one or more methods. For example:

trimmer -m 'Employee.*\..*' test_cases/sample1.thrift

This can match all methods under the service starting with “Employee”. (Note: “.” has special meaning in regular expressions, so it’s best to use “.” to match). You can also use some Perl-style expressions for advanced trimming operations. For example:

trimmer -m '^(?!EmployeeService.getEmployee).*$' test_cases/sample1.thrift

This can match all methods except for EmployeeService.getEmployee. By executing this command, you can remove this method and its dependencies from the IDL (assuming other methods have no dependencies).

Protecting Structures from Trimming

For specific development needs, you may want the trimming tool to retain certain structures, allowing you to utilize the corresponding generated code. You can annotate the struct, union, or exception with the @preserve comment above it to indicate that it should be preserved during trimming. It will be retained unconditionally during trimming. The @preserve comment can coexist with other comments, but make sure that the @preserve comment is on a separate line.

For example:

// @preserve
struct Useless{
}

Even if this structure is not referenced, it will still be preserved after trimming.

This feature is enabled by default and can be disabled by setting the -p or -preserve parameter to false. For example:

trimmer -p false sample.thrift

In this case, the Trimmer tool will disregard the @preserve comment and proceed with trimming.

Integration with Kitex Tool

The Trimmer feature can also be integrated directly into the code generation process of Thriftgo/Kitex. However, make sure that the Thriftgo version is not lower than v0.3.1.
Taking Kitex as an example, you can add the -thrift trim_idl parameter when using it, like this:

kitex -module xx -thrift trim_idl xxx.thrift`

When using this parameter, the command line will output an additional prompt:
[WARN] You Are Using IDL Trimmer

During code generation, the source IDL will be trimmed, and the resulting generated Golang code will not contain unused structures.

Configuration Using YAML

To facilitate configuration of the trimmer tool parameters through Kitex/Thriftgo integration, Thriftgo v0.3.3 (not yet released, but can be obtained from the latest commit) provides support for a YAML configuration file for the trimmer tool. When integrating with Kitex/Thriftgo or using the trimmer tool directly, the trimmer will automatically scan and apply the trim_config.yaml configuration file located in the current execution directory (os.Getwd()).

When using the YAML configuration file, you will receive a similar prompt:
using trim config: /xxx/trim_config.yaml

In addition to providing a configuration parameter method for integrating with Kitex/Thriftgo, IDL authors can use the YAML configuration file to define the content to be preserved during trimming, which can serve as a template for batch trimming or be shared with others.

YAML Configuration Example:

methods:
  - "TestService.func1"
  - "TestService.func3"
preserve: true
preserved_structs:
  - "usefulStruct"

Configuration Format

Currently, the following options can be configured using YAML:

  • methods: An array of strings, equivalent to the -m parameter, indicating the methods to be preserved during trimming. Refer to the “Specifying Methods to Retain” section for the format and functionality. If the -m parameter is specified in the command line, it will override the YAML configuration.
  • preserve: A boolean value, equivalent to the -p parameter, set to false to disable the functionality of preserving structures from being trimmed, such as using the @preserve comment. The default value is true. If the -p parameter is specified in the command line, it will override the YAML configuration.
  • preserved_structs: An array of strings, indicating the names of structures to be preserved. If preserve or the -p parameter in the command line is set to true, the trimmer will unconditionally preserve the specified structures. This can be used for struct, union, and exception structures.

Last modified December 9, 2024 : Update prerequisite.md (#1178) (b5e2299)