Thriftgo IDL Trimmer
Introduction
Thriftgo is a Thrift IDL parser and code generator implemented in Go. It supports comprehensive Thrift IDL syntax and semantic checks. Compared to the official Golang code generated by Apache Thrift, Thriftgo has made some bug fixes and supports plugin mechanisms, allowing users to customize the generated code according to their needs. The code generation tool for Kitex is one of the plugin implementations of Thriftgo.
Some IDL repositories may accumulate unrelated content over time due to lack of timely maintenance. With version iterations, the amount of unrelated content increases, resulting in large generated Golang code repositories and reduced readability of the IDL. It may also include excessive IDL references, resulting in irrelevant structures being included in the generated code, leading to oversized code or exceptions during the code generation process. These issues frequently occur in complex project practices and may hinder the development workflow.
Thriftgo provides IDL trimming functionality starting from version v0.3.1+. This feature is used to trim structures that are not referenced by services, thereby reducing unnecessary generated code that has no impact or dependency on RPC.
This feature can be used separately using the Trimmer tool provided under the Thriftgo project, or it can be used in conjunction with the Thriftgo/Kitex command line to filter out unused content during code generation.
Trimming Principle
- Traverse all services included in the given IDL file and mark all struct structures directly or indirectly referenced by services and methods as “used”.
- Scan all IDLs directly or indirectly referenced by this project. For all struct structures, if they are not marked as “used”, they will be pruned.
- Since businesses often use IDL to introduce enums and constants, the tool does not trim typedefs, enums, and constants in the IDL.
- The trimmed result is output as an IDL file with the .thrift extension or as generated Golang code.
The following is an illustration of the IDL files before trimming.
After using the trimming tool, it will traverse all structures based on service A in IDL A and remove unused struct structures. The final output is as follows.
Using the Trimmer Tool
The Trimmer tool supports processing thrift IDL files and outputs the trimmed results as thrift IDL files.
Install the Trimmer tool:
go install github.com/cloudwego/thriftgo/tool/trimmer@latest
Check the version/verify the installation:
trimmer -version
Usage format:
trimmer [options] file
Options:
--version Print version information
-h, --help Print help information
-o, --out [file/dir name] Specify the file name/output directory for the IDL
-r, --recurse [dir] Specify a root directory, copy its directory structure to the output directory, and recursively output the specified IDL and its referenced IDLs to the corresponding locations. If -o is specified, it must be a directory.
-m, --method [service.method] Specify one or more methods to be retained for trimming
Single File Processing
When you want to trim a single IDL file, you can execute the following command:
trimmer sample.thrift
After successful execution, the location of the trimmed IDL file will be output, and you will see the following prompt:success, dump to example_trimmed.thrift
By default, the trimmed IDL is named with the original name + “trimmed” suffix. If you want to output the trimmed IDL to a specific directory or rename it, you can use the -o parameter:
trimmer -o abc/my_sample.thrift sample.thrift
Note that due to the fact that IDL definitions themselves do not record indentation, line breaks, and order, the field order in the output new IDL file may differ from the original version.
Recursive Trimming
If you want to trim a specific IDL and simultaneously trim and output the related IDLs it references, you can use the -r parameter:
trimmer -r test_cases/ test_cases/my_idl/a.thrift
Note that when using -r, you need to specify the IDL file directory after the -r parameter. The tool will search for dependent IDLs in this folder, trim them, and maintain the directory structure in the output (based on the dependency of the specified IDL).
The default output location is the “trimmed_idl” folder. You can set the output folder name using -o. The directory structure after output is as follows:
.
├── test_cases // the original folder
│ ├── my_idl
│ │ └── a.thrift
│ ├── b.thrift
│ ├── c.thrift
└── trimmed_idl // trimmed folder
├── my_idl
│ └── a.thrift
└── c.thrift
Specifying Methods to Retain
In the “Trimming Principle” section, it was mentioned that by default, the Trimmer tool searches for all methods of all services to trim the structures. If you want to refine the trimming logic to specific methods or multiple methods, you can use the -m parameter and specify them in the format [service_name.method_name]
.
trimmer -m MyService.MethodA -m MyService.MethodB example.thrift`
When there is only one service in the target IDL, you can omit the service name:
trimmer -m MethodA -m MethodB example.thrift
After execution, the other services and methods will be trimmed, and only the specified methods and their dependent structures will be retained.
In Thriftgo v0.3.3 (not yet released, but can be obtained from the latest commit), the ability to specify methods for trimming supports regular expression matching. Users can construct regular expressions to match the method names in the format [serviceName].[methodName]
and precisely specify one or more methods. For example:
trimmer -m 'Employee.*\..*' test_cases/sample1.thrift
This can match all methods under the service starting with “Employee”. (Note: “.” has special meaning in regular expressions, so it’s best to use “.” to match). You can also use some Perl-style expressions for advanced trimming operations. For example:
trimmer -m '^(?!EmployeeService.getEmployee).*$' test_cases/sample1.thrift
This can match all methods except for EmployeeService.getEmployee
. By executing this command, you can remove this method and its dependencies from the IDL (assuming other methods have no dependencies).
Protecting Structures from Trimming
For specific development needs, you may want the trimming tool to retain certain structures, allowing you to utilize the corresponding generated code. You can annotate the struct, union, or exception with the @preserve
comment above it to indicate that it should be preserved during trimming. It will be retained unconditionally during trimming. The @preserve
comment can coexist with other comments, but make sure that the @preserve
comment is on a separate line.
For example:
// @preserve
struct Useless{
}
Even if this structure is not referenced, it will still be preserved after trimming.
This feature is enabled by default and can be disabled by setting the -p or -preserve parameter to false. For example:
trimmer -p false sample.thrift
In this case, the Trimmer tool will disregard the @preserve
comment and proceed with trimming.
Integration with Kitex Tool
The Trimmer feature can also be integrated directly into the code generation process of Thriftgo/Kitex. However, make sure that the Thriftgo version is not lower than v0.3.1.
Taking Kitex as an example, you can add the -thrift trim_idl
parameter when using it, like this:
kitex -module xx -thrift trim_idl xxx.thrift`
When using this parameter, the command line will output an additional prompt:[WARN] You Are Using IDL Trimmer
During code generation, the source IDL will be trimmed, and the resulting generated Golang code will not contain unused structures.
Configuration Using YAML
To facilitate configuration of the trimmer tool parameters through Kitex/Thriftgo integration, Thriftgo v0.3.3 (not yet released, but can be obtained from the latest commit) provides support for a YAML configuration file for the trimmer tool. When integrating with Kitex/Thriftgo or using the trimmer tool directly, the trimmer will automatically scan and apply the trim_config.yaml configuration file located in the current execution directory (os.Getwd()).
When using the YAML configuration file, you will receive a similar prompt:using trim config: /xxx/trim_config.yaml
In addition to providing a configuration parameter method for integrating with Kitex/Thriftgo, IDL authors can use the YAML configuration file to define the content to be preserved during trimming, which can serve as a template for batch trimming or be shared with others.
YAML Configuration Example:
methods:
- "TestService.func1"
- "TestService.func3"
preserve: true
preserved_structs:
- "usefulStruct"
Configuration Format
Currently, the following options can be configured using YAML:
- methods: An array of strings, equivalent to the -m parameter, indicating the methods to be preserved during trimming. Refer to the “Specifying Methods to Retain” section for the format and functionality. If the -m parameter is specified in the command line, it will override the YAML configuration.
- preserve: A boolean value, equivalent to the -p parameter, set to false to disable the functionality of preserving structures from being trimmed, such as using the @preserve comment. The default value is true. If the -p parameter is specified in the command line, it will override the YAML configuration.
- preserved_structs: An array of strings, indicating the names of structures to be preserved. If preserve or the -p parameter in the command line is set to true, the trimmer will unconditionally preserve the specified structures. This can be used for struct, union, and exception structures.