Kitex Remove Apache Thrift User Guide

This document introduce how Kitex remove Apache Thrift in the future

Background

Kitex uses the github.com/apache/thrift library for some encoding and decoding tasks and also generates native encoding and decoding code for Apache Codec (located in kitex_gen under Read, Write, and other content).

However, in actual usage, most services do not require this content, leading to code redundancy and a series of issues due to the dependency on Apache Thrift.

In future versions, we plan to gradually remove the dependency on Apache Thrift. The expected outcomes after removal include:

  • A nearly 50% reduction in the size of kitex_gen output
  • Elimination of issues related to the lock-in with Apache Thrift version 0.13.0

Remove Apache Thrift Dependency

To facilitate expression, the following libraries are respectively referred to as:

  • github.com/apache/thrift => apache thrift
  • github.com/cloudwego/kitex/pkg/protocol/bthrift => kitex bthrift
  • github.com/cloudwego/gopkg/protocol/thrift => gopkg thrift

Kitex will gradually eliminate its dependency on the Apache Thrift library in two stages in the go.mod file.

Phase 1(v0.11.0)

Kitex will remove the import dependency on Apache Thrift from the generated code and replace it with kitex bthrift and gopkg thrift.

The imports in the generated code and corresponding methods like Read, Write, etc., will also be replaced. After regenerating the code, the import dependency on Apache Thrift in kitex_gen will be transitioned to the latter two libraries.

Phase 2(v0.12.0)

Kitex will transform the bthrift library into an independent submodule, consolidating the dependency on Apache Thrift in the go.mod file of this submodule. Additionally, with changes in the generated code, if users do not generate Apache Codec interfaces in kitex_gen (see details below), the Kitex project will no longer introduce the dependency on github.com/apache/thrift.

Furthermore, all usage of thrift encoding and decoding-related interfaces will be consolidated into the independently maintained gopkg thrift (such as the FastCodec interface definition, fastthrift tool interface, etc.).

At this point, Kitex’s go.mod file will no longer actively depend on Apache Thrift, thereby resolving issues related to specific version constraints like v0.13.0.

Remove Apache Codec Codegen

Within kitex_gen, two sets of serialization code will be generated: FastCodec (cloudwego) and Apache Codec (apache).

The usage of Apache Codec is quite low, with the majority of services utilizing FastCodec for encoding and decoding. Since most services do not make use of the Apache native interfaces,

to align with the dependency removal, the code generation within kitex_gen will gradually eliminate the generation code and interfaces related to Apache Codec. This process will also result in significant reduction in output size.

We will undertake this in three stages.

Phase 1 (<= v0.11.0)

By adding the parameter -thrift no_default_serdes to the kitex tool, it is possible to avoid generating Apache Codec.

In versions up to v0.11.0, we will maintain the exact same generation behavior without any changes, providing users with a transitional buffer period.

If you wish to actively remove Apache Codec to reduce output size by half, you can refer to the “Appendix” at the end of the document.

Phase 2 (v0.12.0)

We will add warnings and logging before the Apache Codec Read/Write operations generated by kitex_gen to strongly remind users. Additionally, we will assist in the transformation of some critical services.

If you encounter warnings mentioning Apache Codec after starting a service, please refer to the “How to Actively Remove” section in the appendix at the end of the document. Replace Apache Codec with Fast Codec to achieve better encoding and decoding performance, while also avoiding any impact from our future removal of Apache Codec.

Phase 3 (v0.13.0)

In version 1.18.0 of Kitex tools, the default behavior will be to not generate Apache Codec code and solely utilize FastCodec.

This change will reduce the output size to only half of the original, and the go.mod file will no longer depend on the github.com/apache/thrift library.

  • User Impact: Those directly using Apache Codec for serialization will encounter missing interfaces, resulting in an impact (RPC calls will not be affected).
  • If needed, users can retain the generation of Apache Codec by specifying parameters. Specific operational manuals will be provided after the version release.
  • To prevent compilation failures due to missing Read/Write interfaces, we will also offer an Apache Adaptor for generating bridge code. Subsequently, we will publish corresponding usage instructions.

How to check if you’re using Apache Codec

While the dependency on apache/thrift has been removed from kitex_gen, there may still be instances of apache/thrift encoding and decoding in other parts of your project. This approach is relatively inefficient and heavily relies on the Apache native Read, Write interfaces generated within kitex_gen.

In future plans, Kitex intends to no longer generate these contents by default. This change may impact scenarios where apache/thrift is still being used. It is recommended to replace such usage with Kitex’s efficient FastCodec encoding and decoding. The specific method for doing so is outlined as follows:

1. Use Apache/thrift lib to marshal and unmarshal

If you have code such as:

func GetThriftBinary(ctx context.Context, msg apache_thrift.TStruct) ([]byte, error) {
    t := apache_thrift.NewTMemoryBufferLen(1024)
    p := apache_thrift.NewTBinaryProtocolFactoryDefault().GetProtocol(t)

    tser := &apache_thrift.TSerializer{
    Transport: t,
    Protocol:  p,
    }

    bs, err := tser.Write(ctx, msg)
    if err != nil {
        return nil, err
    }
    return bs, nil
}

func ParseThriftBinary(msg apache_thrift.TStruct, by []byte) error {
    t := apache_thrift.NewTMemoryBufferLen(1024)
    p := apache_thrift.NewTBinaryProtocolFactoryDefault().GetProtocol(t)

    deser := &apache_thrift.TDeserializer{Transport: t, Protocol: p}
    _ = deser.Transport.Close()
    err := deser.Read(msg, by)
    if err != nil {
        return err
    }
    return nil
}

Instead, you can use Kitex FastCodec:

// msg is a struct within kitex_gen, it will have methods such as FastRead and FastWriteNoCopy.

import github.com/cloudwego/kitex/pkg/utils/fastthrift

// marshal
if msg, ok := data.(thrift.FastCodec); ok {
   payload := thrift.FastMarshal(msg)
}

// unmarshal
if msg, ok := data.(thrift.FastCodec);ok {
   err = thrift.FastUnmarshal(buf, msg)
}

2. Make sure that FastCodec is the codec for Kitex RPC

Keyword search: WithPayloadCodec(thrift.NewThriftCodecDisableFastMode(true, true))

If you have this snippet in your code repository, it indicates that RPC requests have disabled FastCodec and are using the lower-performance Apache native Codec.

It is recommended to remove this option to enhance encoding and decoding performance.

How to Actively Remove Apache Codec

Removing Apache Codec can reduce output size by almost half. If your project does not fall into the minority usage scenarios mentioned earlier, you can proactively configure your settings to avoid generating this portion of code. Make sure your Kitex version is above v0.11.0.

  1. Generate code again with param: -thrift no_default_serdes
kitex -module xxx -thrift no_default_serdes xxx.thrift

Then there’s no Apache Codec, the size of kitex_gen is almost half smaller.

  1. Enable Skip Decoder,add the following option when creating client or server:
import (
   "github.com/cloudwego/kitex/pkg/remote/codec/thrift"
   "demo/kitex_gen/kitex/samples/echo/echoservice"
)

func main() {
    cli := echoservice.MustNewClient("kitex.samples.echo",
        client.WithPayloadCodec(thrift.NewThriftCodecWithConfig(thrift.FastRead|thrift.FastWrite|thrift.EnableSkipDecoder)),
    )

    srv := echoservice.NewServer(handler,
        server.WithPayloadCodec(thrift.NewThriftCodecWithConfig(thrift.FrugalWrite|thrift.FrugalRead|thrift.EnableSkipDecoder)),
    )
}

In this way, when your service receives a Thrift Buffered message, decoding and encoding will be accomplished through SkipDecoder + FastCodec, no longer relying on Apache Codec. For other scenarios, such as TTHeader or Mesh scenarios, the logic remains unchanged, all directly utilizing FastCodec.


Last modified December 9, 2024 : Update prerequisite.md (#1178) (b5e2299)