C语言中的接口和特性
Interfaces and Traits in C

原始链接: https://antonz.org/interfaces-in-c/

## C语言中的接口:模拟Go和Rust的多态性 本文探讨了在C语言中实现类似接口的行为,灵感来自Go语言的接口和Rust语言的trait,旨在实现无需传统基于类的继承的多态性。核心思想是定义行为契约,并允许类型实现这些契约而无需显式声明。 最初的尝试模仿了Go的结构化类型,但在实现多个接口时会遇到布局依赖问题——类型转换变得不可靠。更好的方法是在接口本身内部存储指向对象的指针,从而提高类型安全性并实现多个接口的实现。然而,这会为每个方法引入函数指针开销。 最佳解决方案是使用“方法表”——一个包含函数指针的独立结构——由接口引用。即使对于大型接口,这也能最大限度地减少开销并保持类型安全性。本文还涉及类型断言,类似于Go语言,但建议谨慎使用,因为C语言存在局限性。 最终,虽然在C语言中实现接口是可行的,但比在Go或Rust中更复杂。方法表方法在效率、类型安全性和多态功能之间提供了实际的平衡。完整的C代码示例演示了`Reader`接口及其实现者`Zeros`结构体的功能实现。

黑客新闻 新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 C语言中的接口和特性 (antonz.org) 8 分,作者 ibobev 26分钟前 | 隐藏 | 过去 | 收藏 | 讨论 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系 搜索:
相关文章

原文

Everyone likes interfaces in Go and traits in Rust. Polymorphism without class-based hierarchies or inheritance seems to be the sweet spot. What if we try to implement this in C?

Interfaces in Go • Traits in Rust • Toy example • Interface definition • Interface data • Method table • Method table in implementor • Type assertions • Final thoughts

Interfaces in Go

An interface in Go is a convenient way to define a contract for some useful behavior. Take, for example, the honored io.Reader:

// Reader is the interface that wraps the basic Read method.
type Reader interface {
    // Read reads up to len(p) bytes into p. It returns the number of bytes
    // read (0 <= n <= len(p)) and any error encountered.
    Read(p []byte) (n int, err error)
}

Anything that can read data into a byte slice provided by the caller is a Reader. Quite handy, because the code doesn't need to care where the data comes from — whether it's memory, the file system, or the network. All that matters is that it can read the data into a slice:

// work processes the data read from r.
func work(r io.Reader) int {
    buf := make([]byte, 8)
    n, err := r.Read(buf)
    if err != nil && err != io.EOF {
        panic(err)
    }
    // ...
    return n
}

We can provide any kind of reader:

func main() {
    var total int
    b := bytes.NewBufferString("hello world")

    // bytes.Buffer implements io.Reader, so we can use it with work.
    total += work(b)
    total += work(b)

    fmt.Println("total =", total)
}

Go's interfaces are structural, which is similar to duck typing. A type doesn't need to explicitly state that it implements io.Reader; it just needs to have a Read method:

// Zeros is an infinite stream of zero bytes.
type Zeros struct{}

func (z Zeros) Read(p []byte) (n int, err error) {
    clear(p)
    return len(p), nil
}

The Go compiler and runtime take care of the rest:

func main() {
    var total int
    var z Zeros

    // Zeros implements io.Reader, so we can use it with work.
    total += work(z)
    total += work(z)

    fmt.Println("total =", total)
}

Traits in Rust

A trait in Rust is also a way to define a contract for certain behavior. Here's the std::io::Read trait:

// The Read trait allows for reading bytes from a source.
pub trait Read {
    // Readers are defined by one required method, read(). Each call to read()
    // will attempt to pull bytes from this source into a provided buffer.
    fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize>;

    // ...
}

Unlike in Go, a type must explicitly state that it implements a trait:

// An infinite stream of zero bytes.
struct Zeros;

impl io::Read for Zeros {
    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
        buf.fill(0);
        Ok(buf.len())
    }
}

The Rust compiler takes care of the rest:

// Processes the data read from r.
fn work(r: &mut dyn io::Read) -> usize {
    let mut buf = [0; 8];
    match r.read(&mut buf) {
        Ok(n) => n,
        Err(e) => panic!("Error: {}", e),
    }
}

fn main() {
    let mut total = 0;
    let mut z = Zeros;

    // Zeros implements Read, so we can use it with work.
    total += work(&mut z);
    total += work(&mut z);

    println!("total = {}", total);
}

Either way, whether it's Go or Rust, the caller only cares about the contract (defined as an interface or trait), not the specific implementation.

Toy example

Let's make an even simpler version of Reader — one without any error handling (Go):

// Reader an interface that wraps the basic Read method.
// Read reads up to len(p) bytes into p.
type Reader interface {
    Read(p []byte) int
}

Usage example:

// Zeros is an infinite stream of zero bytes.
type Zeros struct {
    total int // total number of bytes read
}

// Read reads len(p) bytes into p.
func (z *Zeros) Read(p []byte) int {
    clear(p)
    z.total += len(p)
    return len(p)
}

// work processes the data read from r.
func work(r Reader) int {
    buf := make([]byte, 8)
    return r.Read(buf)
}

func main() {
    z := new(Zeros)
    work(z)
    work(z)
    fmt.Println("total =", z.total)
}

Let's see how we can do this in C!

Interface definition

The main building blocks in C are structs and functions, so let's use them. Our Reader will be a struct with a single field called Read. This field will be a pointer to a function with the right signature:

// An interface that wraps the basic Read method.
// Read reads up to len(p) bytes into p.
typedef struct {
    size_t (*Read)(void* self, uint8_t* p, size_t len);
} Reader;

To make Zeros fully dynamic, let's turn it into a struct with a Read function pointer (I know, I know — just bear with me):

// An infinite stream of zero bytes.
typedef struct {
    size_t (*Read)(void* self, uint8_t* p, size_t len);
    size_t total;
} Zeros;

Here's the Zeros_Read "method" implementation:

// Reads up to len(p) bytes into p.
size_t Zeros_Read(void* self, uint8_t* p, size_t len) {
    Zeros* z = (Zeros*)self;
    for (size_t i = 0; i < len; i++) {
        p[i] = 0;
    }
    z->total += len;
    return len;
}

The work is pretty obvious:

// Does some work reading from r.
size_t work(Reader* r) {
    uint8_t buf[8];
    return r->Read(r, buf, sizeof(buf));
}

And, finally, the main function:

int main(void) {
    Zeros z = {.Read = Zeros_Read, .total = 0};

    Reader* r = (Reader*)&z;
    work(r);
    work(r);

    printf("total = %zu\n", z.total);
}

See how easy it is to turn a Zeros into a Reader: all we need is (Reader*)&z. Pretty cool, right?

Not really. Actually, this implementation is seriously flawed in almost every way (except for the Reader definition).

Memory overhead. Each Zeros instance has its own function pointers (8 bytes per function on a 64-bit system) as "methods", which isn't practical even if there are only a few of them. Regular objects should store data, not functions.

Layout dependency. Converting from Zeros* to Reader* like (Reader*)&z only works if both structures have the same Read field as their first member. If we try to implement another interface:

// Reader interface.
typedef struct {
    size_t (*Read)(void* self, uint8_t* p, size_t len);
} Reader;

// Closer interface.
typedef struct {
    void (*Close)(void* self);
} Closer;

// Zeros implements both Reader and Closer.
typedef struct {
    size_t (*Read)(void* self, uint8_t* p, size_t len);
    void (*Close)(void* self);
    size_t total;
} Zeros;

Everything will fall apart:

int main(void) {
    Zeros z = {
        .Read = Zeros_Read,
        .Close = Zeros_Close,
        .total = 0,
    };
    Closer* c = (Closer*)&z;  // (X)
    c->Close(c);
}

Closer and Zeros have different layouts, so type conversion in ⓧ is invalid and causes undefined behavior.

Lack of type safety. Using a void* as the receiver in Zeros_Read means the caller can pass any type, and the compiler won't even show a warning:

int main(void) {
    int x = 42;
    uint8_t buf[8];
    Zeros_Read(&x, buf, sizeof(buf));  // bad decision
}

size_t Zeros_Read(void* self, uint8_t* p, size_t len) {
    Zeros* z = (Zeros*)self;
    // ...
    z->total += len;                   // consequences
    return len;
}

C isn't a particularly type-safe language, but this is just too much. Let's try something else.

Interface data

A better way is to store a reference to the actual object in the interface:

// An interface that wraps the basic Read method.
// Read reads up to len(p) Zeros into p.
typedef struct {
    size_t (*Read)(void* self, uint8_t* p, size_t len);
    void* self;
} Reader;

We could have the Read method in the interface take a Reader instead of a void*, but that would make the implementation more complicated without any real benefits. So, I'll keep it as void*.

Then Zeros will only have its own fields:

// An infinite stream of zero bytes.
typedef struct {
    size_t total;
} Zeros;

We can make the Zeros_Read method type-safe:

// Reads len(p) bytes into p.
size_t Zeros_Read(Zeros* z, uint8_t* p, size_t len) {
    for (size_t i = 0; i < len; i++) {
        p[i] = i % 256;
    }
    z->total += len;
    return len;
}

To make this work, we add a Zeros_Reader method that returns the instance wrapped in a Reader interface:

// Returns a Reader implementation for Zeros.
Reader Zeros_Reader(Zeros* z) {
    return (Reader){
        .Read = (size_t (*)(void*, uint8_t*, size_t))Zeros_Read,
        .self = z,
    };
}

The work and main functions remain quite simple:

// Does some work reading from r.
size_t work(Reader r) {
    uint8_t buf[8];
    return r.Read(r.self, buf, sizeof(buf));
}

int main(void) {
    Zeros z = {0};

    Reader r = Zeros_Reader(&z);
    work(r);
    work(r);

    printf("total = %zu\n", z.total);
}

This approach is much better than the previous one:

  • The Zeros struct is lean and doesn't have any interface-related fields.
  • The Zeros_Read method takes a Zeros* instead of a void*.
  • The cast from Zeros to Reader is handled inside the Zeros_Reader method.
  • We can implement multiple interfaces if needed.

Since our Zeros type now knows about the Reader interface (through the Zeros_Reader method), our implementation is more like a basic version of a Rust trait than a true Go interface. For simplicity, I'll keep using the term "interface".

There is one downside, though: each Reader instance has its own function pointer for every interface method. Since Reader only has one method, this isn't an issue. But if an interface has a dozen methods and the program uses a lot of these interface instances, it can become a problem.

Let's fix this.

Method table

Let's extract interface methods into a separate strucute — the method table. The interface references its methods though the mtab field:

// An interface that wraps the basic Read method.
// Read reads up to len(p) bytes into p.
typedef struct {
    size_t (*Read)(void* self, uint8_t* p, size_t len);
} ReaderTable;

typedef struct {
    const ReaderTable* mtab;
    void* self;
} Reader;

Zeros and Zeros_Read don't change at all:

// An infinite stream of zero bytes.
typedef struct {
    size_t total;
} Zeros;

// Reads len(p) bytes into p.
size_t Zeros_Read(Zeros* z, uint8_t* p, size_t len) {
    for (size_t i = 0; i < len; i++) {
        p[i] = i % 256;
    }
    z->total += len;
    return len;
}

The Zeros_Reader method initializes the static method table and assigns it to the interface instance:

// Returns a Reader implementation for Zeros.
Reader Zeros_Reader(Zeros* z) {
    // The method table is only initialized once.
    static const ReaderTable impl = {
        .Read = (size_t (*)(void*, uint8_t*, size_t))Zeros_Read,
    };
    return (Reader){.mtab = &impl, .self = z};
}

The only difference in work is that it calls the Read method on the interface indirectly using the method table (r.mtab->Read instead of r.Read):

// Does some work reading from r.
size_t work(Reader r) {
    uint8_t buf[8];
    return r.mtab->Read(r.self, buf, sizeof(buf));
}

main stays the same:

int main(void) {
    Zeros z = {0};

    Reader r = Zeros_Reader(&z);
    work(r);
    work(r);

    printf("total = %zu\n", z.total);
}

Now the Reader instance always has a single pointer field for its methods. So even for large interfaces, it only uses 16 bytes (mtab + self fields). This approach also keeps all the benefits from the previous version:

  • Lightweight Zeros structure.
  • Easy conversion from Zeros to Reader.
  • Supports multiple interfaces.

We can even add a separate Reader_Read helper so the client doesn't have to worry about r.mtab->Read implementation detail:

// Reads len(p) bytes into p.
size_t Reader_Read(Reader r, uint8_t* p, size_t len) {
    return r.mtab->Read(r.self, p, len);
}

// Does some work reading from r.
size_t work(Reader r) {
    uint8_t buf[8];
    return Reader_Read(r, buf, sizeof(buf));
}

Nice!

Alternative: Method table in implementor

There's another approach I've seen out there. I don't like it, but it's still worth mentioning for completeness.

Instead of embedding the Reader method table in the interface, we can place it in the implementation (Zeros):

// An interface that wraps the basic Read method.
// Read reads up to len(p) bytes into p.
typedef struct {
    size_t (*Read)(void* self, uint8_t* p, size_t len);
} ReaderTable;

typedef ReaderTable* Reader;

// An infinite stream of zero bytes.
typedef struct {
    Reader mtab;
    size_t total;
} Zeros;

We initialize the method table in the Zeros constructor:

// Returns a new Zeros instance.
Zeros NewZeros(void) {
    static const ReaderTable impl = {
        .Read = (size_t (*)(void*, uint8_t*, size_t))Zeros_Read,
    };
    return (Zeros){
        .mtab = (Reader)&impl,
        .total = 0,
    };
}

work now takes a Reader pointer:

// Does some work reading from r.
size_t work(Reader* r) {
    uint8_t buf[8];
    return (*r)->Read(r, buf, sizeof(buf));
}

And main converts Zeros* to Reader* with a simple type cast:

int main(void) {
    Zeros z = NewZeros();

    Reader* r = (Reader*)&z;
    work(r);
    work(r);

    printf("total = %zu\n", z.total);
}

This keeps Zeros pretty lightweight, only adding one extra mtab field. But the (Reader*)&z cast only works because Reader mtab is the first field in Zeros. If we try to implement a second interface, things will break — just like in the very first solution.

I think the "method table in the interface" approach is much better.

Bonus: Type assertions

Go has an io.Copy function that copies data from a source (a reader) to a destination (a writer):

func Copy(dst Writer, src Reader) (written int64, err error)

There's an interesting comment in its documentation:

If src implements WriterTo, the copy is implemented by calling src.WriteTo(dst). Otherwise, if dst implements ReaderFrom, the copy is implemented by calling dst.ReadFrom(src).

Here's what the function looks like:

func Copy(dst Writer, src Reader) (written int64, err error) {
    // If the reader has a WriteTo method, use it to do the copy.
    // Avoids an allocation and a copy.
    if wt, ok := src.(WriterTo); ok {
        return wt.WriteTo(dst)
    }
    // Similarly, if the writer has a ReadFrom method, use it to do the copy.
    if rf, ok := dst.(ReaderFrom); ok {
        return rf.ReadFrom(src)
    }
    // The default implementation using regular Reader and Writer.
    // ...
}

src.(WriterTo) is a type assertion that checks if the src reader is not just a Reader, but also implements the WriterTo interface. The Go runtime handles these kinds of dynamic type checks.

Can we do something like this in C? I'd prefer not to make it fully dynamic, since trying to recreate parts of the Go runtime in C probably isn't a good idea.

What we can do is add an optional AsWriterTo method to the Reader interface:

// An interface that wraps the basic Read method.
// Read reads up to len(p) bytes into p.
typedef struct {
    // required
    size_t (*Read)(void* self, uint8_t* p, size_t len);
    // optional
    WriterTo (*AsWriterTo)(void* self);
} ReaderTable;

typedef struct {
    const ReaderTable* mtab;
    void* self;
} Reader;

Then we can easily check if a given Reader is also a WriterTo:

void work(Reader r) {
    // Check if r implements WriterTo.
    if (r.mtab->AsWriterTo) {
        WriterTo wt = r.mtab->AsWriterTo(r.self);
        // Use r as WriterTo...
        return;
    }
    // Use r as a regular Reader...
    return;
}

Still, this feels a bit like a hack. I'd rather avoid using type assertions unless it's really necessary.

Final thoughts

Interfaces (traits, really) in C are possible, but they're not as simple or elegant as in Go or Rust. The method table approach we discussed is a good starting point. It's memory-efficient, as type-safe as possible given C's limitations, and supports polymorphic behavior.

Here's the full source code if you are interested:

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

// An interface that wraps the basic Read method.
// Read reads up to len(p) bytes into p.
typedef struct {
    size_t (*Read)(void* self, uint8_t* p, size_t len);
} ReaderTable;

typedef struct {
    const ReaderTable* mtab;
    void* self;
} Reader;

// Reads len(p) bytes into p.
size_t Reader_Read(Reader r, uint8_t* p, size_t len) {
    return r.mtab->Read(r.self, p, len);
}

// An infinite stream of zero bytes.
typedef struct {
    size_t total;
} Zeros;

// Reads len(p) bytes into p.
size_t Zeros_Read(Zeros* z, uint8_t* p, size_t len) {
    for (size_t i = 0; i < len; i++) {
        p[i] = i % 256;
    }
    z->total += len;
    return len;
}

// Returns a Reader implementation for Zeros.
Reader Zeros_Reader(Zeros* z) {
    // The method table is only initialized once.
    static const ReaderTable impl = {
        .Read = (size_t (*)(void*, uint8_t*, size_t))Zeros_Read,
    };
    return (Reader){.mtab = &impl, .self = z};
}

// Does some work reading from r.
size_t work(Reader r) {
    uint8_t buf[8];
    return Reader_Read(r, buf, sizeof(buf));
}

int main(void) {
    Zeros z = {0};

    Reader r = Zeros_Reader(&z);
    work(r);
    work(r);

    printf("total = %zu\n", z.total);
}

Cheers!

★ Subscribe to keep up with new posts.

联系我们 contact @ memedata.com