Go 的逃逸分析以及为什么我的函数返回值有效

Go 的逃逸分析以及为什么我的函数返回值有效
Go's escape analysis and why my function return worked

原始链接: https://bonniesimon.in/blog/go-escape-analysis

## Go 的内存管理与返回值从 C/C++ 过来的程序员可能会对 Go 函数返回值的能力感到困惑，因为它避免了 C 中常见的栈/堆问题。在 C 中，返回指向局部变量的指针会导致悬挂指针和未定义行为。然而，Go 在编译期间使用 **逃逸分析**。如果变量的生命周期超出了它所定义的函数范围（例如，当它被返回时），编译器会自动将其分配到 **堆** 上，而不是栈上。Go 的 **垃圾回收器** 然后管理这块堆内存，确保只要它被引用，它就保持有效。重要的是，当在 Go 中返回一个切片时，你实际上是在返回切片头（包含指向底层数组、长度和容量的指针）的 *副本*。这个头指向堆分配的数组，该数组在不再被引用之前会持续存在。这与 C 形成鲜明对比，在 C 中，返回指向栈变量的指针会导致内存立即失效。Go 的设计优先考虑安全性并简化了内存管理，从而实现更简洁的代码，而无需 C 的手动分配/释放复杂性。

原文

Recently I came across something interesting in Go. I wrote the following code to read logs from a log file in the server.

func readLogsFromPartition(partition int) []LogEntry {
  var logs []LogEntry 
  logs = ReadFromFile()

  return logs 
}

func main() {
  logs := readLogsFromPartition(1) 
}

I compiled the program and ran it, and it worked. But after I took a step back and looked at it, I couldn't make sense of why it worked. Why am I able to return a value that was created on the local function back to the main function?

If you can't seem to understand why I'm confused, then I'll explain some background. Before my Go phase, I was trying to get back into writing C. And I had a few head scratching days of understanding the stack vs the heap.

From my understanding of C/C++, the above code would blow up. In C, you can't assign a value in a local function and then return it (if the value is a pointer). This is because when the function returns, all the stack values are destroyed, and hence the returned value will be replaced with garbage value.

Lets see an equivalent example of this in C:

#include <stdio.h>

int* readLogsFromPartition() {
    int logs[5] = {101, 102, 103, 104, 105};  
    return logs;  
}

int main() {
    int* logs = readLogsFromPartition();

    printf("First log: %d\n", logs[0]);  

    
    printf("Reading logs...\n");

    
    for (int i = 0; i < 5; i++) {
        printf("Log %d: %d\n", i, logs[i]);  
    }

    return 0;
}

As you can see, logs is a local array that we have defined in readLogsFromPartition. It will be initialized on the stack. Thus when the function readLogsFromPartition returns, internally the entry associated with readLogsFromPartition on the stack will be popped and cleared. So in the main function, we won't have a accurate value of logs. Instead we'll get garbage values.

In C, to avoid this, you'd initialize the variable in the calling function (on stack or heap) and then pass a pointer to the function. Then in the function, you'd dereference the variable and assign the value to it.

#include <stdio.h>
#include <stdlib.h>

void readLogsFromPartition(int* logs, int size) {
    
    for (int i = 0; i < size; i++) {
        logs[i] = 100 + i;  
    }
}

int main() {
    int logs[5];  
    

    readLogsFromPartition(logs, 5);

    for (int i = 0; i < 5; i++) {
        printf("Log %d: %d\n", i, logs[i]);  
    }

    return 0;
}

Or you can make the function allocate on the heap and return a pointer.

#include <stdio.h>
#include <stdlib.h>

int* readLogsFromPartition(int size) {
    int* logs = (int*)malloc(size * sizeof(int));  
    if (logs == NULL) {
        return NULL;
    }

    for (int i = 0; i < size; i++) {
        logs[i] = 100 + i;  
    }

    return logs;  
}

int main() {
    int* logs = readLogsFromPartition(5);

    if (logs == NULL) {
        printf("Allocation failed!\n");
        return 1;
    }

    for (int i = 0; i < 5; i++) {
        printf("Log %d: %d\n", i, logs[i]);  
    }

    free(logs);  
    return 0;
}

Now you can understand why my understanding of C would leave me scratching my head when I saw the Go code.

func readLogsFromPartition(partition int) []LogEntry {
  var logs []LogEntry
  logs = ReadFromFile()

  return logs 
}

func main() {
  logs := readLogsFromPartition(1)
}

Go uses escape analysis at compile time. When the compiler sees that a variable (like logs) is returned from a function, it recognizes that the variable "escapes" the function scope. When a variable escapes:

Go allocates it on the heap (not the stack)

The garbage collector manages its lifetime

The memory stays alive as long as something references it, so when you do:


var logs []LogEntry       
logs = append(all, log)
return logs        


logs, err := readLogsFromPartition(partitionFilePath)

A slice in Go is actually a small struct (called a "slice header") containing:

A pointer to the underlying array
The length
The capacity

image of the underlying structure of slice in go

When you return all, you're returning a copy of this slice header, but the pointer still points to the same underlying array on the heap. The garbage collector won't free that array until nothing references it anymore.

In C, returning a pointer to a local stack variable would be undefined behavior (dangling pointer). But Go's escape analysis and garbage collection specifically prevent this problem; it's one of Go's safety features.

So the next time you return a slice from a function in Go, know that the compiler has your back. You get the clean syntax without the danger.

claude-4.5 opus (Answered my initial query on why the code worked which triggered the hunt for understanding)

https://go.dev/blog/go-slices-usage-and-internals

https://go.dev/doc/faq#stack_or_heap

Go 的逃逸分析以及为什么我的函数返回值有效 Go's escape analysis and why my function return worked

Go 的逃逸分析以及为什么我的函数返回值有效
Go's escape analysis and why my function return worked