Leapcell: The Next-Gen Serverless Platform for Golang app Hosting
Exploring the Mystery of Golang Timer Precision
I. Problem Introduction: How Precise Can a Timer Be in Golang?
In the world of Golang, timers have a wide range of application scenarios. However, the question of exactly how precise they are has always been a concern for developers. This article will delve deep into the management of the timer heap in Go and the mechanism for obtaining time at runtime, thus revealing to what extent we can rely on the accuracy of timers.
II. How Go Obtains Time
(I) The Assembly Function Behind time.Now
When we call time.Now
, it will eventually call the following assembly function:
// func now() (sec int64, nsec int32)
TEXT time·now(SB),NOSPLIT,$16
// Be careful. We're calling a function with gcc calling convention here.
// We're guaranteed 128 bytes on entry, and we've taken 16, and the
// call uses another 8.
// That leaves 104 for the gettime code to use. Hope that's enough!
MOVQ runtime·__vdso_clock_gettime_sym(SB), AX
CMPQ AX, $0
JEQ fallback
MOVL $0, DI // CLOCK_REALTIME
LEAQ 0(SP), SI
CALL AX
MOVQ 0(SP), AX // sec
MOVQ 8(SP), DX // nsec
MOVQ AX, sec+0(FP)
MOVL DX, nsec+8(FP)
RET
fallback:
LEAQ 0(SP), DI
MOVQ $0, SI
MOVQ runtime·__vdso_gettimeofday_sym(SB), AX
CALL AX
MOVQ 0(SP), AX // sec
MOVL 8(SP), DX // usec
IMULQ $1000, DX
MOVQ AX, sec+0(FP)
MOVL DX, nsec+8(FP)
RET
Here, in TEXT time·now(SB),NOSPLIT,$16
, time·now(SB)
represents the address of the function now
, the NOSPLIT
flag indicates that it does not rely on parameters, and $16
indicates that the returned content is 16 bytes.
(II) Function Call Process
First, the address of __vdso_clock_gettime_sym(SB)
is retrieved, which points to the clock_gettime
function. If this symbol is not empty, the address at the top of the stack is calculated and passed to SI
(using the LEA
instruction). DI
and SI
are the registers for the first two parameters of the system call, which is equivalent to calling clock_gettime(0, &ret)
. If the corresponding symbol is not initialized, it will enter the fallback
branch and call the gettimeofday
function.
(III) Stack Space Limitation
Go function calls ensure that there is at least 128 bytes of stack (note that this is not the goroutine stack), and you can refer to _StackSmall
in runtime/stack.go
for details. However, after entering the corresponding C function, the growth of the stack is no longer controlled by Go. So, the remaining 104 bytes must ensure that the call does not cause a stack overflow. Fortunately, since these two functions for obtaining time are not complex, stack overflow generally does not occur.
(IV) VDSO Mechanism
VDSO, which stands for Virtual Dynamic Shared Object, is a virtual .so
file provided by the kernel. It is not on the disk but is in the kernel and mapped to the user space. This is a mechanism for accelerating system calls and a compatibility mode. For functions like gettimeofday
, if a normal system call is used, there will be a large number of context switches, especially for programs that frequently obtain time. Through the VDSO mechanism, a section of address will be separately mapped in the user space, which contains some system calls exposed by the kernel. The specific calling method (such as syscall
, int 80
, or systenter
) is determined by the kernel to prevent compatibility issues between the glibc
version and the kernel
version. Moreover, VDSO is an upgraded version of vsyscall
, which avoids some security issues, and the mapping is no longer statically fixed.
(V) The Update Mechanism for Time Acquisition in the Kernel
It can be seen from the kernel that the time obtained by the system call is updated by the time interrupt, and its call stack is as follows [5]:
Hardware timer interrupt (generated by the Programmable Interrupt Timer - PIT)
-> tick_periodic();
-> do_timer(1);
-> update_wall_time();
-> timekeeping_update(tk, false);
-> update_vsyscall(tk);
update_wall_time
uses the time from the clock source, and the precision can reach the ns level. But generally, the time interrupt of the Linux kernel is 100HZ, and in some cases, it can be as high as 1000HZ. That is to say, the time is generally updated once during the interrupt processing every 10ms or 1ms. From the perspective of the operating system, the time granularity is approximately at the ms level, but this is just a benchmark value. Each time the time is obtained, the time from the clock source will still be retrieved (there are many types of clock sources, which may be a hardware counter or the jiffy of an interrupt, and generally can reach the ns level), and the precision of time acquisition can be between us and several hundred ns. Theoretically, for more precise time, the CPU cycle needs to be read directly using the assembly instruction rdtsc
.
(VI) The Search and Linking of Function Symbols
The process of searching for the function symbols for obtaining time involves the content of ELF, that is, the process of dynamic linking. It will resolve the addresses of the function symbols in the .so
file and store them in function pointers, such as __vdso_clock_gettime_sym
. Other functions, such as TEXT runtime·nanotime(SB),NOSPLIT,$16
, also have a similar process, and this function can obtain time.
III. The Management of the Timer Heap by the Go Runtime
(I) The timer
Structure
// Package time knows the layout of this structure.
// If this struct changes, adjust ../time/sleep.go:/runtimeTimer.
// For GOOS=nacl, package syscall knows the layout of this structure.
// If this struct changes, adjust ../syscall/net_nacl.go:/runtimeTimer.
type timer struct {
i int // heap index
// Timer wakes up at when, and then at when+period, ... (period > 0 only)
// each time calling f(now, arg) in the timer goroutine, so f must be
// a well-behaved function and not block.
when int64
period int64
f func(interface{}, uintptr)
arg interface{}
seq uintptr
}
Timers are managed in the form of a heap (heap). A heap is a complete binary tree and can be stored using an array. i
is the index of the heap. when
is the time when the goroutine is woken up, and period
is the interval between wake-ups. The next wake-up time is when + period
, and so on. The function f(now, arg)
is called, where now
is the timestamp.
(II) The timers
Structure
var timers struct {
lock mutex
gp *g
created bool
sleeping bool
rescheduling bool
waitnote note
t []*timer
}
The entire timer heap is managed by timers
. gp
points to the G structure in the scheduler, that is, a state maintenance structure of a goroutine. It points to a separate goroutine of the time manager, which is started by the runtime (it will only be started when a timer is used). lock
ensures the thread safety of timers
, and waitnote
is a condition variable.
(III) The addtimer
Function
func addtimer(t *timer) {
lock(&timers.lock)
addtimerLocked(t)
unlock(&timers.lock)
}
The addtimer
function is the entry point for the start of the entire timer. It simply locks and then calls the addtimerLocked
function.
(IV) The addtimerLocked
Function
// Add a timer to the heap and start or kick the timer proc.
// If the new timer is earlier than any of the others.
// Timers are locked.
func addtimerLocked(t *timer) {
// when must never be negative; otherwise timerproc will overflow
// during its delta calculation and never expire other runtime·timers.
if t.when < 0 {
t.when = 1<<63 - 1
}
t.i = len(timers.t)
timers.t = append(timers.t, t)
siftupTimer(t.i)
if t.i == 0 {
// siftup moved to top: new earliest deadline.
if timers.sleeping {
timers.sleeping = false
notewakeup(&timers.waitnote)
}
if timers.rescheduling {
timers.rescheduling = false
goready(timers.gp, 0)
}
}
if !timers.created {
timers.created = true
go timerproc()
}
}
In the addtimerLocked
function, if timers
has not been created, a timerproc
coroutine will be started.
(V) The timerproc
Function
// Timerproc runs the time-driven events.
// It sleeps until the next event in the timers heap.
// If addtimer inserts a new earlier event, addtimer1 wakes timerproc early.
func timerproc() {
timers.gp = getg()
for {
lock(&timers.lock)
timers.sleeping = false
now := nanotime()
delta := int64(-1)
for {
if len(timers.t) == 0 {
delta = -1
break
}
t := timers.t[0]
delta = t.when - now
if delta > 0 {
break
}
if t.period > 0 {
// leave in heap but adjust next time to fire
t.when += t.period * (1 + -delta/t.period)
siftdownTimer(0)
} else {
// remove from heap
last := len(timers.t) - 1
if last > 0 {
timers.t[0] = timers.t[last]
timers.t[0].i = 0
}
timers.t[last] = nil
timers.t = timers.t[:last]
if last > 0 {
siftdownTimer(0)
}
t.i = -1 // mark as removed
}
f := t.f
arg := t.arg
seq := t.seq
unlock(&timers.lock)
if raceenabled {
raceacquire(unsafe.Pointer(t))
}
f(arg, seq)
lock(&timers.lock)
}
if delta < 0 || faketime > 0 {
// No timers left - put goroutine to sleep.
timers.rescheduling = true
goparkunlock(&timers.lock, "timer goroutine (idle)", traceEvGoBlock, 1)
continue
}
// At least one timer pending. Sleep until then.
timers.sleeping = true
noteclear(&timers.waitnote)
unlock(&timers.lock)
notetsleepg(&timers.waitnote, delta)
}
}
The main logic of timerproc
is to take out the timer from the min-heap and call the callback function. If period
is greater than 0, the when
value of the timer will be modified and the heap will be adjusted; if it is less than 0, the timer will be directly removed from the heap. Then, it will enter the OS semaphore to sleep and wait for the next processing, and it can also be woken up by the waitnote
variable. When there are no timers left, the goroutine represented by the G structure enters the sleeping state, and the OS thread represented by the M structure that hosts the goroutine will look for other runnable goroutines to execute.
(VI) The Wake-up Mechanism in addtimerLocked
When a new timer is added, it will be checked. If the newly inserted timer is at the top of the heap, it will wake up the sleeping timergorountine
, causing it to start checking for expired timers on the heap and executing them. There are two states for the wake-up and the previous sleep: timers.sleeping
means entering the os semaphore sleep of M, and timers.rescheduling
means entering the scheduling sleep of G, while M is not sleeping and will make G re-enter the runnable state. The expiration of time and the addition of new timers together constitute the driving force for the timer's operation at runtime.
IV. Factors Affecting Timer Precision
Looking back at the initial question “How precise can a timer be?”, it is actually affected by two factors:
(I) The Time Granularity of the Operating System Itself
Generally, it is at the us level, the time benchmark update is at the ms level, and the time precision can reach the us level.
(II) The Scheduling Problem of the Timer’s Own Goroutine
If the runtime load is too high or the load of the operating system itself is too high, it will cause the timer’s own goroutine to respond untimely, resulting in the timer not being triggered in a timely manner. For example, it may happen that a 20ms timer and a 30ms timer seem to be executed concurrently, especially in some container environments restricted by cgroup where the CPU time allocation is very small. Therefore, sometimes we cannot overly rely on the timing of the timer to ensure the normal operation of the program. The comment of NewTimer
also emphasizes that "NewTimer creates a new Timer that will send the current time on its channel after at least duration d.", which means that no one can guarantee that the timer will execute on time. Of course, if the time interval is very large, the impact in this regard can be ignored.
Leapcell: The Next-Gen Serverless Platform for Golang app Hosting
Finally, I would like to recommend the most suitable platform for deploying Go services: Leapcell
1. Multi-Language Support
- Develop with JavaScript, Python, Go, or Rust.
2. Deploy unlimited projects for free
- pay only for usage — no requests, no charges.
3. Unbeatable Cost Efficiency
- Pay-as-you-go with no idle charges.
- Example: $25 supports 6.94M requests at a 60ms average response time.
4. Streamlined Developer Experience
- Intuitive UI for effortless setup.
- Fully automated CI/CD pipelines and GitOps integration.
- Real-time metrics and logging for actionable insights.
5. Effortless Scalability and High Performance
- Auto-scaling to handle high concurrency with ease.
- Zero operational overhead — just focus on building.
Explore more in the documentation!
Leapcell Twitter: https://x.com/LeapcellHQ