Practical Engineering
open-menu closeme
Engineering
github linkedin rss
  • Working Knowledge of Linux Memory: Concepts

    calendar Aug 4, 2025 · 6 min read · Linux  ·
    Share on: twitter copy

    I recently dealt with a server livelock issue caused by memory page thrashing. This post refreshes the Linux memory basics I found useful for debugging the issue. Much of the content is from Chapter 7 of Systems Performance: Enterprise and the Cloud. Virtual Memory Virtual memory is an abstraction that provides each …


    Read More
  • Detect and fix rare cases where the primary ENI does not serve default traffic

    calendar Jul 27, 2025 · 3 min read · Linux EC2  ·
    Share on: twitter copy

    During testing, we encountered a rare scenario when launching EC2 instances with multiple ENIs: the primary ENI (device index 0) does not serve default network traffic. This occurs in approximately 1 out of 10,000 launches (0.01%). For example, when configuring two ENIs on an instance—ENI-0 (deviceIndex=0) from …


    Read More
  • SELinux Concepts

    calendar Jun 15, 2025 · 5 min read · Linux SELinux  ·
    Share on: twitter copy

    Security-Enhanced Linux (SELinux) is a mandatory access control (MAC) system that enhances Linux security. "Mandatory" means access control is strictly enforced by predefined policy rules—users and processes cannot modify these rules at will, ensuring security is not left to individual discretion. SELinux is …


    Read More
  • Modern Go idioms

    calendar May 18, 2025 · 4 min read · Go  ·
    Share on: twitter copy

    Go is known for its backward compatibility, simplicity, and six-month release cycle. But that can sometimes lead to code that works but isn’t as readable and modern as it could be. This post is a living document where I note modern Go idioms I’ve applied in production codebases to improve clarity and maintainability. …


    Read More
  • A Few Shell Surprises

    calendar Apr 22, 2025 · 3 min read · Linux Shell  ·
    Share on: twitter copy

    Shell scripts are infamous for security issues and surprising behavior, so when possible, it's better to avoid using shell. For instance, we built a container platform using the Bottlerocket OS, and we didn't even install a shell. If someone needs to run a shell, it must be run inside a container. That said, shell is …


    Read More
  • x509: certificate signed by unknown authority? Maybe the cert pool is empty

    calendar Apr 15, 2025 · 6 min read · Linux Container SELinux Bottlerocket  ·
    Share on: twitter copy

    I recently worked on getting amazon-ssm-agent to run inside containers on Bottlerocket. During that process, I ran into a TLS issue connecting to amazonaws.com. The root cause turned out be interesting and we'll walk through it in this post. Running amazon-ssm-agent in a container: why and how? To enable sessions …


    Read More
  • Sharp edges of errgroup: Lessons from an errgroup and Context mishap

    calendar Mar 23, 2025 · 8 min read · Go Concurrency  ·
    Share on: twitter copy

    A recent faulty release disrupted service for some customers. The root cause was a concurrency bug involving x/sync/errgroup and context cancellation. This post shares three practices we learned from the incident. These practices will help us catch similar issues during code review or alert us to problems in …


    Read More
  • Avoid panic on expected errors: lessons from operating journald-to-cwl

    calendar Feb 23, 2025 · 3 min read · Go  ·
    Share on: twitter copy

    We've been using journald-to-cwl to ship journal logs from EC2 instances to CloudWatch Logs. It is lightweight and reliable. However, we recently started receiving false positive alarms, which became annoying. This blog covers the changes we made and the key lesson learned: panicking on expected errors in Go is …


    Read More
  • GPG is still in use to verify downloads

    calendar Feb 23, 2025 · 2 min read · Linux Cryptography  ·
    Share on: twitter copy

    This week, I needed to install the Amazon SSM Agent and was surprised to find that GPG (GNU Privacy Guard) was the only way to verify the download. I had assumed that software downloads verification had largely transitioned to PKI (Public Key Infrastructure). This short post is a refresh on GPG. OpenPGP is an open …


    Read More
  • Debug systemd race condition with reboot loop

    calendar Jan 20, 2025 · 1 min read · Linux Bottlerocket  ·
    Share on: twitter copy

    Hello! https://github.com/bcressey/bottlerocket/commits/debug-unified-fips/ https://github.com/bcressey/bottlerocket/commit/a2f3ef75b080d3cce1b077e9bc313bc0126c70c4


    Read More
    • ««
    • «
    • 1
    • 2
    • 3
    • 4
    • 5
    • »
    • »»

Peng Zhang

Software Engineer

Recent Posts

  • Simplify device path on boot with udev
  • Use KillMode=process with caution: restart loop could deplete resources
  • Spawning a New Process for Socket-Activated Daemons is Error-Prone
  • Be careful making thread-aware syscalls in Go: lock the thread
  • Speed up building Bottlerocket image in AWS CodeBuild
  • Mysterious Image Pull Failures: "401 Unauthorized" and "Not Found" After Migrating Containerd to v2
  • EC2 IMDS is Unstable During Early Boot: Always Retry
  • Who Modified My Program in Bottlerocket?

Tags

LINUX 19 GO 17 ALGORITHMS 8 BOTTLEROCKET 7 INTERVIEW 7 CONTAINER 5 GUIDE 3 DISTRIBUTED-SYSTEM 2 SELINUX 2 SYSTEMD 2 WEB 2 AWS 1 COMPUTER-ARCHITECTURE 1 CONCURRENCY 1
All Tags
ALGORITHMS8 AWS1 BOTTLEROCKET7 COMPUTER-ARCHITECTURE1 CONCURRENCY1 CONTAINER5 CRYPTOGRAPHY1 DATABASES1 DISTRIBUTED-SYSTEM2 DOCKER1 EC21 GO17 GUIDE3 INTERVIEW7 LINUX19 SELINUX2 SHELL1 SYSTEMD2 TESTING1 WEB2
[A~Z][0~9]
Peng Zhang

Copyright 2022-  PENG ZHANG. All Rights Reserved

to-top