Web Infrastructure
FreeBSD VNET
Jails Networking

by Eric Fortis

Uxtly runs on two servers and each one has three jails. The jails for the databases and application servers communicate privately, while the reverse proxy jails listen on the internet.

Server A Server B Rev. Proxy App Server Database Rev. Proxy App Server Database

Each server has:

Two “virtual switches” (if_bridge):

  • ibridge is for replicating the database, and for the application servers to target their peer database.
  • xbridge is for passing incoming traffic to the reverse proxy jail and public outgoing traffic from all the jails.

Two network interfaces:

Seven virtual cables with a VNIC at either end (epair):

  • Two jail‑to‑jail (orange lines)
  • Five jail‑to‑bridge
xnic  192.0.2.155 192.0.2.165  xnic Server A Server B 192.168.56.110  inic 192.168.56.111  inic xbridge  10.0.2.1/24 ibridge ibridge nginx_j node_j pg_j .2.20 ngx_b .2.30 node_b .2.40 pg_b .3.20 ngx_node_a .3.30 ngx_node_b .4.30 node_pg_a .4.40 node_pg_b 192.168.56.30 inode_b 192.168.56.40 ipg_b 192.168.56.31 inode_b 192.168.56.41 ipg_b

VNIC Labels

Also, each server has four encrypted tunnels (spiped). In the diagram the arrows point to the pipe’s server‑end.

spiped tunnels .56.31 .56.41 .56.30 .56.40 :5432 :5433 :5434 :5435 :5432 :5433 :5434 :5435

Load Balancing

In the DNS provider there’s an ‘A’ record per server. Therefore, they get load balanced free of charge by a round‑robin algorithm.

Type Name Content TTL
A uxtly.com 192.0.2.155 Auto
A uxtly.com 192.0.2.165 Auto

For shutting‑down a server, we remove its ‘A’ record and wait for the TTL to expire. In Cloudflare®, that time‑to‑live is 5 minutes by default (Auto TTL).

By the way, for targeting a particular server by its IP, see this health‑checking post.

Configurations

The configuration files (*.conf) are exactly the same in both servers. For that, they read server‑specific settings from other files. For example, Server A’s /etc/inic contains 192.168.56.110/24.

Likewise, the hardware and company‑specific parts will be highlighted in green.


/etc/rc.conf

This rc.conf specifies how to create and interconnect the bridges and epairs when the host boots up.

About rc.conf’s ifconfig syntax

Both of the next two lines rename an interface (bridge0). The first one shows how it’s typed it in the shell, which is non‑persistent across reboots. While the second one is for the rc.conf (persistent).

ifconfig bridge0 name xbridge
ifconfig_bridge0_name=xbridge

# Rename and configure the NICs 
ifconfig_em0_name=xnic
ifconfig_em1_name=inic
ifconfig_xnic="`/bin/cat /etc/xnic`"
ifconfig_inic="`/bin/cat /etc/inic` vlanhwtag 2222"

defaultrouter=`/bin/cat /etc/gateway`

# Create the virtual interfaces e.g., ifconfig bridge0 create
cloned_interfaces="
bridge0
bridge1
epair0
epair1
epair2
epair3
epair4
epair5
epair6
"
# Rename the bridges and VNICs ifconfig_bridge0_name=xbridge ifconfig_bridge1_name=ibridge ifconfig_epair0a_name=ngx_a ifconfig_epair0b_name=ngx_b ifconfig_epair1a_name=node_a ifconfig_epair1b_name=node_b ifconfig_epair2a_name=pg_a ifconfig_epair2b_name=pg_b ifconfig_epair3a_name=ngx_node_a ifconfig_epair3b_name=ngx_node_b ifconfig_epair4a_name=node_pg_a ifconfig_epair4b_name=node_pg_b ifconfig_epair5a_name=inode_a ifconfig_epair5b_name=inode_b ifconfig_epair6a_name=ipg_a ifconfig_epair6b_name=ipg_b # Enable the VNICs that we don’t assign IPs to ifconfig_ngx_a=up ifconfig_node_a=up ifconfig_pg_a=up ifconfig_ngx_node_a=up ifconfig_node_pg_a=up ifconfig_inode_a=up ifconfig_ipg_a=up # Connect the NICs to the bridges e.g., ifconfig ibridge addm inic ifconfig_ibridge="
addm inic
addm inode_a
addm ipg_a
"
ifconfig_xbridge="
10.0.2.1/24
addm ngx_a
addm node_a
addm pg_a
"
# Services jail_enable=YES jail_reverse_stop=YES gateway_enable=YES pf_enable=YES

/etc/jail.conf

The jails boot up in the order they appear in this file. This file also specifies which VNICs the host will delegate to the jails.

In addition, as PostgreSQL requires System V’s shared memory, sysvshm provides and namespaces it to the database jail.

path = "/jails/$name";

devfs_ruleset = 4;
mount.devfs;
vnet;

exec.clean;
exec.start = "/bin/sh /etc/rc";
exec.stop  = "/bin/sh /etc/rc.shutdown";

pg_j {
  sysvshm = "new";
  vnet.interface = pg_b, node_pg_b, ipg_b;
}

node_j {
  vnet.interface = node_b, ngx_node_b, inode_b, node_pg_a;
}

nginx_j {
  vnet.interface = ngx_b, ngx_node_a;
}

Jails rc.conf’s

This section has the jail’s boot‑time configurations. These rc.conf files configure the VNICs, gateway, tunnels, and services.

Keep in mind that at this stage we are in the jail, so the file paths are rooted at /jails/<name>. For example, on the host, the Nginx jail’s hostname is at /jails/nginx_j/etc/hostname.

/jails/nginx_j/etc/rc.conf

hostname=`/bin/cat /etc/hostname`

ifconfig_ngx_b=10.0.2.20/24
ifconfig_ngx_node_a=10.0.3.20/24

defaultrouter=10.0.2.1

nginx_enable=YES

/jails/node_j/etc/rc.conf

hostname=`/bin/cat /etc/hostname`

ifconfig_node_b=10.0.2.30/24
ifconfig_ngx_node_b=10.0.3.30/24
ifconfig_inode_b=`/bin/cat /etc/ip_inode_b`/24
ifconfig_node_pg_a=10.0.4.30/24

defaultrouter=10.0.2.1

# Tunnel to Peer Database
spiped_enable=YES
spiped_pipes=N2P

spiped_pipe_N2P_mode=encrypt
spiped_pipe_N2P_source="[`/bin/cat /etc/ip_inode_b`]:5432"
spiped_pipe_N2P_target="[`/bin/cat /etc/ip_peer_ipg_b`]:5433"
spiped_pipe_N2P_key=/etc/spiped.key

node_enable=YES

/jails/pg_j/etc/rc.conf

hostname=`/bin/cat /etc/hostname`

ifconfig_node_pg_b=10.0.4.40/24
ifconfig_pg_b=10.0.2.40/24
ifconfig_ipg_b=`/bin/cat /etc/ip_ipg_b`/24

defaultrouter=10.0.2.1

# Tunnels
spiped_enable=YES
spiped_pipes="N2P PGS PGC"

spiped_pipe_N2P_mode=decrypt
spiped_pipe_N2P_source="[`/bin/cat /etc/ip_ipg_b`]:5433"
spiped_pipe_N2P_target="[`/bin/cat /etc/ip_ipg_b`]:5432"
spiped_pipe_N2P_key=/etc/spiped.key

spiped_pipe_PGS_mode=decrypt
spiped_pipe_PGS_source="[`/bin/cat /etc/ip_ipg_b`]:5434"
spiped_pipe_PGS_target="[`/bin/cat /etc/ip_ipg_b`]:5432"
spiped_pipe_PGS_key=/etc/spiped.key

spiped_pipe_PGC_mode=encrypt
spiped_pipe_PGC_source="[`/bin/cat /etc/ip_ipg_b`]:5435"
spiped_pipe_PGC_target="[`/bin/cat /etc/ip_peer_ipg_b`]:5434"
spiped_pipe_PGC_key=/etc/spiped.key

postgresql_enable=YES

Firewall (pf)

This firewall configuration denies all traffic by default and then allows what’s needed. It assumes xnic is on a public IP , if not, remove its subnet from the <martians> table — for example, when locally simulating the infrastructure.

Allowed Incoming Traffic

  • Only the Nginx jail is open to the public.
  • SSHing to the host or jails is only possible from IPs listed in the <xpeers> table.
Rate Limits

If an IP connects 100 times within 10 seconds it’ll be blocked. Not only from making new connections, but also from using established ones, as flush global terminates them.

Effectively, that IP gets added to the <ratelimit> table. Therefore, you can use a cron job to reallow those IPs. For example, to remove IPs that are at least three minutes old from that table:

/sbin/pfctl -t ratelimit -T expire 180

How to exclude a company from rate‑limits? Create a table with their IPs, and add a pass in quick rule before the block in quick one (think of quick as an early return). For instance, if Cloudflare® is proxying your traffic, populate a <bypass> table with their IPs.

pass in quick on xnic proto tcp from <bypass> to port 443
block in quick …

Allowed Outgoing Traffic

Besides patching the jails, only the Node jail can initiate connections to the internet. It can connect to Stripe® and Fastmail® , and as they need DNS services, Cloudflare® and Google® resolvers are allowed as well.

NTP is allowed via inic to private NTP servers.

How to deploy to these servers? Copy over to them with rsync. By the way, the orchestration server’s IP must be in the <xpeers> table.

How to generate TLS certificates? Like deploying, see this post: Securely creating TLS certificates with Let’s Encrypt.

How to extract backups? rsync from the backup/orchestration server too.

The general answer to those questions is: initiate the connection from an external, orchestration, server.


/etc/pf.conf

xbridge_net = "10.0.2/24"
nginx_j     = "10.0.2.20"
node_j      = "10.0.2.30"
pg_j        = "10.0.2.40"

port_nginx  = "{ 443 80 }"
port_email  = "465"

resolvers   = "{ 1.1.1.1 8.8.8.8 }"
priv_ntp    = "{ 192.0.2.10 192.0.2.11 }"

table <ratelimit>
table <blocklist> file "/etc/ips_blocklist"  # DShield Daily
table <martians>  file "/etc/ips_martians"   # Bogon →
table <payments>  file "/etc/ips_stripe"     # Stripe API IPs →
table <email>     file "/etc/ips_fastmail"   # 66.111.4/24
table <xpeers>    file "/etc/ips_xnic_peers"
table <ipg_j>     file "/etc/ip_ipg_b"
table <inode_j>   file "/etc/ip_inode_b"
table <pgpeer>    file "/etc/ip_peer_ipg_b"
table <nodepeer>  file "/etc/ip_peer_inode_b"

# Normalization. Prevents IP Fragmentation Attacks and Inspection Evasion
scrub in all fragment reassemble no-df

# Translation
nat on xnic from $xbridge_net -> (xnic:0)
rdr on xnic proto tcp from any      to port $port_nginx -> $nginx_j
rdr on xnic proto tcp from <xpeers> to port 2220 -> $nginx_j port 22
rdr on xnic proto tcp from <xpeers> to port 2230 -> $node_j  port 22
rdr on xnic proto tcp from <xpeers> to port 2240 -> $pg_j    port 22

# Blockers
antispoof quick for xnic
block in quick on xnic \
      from { <blocklist> <ratelimit> <martians> no-route urpf-failed }
block all

# Nginx Incoming Traffic
pass in quick on xnic proto tcp from any \
     to port $port_nginx keep state \
     (max-src-conn-rate 100/10, overload <ratelimit> flush global)
pass out quick on xbridge proto tcp to $nginx_j port $port_nginx

# Tunnels
pass in  quick on ipg_a   proto tcp from <ipg_j>    to <pgpeer> port 5434
pass in  quick on inic    proto tcp from <pgpeer>   to <ipg_j>  port 5434
pass in  quick on inode_a proto tcp from <inode_j>  to <pgpeer> port 5433
pass in  quick on inic    proto tcp from <nodepeer> to <ipg_j>  port 5433
pass out quick on ibridge proto tcp from any        to any      port { 5434 5433 }

# Node.js Outgoing Traffic
pass in  on xbridge proto udp from $node_j to $resolvers port 53
pass in  on xbridge proto tcp from $node_j to <payments> port 443
pass in  on xbridge proto tcp from $node_j to <email>    port $port_email
pass out on xnic    proto udp from any     to $resolvers port 53
pass out on xnic    proto tcp from any     to <payments> port 443
pass out on xnic    proto tcp from any     to <email>    port $port_email

# SSH
pass in  on xnic    proto tcp from <xpeers> to any          port 22
pass out on xbridge proto tcp from <xpeers> to $xbridge_net port 22

# NTP
pass out on inic proto udp to $priv_ntp port 123

# Uncomment for updating FreeBSD and packages
# pass in  on xbridge from $xbridge_net
# pass out on xnic

References

Shawn Webb (2012) Virtually Networked FreeBSD Jails

Open Source

Check out these ops‑utils/location‑server scripts for a fairly automated installation, and setup for this infrastructure.

FAQs

How many requests can a bare‑metal server handle?

In 2021, Hacker News served ~6 million daily requests with an 8‑core server. Consider that servers can have up to 256 cores.

In 2022, Netflix served over 800 Gb/s per server.

Is bare‑metal more cost‑effective than cloud?

Disregarding the security risks of multi‑tenant cloud instances, and the extra attack surface of the cloud management software, bare‑metal is more cost‑effective once your cloud bill exceeds $250/month, which is the cost of renting a pair of low‑end bare‑metal servers.

What happens if the hardware fails?

Besides collocating, bare‑metal servers can be rented out, so the company takes care of hardware problems.

By the way, since hard disk failures are the most common issue, you can have hot spares and keep the damaged disks installed, instead of replacing them. For example, in ZFS, you can mirror disks 1 and 2, while having 3 and 4 as hot spares with the following command:

zpool create pool mirror $d1 $d2 spare $d3 $d4
What about microservices?

You can create nested jails within the application jail, or multiple top‑level application jails.

Can I simulate this infrastructure in VirtualBox?

Yes, here is the guide.

Sponsored by: