Some of you are already curious about the new server hardware. I’ll provide some details in a minute - but let’s first start off with the old server hardware for comparison:
- Dell PowerEdge R620 system (8-bay)
- CPU: Dual socket - 2x Intel Xeon E5-2640 v2 processor, 2 GHz, 8 cores
- RAM: 64 GB DDR3 ECC memory 1600 MHz (8x 8 GB)
- Storage
- Controller: PERC H710 Mini
- 2x Crucial MX500 1 TB SATA (RAID-1)
- 2x Crucial MX500 2 TB SATA (RAID-1)
While this system is about 10 years old, it still performs quite well as an ordinary web server, e.g. for Wordpress hosting, Nextcloud or similar applications. But it could not keep up with the increasing demands or modern, more complex web applications such as Mastodon - at least not at a larger scale. The system performed well until November 2023, when a big wave of new Mastodon users hit the Fediverse. More federation traffic, more concurrent users and a bigger amount of data to be processed in the database put too much load on the aged system.
The new system will have this hardware:
- Dell PowerEdge R640 system (10-bay)
- CPU: Dual socket - 2x Intel Xeon Gold 6154 processor, 3 GHz, 18 cores
- RAM: 192 GB Registered ECC DDR4 2400 MHz (12x 16 GB)
- Storage
- Controller: (none, on-board SATA / NVMe)
- 2x Samsung PM893 1 TB SATA (Boot, OS, SW RAID-1)
- 2x Samsung PM9A3 4 TB U.2 NVMe (VM Storage, SW RAID-1)
Some notes about the configuration:
The base system / PowerEdge server: So far I’ve had a great experience with my old Dell R620 system, so I wanted to stick to something that I know and that I’ve worked with for the last 6 years. The newer R640 systems are easily available and can be bought used at many online stores.
The CPU(s): CPU power has been an important aspect when looking for a proper HW configuration as the old server did not provide enough single core performance to handle web requests quickly. Well - CPUs can be the most expensive component of a server, esp. if you’re heading for the > 3 GHz ones. I was ready to spend about € 600 to € 700 on a single CPU (there will be two of them!), so I checked how much performance I could get for the given money. The server supplier provided me with a list with available and compatible CPUs and the prices. While there were CPU options with slightly more than 3 GHz base clock available, I decided on the Intel Xeon Gold 6154. It does not have the highest clock speeds available, but should still offer decent performance. Higher CPU base clock would mean sacrificing exponentially more power budget which would result in lower core counts (or way higher price tags). 3 GHz and 3.7 GHz turbo boots seemed reasonable to me - at least for a server with 18 physical CPU cores per CPU. With 36 physical cores total (= 72 threads!) there should also be plenty of simultaneous multiprocessing power. Compared to our old server, we will reach almost double the single core performance and ~ 3.5x the multi core performance. More cores will help with increased metalhead.club federation traffic (Sidekiq) - the higher base clock speed will improve responsiveness of single web requests.
RAM amount: You might think: “Why 192 GB - isn’t that more than required?” There’s a simple reason for that: I was thinking about 128 GB at first, but Dell recommends to populate either 6 or 12 of the 12 available DIMM slots per CPU for best performance. I decided to populate 6 slots per CPU, which left me to decide between 8 or 16 GB DIMMs. 8 GB DIMMs would have resulted in 8 Gb x 6 slots x 2 CPUs = 96 GB. Which I considered as not enough for the future… given that we’ve reached limits of the old 64 GB machine in November / December. Therefore it seemed to be a more future-proof choice to buy the 16 GB modules - which results in 192 GB of RAM. Also I expect a higher RAM demand as soon as I will increase the number of Sidekiq workers for federation. To make use of the many CPU cores I will need to create more Sidekiq processes (not threads!) - and each single one is quite heavy on the memory. While we will not need all the RAM, it is better to have the option to consume as much as you want, without needing to think about limitations too soon.
RAM speed: 2400 Mhz modules are not the fastest ones (but also not the slowest ones, either). I considered 2666 MHz modules, but the server supplier recommended to save the extra cost of 2666 MHz modules and instead rely on the better performance enabled by the 6-channel memory mode that is enables by populating at least 6 memory slots per CPU. I was told that this configuration is a better benefit than using faster (but fewer) memory modules. Using 12 of the faster 2666 MHz modules was not an option because it turned out to be too expensive.
Storage / SATA SSDs: The 10-bay backplane variant of the R640 system allows for the use of 8 NVMe and 2 SATA/SAS drives. To make use of the slower SATA bays and leave the NVMe bays for more demanding tasks, I decided to put the operating system / hypervisor environment on slower SATA drives. SATA is cheaper than SAS and I don’t need the SAS extra features. The Samsung Enterprise drives are reliable and fast enough and will be bundled to a software RAID-1 for extra reliability. They don’t need to perform as well, since the main I/O load will be put on the bigger NVMe drives. I just need something reliable to boot from …
Storage / NVMe U.2 drives: For the main storage that hosts the actual application and database / asset storage I went with two Samsung PM9A3 Enterprise NVMe drives. They connect to the main board via PCIe 3.0 and offer great IOPS performance, which is important for database operations. 4 TB of storage should be sufficient for a while and we will even be able to host the Mastodon S3 storage ourselves. There will be no more need to rent external S3 storage.
Furthermore the new server features two redundant 700 W power supplies (A/B power feed) and an LCD display front bezel. It will be connected to the upstream router via two 1 GBit/s Ethernet uplinks in LACP mode. As with the old server, the new server will be powered by 100 % green energy.
I’m already curious about the performance and power draw/efficiency and can’t wait to get my hands on the new machine :-)