The Linux System Administrator Challenge
One common question we see at LHC is how to use Linux, this guide/challenge is designed to take that question to another level entirely. Rather, this is more so geared towards how to understand Linux on a more advanced and enterprise usable level. This can be helpful if you are looking towards a job being a Linux System Administrator or you find yourself pentesting Linux based environments often, as understanding the infrastructure can be incredibly useful as an attack vector.
This is rather a simple guide originally made by reddit.com/u/IConrad and has slowly been modified over time by him, several others, and this is a spin-off on top of those. This version includes some minor modifications, more extra credit options, as well as some estimations for requirement of doing this.
Set up a KVM hypervisor, for this I highly suggest some form of a type 1 hypervisor. Considering the amount of virtual machines and overhead required for this project, using a type 2 can really be detrimental to the hardware requirements due to overhead and is really unnecessary. For type 1 my strongest suggestion lies in VMWare's ESXI as it is one of the most in the enterprise industry (and can also be used for free). My other suggestions in no particular order are raw KVM on QEMU, Xen, SmartOS, and Proxmox.
Where is Hyper-V?
Hyper-V requires more overhead and has no real advantages over the other options. As another hit, it is based on
Microsoft Windows and not any kind of *nix or Linux. If you're on this page, you are probably not interested in learning
the Microsoft ecosystem at this point anyways. You can use Hyper-V if you truly wish, but I do not advise it.
At this point you have a choice between two things to do, you can either continue to the next step and run everything as per normal, or you can setup routing software to help containerize this project so you don't accidentally
break your entire home network at some point and anger those you happen to live with. For this I personally prefer PFSense to manage this, however there are other options such as VyOS or MicroTik RouterOS or this fork of PFSense OPNsense.
Inside of that KVM hypervisor, install a provisioning management server. Use CentOS 7 as the distro for all work below. (For bonus points, set up errata importation on the CentOS channels, so you can properly see security update advisory information.)
For management servers there are two suggestions that come up Spacewalk which is what Red Hat's Satellite 5 was based off of, or there is Katello which is what Red Hat's current Satellite 6 is based off of.
As a note for later, the Katello agent also has inclusion of Puppet which you mind find useful later and is included in a later section of this challenge.
Create a VM to provide named and dhcp services to your entire environment. Set up the dhcp daemon to use the provisioning management server from step 2 server as the pxeboot machine (thus allowing you to use PXE to do unattended OS installs). Make sure that every forward zone you create has a reverse zone associated with it. Use something like "internal.virtnet" (but not ".local") as your internal DNS zone.
Use that provisioning management server to automatically (without touching it) install a new pair of OS instances, with which you will then create a Master/Master pair of LDAP servers. Make sure they register with the provisioning management server. Do not allow anonymous bind, do not use unencrypted LDAP.
Reconfigure all 3 servers to use LDAP authentication and remove any sysadmin accounts that you were using to administer them.
Create two new VMs, again unattendedly, which will then be Postgresql VMs. Use pgpool-II to set up master/master replication between them. Export the database from your provisioning management server and import it into the new pgsql cluster. Reconfigure your provisioning management instance to run off of that server cluster.
Set up a content management server master. Plug it into the provisioning server for identifying the inventory it will need to work with. For this there are a few options: Puppet (which is included in an install of Katello), Ansible, Fabric, Chef, Salt, or some combination of these.
Deploy another VM. Install iscsi tgt and nfs-kernel-server on it. Export a LUN and an NFS share,
Deploy another VM and install a backup server software on it such as Bacula or whatever your choice is (Rsync is not backup software, get over it). Use the Postegre SQL cluster to store the database , then store the image server VM images on the iscsi LUN, and every other server on the NFS share.
Deploy two more VMs. These will have httpd (Apache2) on them. Leave essentially default for now.
Deploy two more VMs. These will have Tomcat on them. Use JBoss Cache to replicate the session caches between them. Use the httpd servers as the frontends for this. The application you will run is a Spring Boot web servlet, in the interest of learning you'll also be writing the basic hello world servlet yourself to understand the pain.
You guessed right, deploy another VM. This will do iptables-based NAT/round-robin loadbalancing between the two httpd servers.
Deploy another VM. On this VM, install Postfix. Set it up to use a gmail account to allow you to have it send emails, and receive messages only from your internal network.
Deploy another VM. On this VM, set up a monitoring server. Have it use snmp to monitor the communication state of every relevant service involved above. This means doing a "is the right port open" check, and a "I got the right kind of response" check and "We still have filesystem space free" check.
For this there are an astounding number of options, some of the more popular enterprise options are Nagios or Zabbix. There's also a million other options for this: Prometheus, Solar Winds, Zennos, Cacti, Icinga, openNMS, or many others.
Deploy a new VM or use one of the previously made servers to deploy a documentation server, use whatever you want. A few options for these are Cowyo, Documize, Gollum, Pepperminty, PineDocs, or whatever else documentation server you choose.
Deploy another VM. On this VM, set up a syslog daemon to listen to every other server's input. Reconfigure each other server to send their logging output to various files on the syslog server. Set up Logstash, Grafana, Kibana, or GrayLog to parse those logs.
Document every last step you did in getting to this point in your brand new Wiki.
Now go back and create proper configuration files with the content management software you setup in step 8 to ensure that every last one of these machines is authenticating to the LDAP servers, registered to the provisioning server server, and backed up by the backups server.
Now go back, reference your documents, and set up a configuration files with the content management servers that hooks into each of these things to allow you to recreate, from scratch, each individual server.
Destroy every secondary machine you've created and use the above profile to recreate them, joining them to the clusters as needed.
Bonus exercise: create four more VMs. A CentOS 5, 6, and 7, 8 machine. On each of these machines, set them up to allow you to create custom RPMs and import them into the provisioning server instance. Ensure your content management configurations work for all four and produce like-for-like behaviors and add them to the backups and monitoring services as well.
Welcome to real systems administration and you already hate yourself, you won't be just one operating system type if you're in a large environment. So as the next bonus setup a Fedora 30 workstation VM, Debian 9 VM, Ubuntu 18.04 Workstation VM, and if you really hate yourself and want some non-enterprise fun setup an Arch server. Now for all of these VMs manage them with the content management systems, backup servers, provisioning servers, ldap servers, and logging servers.
For the next bonus, since this is purely a test environment. Setup another VM with Nginx, and set it up to be a reverse proxy that serves all the various web services under a single domain name/ ip address.
At some point you're probably going to need to deploy the shares on the NFS server from step 9 to some Windows machines as well. In enterprise you might be able to use NFS adapters, but they can be iffy on usability so the better option is to make SMB shares on that server for the Windows machines.
Spin up a Windows VM , make sure you can attach the SMB share to it and test it. If it's available also integrate it with the various other services you have made so far.
There was once a point in time where you could probably manage to pull this off with 8GB of ram if you really tried and an old quad core cpu, however that seems to be in the past with today's resource intensive programs that's just really is not a possibility anymore.
The given requirements I'm suggesting below are an educated guess for the most part, you'll have to learn to keep your virtual machines to a minimum in order to do this with lesser hardware. They are also designed in mind only up to step 21
after that, you're likely going to need even more resources for the extra VMs.
Absolute Minimum suggest requirements
- CPU: At least 4 cores with SMT (Hyper-threading/Clustered Multi-Threading): At oldest a third generation intel, and for AMD I wouldn't attempt it on anything not Ryzen.
- RAM: At minimum 16 GB: Speed for this doesn't necessarily matter, just note a lot of VMs require a lot of RAM.
- Disk: 1 TB : Disk space is likely the least of your concerns, you can easily do this on a cheap 1TB drive, speeds might slow you down a bit but won't hurt you in the long run. Just be careful about how many repositories you pull in with provisioning
- CPU: Around 12 cores would probably do find with this. My higher up suggestions for these are either going to be oldish Xeons or Ryzen CPUs, when you're dealing with this many VMs core count is going to be king.
- RAM: Speed doesn't really matter here as well.
- Disk: 2 TB : Disk space is likely the least of your concerns still, you can easily do this on a cheap 1TB drive however the increased space suggestion is mostly about allowing more backups and snapshots which I highly suggest.
If you want to do some extra learning with this bit I suggest using something like the ZFS filesystem, or if you're using a hyper visor that doesn't support it, a RAID array works fine as well.
Hardware is expensive, how am I supposed to afford this?
Hardware is indeed expensive, so I'm going to give you a hint on where you can find suitable equipment for a large project such as this. Look for used equipment, especially servers and workstations; data centers and large companies/universities often phase a new
computer model in every year and in doing so phase one out as well. You can often find great deals on these pieces of equipment and more by looking for deals in online places like Ebay, Craiglists, classified ads, businesses going out, etc..
Also you can sometimes find rather nice hardware very cheap by asking your local university and data centers what they are phasing out, if you're really lucky you might even be able to find some free things they were just going to recycle anyways.
Skills is an interesting topic, you could walk into this and start by knowing literally nothing, but you're going to struggle every single half step of the way. The more you know, the easier this may be, but I suppose some baseline suggestion should be established.
- Have a rather good comprehension about bash or POSIX scripting in general. This is going to be a key component in many things as it's going to be the glue for many things and makes your life much easier in the end.
- Understand the basics of networking at least: firewalls, vlans, how traffic works, ports, subnetting, etc.
- Linux, an obvious thing. You're going to at least want to be comfortable with CLI operations and understand how to navigate and control your system exclusively through that.
- The most important thing you're going to need to know is researching and analysis skills. You're going to be reading a lot of information and you're going to need to know how to research more into it and decide if it's enough or not.
The time commitment for this project is rather intimidating even to someone who's done many of the things in this list before. u/Iconrad originally stated that they expected it would take about 3-6 months of time to go from "I think I'm pretty good with computers" to completing this.
Things aren't that simple however, technology and applications become more and more complex every day, add on the extension to this challenge and decision making time for the options and this project could take up to a year if you work incredibly hard at it. It's a very long term
goal and should be spread out across a rather large amount of time to prevent burn outs and remembering to actually take care of yourself.