I’ve been involved in some fashion of IT for over thirty years now. Running a FidoNet BBS (The Twilight Zone) in 1986 was my first interaction with a human element and where I first experienced the concept of a System Administrator. Prior to that I was flipping 16 toggle switches to load stb’s, rbr’s and the like and reading the results on 16 LEDs keeping Navy Frigates moving through the water. What fun !
I’ve been in the trenches, racking and stacking, installing the OS and Applications, backing up and restoring, and fixing broken systems and applications. And at a point in time, that was my definition of a System Administrator. It isn’t any longer.
I’m asked “What is the real underlying problem for SysAdmins now that everything is virtual” As I mentioned in my interview with Rick Ramsey at OOW13 elasticity is the biggest challenge for the SysAdmins today. Business process demands are more complex and need to be provisioned faster than ever before. These demands span a large number of technologies and the SysAdmin needs to know them all.
The SysAdmin’s must be able to leverage technologies such as Virtualization, Infrastructure as a Service, Database as a Service, Middleware as a Service, Storage/Network provisioning, pooling and consolidation of hardware resources. They need to understand the technologies and how they interact with each other to ensure they can successfully deploy them and once deployed, manage them.
New/improved management tools need to be mastered to be successful. The SysAdmin role has been far too dependent on performing repetitive tasks and working in a reactionary mode attempting to locate and address/repair faults manually. As the complexity of our data centers continue to grow, this model becomes a significant limiting factor. We need to understand tools like Enterprise Manager 12c which allow for applications to be rapidly deployed by the end users like developers/testers themselves through self service, with metering and charge back.
The SysAdmins need to accept the automation that these new tools provide. To shun them will lead to their undoing.
And the knowledge level needed has never been greater. As an example, I expect a SysAdmin to know Dtrace if they are running Solaris or Oracle Linux. I expect them to have some basic understanding of the kernel, system calls, and the like so they understand what Dtrace tells them. I expect a SysAdmin to be comfortable working in a Database and a middleware environment. They need to understand the flow from the various tiers and how to provision those tiers rapidly when there is a business demand.
Basically the System Administrator must grow a much larger skill set to be successful. Don’t grow vertically in one technology, grow horizontally amongst many technologies. Engineer solutions with the specialist teams and know enough of the solutions to have an intelligent conversation. Know enough to assist in the architecture of the solution. Be proactive, not reactive.
So to answer the question “now that all is virtual, what’s the REAL underlying problem for sysadmins? Provisioning strategy?”
I think the complexity of a provisioning strategy is the REAL underlying challenge. Understanding which of the available technologies make sense, where each solution fits into the stack, how to provision and re-provision the solution in the stack, and how to manage it will be the new measure of success or failure in the SysAdmin realm. The tools are there, and for those that embrace the technology and the tools should have a very bright future.
And for those that don’t, a warning. It is coming from the other direction. I interact with DBAs frequently that are managing the entire Exadata appliance. They’ve been to the Solaris or Linux Admin classes, they’ve attended the Exadata class. The “SysAdmin” team isn’t a user, root or otherwise, on the system. The Database group has become system administrators on the majority of those systems. I’ve made similar observations in the Exalogic engineered system as well.
Embrace the technologies and the tools. Reach out and extend yourself. Throw away the old “rules”. Soon no one will really care what is under the hood. It won’t matter if it is Solaris or Oracle Linux, if it is SPARC or x86, what will matter is the IT staff’s ability to deploy the business demands on schedule.
The last blog post gave us some brief descriptions of the various scheduling classes in Solaris. I focused on the Time Sharing (TS) class since it is the default. Hopefully we can see that the TS (and the IA class for that matter) makes its decisions based on how the threads are using the CPU. Are we CPU intensive or are we I/O intensive? It works well, but it doesn’t provide the administrator fine-grain control as it relates to resource management.
To address this, The Fair Share Scheduler (FSS) was added to Solaris in the Solaris 9 release.
The primary benefit of FSS is to allow the admin an ability to identify and dispatch processes and their threads based upon their importance as determined by the business and implemented by the administrator.
We saw the complexity of the TS dispatch table in the earlier post. Here we see the FSS table has no such complexity.
In FSS we use the concept of CPU shares. These shares allow the admin a fine level of granularity to carve up CPU resources. We are no longer limited to allocating an entire CPU. The admin designates the importance of the workload by assigning to it a number of shares. You dictate importance by assigning a larger number of shares to those workloads that carry a higher importance. Shares ARE NOT the same as CPU caps nor CPU resource usage. Shares simply define the relative importance of workloads in comparison to other workloads where CPU resource usage is an actual measurement of consumption. A workload may be given 50% of the shares yet at a point in time may be only consuming 5% of the CPU. I look at a CPU share as a minimum guaranty of CPU allocation, not as a cap on CPU consumption.
When we assign shares to a work load, we need to be aware of the shares that are already assigned. It is the ratio of shares assigned to one workload compared to all of the other workloads.
I speak of FSS in a “Horizontal” and a “Vertical” aspect when I’m delivering for Oracle University. In Solaris 9 we were able to define projects in the /etc/project file. This is the vertical aspect. In Solaris 10 Non-Global Zones were introduced and brought with it the Horizontal aspect. I assign shares horizontally across the various zones and then vertically within each zone in the /etc/project file if needed.
By default the Non-Global zones use the default scheduling class. If the system is updated with a new default class, they will obtain the new setting when booted or rebooted. The recommended scheduler to use with Non-Global Zones is the FSS. The preferred way is to set the system default scheduler to FSS and all zones then inherit it.
To display information about the loaded scheduling classes, run priocntl -l
SYS (System Class)
TS (Time Sharing)
Configured TS User Priority Range: -60 through 60
SDC (System Duty-Cycle Class)
FX (Fixed priority)
Configured FX User Priority Range: 0 through 60
Configured IA User Priority Range: -60 through 60
priocntl can be used to view or set scheduling parameters for a specified process.
To determine the global priority of a process run ps -ecl
root@solaris:~# ps -ecl #The c displays properties of the scheduler, we see the class (CLS) and the priority (PRI) F S UID PID PPID CLS PRI ADDR SZ WCHAN TTY TIME CMD 1 T 0 0 0 SYS 96 ? 0 ? 0:01 sched 1 S 0 5 0 SDC 99 ? 0 ? ? 0:02 zpool-rp 1 S 0 6 0 SDC 99 ? 0 ? ? 0:00 kmem_tas 0 S 0 1 0 TS 59 ? 720 ? ? 0:00 init 1 S 0 2 0 SYS 98 ? 0 ? ? 0:00 pageout 1 S 0 3 0 SYS 60 ? 0 ? ? 0:01 fsflush 1 S 0 7 0 SYS 60 ? 0 ? ? 0:00 intrd 1 S 0 8 0 SYS 60 ? 0 ? ? 0:00 vmtasks 0 S 0 869 1 TS 59 ? 1461 ? ? 0:05 nscd 0 S 0 11 1 TS 59 ? 3949 ? ? 0:11 svc.star 0 S 0 13 1 TS 59 ? 5007 ? ? 0:32 svc.conf 0 S 0 164 1 TS 59 ? 822 ? ? 0:00 vbiosd 0 S 16 460 1 TS 59 ? 1323 ? ? 0:00 nwamd
To set the default scheduling class use dispadmin -d FSS and then dispadmin -d to ensure it changed. Then run dispadmin -l to see that it loaded.
root@solaris:~# dispadmin -d dispadmin: Default scheduling class is not set root@solaris:~# dispadmin -d FSS root@solaris:~# dispadmin -d FSS (Fair Share) root@solaris:~# dispadmin -l CONFIGURED CLASSES ================== SYS (System Class) TS (Time Sharing) SDC (System Duty-Cycle Class) FX (Fixed Priority) IA (Interactive) FSS (Fair Share)
Manually move add of the running processes into the FSS class and then verify with the ps command.
root@solaris:~# priocntl -s -c FSS -i all root@solaris:~# ps -ef -o class,zone,fname | grep -v CLS | sort -k2 | more FSS global auditd FSS global automoun FSS global automoun FSS global bash FSS global bash FSS global bonobo-a FSS global clock-ap FSS global console- FSS global cron FSS global cupsd FSS global dbus-dae FSS global dbus-dae FSS global dbus-lau FSS global dbus-lau
Finally move init over to the FSS class so all children will inherit.
root@solaris:~# ps -ecf | grep init root 1 0 TS 59 16:33:44 ? 0:00 /usr/sbin/init root@solaris:~# priocntl -s -c FSS -i pid 1 root@solaris:~# ps -ecf | grep init root 1 0 FSS 29 16:33:44 ? 0:00 /usr/sbin/init
With the FSS all set, we now assign shares to our Non-Global Zones
set cpu-shares=number of shares
To display CPU consumption run prstat -Z