[Fwd: [H-GEN] partitioning again]

Wed May 12 23:38:06 EDT 1999

(Note reply-to: being general at humbug.org.au vs James McPherson <jmcphers at laurel.ocs.mq.edu.au>)

Robert Stuart writes:
 > James McPherson wrote:
 > > Robert Stuart writes:
 > >  > Robert Stuart wrote:
 > >  > > James McPherson wrote:
 > >  > >
 > >  > I use one large _fs_ for /, /usr, and /var (see below) on our Solaris
 > >  > servers because we are running SDS and the root disk is mirrored.  It is
 > >  > simply easier to manage that way.  We also have UPSs (and the building
 > >  > has emergency power) so I am not concerned about losing the fs except
 > >  > due to some sort of software problem screwing the root fs which
 > >  > partitioning can't help anyhow.  This solves my fs space problems too
 > >  > (var filling etc). The one issue I have with this is root fs' disks
 > >  > performance because is holds both swap and root (and usr var ...), both
 > >  > of which are hit heavily when the machine is loaded.
 > >  > Comments?
 > > 
 > > According to Brian Wong's Configuration and Capacity Planning for Solaris
 > > Servers book, the outer edges of the disk are the fastest, so we should be
 > > placing heavily used partitions such as /usr and /var on slices 6 7 or 8. Of
 > > course, I have to wait until I get another box to reconfigure before I can try
 > > that ;)
 > > 
 > 
 > Realistically, how much of a difference in speed is putting different fs
 > on different areas of the disk going to make?  Does anyone have hard
 > figures?  Even if I sacrifice 10% of the performance of those
 > filesystems, that's not going to worry me, there are much better ways to
 > spend my time optimising the machine's performance.  It's not worth the
 > problems of having to juggle space.  On a real production machine, there
 > should be enough memory that swap should simply hold data/code that is
 > never going to be used (ie the pages that are in memory are "used"). 
 > The Ultra E450 machine I was talking about has 6x9GB drives.  I have
 > only root and swap on the 2 disks that are used for the root fs which
 > leaves me with ~7GB free space (currently not even partitioned) because
 > I don't want the root and swap disks to be slowed by other apps using
 > the same drives.  In the end, I'll probably put some stuff on there that
 > is static and unlikely to cause a large write load (it will be a
 > mirrored fs).

I believe Wong's book presents some figures, but of course it's at home so I
can't check at the moment. Certainly it's easy to configure enough memory, but 
you need to have the money available in the budget to do so ;) We're going to
be replacing the SC100es later this year, possibly with E5550s. We have an
e450 as a test machine (and our backup DR box, lucky us!) with 2cpus (300MHz), 
512Mb ram and 6x9Gb ibm disks on three controllers. Currently we have it
configured as follows:

Filesystem	size	mountpoint
c0t1d0s0	97306	/
c0t1d0s5	963869	/usr
c0t1d0s3	290065	/var
md/dsk/d0	927868	/opt
c0t0d0s0	97306	/solaris-251-root
c0t0d0s3	290065	/solaris-251-var
c0t0d0s4	241403	/solaris-251-opt
c0t0d0s5	963869	/solaris-251-usr
swap		1501616 /tmp
c0t0d0s7	5505694 /gnu
c0t0d0s6	963869	/home
c0t1d0s6	6772940 /usr/local
c2t0d0s2	3949167 /d00
c2t0d0s3	4131384 /d04
c2t1d0s2	3949167 /d01
c2t1d0s3	4131384 /d05
c3t0d0s2	3949167 /d02
c3t0d0s3	4131384 /d06
c3t1d0s2	3949167 /d03
c3t1d0s3	4131384 /d07         

where /opt is made up of the small chunks remaining after configure
c[23]t[01]d0 into 4Gb partitions. (The /solaris-251-* is for the dual-boot
capability). It's great if you've got the free space to leave your root disk
mostly empty, but from my point of view it ignores the realities of the
constraints I have to work within. If I didn't have to make this box dual boot 
then I'd be able to dedicate c0t0d0s* to DR capabilities, and I'd probably
trim some of the /gnu partition. 

Output from top is interesting:

last pid:  4980;  load averages:  1.29,  1.23,  1.49                   13:12:35
155 processes: 131 sleeping, 1 running, 21 zombie, 2 on cpu
CPU states: 47.9% idle, 49.3% user,  1.1% kernel,  1.6% iowait,  0.0% swap
Memory: 512M real, 9032K free, 105M swap in use, 1449M swap free

  PID USERNAME THR PRI NICE  SIZE   RES STATE   TIME    CPU COMMAND
  750 root       6 -25   18 2080K 1488K run   180:05 43.63% rc5des           

most of the processes are actually oracle processes (100 of 132), 8 are squid, 
and the rest are root's. As for swap, I've spread it across each drive:

# swap -l
swapfile             dev  swaplo blocks   free
/dev/dsk/c0t1d0s1   32,9      16 535040 499184
/dev/dsk/c0t0d0s1   32,1      16 535040 499744
/dev/dsk/c2t0d0s1   32,241     16 527856 491440
/dev/dsk/c2t1d0s1   32,249     16 527856 490000
/dev/dsk/c3t0d0s1   32,361     16 527856 492224
/dev/dsk/c3t1d0s1   32,369     16 527856 492000      

which seems to be working quite well. BTW Robert, did you know that the e450
has officially been transitioned by sun? 

 > 
 > > 
 > > # prtconf -v
 > > System Configuration:  Sun Microsystems  sun4d
 > > Memory size: 768 Megabytes
 > > System Peripherals (Software Nodes):
 > > SUNW,SPARCserver-1000
 > > [snip]
 > > cpu-unit, instance #x
 > > [snip]
 > >  TI,TMS390Z55        <<<<---- the 512kb onboard cache module
 > 
 > I thought the cache on the Super Sparcs was a 1MB cache?

According to sunsolve (posted by a sun engineer with MY initials no less!):

[jcm 6/12/93]

                     The sun4d FCS PROMs added two properties under TI,TMS390Z55 nodes:

                             ecache-nlines
                             ecache-line-size

                     At this point in time the value of ecache-line-size is always 32. The value of
                     ecache-nlines could be 0x4000 for .5M ecache, 0x8000 for 1M ecache , or 0x10000
                     for 
                     2M ecache. The total size of the ecache = (ecache-nlines) * (ecache-line-size).

and prtconf -vp |grep ecache gives:

	ecache-line-size:  00000040
        ecache-nlines:  00004000          

for each cpu. So _these_ tms390z55s are 512Kb cache modules. 

 > 
 > > 
 > > As for swap space, what sort of db application are you running? we've got
 > > oracle 7.3.3 and 8.0.5 (testing mode) to deal with, along with three
 > > production databases and two development dbs. Then there's the application
 > > weight to consider - nothing's cut and dried with db servers!
 > 
 > True.  We have only one DB running on it, the test DB is on a separate
 > machine (which also is a "backup" machine for if the production machine
 > blows up).
 > 
 > >  > /md6 is made up of 4 9GB SCSI drives striped and mirrored.
 > > 
 > > since when did qbssss have any money to spend on decent configurations? ;)
 > 
 > Since it was costing us more for the maintenance of the old server
 > (SS1000, 3x50MHz) than to lease a new machine (which is under 3 year
 > warranty)! 

sounds like your IT manager wasn't pushing the Sun sales rep enough! If you
keep pushing hard enough they'll come up with a price that is actually
reasonable. At least, that's my experience.
 >  
 > >  > NB we use DB files rather than raw paritions (at this point in time, we
 > >  > can afford to shutdown the DB every night for backup).
 > > 
 > > oh don't start that one please! imho the "real databases use raw partitions
 > > for speed" claim is a furphy which should have died out when 2Gb drives became
 > > cheap. The speed and capacity of disks these days, coupled with the fast/wide
 > > scsi standard and ever faster system busses makes it more efficient to use the
 > > OS filesystem than raw partitions.
 > 
 > I find this a bit hard to believe.  Assuming Oracle (or whatever RDBMS
 > vendor) has bright people etc, raw partitions should be faster (I guess
 > in some ways they'd be like a FS specifically for a DB) - Oracle
 > certainly claims this.  But it is much simpler to simply treat the DB
 > data as just another couple of files - no special treatment at all - and
 > shutdown the DB to do backups (as opposed to using table backup
 > methods).  Note that the term "raw partitions" as I use it above does
 > not indicate that they can't be a raid "partition".  

don't forget that OS filesystems these days do quite a lot of caching of both
reads and writes. Solaris' ufs does reads of 56Kb out of the box (Wong's book
has some nice graphs showing the effect of various block read/write sizes on
throughput/data availability). I think in today's environment, it _is_ much
easier to treat the DB as just another set of files on a filesystem (I've
checked this with our DBAs here), and it's only if you have the most
high-volume and high-utilization database (a data warehouse maybe) that you
should consider raw partitions. And even then there are products which take
advantage of the disks as well as raw partitions do such as Veritas' VxFS
(mentioned in this month's SunWorld online mag).

I think it's one of those religious problems actually ;)

cheers,
James
--
Unix Systems Administrator            Phone: +61.2.9850.9418
Office of Computing Services            Fax: +61.2.9850.7433
Macquarie University   NSW    2109     remove the extra char 
AUSTRALIA			       in the reply-to field

--
This is list (humbug) general handled by majordomo at lists.humbug.org.au .
Postings only from subscribed addresses of lists general or general-post.