Solaris Products
White Papers
How To Buy
Support Services


Solaris Site Map
  

NFS Performance

The benefits of distributed file sharing on an enterprise network can be easily outweighed by poor performance. Productivity can suffer substantially when the users in an organization, unable to get fast access to the data they require, cannot get their work done efficiently. Any compromise in productivity has the potential to compromise the success of the corporation.

NFS performance has increased an order of magnitude in the last 5 years, from hundreds of operations per second to thousands. The NFS Version 3 protocol revision, client caching, and general tuning efforts have all contributed to this phenomenon. The factors contributing to increased NFS performance as well as details on the latest performance attributes of NFS Version 2 and Version 3 are discussed below.

How Caching Increases NFS Performance

The term "caching" refers to the temporary storage of file data in a fast access local storage receptacle called a cache. Once file data (including pages, attributes and directory information) is cached, subsequent client requests go directly to the cache and do not require a data transfer over the network. After the cache is populated, frequently called NFS procedures such as read, readdir, readlink, and lookup are virtually eliminated.

Caching significantly increases client side NFS performance and also enhances server scalability. Clients get faster access to files by storing large chunks of data in a local, fast access cache. If the file is being read sequentially, the NFS client can anticipate future data requirements through a process called "read- ahead" and can store this information in its local cache for future reference. Caching large amounts of file information on the client means fewer demands are made on the server, thus server load is decreased and scalability is enhanced.

In many existing NFS implementations, clients utilize fast access RAM to cache data. However in these cases, the amount of memory available for caching is usually very limited. In addition, any information stored in a volatile RAM cache is not persistent (i.e. it will not survive a reboot). An alternative is to utilize the local disk as a cache. (See Figure 3 Although the speed of access to a disk cache is slightly reduced, larger amounts of data can be stored there with the added benefit that the information persists after a system reboot.

Local Disk Caching with NFS

Previous NFS client implementations on Solaris used a physical memory cache exclusively. However a technology called CacheFS (for cache file system) which provides local disk caching capability is now included in Solaris [4]). Therefore on Solaris, both NFS Version 2 and Version 3 clients still cache as much data in fast access RAM as they can. But when additional disk space is needed, CacheFS enables clients to also use the local disk. Because of the greater storage space available on disk, data can be cached in 64K chunks, and whole directories can be cached instead of just directory entries. Expanded cache space means clients can access more data quickly and they make fewer requests of the server.

[ image deleted ]

    Figure 3 Using Local Disk Caching with NFS

Performance Improvements with NFS Version 3

NFS Version 3 introduced a variety of improvements to NFS performance and scalability. These features are covered in detail in the following sections.

Write Throughput Improvements

Applications running on client systems may periodically write data to a file, changing its contents. The amount of time an application waits for its data to be written to stable storage on the server is a measurement of the write throughput of a distributed file system. Write throughput is therefore an important aspect of performance. All distributed file systems including NFS must ensure that data is safely written to the destination file while at the same time minimizing the impact of server latency on write throughput. This section provides a background on factors contributing to NFS Version 2 write throughput, the pros and cons of alternative mechanisms provided by other distributed file systems, and an explanation of throughput improvements with NFS Version 3.

NFS Version 2 Write Throughput

When an application or process writes to a file, the NFS Version 2 client first temporarily stores the data in the local cache. After data is written to the cache, the application can continue to go about its business [5]). In parallel, the client takes data corresponding to each write request from the cache, and submits write requests to the server. The NFS server then writes the data serially to stable storage and responds to the client. Because each individual NFS write request from the client requires that data must be written to stable storage on the server before it completes, this mechanism is known as synchronous writes.

When the application closes the file, the close will not complete until all outstanding data is written to stable storage on the server, a property called close-to-open semantics. This helps ensure data is handled correctly if a failure occurs. For example, if the data was not committed to stable storage before the file was closed, the possibility of a server crash could result in the data being lost. Close-to-open semantics also helps ensure that changes are written to the file before another client attempts to read it.

Obviously, if a server is suffering from high disk latency, an application may wait for extended periods of time for its data to be successfully written. This is especially true if there is a lot of outstanding data at the time the file is closed.

[ image deleted ]

    Figure 4 Synchronous writes with NFS Version 2: Several write operations result in data being stored in the local client cache (Steps 1, 2 and 3). The NFS client handles data corresponding to each write request sequentially (see colored arrows). A write is not considered complete until the data is safely written to stable storage on the server.

How Prestoserve Improves Write Throughput

One way to significantly increase write throughput is to utilize Prestoserve [6]). Prestoserve interposes a software driver between the file system and the disk driver and accelerates writes by utilizing nonvolatile RAM (NVRAM). When Prestoserve is installed on the server, clients can write data to NVRAM where it will later be scheduled to be written to disk. Prestoserve is advantageous because clients can write to it faster than they can write to disk and can then move on to their next activity without being impacted by disk latency.

Other Mechanisms that Address Write Throughput

Alternative mechanisms (sometimes referred to as cache consistency mechanisms) implemented in software included with other distributed file systems technologies attempt to minimize the impact of server latency on write throughput in a different way. In these schemes, the server is responsible for issuing and managing tokens which grant special privileges to the clients that posses them. For example, a write token gives a client exclusive file writing privilege. In this scenario, close-to-open semantics need not be enforced because the outstanding data can be written to the server after the application closes the file. The server will notify the client to yield its token whenever another process wishes to access the file.

However, problems can arise when a failure occurs during the time a client process possesses a token. For example, if a client holding a file's token crashes or if a network partition occurs, the server has a dilemma when a new client wishes to access the same file. Should it wait indefinitely for the first client to "come back"? This will cause the next client in line to hang, waiting for the token. Or, should the server revoke the token letting the next client proceed? This will cause problems when the first client comes back and has to resolve changes made to the file during its absence. A third, labor intensive approach is to have the system administrator decide how to handle tokens on a case by case basis.

It is easy to see that token passing software can be complicated to implement and difficult to administer. The asynchronous writes feature of NFS Version 3 (covered in the next section) solves the same problem without the administrative overhead.

Improvements to Write Throughput with NFS Version 3

The NFS Version 3 protocol offers a better alternative to increasing write throughput by eliminating the synchronous writes requirement while retaining the benefits of close-to-open semantics. The NFS Version 3 client significantly reduces the number of write requests it makes to the server by "collecting" multiple requests and then writing the collective data through to the server's cache. Subsequently, it submits a commit request to the server which causes the server to write the all data to stable storage at one time. This feature, referred to as safe asynchronous writes, vastly reduces the number of write requests to the server, thus significantly improving write throughput. See Figure 5.

The writes are considered "safe" because status information on the data is maintained, indicating whether or not it has been stored successfully. Therefore, if the server crashes before a commit operation, the client will know by looking at the status indication whether or not to resubmit a write request when the server comes back up. Figure 5 depicts safe asynchronous writes.

[ image deleted ]

    Figure 5 Safe asynchronous writes with NFS Version 3: Several write operations result in data being stored in the client cache (Steps 1, 2, and 3). Data is then collectively written "through" to the server cache (Step 4) where it is stored temporarily. The client then submits a commit request (Step 5) to make sure the data is committed to stable storage (Step 6).

NFS Version 3 Write Throughput Measurements

Figure 6 shows the difference in write throughput between NFS Version 2 and Version 3. Because there is no official NFS Version 3 benchmark [7]), the throughput test consisted of transferring 40 MB files by issuing a simple file copy (cp) command. SunSoft and DigitalTM, a substantial contributor to the NFS Version 3 definition work, combined efforts and performed these tests on two DEC 300/600 machines running DEC OSF/1.

The results of these tests show a peak write performance of 6105 KB/s for NFS Version 3 versus 5022 KB/s for NFS Version 2, a 20% increase at comparable server CPU utilization. When Prestoserve was added to the NFS Version 3 system, performance increased an additional 10%. Note that the NFS Version 2 host was configured for maximum performance potential by utilizing write gathering in addition to Prestoserve. The write gathering feature improves NFS Version 2 performance by enabling servers to gather several writes before synchronously committing to stable storage. This requires that the client be capable of sending multiple write requests concurrently to gain parallelism.

[ image deleted ]

    Figure 6 A comparison of NFS Version 2 vs. NFS Version 3 Client Performance

Read Throughput Improvements

Read throughput can be defined as the amount of time applications wait for file data to become available, subsequent to issuing a read request. Both NFS Version 2 and Version 3 clients utilize local disk caching and read ahead (covered in the section on Section , "How Caching Increases NFS Performance," on page 11) to enhance read throughput. In addition, NFS clients also maintain the cache after a file is closed. This is because in the common case where a file is reopened, read requests will often be satisfied by data that already resides in the cache. Together these features help ensure that the data an application wants to read will be in the cache in advance of demand, reducing waiting time and thus increasing read throughput.

Reduced Requests for File Attributes

Because read data can sometime reside in the cache for extended periods of time in anticipation of demand, clients must check to ensure their cached data does not become invalid if a change is made to the file by another application. Therefore, the NFS client periodically acquires the file's attributes, which includes the time the file was last modified. Using the modification time, a client can determine whether or not its cached data is still valid.

Clients obtain file attributes by querying the server with a getattr request. It is a common misconception that NFS clients check for attributes every 30 seconds. NFS clients cache a file's attributes along with the data at the time the file is first opened. If the client subsequently needs to perform an operation that requires attribute information, it first checks to see if the cached attributes are more than 30 seconds old [8]). If so, it will request attribute information from the server. If the operation does not require attributes or if the attribute information is less than 30 seconds old, an attributes request is not sent.

Keeping attributes requests to a minimum makes the client more efficient and minimizes server load, thus increasing scalability and performance. Therefore, NFS Version 3 was designed to return attributes for all operations. This increases the likelihood that the attributes in the cache are up to date and thus reduces the number of separate attribute requests. This is an improvement over NFS Version 2 which does not always return attributes information.

Efficient Utilization of High Bandwidth Network Technology

NFS Version 2 has an 8K maximum buffer size limitation which restricts the amount of NFS data that can be transferred over the network at one time. In NFS Version 3 this limitation has been relaxed, enabling NFS to construct and send larger chunks of data. This allows NFS to more efficiently utilize high bandwidth network technologies such as FDDITM and 100baseT EthernetTM and has contributed substantially to NFS performance gains.

Reduced Directory "Lookup" Requests

A full directory listing (such as "ls -l" on Unix systems) requires that name and attribute information be acquired from the server for all entries in the directory listing. NFS Version 2 clients query the server separately for the file and directory names list and attribute information for all directory entries in a lookup request. However, with NFS Version 3, names list and attribute information is returned at one time, offloading both client and server from performing multiple tasks.

NFS Version 2 Performance on Solaris

NFS Version 2 performance on Solaris has also increased substantially. The chart in Figure 7 shows NFS Version 2 server performance test results specifically for the last two releases of Solaris. Between Solaris 2.3 and 2.4, NFS Version 2 performance on the SPARCstation 2000 multiprocessor (as measured by the SPEC SFS benchmark) increased 26% from 2575 SPECsfs_a93 operations per second to 3242 SPECsfs_a93 operations per second. Continuing this trend, SunSoft expects further performance gains in the upcoming Solaris 2.5 release.

[ image deleted ]

    Figure 7 Comparison of NFS Version 2 performance on Solaris 2.3 vs. Solaris 2.4 as measured by SPECsfs_a93. The test platform was a SPARCstation 2000.

The measurements in the above graph were made using an NFS Version 2 server benchmark called SPECsfs_a93 (formerly LADDIS). This benchmark measures NFS server performance against simulated client loads. An independent, non-profit organization called SPEC is responsible for making sure that vendors completely disclose the product configuration (both hardware and software) used to obtain the performance test results. SPEC was formed to `establish maintain, and endorse a standardized set of relevant benchmarks that can be applied to the newest generation of high performance computers'. Having access to performance information published by SPEC helps customers approximate "apples to apples" comparisons of products from different vendors.

Footnote 4
CacheFS was originally introduced in Solaris 2.3

Footnote 5
As long as there is ample cache space, applications can continue to submit write requests without blocking. However when the client cache is full, a subsequent write request will cause the application to block until enough cache space is freed up to hold the new data.

Footnote 6
The Prestoserve option is available for additional cost from Sun Microsystems Computer Corporation (SMCC) for Sun platforms.

Footnote 7
There is no official client NFS benchmark. . However, a version of the accepted NFS server benchmark, SPEC SFS (formerly LADDIS), that can measure NFS Version 3 server performance is currently under development. by members of the SPEC organization.

Footnote 8
This interval can be set by the NFS administrator

Next

Site MapWhat's Hot!FAQsSoftwareSales & Service
Questions or comments regarding this service? webmaster@sun.com

Copyright 1996 Sun Microsystems, Inc., 2550 Garcia Ave., Mtn. View, CA 94043-1100 USA. All rights reserved.