PolyServe Release Notes PolyServe Matrix Server 3.5.1 File Serving Utility™ 3.5.1 for Red Hat Enterprise Linux AS/ES 4.
Copyright © 1999-2007 PolyServe, Inc. Use, reproduction and distribution of this document and the software it describes are subject to the terms of the software license agreement distributed with the product (“License Agreement”). Any use, reproduction, or distribution of this document or the described software not explicitly permitted pursuant to the License Agreement is strictly prohibited unless prior written permission from PolyServe has been received.
Contents PolyServe Release Notes Contents of the 3.5.1 Releases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Matrix Server 3.5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 MxFS-Linux 3.5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Contents of the 3.5.0 Releases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Matrix Server 3.5.0 . . . . . . . . . . . . . . . . . . . .
PolyServe Release Notes These release notes apply to the following PolyServe software for Linux: • Matrix Server 3.5.1 • The File Serving Utility, which includes Matrix Server 3.5.1 and MxFS-Linux 3.5.1 Contents of the 3.5.1 Releases Matrix Server 3.5.1 This release adds support for the following operating systems: • SuSE Linux Enterprise Server 9, SP 3 (32-bit and 64-bit). The new features added in the 3.5.0 release for RHEL4 are now available on SLES9. See “Contents of the 3.5.
PolyServe Release Notes 2 filesystems and continue with the shutdown. This change reduces the amount of time required to stop Matrix Server. • Matrix Server includes an audit feature that can be used to log administrative commands in the matrix log. This feature can now be disabled if necessary. For more information, see the PolyServe Administration Guide. • The psfsinfo command has a new option, --blocksize, that prints only the blocksize used by the filesystem.
PolyServe Release Notes 3 message was displayed when the specified filesystem was not supported for snapshots. • Defect 15319. The psfsck command did not correct all quota-related corruption. • Defect 15330. A problem in the performance library caused the monitor agent to crash. • Defect 15337. A problem in the PSFS filesystem caused the system to panic with assertion failures in psfs_assert_inode_cached_debug() and cwir_acquire_function(). • Defect 15377.
PolyServe Release Notes 4 • Defect 15599. The rpcquota.d daemon was not included in the pmxsquota-tools RPM. • Defect 15600. The NETLINK facility should be used to configure virtual hosts as secondary addresses. • Defect 15606. The rate at which nodes send snmp requests to FC switches needed to be reduced. This change allows more and larger matrices to be connected to an FC switch. • Defect 15636. Imported devices could not be accessed when the RHEL4 hugemem kernel was used. • Defect 15673.
PolyServe Release Notes 5 MxFS-Linux 3.5.1 This release adds support for the following operating systems: • SuSE Linux Enterprise Server 9, SP 3 (64-bit only) • Red Hat Enterprise Linux AS/ES 4.0, Update 4 (64-bit only) This release includes the following implementation changes: • When MxFS-Linux is installed, it now increases the values of several tunable parameters: – The number of NFS Server threads (specified by RPCNFSDCOUNT on Red Hat or USE_KERNEL_NFSD_NUMBER on SLES9) is increased to 32.
PolyServe Release Notes 6 it is necessary to reset the timeout, change your scripts to use mx nfsprobe. This release includes fixes for the following problems: • Defect 9167. Under certain conditions, the MxFS-Linux NFS server could reject a file handle from a client as “stale” even though the file handle was valid. The conditions related to cached inode management within the exported PSFS filesystem. The filesystem now properly handles the inode cache for files accessed through NFS. • Defect 14832.
PolyServe Release Notes 7 during NFS service probes are no longer treated as service failures and will not trigger failovers. • Defect 15322. The ExportFSsync process could dump core during product shutdown. This defect did not adversely affect the availability of NFS exports. • Defect 15360. More messages from vstatd (the MxFS-Linux version of rpc.statd) are now sent to syslog rather than discarded. This will help diagnose startup problems for vstatd. • Defect 15369.
PolyServe Release Notes 8 Server from one server node to another. MxFS-Linux now returns a persistent file handle. A client may observe similar behavior if it mounts one PSFS filesystem from two or more different Virtual NFS Servers and accesses one file through multiple mounts. This cannot be resolved by MxFS-Linux. Clients may choose to access the file through a single mount, or to use techniques such as sync(2) or file locking to synchronize access to the file. • Defect 16084.
PolyServe Release Notes 9 deletes bookmarks to match the list of servers in the matrix to which the server belongs. If you currently have a .matrixrc file, you may need to update it to use this feature. See the PolyServe Administration Guide or the “Connect to a Matrix” online help for details. The 3.5.0 release included the following implementation changes: This release included the following implementation changes: • The number of virtual hosts supported in a matrix has been increased to 128.
PolyServe Release Notes 10 – Including passwords in the .matrixrc file is now optional. You can remove the passwords from your file if desired, or select the bookmark entry on the Connect window and click Reset. MxFS-Linux 3.5.0 This release included the following: • Support for MxFS-Linux on RHEL4 Update 3. • The timeout value for Export Group monitors now applies only to how long the RPC probe will wait for a response before causing a Virtual NFS Service failover.
PolyServe Release Notes 11 • 03_iounmap_deadlock.patch. This patch fixes a deadlock in iounmap() by changing it to not call ioremap_change_attr() with the vmlist lock held. • 04_cfq_entry.patch. This patch corrects a problem with the CFQ I/O scheduler. Without this patch, when the system is under heavy I/O load, I/Os can be starved indefinitely and affect operations both locally and across the cluster. This patch is applied only to the 2.6.9-34.EK kernel. The problem does not exist in the 2.6.9.42.
PolyServe Release Notes 12 Open Issues and Workarounds The following open issues affect Matrix Server and MxFS-Linux operations. Matrix Server Defect Description 1615 Resized log does not display properly When you use the Set Log Length option to set the number of lines to appear on the Server Log window, the Management Console does not refresh the display. You will need to close the Server Log window and then reopen it to see the resized log.
PolyServe Release Notes 13 Defect Description If Matrix Server is started and the third-party MPIO software has not previously discovered the devices, Matrix Server will make a “best effort” to discover them. In some cases, Matrix Server will initiate a single reboot. Matrix Server will not start if it cannot discover the devices. The administrator must then determine why the MPIO devices have not been discovered and resolve the issue.
PolyServe Release Notes 14 Defect Description 8335 URL in HTTP service monitor can cause virtual host to fail The URL assigned to an HTTP service monitor should not reference the virtual host with which the HTTP service monitor is associated. If the virtual host is down, the HTTP service monitor will not be able to make a connection to the virtual host address and will also be down. A down service monitor on a virtual host then tends to keep the virtual host from ever being instantiated.
PolyServe Release Notes 15 Defect Description 9615 Nodes stall waiting for locks If you are seeing alerts stating that nodes are stalled waiting for locks for a particular filesystem, the filesystem may be experiencing contention on Full Zone Bitmaps. To aid in diagnosing this problem, determine whether the following apply: • Full Zone Bitmaps (FZBMs) are enabled on the filesystem. • Filesystem operations that involve allocating or freeing blocks are slower than on previous versions of the filesystem.
PolyServe Release Notes 16 Defect Description 10191 Using hostnames in .matrixrc file can cause connection delays When servers are specified by hostname in the .matrixrc file, long connection delays (possibly minutes per hostname entry) can occur if there is a slow or unresponsive DNS server on the network. During this time, the Management Console and mx commands might be unresponsive.
PolyServe Release Notes 17 Defect Description 12452 Application resources are not grouped properly if application is renamed If you rename an application containing an Export Group and then reopen the Management Console, the application will be displayed as two applications on the Applications tab. The first application will have the name of the Export Group and the second application will have the new name. Workaround. Rename the application to match the name of the Export Group.
PolyServe Release Notes 18 Defect Description 15103 Deadlock can occur if a node is a client for an NFS filesystem A node running Matrix Server should not be configured as a client for an NFS filesystem. Doing this can cause a deadlock situation on the node. 15782 License file is not read immediately when installed out of sequence during upgrade The upgrade procedure specifies when the license file should be installed.
PolyServe Release Notes 19 Defect Description 16262 Disk information is incorrect for deported disk When you deport a disk and change the LUN size, the disk size information shown by sandiskinfo and the Import Disk window is invalid. After the disk is imported back into the matrix, the correct disk size information will be displayed.
PolyServe Release Notes 20 Defect Description 8808 Linux NFS client may pause uninterruptedly NFS client applications that use blocking file locks (via fcntl()) will sometimes pause uninterruptedly for 30 seconds and then (if they are not killed in the meantime) resume running normally. This problem is caused by a bug in the Linux NFS client. Workaround. PolyServe has provided a version of the SLES9 and RHEL4 NFS clients with this problem fixed.
PolyServe Release Notes 21 Defect Description 9789 Client times out before it can mount exported filesystems If the clientʹs mount command times out consistently (for example, for every retry of a backgrounded mount), one possible cause is that the client IP address cannot be resolved to names at the server. The server mountd may take longer attempting to translate the client address than the client mount is prepared to wait.
PolyServe Release Notes 22 Defect Description 10646 Invalid link count can be returned to NFS clients Linux NFS clients running with Linux 2.4.x kernels will return an invalid link count via the stat() system call interface if a file with multiple hard links is unlinked by one coincident with a failover of a Virtual NFS Service.
PolyServe Release Notes 23 Defect Description 12379 Client mounts hang indefinitely The standard Linux NFS implementation has a defect that prevents the successful export of filesystems or directories that reside on certain devices. The symptom is that client mount attempts hang indefinitely. The defect is not related to Matrix Server or the PSFS filesystem. The affected devices have major or minor device numbers greater than 255.
PolyServe Release Notes 24 Defect Description 13597 Solaris 8 and Solaris 10 clients can experience failover delays When a Virtual NFS Service fails over to another server, a Solaris 10 client using TCP-based NFS mounts can pause for 80 to 90 seconds before I/O resumes normally. This delay is caused by a client-side NFS RPC bug and applies only to NFS RPC over TCP. UDP-based mounts are not affected. The problem can be resolved by applying the recommended Sun patches for Solaris 10.
PolyServe Release Notes 25 Operating System and Environment Issues Defect Description 455 Parent dentries are not revalidated During path lookups, the operating system does not revalidate parent dentries when they are specified as ʺ..ʺ components. The operating system assumes that the parent dentry associated with ʺ..ʺ is always valid; however, this is not always the case in a distributed environment. For example, server A may have a process that executes cd /a/b/c.
PolyServe Release Notes 26 Defect Description 1041 Mount and unmount may fail if run in parallel If you run several mount or umount commands in parallel, an operation may fail with the following error message: Cannot create link /etc/mtab~ Perhaps there is a stale lock file? The mount or umount operation will succeed; however, the /etc/mtab file will not be updated. Run the command again to update the file.
PolyServe Release Notes 27 Defect Description 7435 QLogic Switch login can cause Matrix Server failures Matrix Server will not start if you are logged into a QLogic FibreChannel switch and “admin start” is set. A message such as the following will appear: Switch address is not responding to SNMP SET requests. Verify the configured community string has SNMP write privileges. The FibreChannel switches are not responding to SNMP requests.
PolyServe Release Notes 28 Defect Description 15575 RDAC driver problem can cause I/O requests to hang When the Host Bus Adapter is returning errors because of BUSY or QUEUE_FULL conditions, the RDAC driver may leave I/O requests on the physical deviceʹs request queue, resulting in errors stating that a node is stalled waiting for locks. Workaround. Reduce the HBA queue depth to a value of 8 or less.
Using Oracle with PolyServe Matrix Server PolyServe Matrix Server has undergone a high degree of Oracle performance and stress testing by the PolyServe Database Engineering team. See the PolyServe Web site for the recommended Oracle release for use with PolyServe Matrix Server. Asynchronous I/O Support While certain Linux distributions may support Asynchronous I/O for raw partitions and non-clustered filesystems, these implementations are not supported on clustered filesystems.
PolyServe Release Notes 30 • disk_async_io = FALSE • _lgwr_async_io = FALSE • _dbwr_async_io = FALSE If 10 DBWR slaves are not sufficient for a given workload, the Oracle session wait event “free buffer waits” will be a predominant wait event as reported through statspack or utlestat.sql. To address this, simply increase the value assigned to the init.ora parameter dbwr_io_slaves. Copyright © 1999-2007 PolyServe, Inc. All rights reserved.