Hardware manual

The procedures in the DiskStreamsScan module permit reading (but not writing) of a file to proceed at up
to full disk speed, if the amount of computation to be performed per page is not too great (about 2
milliseconds). To make use of this facility, you must provide a certain amount of extra buffer space to be
managed by the disk streams package, and you must take care of sequencing through the data in each page
yourself rather than obtaining it one item at a time using Gets.
The flow of control is basically as follows. You create a disk stream in the normal fashion. When you want
to start scanning the file, you pass the stream to InitScanStream, along with one or more additional page-
size buffers, and it returns a Scan Stream Descriptor (SSD). Now, every time you want to examine the
next page of the file, you call GetScanStreamBuffer, which returns a pointer to a buffer containing the
contents of that page. The contents of the buffer remain valid until the next call to GetScanStreamBuffer.
When you have scanned as much of the file as you care to, you call FinishScanStream, which destroys the
SSD and leaves the stream positioned at the beginning of the page most recently returned by
GetScanStreamBuffer. You should not execute any normal stream operations between the calls to
InitScanStream and FinishScanStream.
InitScanStream(s, bufTable, nBufs) returns SSD. Creates a Scan Stream Descriptor in preparation for
scanning the file corresponding to the stream s. bufTable is an array of pointers to page-size buffers, and
nBufs is the number of buffers (there must be at least one). That is, the buffers are located at bufTable!0,
bufTable!1, ..., bufTable!(nBufs-1). The SSD is allocated from the zone from which s was allocated.
InitScanStream does not actually initiate any disk activity.
GetScanStreamBuffer(ssd) returns a pointer to a buffer containing the next page of the file being scanned,
or zero if end-of-file has been reached. This procedure waits if necessary for the transfer of the next page
to complete, and before returning it initiates as many new disk transfers as it has buffers for. The first page
returned by GetScanStreamBuffer is the one at which the stream was positioned at the time
InitScanStream was called. The initial portion of the SSD is a public structure (defined in Streams.d)
containing the disk address, page number, and number of characters in the page most recently returned by
GetScanStreamBuffer; you may use this information for whatever purposes you wish (e.g., in building up a
file map for subsequent efficient random access to the stream).
FinishScanStream(ssd) waits for disk activity to cease, updates the state in the corresponding stream, and
destroys the SSD. The stream is left positioned at the beginning of the last page returned by
GetScanStreamBuffer, or at end-of-file if GetScanStreamBuffer most recently returned zero.
The package uses the stream buffer in addition to the buffers passed explicitly to InitScanStream. It is
possible to scan a file at full disk speed (assuming the file is consecutively allocated) with two buffers (i.e.,
just one additional buffer), so long as the interval between calls to GetScanStreamBuffer is no greater than
3.3 milliseconds (or about 2 milliseconds of computation on the caller’s part). If more computation per
page is required, or the amount of computation per page is highly variable, then more buffers are required
to maintain maximum throughput.
4. Fast Streams
A fast stream structure must begin with the structure declared as FS in Streams.D; following this you can
put anything you like. To initialize this structure, use
InitializeFstream(s, itemSize, PutOverflowRoutine, GetOverflowRoutine, GetControlCharRoutine
[Noop]). The s paramter points to storage for the stream structure, lFS words long. The itemSize is as for
CreateDiskStream. The overflow routines are explained below. GetControlCharRoutine(item, s) will be
called whenever a Gets for a charItem stream is about to return an item between 0 and #37, and its value
is returned as the value of the Gets. The initialization provides Gets, Puts, and Endofs routines; the other
stream procedures are left as Errors.
SetupFstream(s, wordBase, currentPos, endPos) is used to set up a fast stream to transfer data to or from a
buffer in memory. WordBase is the address of the buffer in memory, and currentPos and endPos are byte
Disk Streams September 9, 1979 43
For Xerox Internal Use Only -- December 15, 1980