Typically, memory is divided into chunks (or blocks). At the main memory level, a chunk is referred to as a memory page. At the cache level, a chunk is called a cache block or a cache line. Keep in mind that if a register is 8 bits wide, a cell contains one of the bits in that register. Technically, only a CPU has registers. Memory chips have capacitors and transistors.
The memory controller keeps track of the state and location of the charges held in the capacitors and transistors of a memory chip, as we indicated earlier, in our printing business story. This combination of states and locations is called an address. The charges in the memory chips are created by the original byte transmissions, coming from the CPU's registers. The CPU sends data to memory in order to empty its registers for more calculations.
In other words, the CPU has some information it wants to get rid of. It sends that information to the memory controller. The memory controller shoves it into whichever capacitors are available and keeps track of where it put the data. It keeps track of the bits by assigning memory addresses to each bit of information.
Changes in memory chips and controllers are similar to how flat, one-page spreadsheets developed the concept of named ranges. DRAM is usually accessed through paging. A page is a related group of bytes (with their bits), similar to a range on a spreadsheet. It can be from 512 bits to several kilobytes, depending on the way the operating system is set up.
Without ranges, a spreadsheet formula must include every necessary cell in the spreadsheet. For example, we might have a formula something like =SUM(A1+B1+C1+D1+E1). Now suppose that we assign cells C1, D1, and E1 to a range, and call that range "LastWeek." We can now change the formula to include the range name: =SUM(A1+B1+"LastWeek").
When a spreadsheet formula uses a named range, this is analogous to the memory controller giving a unique name to a range of charges. This range of charges is called a page address, and with a page address, the controller doesn't have to go looking for data in every single capacitor or transistor.
Fast Page Mode (FPM)
Dynamic RAM (DRAM) originally began with Fast Page Mode (FPM) back in the late 1980s. In fast page mode, the memory controller makes an assumption that the data read/write following a CPU request will be in the next three pages (ranges), very much like a cache. This is somewhat like having a line of letters all ready to go in the toy stamp we spoke about in our printing business story.
Using FPM, the controller doesn't have to waste time looking for a range address for at least three more times: it can read-assume-assume-assume. Note the three pauses, as we'll mention burst cycles in a moment.
When the controller passes through the memory chip, it turns off something called a data output buffer when it reads the page it just read or wrote to. This process takes approximately 10 nanoseconds.
Fast Page Mode is capable of processing commands at up to 50 ns. Fifty nanoseconds is fifty billionths of a second, which used to be considered very fast. Remember that the controller first moves to a row; then to a column; then retrieves the information at an X-Y coordinatea matrix address.
Once the information is validated, the controller hands it back to the CPU. The column is then deactivated, and the data output buffer is turned off. Finally, the column is prepared for the next transmission from the CPU. The memory enters a 10 ns wait state while the capacitors and transistors are precharged for the next cycle.
Extended Data Output (EDO) RAM
FPM evolved into Extended Data Out (EDO) memory. The big improvement in EDO was that once the column of data was deactivated, the data remained valid until the next cycle began. In other words, FPM removed the data bits in the column address and deactivated the column (data output buffer). EDO, on the other hand, kept the data output buffer active until the beginning of the next cycle, leaving the data bits alone.
EDO memory is sometimes referred to as hyper-page mode, and is a specially manufactured chip that allows a timing overlap between successive read/writes. The data output buffers are not turned off when the memory controller finishes reading a page. Instead, the CPU determines the start of the deactivation process when it sends a new request to the memory controller. The result of this overlap in the process is that EDO eliminated the 10 ns. per cycle delay of fast page mode, generating faster throughput.
Both FPM and EDO memory are asynchronous. (In the English language, the "a" in front of synchronous is called a prefix. The "a" prefix generally mean "not," or "the opposite.") In asynchronous memory, the memory controller and the system clock are not synchronized. DRAM is asynchronous memory. In asynchronous mode, the CPU and memory controller have to wait for each other to be ready before they can transfer data. Remember that everything on an asynchronous motherboard listens to the clock ticks coming from the motherboard oscillator.
Originally, motherboard clocks ran between 566MHz. The early x86 processors ran at 5200MHz. When Pentium Pro motherboards stabilized at 66MHz, CPU speeds went up to between 200MHz and 660MHz. Suppose a motherboard has a 66MHz clock with a clock multiplier of 2. In this situation, the CPU runs at 133MHz (66 * 2). Remember, the CPU runs at the speed of the motherboard clock.
Now suppose a memory chip is synchronized to that same motherboard's 66MHz clock. The memory controller will have to wait 2 clock ticks before it can interrupt the CPU, unless it accidentally happens to catch the CPU at exactly the right time. Think about it: the CPU is hearing two ticks for every one tick the memory controller is hearing. If the CPU processes one instruction for every clock tick, then it will do two things before it's ready to be interrupted by the memory controller. This makes the controller seem a bit slow in the head.
If you choose, you may jump to the "Types of RAM" section at this time. This section is a more technical discussion of wait states and the memory standards used in rating memory performance.
A 60 ns. DRAM module, using fast page mode, might run in 5-3-3-3 burst mode timing. The first access takes 5 cycles (on a 66MHz system), or about 75 ns. The next three accesses take only 3 cycles, because they "assume" the range addresses. It happens that this works out to about 45 ns, with a ninety nanosecond savings in time (about 40 percent). Regular old page mode would be 5-5-5-5 burst mode.
EDO RAM became popular for a time, with a typical burst cycle of 5-2-2-2, yielding about a 22 percent savings in time over FPM memory. 50-nanosecond FPM isn't used anymore. Both FPM and EDO memory eventually gave way to Synchronous DRAM (SDRAM), then to Rambus memory, then to DDR-SDRAM (discussed later in this chapter). SDRAM developed a burst cycle of 5-1-1-1, which is even faster than EDO RAM, so SDRAM became the most popular type of memory toward the end of the 1990s.
The PC100 Standard
The speed of the actual memory chips in a module is only part of how we evaluate memory speed. The other factor is the underlying printed circuit board. Due to the physics of electricity and electronic parts, a module designed with parts that can run at 100MHz may never reach that speed. It takes time for the signals to move through the wire, and the wire itself can slow the signal speed. This led to the same old ratings problems that processors were causing.
Motherboard speeds eventually increased to 100MHz, and CPU speeds went beyond 500MHz. The industry decided that the original SDRAM chips should be synchronized at 100MHz. Someone had to set the standards for the way memory modules were clocked, so Intel developed the PC100 standard. This initial version of the standard made sure that a 100MHz module was really capable of, and really did run at 100MHz. Naturally, this created headaches for memory manufacturing companies, but the standard really helped in determining system performance.
The PC100 SDRAM modules required 8 ns DRAM chips, capable of operating at 125MHz. This provided a margin of error, making sure that the overall module would be able to run at 100MHz, according to the standard. The standard also called for a correctly programmed EEPROM, on a properly designed circuit board.
At 100MHz and higher, timing is absolutely critical, and everything from the length of the signal traces to the construction of the memory chips themselves is a factor. The shorter the distance the signal needs to travel, the faster it runs. Noncompliant modules, those that didn't meet the PC100 specification, could significantly reduce the performance and reliability of the system. The standard caught on. Note that unscrupulous vendors would sometimes use 100MHz SDRAM chips and label the modules as PC100 compliant, a similar situation occurred with processor chips (mentioned in the next chapter). This didn't necessarily do any harm, but consumers weren't getting what they were told they were getting.
As memory speeds increased, the PC100 standard was upgraded to keep pace with the new modules. Intel released a PC133 specification, synchronized to 133MHz, and so it went. PC800 RDRAM was released to coincide with the 800 series chipset, running at 800MHz, and these days, we see a PC1066 specification, designed for high-speed RDRAM memory. As bus speeds and module designs change, so too does the specification.