Home > Articles

  • Print
  • + Share This
This chapter is from the book

Cache Memory

Do you remember the printing business? Well, the company expanded, meaning there was more and more paperwork. Between print jobs, they had to send copies of financial statements and records off to the accounting department and the government. So the boss hired a secretary. At first, they sent these small jobs to the press room—after all, they were a printing company—but that was costing too much money. Finally, he bought a laser printer for himself (L-1 cache), and one for his secretary (L-2 cache) so they could do these quick little jobs themselves.

Whenever the boss was working up a price quote for a customer, he could set up various calculations and have his secretary print them off. Because they didn't have to go all the way to the press room (main memory), these temporary jobs were extremely quick. The CPU uses Level 1 and Level 2 caching in a similar fashion.

Level 1 (primary) cache memory is like the boss's own personal printer, right there by his desk. Level 2 (secondary) cache memory is like the secretary's printer in the next room. It takes a bit longer for the secretary to print a job and carry it back to the boss's office, but it's still much faster than having to run the job through the entire building.


Remember that the CPU uses memory caches to store data from registers that it will be using again soon. It also uses memory caches to store data on the way to memory, where the memory controller is too slow to capture each bit in relation to the CPU's timing speed. L-1 and L-2 caches run at the speed of the processor bus (also known as the front side bus). This allows the caches to capture a bit every time (clock tick) the processor sends a bit, or the reverse.

Memory Caches

Cache (pronounced "cash") is derived from the French word cacher, meaning to hide. Two types of caching are commonly used in personal computers: memory caching and disk caching. A memory cache (sometimes called a cache store, a memory buffer, or a RAM cache) is a portion of memory made up of high-speed static RAM (SRAM) instead of the slower and cheaper dynamic RAM (DRAM). Memory caching is effective because most programs access the same instructions over and over. By keeping as much of this information as possible in SRAM, the computer avoids having to access the slower DRAM.

The memory hierarchy is a way to handle differences in speed. "Hierarchy" is a fancy way of saying "the order of things; from top to bottom, fast to slow, or most important to least important." Going from fastest to slowest, the memory hierarchy is made up of registers, caches, main memory, and disks.


When the processor needs information, it looks at the top of the hierarchy (the fastest memory). If the data is there, it wins. Otherwise, a so-called miss occurs, and the processor has to look in the next, lower level of hierarchy. When a miss occurs, the whole block of memory containing the requested missing information is brought in from a lower, slower hierarchical level. Some existing blocks or pages must be removed for a new one to be brought in.

Disk caching is different from memory caching, in that it uses a formula based on probabilities. If you are editing page one of a text you are probably going to request page two. So even if page two has not been requested, it is retrieved and placed in a disk cache on the assumption it will be required in the near future. Disk caches use main memory or in some cases additional memory included with the disk itself.

Memory caching is based on things the CPU has already used. When data or an instruction has been used once, the chances are very good the same instruction or data will be used again. Processing speed can be dramatically increased if the CPU can grab needed instructions or data from a high-speed memory cache rather than going to slower main memory or an even slower hard disk. The L1, L2, and L3 cache are made up of extremely high-speed memory and provide a place to store instructions and data that may be used again.

Using Memory Levels

Here's another way to understand the different levels of a hierarchy. Think of the answer to the following questions, and then watch what happens in your mind. What's your name? This information is immediately available to you from something like the ROM BIOS in a computer. What day is it? This information is somewhat less available and requires a quick calculation, or "remembering" process. This is vaguely like the CMOS settings in the system.

What's your address? Once again you have a fairly quick access to your long-term memory, and quickly call the information into RAM (your attention span). What's the address of the White House? Now, for the first time, you're likely to draw a blank. In that case you have two options: The first is that you might remember a particular murder-mystery movie and the title, which acts somewhat like an index pointer to retrieve "1600 Pennsylvania Avenue" from your internal hard drive. In other instances, you'll likely have to access process instructions, which point you to a research tool like the Internet or a phone book.

You should be able to see how it takes longer to retrieve something when you're less likely to use the information on a regular basis. Not only that, but an entire body of information can be stored in your mind, or you may have only a "stub." The stub then calls up a process by which you can load an entire application, which goes out and finds the information. If you expect to need something, you keep it handy, so to speak. A cache is a way of keeping information handy.


Understand that a cache is just a predefined place to store data. It can be fast or slow, large or small, and can be used in different ways.

L-1 and L-2 Cache Memory

The Intel 486 and early Pentium chips had a small, built-in, 16KB cache on the CPU called a Level 1 (L-1), or primary cache. Another cache is the Level 2 (L-2), or secondary cache. The L-2 cache was generally (not very often, anymore) a separate memory chip, one step slower than the L-1 cache in the memory hierarchy. L-2 cache almost always uses a dedicated memory bus, also known as a backside bus (see Figure 2.10 in Chapter 2).

A die, sometimes called the chip package, is essentially the foundation for a multitude of circuit traces making up a microprocessor. Today, we have internal caches (inside the CPU housing) and external caches (outside the die). When Intel came up with the idea of a small amount of cache memory (Level 1), engineers were able to fit it right on the die. The 80486 used this process and it worked very well. Then the designers decided that if one cache was good, two would be better. However, that secondary cache (Level 2) couldn't fit on the die, so the company had to purchase separate memory chips from someone else.


Don't confuse a chip package with a chipset—the entire set of chips used on a motherboard to support a CPU.

These separate memory chips came pre-packaged from other companies, so Intel developed a small IC board to combine their own chips with the separate cache memory. They mounted the cards vertically, and changed the mounts from sockets to slots. It wasn't until later that evolving engineering techniques and smaller transistors allowed them to move the L-2 cache onto the die. In other words, not every design change is due to more efficient manufacturing.


For the purposes of the exam, you should remember that the primary (L-1) cache is internal to the processor chip itself, and the secondary (L-2) cache is almost always external. Modern systems may have the L-1 and L-2 cache combined in an integrated package, but the exam may easily differentiate an L-2 cache as being external. Up until the 486 family of chips, the CPU had no internal cache, so any external cache was designated as the "primary" memory cache. The 80486 introduced an 8KB internal L-1 cache, which was later increased to 16KB. The Pentium family added a 256KB or 512KB external, secondary L-2 cache.

Larger memory storage means more memory addresses, which, in turn, means larger numbers. A CPU register can store only a certain size byte, and larger numbers mean wider registers, as well as wider address buses. Note that registers (discussed again in Chapter 4) are usually designed around the number of bits a CPU can process simultaneously. A 16-bit processor usually has 16-bit registers; a 32-bit processor has 32-bit registers, and so forth. These larger numbers require a correspondingly wider data bus to move a complete address out of the processor.


You should be getting a sense of how larger and faster CPUs generate a chain of events that lead to whole new chipsets and motherboards. Not only does the chip run faster, but the internal registers grow larger, or new ways to move instructions more quickly demand faster bus speeds. Although we can always add cells to a memory chip, it isn't so easy to add registers to a microprocessor.

Larger numbers mean the memory controller takes more time to decode the addresses and to find stored information. Faster processing requires more efficient memory storage, faster memory chips, and better bus technology. Everything associated with timing, transfers, and interruptions must be upgraded to support the new central processor.

L-3 Caches

You may see references to an L-3—or a Level 3—cache. Tertiary (third) caches originated out of server technology, where high-end systems use more than a single processor. One way to add an L-3 cache is to build some additional memory chips directly into the North Bridge. Another way is to place the cache into a controller sub-system between the CPU and its dependent devices. These small I/O managers are part of hub architecture, discussed in Chapter 5. Newer Pentium 4 processors use up to 20-level pipelining operations; an L-3 cache would also be a way to offload next-due instructions from a memory controller.

Simply put: More and more CPUs have both the L-1 and L-2 cache built right onto the die. If a third cache remains outside the die, many people refer to it as a Level 3 cache. Level 3 caches are usually larger than L-2 caches, more often in the 1MB size range. All three types of cache usually run at the processor speed, rather than the speed of a slower memory bus. (Benchmark tests on single-processor systems have shown that an L-2 cache peaks out at about 512KB, so adding more memory to a third-level cache isn't always going to increase system performance.)

  • + Share This
  • 🔖 Save To Your Account