Home

Blogs

Forums

News

Portfolios

Projects

Tutorials

26 users logged in

New? Learn about game development.

Before posting, review our community guidelines.

Support GameDev.net with a monthly GDNet+ subscription!

Chat in the GameDev.net Discord!

Back to Hardware Discussion

How a CPU Works? The Nitty Gritty

Zmurf · 2007-03-22T11:54:08

A understand the basics of how a CPU works, like the theory behind it. But I'm finding it hard to get information on how the electronic pulses running through the buses allow the processor to execute instructions, how the computer can cache and store memory and information on the hard drive. Can anyone help me find some information on the actual science behind the computer? Thanks in advanced.

Hardware Discussion Technical

Started by Zmurf March 21, 2007 03:02 AM

25 comments, last by python_regious 17 years, 5 months ago

Extrarius

1,412

March 21, 2007 10:32 AM

You might find OpenCores.org interesting - lots of VHDL and Verilog source for processors and subsystems of various types.

"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk

BeanDog

1,065

March 21, 2007 10:39 AM

Quote: Original post by Endar
That's just no fair. I did both the Architecture and Operating System subjects (architecture I unfortunately didn't pay any attention in, but I was able to pass), but we didn't do anything approaching implementing a processor in Verilog.

That's too bad--we built a simple processor in Verilog in a 200-level class at my university (required course for CS and EE students). It was really an eye-opening project for me. To pass the class, you had to run an undisclosed program provided by the professor on your processor (flashed onto an FPGA) and get the correct output.

trzy

100

March 21, 2007 01:12 PM

Zmurf: To understand this, you'll want to study digital logic. There are two broad categories: combinational and sequential. Combinational logic is logic which returns an output for a given input (think AND, OR, NOT, XOR, etc.) Sequential logic retains state information and, therefore, the output can depend on previous states. Sequential logic circuits are made out of combinational elements as well as logic circuits which have "memory" -- latches and flip-flops.

From there on out, it's just a matter of more abstraction. If you're in college, take some digital logic courses and learn an HDL.

----Bart

LessBread

1,415

March 21, 2007 02:39 PM

Here's another reference: Intel® 64 and IA-32 Architectures Software Developer's Manuals

"I thought what I'd do was, I'd pretend I was one of those deaf-mutes." - the Laughing Man

TheUnbeliever

963

March 21, 2007 03:57 PM

Has anyone used or read this book (Computer Architecture: A Quantitative Approach)? Amazon recommends it in conjunction with Computer Organization and Design. How much is there in the way of overlap - and how do they compare?

[TheUnbeliever]

Monder

993

March 21, 2007 05:37 PM

Quote: Has anyone used or read this book (Computer Architecture: A Quantitative Approach)? Amazon recommends it in conjunction with Computer Organization and Design. How much is there in the way of overlap - and how do they compare?

My Computer Design supervisor informs be it's got pretty much most of the stuff that Computer Organization and Design has plus more.

I've read a bit of Computer Architecture, and it means it when it says 'A Quantitative Approach'. They're got a fair amount of (simple)maths to compare various ways of doing architecture. It pretty much assumes you know the basics (e.g. The basics of pipelining are covered in the appendix) though, so Computer Organization probably gives more in-depth coverage of them.

Though really if you want to know the nitty gritty of how a CPU works, I recomend (as others have in this thread) to grab yourself some free tools from Xilinx or Altera, and a hardware board and make a CPU. Generally books don't go into precisely how these things work, e.g. you may have lots of detail about how to deal with pipeline hazards, but when it comes to say stalling the pipeline or flushing it they don't say how this is actually going to be done in hardware (probably because it can vary quite a bit depending on how you've constructed the thing).

nobodynews

3,128

March 21, 2007 05:49 PM

Quote: Original post by Extrarius
While it's possible to make an N-bit adder from a 1-bit full adder, it's generally not the best way to do it (the latency is too high).
Anyways, another proper name for this is "Digital Logic". Once you understand transistors, it's not to big a leap to logic gates and flip-flops (the simplest unit of memory), and once you have that, it's almost trivial to make an ALU (arithmetic logic unit, which does adding etc). For something simple, you'd just have every possibly type of calculation done (addition, shift, etc) and then have a multiplexer that picks which value to output based on the instruction bits in the opcode/microcode.

I was thinking about this recently while some people in a group I'm are taking a digital circuitry class which involved making a 4-bit ALU (exactly as you describe). I knew that having a, say, 64-bit adder would have too long of a propogation delay as just a series of 1-bit full adders. Just now, I came up with a method for adding that acts somewhat like a Fast Fourier Transform algorithm and I wanted to know if it was similar to what is done in real processors.

Say you have two 64-bit numbers A and B and you wish to generate the sum C. Break apart A, B, and C in half into A1, A2, B1, B2, C1, and C2 where X1 is the lower order bits and X2 is the higher order bits.

A1 + B1 = C1 w/ or w/o Carry
A2 + B2 + Carry = C2a
A2 + B2 + = C2b

Once C2a and C2b are calculated it simply becomes a matter of outputing the version of C2 that correspons to there being a carry or not.

This method can further be broken down. Divide A1, B1, and C1 in half and perform the same technique as before. Continue until you reach a size where the propogation delay is negligible.

My question is: is this how it's done, or is there a better trick?

C++: A Dialog | C++0x Features: Part1 (lambdas, auto, static_assert) , Part 2 (rvalue references) , Part 3 (decltype) | Write Games | Fix Your Timestep!

Promit

13,404

March 21, 2007 07:02 PM

It's generally done using a Carry look-ahead adder. As I don't remember how it works precisely and I'm feeling too lazy to remind myself, I don't know how similar it is to your method.

SlimDX | Ventspace Blog | Twitter | Diverse teams make better games. I am currently hiring capable C++ engine developers in Baltimore, MD.

Sneftel

1,788

March 21, 2007 08:17 PM

Yep, that's pretty much how it's done.

nicksterdomus

270

March 21, 2007 08:43 PM

Quote: Original post by nobodynews
Say you have two 64-bit numbers A and B and you wish to generate the sum C. Break apart A, B, and C in half into A1, A2, B1, B2, C1, and C2 where X1 is the lower order bits and X2 is the higher order bits.

A1 + B1 = C1 w/ or w/o Carry
A2 + B2 + Carry = C2a
A2 + B2 + = C2b

Once C2a and C2b are calculated it simply becomes a matter of outputing the version of C2 that correspons to there being a carry or not.

This method can further be broken down. Divide A1, B1, and C1 in half and perform the same technique as before. Continue until you reach a size where the propogation delay is negligible.

My question is: is this how it's done, or is there a better trick?

Your explanation is so high level it's hard for me to tell what implementation details you are thinking of that would lessen prorogation delay. The 1-bit series implementation is actually exactly your description if you break it up all the way to 1 bit.

Break all the additions down into 1-bit slices. Then have the Carry Out of the previous slice feed into the Carry In of the next slice and that will determine whether C2a or C2b will be chosen because the carry will either be 1 (implicitly choose C2a) or 0.

If you are going to stop breaking it down at some point, for example 8 blocks of 8 bits, then you have to explain how you are going to make the 8 bit addition in each block fast. This is not really a problem. Though, whatever circuit you use to implement the 8-bit blocks does not make the design automatically faster than the 1-bit slices in series design. You would have to calculate the propagation delay of these circuits and see if it the total propagation delay is actually less than the 1-bit adders in series.

Carry look ahead adders are modifications of the ripple carry adder that are proven to lessen propagation delay compared to ripple carry. Read the Wikipedia article and also follow the Look ahead carry unit link at the bottom and you can see them build up a 64 bit adder.

It's too bad they just show block diagrams and not gates within. If they did it would be easy to compare and see that the longest path is much shorter using carry look ahead vs ripple.

Don't think ripple carry adders are never used. If you need a design that takes up little physical area and the speed of your ripple carry adder design is not a problem, then you could use it.

How a CPU Works? The Nitty Gritty

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

How a CPU Works? The Nitty Gritty

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines