Milktrader

Iterating Until Convergence

Wednesday, July 7, 2010

C Pointers, an Un-Tutorial

Most times you find a post on a topic, it's authored by someone who has an idea of what they're talking about. Or at least that is the assumption. The following is a violation of the blogging social contract. It is more questions than answers.

In C, we define a variable by preceding it with a data type, thus:

int target;  // target is a variable of type integer 


This is not difficult to grasp. The syntax is awkward, yes, but Dennis Ritchie skipped out on some Literature classes in college apparently, so this is the historical legacy we have to deal with.

Now to the issue of pointers. A pointer is declared thusly:

int *arrow;   // arrow is a variable of type int pointer


Okay, great. We've got a pointer. It doesn't point to anything though. We can fix that, but this is where the mystery starts. I've 'pointed' arrow to target in two different ways. Which one is correct, and why?

*arrow = ⌖  // & is required as an address operator

arrow = ⌖   // notice the * prefix has been omitted 


In the above example, we are attempting to point the 'arrow' variable at the 'target' address. In the next example, we would like to assign an integer value to 'arrow'. Two different ways of typing it in, which one is correct and why?

*arrow = 3;  

arrow = 3;   


This is why it's important to raise your children to read novels, literature and other forms of creative writing. They may be writing the next programming language and your grandchildren will sacrifice precious brain cells trying to decipher needlessly cryptic syntax. My last question: Really?

6 comments:

  1. For reading C declarations, see http://unixwiz.net/techtips/reading-cdecl.html

    For the rest, you have to remember that a pointer is just a variable that holds a memory location, the * operator dereferences the pointer (uses whatever it points to as a memory location), and the & operator retrieves the memory location of a variable (the opposite of *).

    So in your first example, in line 1 you're assigning the address of target (&target) to the location pointed to by arrow. Unless the location pointed to by arrow is itself a pointer, this is probably not what you want. In line 3 of the first example, you're assigning the address of target to arrow itself, and since arrow is a pointer, arrow now points to target.

    In the second example, in line 1, you're assigning 3 to the address pointed to by arrow. So if arrow pointed to target (arrow = &target) then target would now have the value 3. In line 3 you're assigning 3 to arrow, so arrow now points at whatever is in memory address 3 - probably an illegal memory address on a PC but this might be a valid thing to do on an embedded system.

    All this stuff gets much, much easier once you've done some assembly language programming and you're used to throwing around values and the addresses of values, but the easiest way to approach it is to try and remember that pointers are just variables that hold a memory address.
    ReplyDelete
  2. Thanks Tim.

    I'm not quite to the aha moment, but you've moved me to the ah moment.
    ReplyDelete
  3. Starting with the big picture: C is probably the wrong language to explore trading systems in. C could be the right language to build the core of say high performance backtesting software, but it takes many lines of code and a lot of debugging time to create a correct program. (I like C a lot and have written lots of it, but I don't use it for exploring conceptual problems like trading system development because it takes too much dev time.)

    I think Scheme is a great balance between computational performance, code size, and development time. Also since Scheme is a small language there are many high quality FOSS implementations available.

    Medium picture: there are not very many good uses for C pointers and lots of bad uses. One exception is pointers to structs or large arrays. If you have a bunch of data to pass around then pass the pointer to the struct or array. Don't put a large block of data on the call stack. There is even a special syntax to make it quick to simultaneously dereference a struct pointer and access an element: mystruct_ptr->element

    Small picture: there are Linux apps called valgrind and splint which can help you find all of the problems that writing code in C will cause you. They are the bandaids you will need if you tap-dance in drawer full of knives ;0

    Micro picture: Do write "int* arrow;" instead of your declaration example. This is more pedantic, putting the type designation "pointer to int" together in one piece. This reserves the phrase "*arrow" to only mean follow(arrow).

    Don't write "*arrow = ⌖". Don't write "arrow = 3". I would suggest pronouncing "&target" as "get_address(target)".
    ReplyDelete
  4. Thanks Matthew:

    I think I'm past the mental block, and your -- hate to use the word -- pointers as well as Tim's have been clarifying.

    Since you are a fan of the Scheme dialect, are you at all familiar with Clojure? It appears it has some utility in the world of machine learning algorithms.

    On the point of Medium Picture - it helps that I have a potential use for pointers, and as far as trading applications go, struct and large arrays will be my focus.

    In the Big Picture, I'm learning C as a springboard to learning other languages, and as a tool for future R package development, since R is written on top of C. I don't plan on becoming an 'expert' of C in 30 days, but I'd like a working proficiency of it.

    Here is my mnemonic for keeping the syntax of pointers straight.

    arrow is an inactive thing in a quiver that all you can do with is read the print on, and write notes on (or addresses). So arrow = &target makes sense, since you get the address of target and sharpie it onto the arrow. arrow = 3 means that you sharpie the number 3 onto the arrow. 3 is an improper address, just like if you were to try to send a bill to:

    Company
    3
    Planet Earth

    Now if you assign 1570fff6 to arrow, now we're talking. That is a proper address. Make sure you're not sending your bill payment to Grandma's house though.

    *arrow is notched and an activated thing that as it's activity, points somewhere. All notched arrows have to point somewhere. As with all weapons, never, ever point an arrow at a place you don't want it to go. *arrow = &target reads point the arrow at get the address of target. This may make sense to a Buddha since they talk like this all the time ('the shape of my clock is orange', for example) but for normal conversation is makes zero sense. *arrow = 3 is a bit of a stretch with my metaphor, but you assign the tip of the arrow with the number 3, and when it reaches it's target it will attach the number 3 to it.
    ReplyDelete
  5. Yes, it sounds like you figured pointers out.

    I haven't used Clojure. If I was doing concurrent programming on the JVM, I would look seriously at it. For my purposes I like the performance of non-VM Schemes like Larceny and Ikarus. Also code that I write in Scheme today is going to work in 20 years. Clojure code from today is probably not going to work in 20 years because the language is younger and still growing.
    ReplyDelete
  6. I'm not sure how I learn best, hence my need to iterate.

    Method 1: create a clever metaphor that connects an unnatural expression to a real-world event you can comprehend.

    Method 2: hack your own code until it works.

    Perhaps a hybrid approach is best.
    ReplyDelete