by Doug Coulter » Tue Jun 21, 2011 10:58 am
One thing to know about any programming language is how it "scopes" things. That's a shorthand way of saying "over what space and time is this thing visible, or in existence".
Depending on the machine and compiler, getting this right can do good things for you.
The main scopes that exist (you have control over a lot of this) are project, file, routine. For the moment, lets just look at scope as it applies to variables (it can also apply to any named thing).
At project scope, a declared variable exists always, and is visible and changeable by anything. We call those "globals". They are both useful, and incredibly dangerous. Some people go to ridiculous lengths to avoid them. I don't, but this is a place for some caution -- if anything can change a variable, it might not have the value you expect it to have next time you check. That can be good (if you're using it for communication between things) or really bad, depending. The always-exists issue means if a variable, that memory is always used, so can't be allocated to anything else, an issue always, but especially in a platform where ram is tight. Also, you need to understand that some variable declarations generate code, which runs at various times depending on scope. For example, at project scope the statement:
float SomePreciseNumber = 3.1415927; // get a copy of Pi -everyone can see this and write it as defined here though. Blessing or curse is up to you.
Creates some code that loads up this variable with the number -- but only at startup time. This obviously eats some rom to hold that code, and that number -- you might care, or not.
You can in some cases save yourself some of that via using something like
#define PI 3.1415927
And then everywhere you use PI in code, the compiler simply substitutes the number. Depending on a lot of things (including how often and where you use PI) this might either save code, ram and cycles, or not. Complex expressions in defines are all calculated in the compiler, once, at compile time and take no cycles or space in the target machine, and if you need to use them to make things more obvious, you should -- let the PC do the heavy lifting, once, to save resources in the target project.
Now, if you were using pi all over? The #define construct wastes memory by putting a copy of it everyplace you use the define. Not good. For that, C provides another way:
const float PI = 3.1415927; // every use of PI now just shares this one memory location, const means the compiler will throw an error if any code is detected writing to this
// but note, the protection const gives isn't perfect -- if you have this near an array, and write the array out of bounds, the compiler can't know about that. This is the classic buffer overflow
// bug you see in bad windows code all the time in routines that don't check bounds.
For example you might want to know array bounds for some reason -- there's a lot of ways to do it as they say over in Perl-land. Here's one way:
#define ARRAYSIZE 32 // you might want to be able to change this for everything that sees the array once in awhile as you're building code -- we do it for buffers here all the time
#define ARRAYBOUND (ARRAYSIZE * sizeof(int)) - 1 // because things start at index zero in C, not one, so this is the max legal array index.
int Array[ARRAYSIZE}; // actually declare the array and the memory it's going to eat, and name it "Array"
Some other code can then use array bound to make sure a pointer into an array actually points to the inside of the array by checking the size in bytes and the array start address, no matter what data type the array is actually made of. This is handy if you're checking by looking at the actual number in a pointer to see where it points, which is going to be in bytes usually.
If the #defines are at project scope - everything can use them. If they have internal calculations, those are all done once in the compiler, which can speed things up at runtime.
Going to the other extreme of scope - let's look at the inside of a subroutine. You might have something along the lines of:
int SomeRoutine (int firstarg, int* secondarg)
{
int localvar = firstarg + *secondarg; // add the args
return localvar;
}
This construct allocates a ram location for localvar every time the routine is run, and frees the memory every time the routine returns. This isn't a perfect example, but just shows that this little bit of memory can be used, and reused all over the program. But only inside this routine does the name localvar exist -- no one else can see it at all, and it's gone between runs of the routine, completely. Now, a seasoned C programmer wouldn't do this for the example case, he'd just write instead:
return firstarg + *secondarg; // saves typing and a memory location because the compiler also allocates one for the return value anyway.
The CCS compiler is insanely good at this, it builds a very complete "call tree" that tells it what routines can call what other ones from inside them -- and makes sure things don't step on each other, while saving the most memory possible -- this is a very useful feature in a PIC to say the least. You can, btw, get that call tree to be displayed, and it will help you understand your own code sometimes...nice.
Suppose you want to have some variable only seeable by some routine(s) but stay around? Again, there's more than one way. The classic way is to use some extra brackets to create a scope, like this:
{
int counter = 0; // initialized once at startup only
int countCounter(void)
{
return ++counter; // counts counter, returns new value to caller
}
} // end of special scope that keeps counter statically available, but just to routines inside these brackets
This particular case shows up enough that C provides a syntactically simpler way that acts the same. You can just say:
int countCounter(void)
{
static int counter = 0;
return ++counter;
}
And yet another way -- just define the counter someplace global and it will always be there -- but also can be seen and changed by anything. C gives you a ton of fine-grain control over stuff like this and as usual - it's a mixed blessing. The relevant saying is "C gives you enough rope to shoot yourself in the foot" and truer words are rarely spoken.
To sum up (theres a lot more that can be said, but for now -- keeping it simple) scope determines what the compiler generated code can see from where and when any code associated with intialization could be run (either wasting cycles, or using them wisely, depending).
Another case where scope comes up is where and when do you define and declare stuff so the compiler has for instance, seen a subroutine before the code that calls it. While in theory, the compiler could just iterate over the code endlessly till it found everything, most (including this one) don't do that, and there are "rules" in C to avoid the necessity of that. You have to at least declare something before you use it (it can be actually defined elsewhere). In my case, I tend to avoid having to do "forward declarations" as it's more typing, and needlessly slows things down some (at compile time only, but it's my time too).
Thus, I'll just define all my global variables and subroutines up at the beginning of things before using them -- it's a way to put together code that's easy to read and understand later. This puts the crucial global variables at the very top of the file (so they're quick to find later) and the all important main code at the very end (so it's easy to find later) while all the subs live in the middle somewhere. I avoid writing pairs of routines that can call one another when I can (almost always) as this requires some kind of forward declaration.
Ok, enough English, now to show this in a real language.
// example of a forward declaration
int DoSomething(int, float); // just declares that someplace else you're going to define this subroutine for real
void somedumbroutine(void) // this somedumbroutine is defined fully right here
{
DoSomething (someint,somefloat); / this would give a compile error if DoSomething wasn't declared first.
}
//later on
int DoSomething (int firstarg, float secondarg)
{
// do whatever DoSomething does here
}
A lot of people do things this way -- I really try to avoid it. For both me and the compiler, it's less work to just define things in a good order and not have to use this "workaround" much.
One use of scope in more advanced situations is when you write reusable libraries -- you might not want to make every variable and routine in some library visible to its users for various good reasons. The nice thing about libraries is that it's a nice chunk (if designed well) of very re-usuable code you can pull out of your bag of tricks for many projects. If well written, the user won't have to know much about what's inside, usually. Now, the CCS linker that would be real helpful in using libraries is brain-dead, there are some issues with all those fancy non-C chip-specific directives getting tangled up, and frankly, the PIC is small enough that library reuse is questionable if the library has anything in it not used in this project, so I don't use it! At any rate, most libraries as used are only compiled one last time once they're finished and debugged, and to use one you link in the compiled code, and include a header file for that library that specifies just what the library wants to expose to the public -- which can be very good and save a lot of hassle with naming collisions -- you'll run into those when you start doing anything big and complex.
Good names are real important to understanding later on, but doggone it, some of the good names aren't that unique -- clearArray, MoveData, Init() and so on might want to show up a lot, so scope can make it possible to use same names without confusing the compiler. You just have to be careful you don't confuse yourself.
As usual, there's more than one way, and in my case I make "libraries" that are just #include files....and include them up at the top of the main code so all the declarations are there by the time any of that stuff is called. If I want to use scope to hide some things in that included source, the "extra brackets" trick works, among others. This is a case where what you'd do for a tiny embedded job is a heck of a lot different than what you'd do in a big PC program, where any real handy library is probably already available, dynamically loaded, and can be shared safely across a few running programs as the opsys is clever enough to keep any variables used in a dynamic library as separate copies for each program or process that uses the library -- in a PC it saves space. In a PIC, not so much...C is flexible enough (long rope!) to handle either case well -- but you have to provide all the brains, always. One size does NOT fit all here, embedded is its own specialty and depends more closely on what's available in the platform in use.
This is one reason I don't recommend eating up every book on C programming on the cutout shelf at the bookstore. Yeah, all that stuff is nice to know someday, but some of the more general C tricks that are perfectly appropriate in a PC with "unlimited memory and cycles" are just dumb in a PIC or other embedded device - In the latter, you're smarter to be a lot more "down on the metal" and not use too much "magic between you and the machine" in general. PICs were originally designed to be logic replacements -- just replace a few gates and flops. They've grown, but the basic idea is still the same, and KISS never applied more than in a small SOC system. You are generally just wanting to make each chunk of PIC hardware do just one thing, well...
Posting as just me, not as the forum owner. Everything I say is "in my opinion" and YMMV -- which should go for everyone without saying.