Index Home About Blog
From: wallace@netcom.com (David E. Wallace)
Subject: Re: Revised <inttypes.h> spec ready for review
Date: 10 Sep 1997
Newsgroups: comp.std.c

In article <5ulkj8$7rm@dfw-ixnews3.ix.netcom.com>,
Douglas Gwyn  <gwyn@ix.netcom.com> wrote:
[On Stupid Pointer Tricks that exploit pointer->integer conversions]:

>Such tricks exploit specific knowledge of an implementation's
>representation(s) of pointers, and are inherently nonportable.
>I'm not proposing to stop nonportable programs from using such
>tricks (when necessary, which is *very* rarely), but rather
>that the C standard stop saying anything about this if what it
>says has no force anyway -- it's just misleading.

>I'm not sure what good it will do you to have a test for the
>existence of an ill-defined intptr_t, for two reasons:
>(a) you still need to know more, namely enough about the
>representation details to find spare bits (if any; they might
>not be available) for your Stupid Pointer Tricks; and
>(b) if you're going to have alternative code for use when there
>is no intptr_t, then you don't need the hackery in the first place.

Ok, I'll take that as a challenge.  I believe the following code is
portable to any implementation where you can store a converted pointer
in an unsigned long, regardless of the details of the implementation's
pointer representation.  No #ifdefs.  You don't necessarily get the
desired performance benefits on architectures that differ from the
assumptions made in the code, but it should compile and run correctly.
In C9X, you could change the typedef to intptr_t, and have
it be portable to any implementation that supports intptr_t, even
if pointers don't fit in an unsigned long.

I cooked up this example based on a conversation with a friend who is
one of the most performance-oriented C programmers I've ever met.  If
I recall correctly, he claimed he once observed around a 15-20% speedup
on some code of his by cache-aligning certain data structures.  Your mileage
may, and probably will, vary.

I don't normally recommend doing this sort of thing just for the heck of it.
But I can see that there could be some applications where you really
need the performance on some of your target platforms, and are willing
to live with whatever you get on the rest.

Dave W.
*************************
#include <stdlib.h>

typedef unsigned long ptrint;  /* int type that can hold a pointer */
typedef long a[4]; /* a is array of 4 longs */
#define NUM_LONGS_IN_A (sizeof(a)/sizeof(long))

int find_num_of_as_to_allocate(void);
void routine_expecting_array_of_alen_as(a arg[]);

/* The following routine allocates an array of alen a objects for
 * the routine routine_expecting_array_of_alen_as to play with.
 * On sane architectures, the code here should align this array
 * to a boundary that is a multiple of sizeof(a) bytes.  This may
 * result in performance improvements if the cache line size is
 * a multiple of sizeof(a) bytes, by forcing each element of a to
 * be cache-aligned.  Performance gains (if any) depend on the access
 * patterns in the above routine, of course.  The code is intended to be
 * strictly conforming, except for architectures where the ptrint typedef can
 * not hold a converted pointer.
 *
 * I do not guarantee that this code is bug-free.  It has, however,
 * been compiled with "gcc -pedantic -ansi -Wall" without warnings.
 *
 */
int main()
{
        ptrint i;
        int alen, base_offset;
        long *baseptr, *alignptr;

        alen = find_num_of_as_to_allocate();

        /* Allocate one extra array element so we have room to play with
         * the alignment */
        baseptr = malloc((alen + 1) * sizeof(a));

        if (baseptr) {
                i = (ptrint) baseptr;
                base_offset = NUM_LONGS_IN_A - (i % NUM_LONGS_IN_A);

                base_offset = base_offset % NUM_LONGS_IN_A;
                /* This last line is paranoia, in case ptrint becomes a
                 * signed type in the future.  It's unnecessary if you
                 * know that ptrint is guaranteed unsigned.  We now
                 * know that 0 <= base_offset < NUM_LONGS_IN_A. */

                alignptr = baseptr + base_offset;
                /* On sane architectures, where the low-order bits of
                 * a pointer are the byte address, and pointer->int
                 * conversions preserve these bits in the low-order bits
                 * of the converted int, alignptr should now be
                 * an address that is an even multiple of sizeof(a).
                 * On other architectures, this produces a pointer
                 * that is long-aligned, but doesn't necessarily observe
                 * any stricter alignment.
                 */

                 routine_expecting_array_of_alen_as((a *)alignptr);
                 free(baseptr);
        }
        return 0;
}
--
Dave Wallace		(wallace@netcom.com)
It is quite humbling to realize that the storage occupied by the longest
line from a typical Usenet posting is sufficient to provide a state space
so vast that all the computation power in the world can not conquer it.

Index Home About Blog