Index Home About Blog
Date: Mon, 04 Jan 2010 17:43:46 +0100
From: Terje Mathisen <"terje.mathisen at tmsw.no">
Newsgroups: comp.arch
Subject: Re: Larrabee delayed: anyone know what's happening?
Message-ID: <4fi917-n1q1.ln1@ntp.tmsw.no>

Thomas Womack wrote:
> How fast are the computation fronts in your jobs, and how big the
> kernels?  Bit-interleaving the addresses seems to be the standard
> trick for dealing with two- and three-dimensional mainly-local jobs in
> one-dimensional memories with caches; at least the cache lines now
> contain data which is generally used together.

This same trick is so useful that I have seen at least one modern core
which has dedicated instructions for 2D and 3D interleave.

I'm guessing that it wouldn't be too hard to use those operations to
support a few higher dimensions as well, with one or two more steps.

Many, many years ago it was used to tile geometry for a flight
simulator, it is probably still used like that.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"


Date: Mon, 04 Jan 2010 19:25:31 +0100
From: Terje Mathisen <"terje.mathisen at tmsw.no">
Newsgroups: comp.arch
Subject: Re: Larrabee delayed: anyone know what's happening?
Message-ID: <tdo917-69q1.ln1@ntp.tmsw.no>

Thomas Womack wrote:
> each way using 'standard' instructions; IIRC Larrabee doesn't have an
> uninterleave operation, you interleave to get an address for a texture
> sampler and who would want to uninterleave afterwards.  Can't use a
> naive magic multiplier: 8N=2 and 64N=4 are incompatible even mod 2^32.

I think the proper response is simply "Don't do that!", i.e. make sure
you never need to uninterleave by keeping the normal coordinates around.

>> Many, many years ago it was used to tile geometry for a flight
>> simulator, it is probably still used like that.
>
> The ATI detailed documentation indicates that textures are stored in
> an interleaved format, though I think it's not bitwise, it's something
> like interleaved up to X[6] then Y[5..3] X[5..3] Y[2..0] X[2..0], but

You would not want bitwise interleaving of the actual data, use the
bit-interleaved value as an index into the texture array instead. (I.e.
loading 8-128 bits from each index)

I think that flight simulator used the interleaved index to load 256x256
pixel textures.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Index Home About Blog