7th March 2012
Filter Logic 1
In this article (hopefully the first of an occasional series) I will talk about using filters for gameplay - actual game logic, as opposed to graphical effects.
The ActionScript code running in Flash is notoriously slow. This is no longer a problem for the average game, because modern computers are fast enough to deal with reasonably complex game logic for many objects. Even, with the right algorithms, many mutually interacting objects. However, there are still limits to what can be reasonably done.
One way of getting more work done is to arrange for it to be done in bulk by optimised native code.
Flash 8 and above has access to several such functions as graphical filters. These quickly process every pixel
of an image and can provide a very significant speed-up when working on appropriate data - although as gameplay
is not their intended purpose there are arcane and case-specific restrictions.
There will typically be extra manipulations needed to form up the data at each stage, and the functions available
are not always as suitable as we might wish.
With that in mind, it's necessary to design the game around these limitations - if, like me, you are interested in the challenge.
A bitmap within Flash is effectively a 2-dimensional array of either 24-bit or 32-bit values: 8 bits each for red, green and blue, plus optional transparency. These channels are generally processed separately (which greatly restricts the usable number range for many operations). However, how the alpha channel is used depends on the operation; for some it will interact with the other channels.
The processing functions available likewise have some awkward aspects. Rather than dryly describe each in detail, I will try to cover the uses of different calls I have discovered on my forays. For now suffice it to say that we can set, add, subtract and multiply values, as well as move them around and combine them in various ways.
The significant bottleneck to using filters computationally is reading and writing of the information where it
has to be done individually, pixel by pixel. (In AS3 the situation may be slightly better, but at best still involves
ferrying data around.)
Unless the dataset is a reasonable size, and the data-processing pathway long and directly amenable to filter
functions, the cost of doing this will generally mitigate the speed gain from using filters.
This implies that it is necessary to maintain state in bitmap form, as opposed to
attempting to read, update and write any significant portion of the array per game cycle.
When that is accepted, we are lead to to the concept of 'taps'. A tap is any position of the data array which must be read or written during a state cycle for gameplay purposes. I will return to this in a later article.
The most obvious fit for filter processing are Cellular Automata. These are a regular grid of cells, each in a finite number of states. The grid is advanced to a new state by applying a function to each cell, deriving the new state for each from the prior state of itself and neighbouring cells.
Here is a version of Life using filters. The area is 256x256 (pixel-doubled) cells, with wraparound enabled by simply copying the edge-but-one row of each side to the opposite side.
The basic inner-loop is very simple in concept. 'live' cells are white, 0xffffff, while 'dead' cells are
This makes it unnecessary to read out state - instead simply display the resultant bitmap. A convolutionFilter processes each channel separately, so consider each cell as either zero or 255. The filter adds up the eight neighbours plus nine times the cell's old contents, and divides through by 255. Thus they become a value in the range 0..17. These are then processed to set them as either live or dead, which can easily be done with a series of thresholds:
//initialisation ... var bg:BitmapData = new BitmapData(cawidth,caheight,false,0); var pt:Point=new Point(); var cf:ConvolutionFilter=new ConvolutionFilter(3,3,[1,1,1,1,9,1,1,1,1],255); var w:Number=0xffffff; ... //inner loop bg.applyFilter(bg,bg.rectangle,pt,cf); bg.threshold(bg,bg.rectangle,pt,"==",3,w,63);//3 neighbours -> live bg.threshold(bg,bg.rectangle,pt,"==",9+2,w,63);//cell+2 neighbours -> live bg.threshold(bg,bg.rectangle,pt,"==",9+3,w,63);//cell+3 neighbours -> live bg.threshold(bg,bg.rectangle,pt,"<",18,0,63);//everything else -> dead
If you're paying attention, you'll have realised that the Convolution filter is actually doing three times as much work as it needs to, since it processes red, green and blue channels separately. If you were using a CA like this for a large area, it might be a win to overlay different areas together using copyChannel, and split them out for display, particularly if only a fraction is displayed onscreen.
I have a slight optimisation, which replaces the thresholds with one paletteMap filter pass:
var lifelut:Array=[//3,9+2,9+3 -> live else dead 0, 0, 0, w, 0, 0, 0, 0, 0, 0, 0, w, w, 0, 0, 0, 0, 0]; var zero:Array=[//disregard channel 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]; ... bg.applyFilter(bg,bg.rectangle,pt,cf); bg.paletteMap(bg,bg.rectangle,pt,lifelut,zero,zero);
This turns out to be slightly quicker - around 10% saving over the whole cycle. However, the main reason for mentioning this is that paletteMap is a fantastically useful filter. It can often replace several other filters, and will allow you to combine information from different channels. I'm actually using an optimisation of the paletteMap which gives a further 1% speedup, by supplying remapping information for just one colour channel - but I'll leave that as an exercise for the interested reader.
Here are average timings over a representative number of cycles for the core of the above system. I excluded wraparound code so as to concentrate on the inner loop.
|lightly optimised actionscript|| |
|convolutionFilter + 4 threshold passes|| |
|convolutionFilter + paletteMap (3 channel)|
|convolutionFilter + paletteMap (1 channel)|
I'm sure significantly faster actionscript versions of life could be made with progressively more effort; the version above is fairly naive. However, there is scope for improvement of the filter version too (i.e. using all channels for data). Therefore, I think it's fair to say that using filters confers something like two orders of magnitude speed improvement for this algorithm. Your milage will vary depending on filter pathway and processor used.
More next time.
- Filters are fast but awkward
- Game must be designed around filter abilities
- IO is a bottleneck; state must be fully maintained with filters.
- convolutionFilter combines neighbouring cells
- threshold selects certain values
- paletteMap arbitrarily replaces values, and can replace many threshold calls