Gamebuino forum

by **rodot** » Fri Jan 09, 2015 7:48 pm

Well you should call gb.display.clear() before and after you use the display buffer to be sure it's clean.
You have to use the buffer within one frame, as it's updated onto the screen at the end of each frame. Also make sure you draw your sprites and tiles after you use and erase it.

Haha that's quite a short .ino indeed. I use notepad++ too to work on the Gamebuino library.

by **Marcus** » Fri Jan 09, 2015 7:53 pm

Amazing design, I love the animation and the backgrounds, too. Keep up the good work :-)

by **Jamish** » Fri Jan 09, 2015 9:20 pm

rodot wrote:Well you should call gb.display.clear() before and after you use the display buffer to be sure it's clean.
You have to use the buffer within one frame, as it's updated onto the screen at the end of each frame. Also make sure you draw your sprites and tiles after you use and erase it.

Haha that's quite a short .ino indeed. I use notepad++ too to work on the Gamebuino library.

That makes sense. Flashing to memory upon loading the level may be the best course of action. Unless that expansion with the SPI RAM is coming out soon

Do you use any notepad++ extensions? Something to auto-complete the gb calls, maybe?

Oh, and I wanted to share this part because I thought it was neat... but also because I need some help. In my game, each tile is 4 bits, giving me 16 different tile IDs. But, I dynamically assign the sprites to tiles--that is, rather than using one tile ID for each block type (left corner, right corner, top, interior/wall), I only have one block tile ID stored in the map data. I check for neighboring blocks to decide if it's a corner, top, or interior block. Same thing for trees--there is really only one tree tile ID, but the sprite it draws (top, middle, and trunk) is based off of neighboring tree tiles. That way I'm not wasting 4 tile IDs on blocks and 3 tile IDs on trees. My maps can have a lot of variety with very little memory usage. Unfortunately, I have to access the PROGMEM ~4 times per block to check the neighbors.

I seem to recall something about how arrays are accessed--that if you're traversing an array sequentially, it's faster than jumping around a lot (which I'm doing--when I read a tile above or below, that's jumping to a non-adjacent part of the array). How can I make that faster? Would making it a 2D array help at all? Or does that really only apply to architectures with a true memory hierarchy with registers, L1/L2/L3 caches, etc?

Alas, we programmers must always decide between CPU efficiency and RAM efficiency

by **Drakker** » Fri Jan 09, 2015 10:53 pm

You could have a small ram buffer that fits only the blocks visible on screen. Then every frame you could fill that and do all your checks on this one instead. I'm not sure how much slower accessing progmem is, but it might be a worthy trade off.

by **Jamish** » Fri Jan 09, 2015 11:03 pm

Drakker wrote:You could have a small ram buffer that fits only the blocks visible on screen. Then every frame you could fill that and do all your checks on this one instead. I'm not sure how much slower accessing progmem is, but it might be a worthy trade off.

I considered that. According to http://forum.arduino.cc/index.php?topic ... msg1013574, it takes 62.5 nS more time per byte to read from PROGMEM vs SRAM. There are 17x8 tiles on a screen (more if you consider that half of a tile could be visible on either side). That's maybe 162 tiles. Even if every tile is accessed 4 times, or 648 reads, that's only 0.04 ms faster. Since a frame is 50ms at 20 FPS, it doesn't seem worth it

Come to think of it, I wonder if calculating which tile to draw is even the slow part. Could it be slow because I'm drawing over 100 bitmaps to the screen? Is it any slower to draw a bunch of smaller bitmaps vs one giant bitmap?

by **Drakker** » Sat Jan 10, 2015 12:27 am

I don't know much about Arduino in general so I can't be of much help, but one thing I noticed is that drawing bitmaps in the buffer takes a lot of time. In my bricks game at 40 fps I can't fill more than half the screen before the CPU usage grows enough to go over 100%. In my gray test utility I hit the same problem, the latest version (which I haven't posted yet) fills like 1/4 of the screen and CPU usage quickly grows. Strangely, I can fill the screen with characters/strings, and the CPU usage doesn't grow much if at all, yet it is still drawing the same number of pixels on screen. My guess is that something is really slow in the gb.display.drawBitmap function.

by **Jamish** » Sat Jan 10, 2015 6:19 am

rodot wrote:I've always read that Arduino doesn't supports dynamic allocation, that everything has to be declared and allocated at compilation time. But I just stumbled on that, which seems to provide a way to do it. They say that it may cause heap fragmentation though.

Ugh. I'm so glad you pointed this out, because I'm running into exactly that problem. If there isn't enough contiguous space in the memory, the creation of an enemy fails silently. Shoot. That throws a massive wrench in the object-oriented programming plan...

by **Myndale** » Sat Jan 10, 2015 9:32 am

You actually have a number of options here. The best plan is probably to design your architecture so that you never have to do any per-frame allocations (e.g. by maintaining a bank of objects) but if that's not an option then one workaround is to declare your own form of the placement new operator like this:

Code: Select all: inline void * operator new(size_t, void * mem) { return mem; }

You can then create and destroy your instances by calling malloc/free manually and invoking the ctor and dtor directly:

Code: Select all: Foo * foo = ::new(malloc(sizeof(Foo))) Foo; // instance is in scope here foo->~Foo(); free(foo);

Ordinarily you would hide this further by templating new versions of the new and delete operator, but unfortunately Arduino doesn't have a proper pre-processor so you'd want to wrap this in a function or something. Either way the key thing to note is that memory is being allocated with a call to malloc() and you're specifying the size manually. If fragmentation is a serious problem then you can often alleviate it by setting this value to the size of your largest object, you'll allocate more memory overall of course but you'll effectively be breaking your memory up into these fixed-size chunks and prevent fragmentation into smaller blocks.

by **rodot** » Sat Jan 10, 2015 12:54 pm

Jamish wrote:Come to think of it, I wonder if calculating which tile to draw is even the slow part. Could it be slow because I'm drawing over 100 bitmaps to the screen? Is it any slower to draw a bunch of smaller bitmaps vs one giant bitmap?

Yes, the real bottleneck here are the drawPixel(), drawBitmap() (which might be optimized) and the SPI bus (which is used to communicate with the screen). You can increase the SPI bus clock, but it's already at the screen's maximum so you'll get random pixel if you do so.

The line to change is in Libraries/Gamebuino/display.cpp/Display::begin()

Code: Select all: SPI.setClockDivider(SPI_CLOCK_DIV8); //can be set to 4 but some random pixels will start to appear on some displays

by **Jamish** » Sat Jan 10, 2015 11:26 pm

rodot wrote:Yes, the real bottleneck here are the drawPixel(), drawBitmap() (which might be optimized)

Funny you should mention that, because I was poking around in drawBitmap a bit last night. Original code:

Code: Select all: int8_t w = pgm_read_byte(bitmap); int8_t h = pgm_read_byte(bitmap + 1); bitmap = bitmap + 2; //add an offset to the pointer to start after the width and height int8_t i, j, byteWidth = (w + 7) / 8; for (j = 0; j < h; j++) { for (i = 0; i < w; i++) { if (pgm_read_byte(bitmap + j * byteWidth + i / 8) & (B10000000 >> (i % 8))) { drawPixel(x + i, y + j); } } }

I didn't do very *official* speed tests, but I used getCpuLoad when drawing a bitmap scene in my game (I removed all enemies to avoid fuzziness) and got "90".

Going from "byteWidth = (w + 7) / 8" to "byteWidth = (w + 7) >> 3" gave me "88", or saving 2 CPU units (what's the max? 128?). I worried that I was bitshifting a signed integer (w) and will encounter problems with sign extension, so I switched w and h to unsigned. You'd think that the "i < w" and "j < h" signed-and-unsigned comparisons would slow it down, but it hovered at "87".

Also, I noticed that you calculated byteWidth once in the outside the loops, and figured I could do something the same: why not switch the order of i and j around and calculate "i/8" and "i%8" n times rather than n^2 times? Here's the final code I ended up with:

Code: Select all: int8_t i, j, byteNum, bitNum, byteWidth = (w + 7) / 8; for (i = 0; i < w; i++) { byteNum = i / 8; bitNum = i % 8; for (j = 0; j < h; j++) { if (pgm_read_byte(bitmap + j * byteWidth + byteNum) & (128 >> (bitNum))) { drawPixel(x + i, y + j); } } }

which gave me a CPU load of 80. That's over 10% faster! And all I was drawing was ~50 6x6 bitmaps (which was also running in my game engine--I'm looking to testing it with a bitmap-only example to know for sure). And it should only use, what, 2 more bytes of RAM to sture byteNum and bitNum?

EDIT: follow up. Barebones example.

Code: Select all: #include <Gamebuino.h> #include <SPI.h> extern const byte font3x5[]; const byte tile[] PROGMEM = {8,6,0xFC,0x24,0xFC,0x90,0xFC,0x48,}; Gamebuino gb; void setup() { gb.begin(); } void loop() { if(gb.update()){ gb.display.setFont(font3x5); gb.display.print(F("CPU: ")); gb.display.println(gb.getCpuLoad()); for (int i = 0; i < LCDWIDTH; i += 6) { for (int j = 5; j < LCDHEIGHT; j += 6) { gb.display.drawBitmap(i, j, tile, 0, 0); } } } }

Before:

After:

That would be 16% faster, my friends

Want me to check the changes into the beta branch, or branch off beta and make a pull request?

Gamebuino forum

[WIP] A generic platformer w/ object-oriented game engine

Re: [WIP] A generic platformer w/ object-oriented game engin

Re: [WIP] A generic platformer w/ object-oriented game engin

Re: [WIP] A generic platformer w/ object-oriented game engin

Re: [WIP] A generic platformer w/ object-oriented game engin

Re: [WIP] A generic platformer w/ object-oriented game engin

Re: [WIP] A generic platformer w/ object-oriented game engin

Re: [WIP] A generic platformer w/ object-oriented game engin

Re: [WIP] A generic platformer w/ object-oriented game engin

Re: [WIP] A generic platformer w/ object-oriented game engin

Re: [WIP] A generic platformer w/ object-oriented game engin

Who is online