Levelling Out The Wear

I've been playing around with an Arduino since May, but up until now I haven't had any need to use the 1024 bytes of EEPROM embedded in the ATMEGA328 microcontroller at the heart of the Arduino UNO. EEPROM, or to give it its full name Electrically Erasable Programmable Read-Only Memory, is really useful for storing configuration data as any values saved to EEPROM are retained even when the power is removed.

The main problem with EEPROM is that it can quickly wear out. The datasheet for the ATMEGA328 gives an expected life of just 100,000 writes for each 32 byte page. If you are just using the EEPROM to save configuration data, which won't change very often, then this won't be a problem, but if you intend to frequently write to the EEPROM this could quickly become a real issue. To mitigate this problem what we need is a wear levelling algorithm.

In the sketch I'm working on I have just four bytes that I need to store as configuration data. The easiest option would be to simply store them at offsets 0, 1, 2 and 3, but of course this would mean that every time they were changed the same four bytes (and hence the same 32 byte page) would be re-written, while the rest of the EEPROM would remain unused. Clearly this won't result in even wear across the EEPROM.

My solution, which probably isn't optimal, is to simply shift the data along by one byte each time it needs to be written. This completely ignores the idea of 32 byte pages, but if data is being written often enough then hopefully the pages should get fairly evenly used.

Now the main problem of shifting the data along by one byte each time it changes is knowing where to read it back from when the Arduino starts up. The trick here is to employ some form of multi-byte marker that denotes the location of the data block. In this example I'm going to use the four characters "TS01" to stand for Test Sketch version 1. This allows me to increment the version number if I change the format of the stored data in the future, and tells me which sketch the data refers to. This second point is important if you are going to use the EEPROM across different sketches (less important if you dedicate an Arduino to a single project). So enough talking lets start putting this into practice.

Firstly, while the core Arduino software does include an EEPROM library it's fairly rudimentary. Fortunately another Arduino user has contributed the EEPROMex library which offers a whole bunch of useful methods (note that I did have a few problems getting this library to work; you have to make sure that the folder and filenames all share the same capitalization -- EEPROMex, and the header file doesn't need to include the original EEPROM library). This extended library is capable of writing arbitrary structures to EEPROM, so instead of having separate variables for the configuration data it's easier to store them in a custom structure giving us the following code:
#include <EEPROMex.h>

#define CONFIG_VERSION "TS01"

struct ConfigStructure {
    byte a, b, c, d;
    char version[5];
} config = {
    76, 74, true, 15,
    CONFIG_VERSION
}, stored;

int offset = 0;
As you can see this snippet imports the library, defines a constant to hold the multi-byte marker and then creates the custom structure. We also create two instances of the structure config and stored giving the first one some sensible default values for each variable, so that if we don't find any stored settings the sketch can still operate. You'll also notice that I've put the multi-byte marker at the end. This is so that it only gets written if we have successfully written the rest of the data -- I'll come back to this point later.

Now before we get into actually reading and writing the settings I'm going to introduce a utility function for comparing the current and stored settings.
boolean compareSettings() {
  if (config.a != stored.a) return false;
  if (config.b != stored.b) return false;
  if (config.c != stored.c) return false;
  if (config.d != stored.d) return false;
 
  return true; 
}
Ideally I would have liked to actually pass the two instances to the function so that it wasn't a hard coded comparison of the config and stored instances, but there is an issue with passing user defined structures to functions in the version of the Arduino IDE I'm using (it has been fixed in 1.0.2 but I'm still on 1.0.0). Anyway this function simply checks that the two instances hold identical values for each variable.

So now we can turn our attention to reading stored settings out of the EEPROM.
void setup() {
  //search through the EEPROM for a valid config structure
  for ( ; offset < EEPROMSizeUno-sizeof(stored) ; ++offset) {
      
      //read a struct sized block from the EEPROM
      EEPROM.readBlock(offset, stored);

      if (strcmp(stored.version, CONFIG_VERSION) == 0) {
        //if the block has the right version tag then we can
        //assume that the rest of the struct is valid

        //overwrite the default settings with those we have
        //just loaded from the EEPROM
        config = stored;

        //we don't need to look any further so break out of
        //the loop to speed things up a bit
        break;
     }
  }
  
  if (!compareSettings()) {
    //if they haven't changed the settings from the default then
    //there is no reason to waste an EEPROM write in storing them
    //so assume the stored version is the default
    stored = config;  

    //as there wasn't anything in the EEPROM set the offset to -1
    offset = -1;
  }
}
Hopefully the comments make this code easy to understand, but essentially all it does is read a block of data from the EEPROM, see if it is a valid settings object by checking if the version variable is the same as the CONFIG_VERSION constant. If it is then we ensure that the current config structure is the same as the stored structure, otherwise we move on 1 byte and try again (this all happens in lines 3 to 20, ). Once we have searched through the entire EEPROM, either we will have found some settings and config and stored will be equal, or we haven't found any and they will differ. Note that we control the loop using the EEPROMSizeUno constant defined by the EEPROMex library, if you have a different Arduino then you will need to change this. If, when we have searched through the EEPROM, config and stored aren't the same then lines 22 to 30 make them so (so we don't store the default values to EEPROM for no reason) and reset the offset so that when it is incremented to save the settings it moves to the beginning of the address range.

Now in the main part of your sketch when you decided the settings need to change you can simply update the config instance and call a saveSettings function to actually write them to the EEPROM.
void saveSettings() {
  if (!compareSettings()) {
    //we need to store the new settings in EEPROM

    //move on one position from the previous place we stored things
    ++offset;

    //if writing at offset would mean going outside the EEPROM limit then
    //reset the pointer to the beginning of the EEPROM
    if (offset > EEPROMSizeUno-sizeof(config)) offset = 0;

    //do the actual EEPROM writing
    EEPROM.updateBlock(offset, config);

    //stored should now be the same as config
    stored = config;
  }
}
Essentially if the config and stored instances differ then we need to write config to the EEPROM. We increment the offset (the wear levelling part), check we haven't gone outside the valid range, and then write the block (we actually use update so that we don't write bytes that won't actually change) and now we know that config and stored are the same again.

I mentioned earlier that the reason for putting the multi-byte marker at the end of the config structure was so that I knew once it had been written then the rest of the data must have been as well. The problem here is that if writing fails before the start of the marker we will have no way of knowing, other than we will see random configuration data. The easiest option would be to add a different marker at the start. If both markers are present and in the right place then the data between them must be valid. The problem here is that a four character marker takes up more space than the config data I'm saving so having two of them seems like a waste.

The approach I'm taking is to compute a hash of the four config values and store this as the first piece of data. This means that when I read the structure back I can check for the multi-byte end marker and that the data matches the hash. The problem is that the hash function is specific to the number and type of config variables you want to store and so isn't a generic solution but something you would have to adapt to your needs. As an example though here is how I'm currently hashing the four bytes of configuration data.
const int prime = 31;
int result = 1;
result = prime * result + config.a;
result = prime * result + config.b;
result = prime * result + config.c;
result = prime * result + config.d;
This code (which is based on what Eclipse generates for a four byte java structure) generates an integer (two bytes on the Arduino) based upon the four config values. Changing any of the config values changes the result of the hash function. There is also a second reason for using a hash function: data stored in EEPROM can degrade over time and the hash function allows us to check that the data hasn't changed since we stored it.

I know this hasn’t led to a single example sketch you can use, but hopefully the snippets I've presented will come in handy.

2 comments:

  1. I recognised a couple of words there. The order had me stumped though. Good job you know what you are talking about.

    ReplyDelete
  2. Thanks Mark. I will tag this for the future. I just hope I never need it.

    ReplyDelete