Quick and Dirty AArch64 MMU setup

13.9.2024

Introduction

This post covers the process of initializing the memory management unit (MMU) for ARMv8-a processor executing in AArch64 state. The example code is from a bare metal project I am currently working on and is meant to be run on a QEMU System Emulators virt board with cpu type cortex-a53 and 2 GB of RAM.

Entering Exception Level 1 (EL1)

The setup is done in the Exception level 1 (EL1), where OS kernels usually operate. Before the MMU configuration we need to make sure that we are in EL1. This is done by looking at the value of CurrentEL register. CurrentEL is a 32-bit register with bits [31:4] and [1:0] reserved and set to zero. The interesting bits [3:2] tell us the current exception level. In the boot assembly we write:


// ...
    msr     spsel, 1
    mrs     x0, CurrentEL
    cmp     x0, 0b0100
    beq     in_el1
    blo     in_el0
    cmp     x0, 0b1000
    beq     in_el2
in_el0:
    b       .
in_el3:
    mrs     x0, scr_el3
    orr     x0, x0, (1 << 10)
    orr     x0, x0, (1 << 0)
    and     x0, x0, ~(1 << 3)
    and     x0, x0, ~(1 << 2)
    and     x0, x0, ~(1 << 1)
    msr     scr_el3, x0
    mov     x0, 0b01001
    msr     spsr_el3, x0
    adr     x0, in_el2
    msr     elr_el3, x0
    eret
in_el2:
    mrs     x0, hcr_el2
    orr     x0, x0, (1 << 31)
    and     x0, x0, ~(1 << 5)
    and     x0, x0, ~(1 << 4)
    and     x0, x0, ~(1 << 3)
    msr     hcr_el2, x0
    mov     x0, 0b00101
    msr     spsr_el2, x0
    adr     x0, in_el1
    msr     elr_el2, x0
    eret
in_el1:
    mov     x0, 0b0101
    msr     spsr_el1, x0
    msr     DAIFSet, 0b1111
// We begin MMU initialization here.

If the CurrentEL value equals to 0b0100 (bits [3:2] equal to 0b01) we are in EL1 and we can jump straight to in_el1. If CurrentEL is lower than 0b0100 we know that we are in EL0. Since one can not enter a higher Exception level from a lower one we just enter an infinite loop. In the case of CurrentEL being 0b1000 we jump to in_el2. If it is higher we continue to in_el3.

In EL3

While in EL3 we modify the Secure Configuration Register scr_el3 and the Saved Program Status Register spsr_el3. By setting the bit [10] of scr_el3 to 1 we determine that the next lower level (EL2) is AArch64. By setting the bit [0] to 1 we determine that EL0 and EL1 are in Non-secure state, and that memory accessses from those ELs can no access Secure memory. This may not be necessary in all cases of MMU initialization. We also make sure that bits [3] (EA), [2] (FIQ) and [1] (IRQ) are all set to 0 so that external aborts, SError interrupts, physical FIQ and physical IRQ are not taken in EL3 when executing at other ELs, EL1 in our case.

For spsr_el3 we set the bit [4] to 0 to indicate that the exception was taken from AArch64. We set bits [3:0] to 0b1001 (bits [3:2] hold the Exception level, bit [0] is set to 1 to indicate that the exception stack pointer is determined by the EL) since exceptions are going to be taken from EL2 after continuing to in_el2.

Next we write the address of in_el2 to the Exception Link Register elr_el3. We then call eret, which switches the exception level from EL3 to EL2 and jumps to in_el2.

In EL2

The steps in EL2 are fairly similar to the ones of EL3. The configuration register of EL2 is the Hypervisor Configuration Register hcr_el2. We set the bit [31] to 1 since EL1 will be AArch64. Bits [5] (AMO), [4] (IMO) and [3] (FMO) should all be set to 0 so that external aborts, SError interrupts, physical IRQ and physical FIQ are not taken in EL2 when executing at ELs lower than EL2.

The spsr_el2 is set up the same way as spsr_el3 but with bits [3:2] corresponding to EL1. We write the address of in_el1 to elr_el2 and call eret to switch to EL1 and jump to in_el1.

In EL1

In El1 we set up the spsr_el1 the same way as in the other ELs. We also set all interrupt mask bits to 1. In addition to these steps one could also set up exception tables at this point.

Setting up the MMU

Since this is a "quick and dirty" setup, we try to get a working MMU with as little configuration as possible. We will be using five system registers: the Memory Attribute Indirection Register mair_el1, the Translation Control Register tcr_el1, Translation Table Base Registers ttbr0_el1/ttbr1_el1 and the System Control Register sctlr_el1.

Since we are in AArch64 our virtual address space spans from 0x0000_0000_0000_0000 to 0xffff_ffff_ffff_ffff. We are going to separate the space into user (0x0000_0000_0000_0000-0x0000_ffff_ffff_ffff) and kernel space (0xffff_0000_0000_0000-0xffff_ffff_ffff_ffff).

We will be using a 4 KB granule size with 48-bit virtual addresses. A 48-bit virtual address has the following structure:


+--------------+--------------+--------------+--------------+-------------+
| bits [47:39] | bits [38:30] | bits [29:21] | bits [20:12] | bits [11:0] |
|--------------+--------------+--------------+--------------+-------------|
| L0 table idx | L1 table idx | L2 table idx | L3 table idx | offset & pa |
+--------------+--------------+--------------+--------------+-------------+

The bits [11:0] describe the offset within the 4 KB page (pa refers to physical address). The bits [20:12] act as an index for a table (Level 3 table) of 4 KB pages with 512 entries (indexes range from 0b000000000 to 0b111111111). The bits [29:30] in turn act as indexes for Level 2 table containing entries each pointing to a Level 3 table. A Level 2 table entry can also point to a 512 * 4 KB = 2 MB block of memory. Similary, bits [38:30] act as indexes of the Level 1 table with each entry pointing to either a Level 2 table or a 512 * 512 * 4 KB = 1 GB memory block. The Level 0 (bits [47:39] serve as indexes) in turn points to a Level 1 table or a 512 GB memory block.

For the purpose of my bare metal project I wanted the translation table to cover 2 GB of physical memory. Since the Level 0 table holds 256 GB entries there should be only one entry in the table in our case. Level 1 table holds two 1 GB memory blocks. This means that there is no need to set up tables lower than Level 1!

At this point you may be wondering about the remaining bits [63:48] of the virtual address. These bits tell which translation table base register to use. If they are all set to zero ttbr0_el1 is used, if they are all set to one ttbr1_el1 is used. If the bits hold ones and zeros a translation fault is generated. This plays well with our kernel and user space separation. ttbr0_el1 holds the translation table for the user space and ttbr1_el1 for the kernel space. In our simplified case ttbr0_el1 and ttbr1_el1 can be set to point to the same table.

With this information we are ready to configure the MMU.

TCR_EL1

The tcr_el1 register holds information relevant to the structure of our virtual addresses. We are going to set up tcr_el1 the following way:

IPS, bits [34:32]

Intermediate Physical Address Size
Set to 0b101 which corresponds to 48 bits

TG1, bits [31:30]

ttbr1_el1 granule size
Set to 0b10 which corresponds to 4 KB

SH1, bits [29:28]

Shareability attribute for memory associated with ttbr1_el1 related translation table walks
Set to 0b11 which corresponds to Inner Shareable

ORGN1, bits [27:26]

Outer cacheability attribute for memory associated with ttbr1_el1 related translation table walks
Set to 0b01 which corresponds to Normal memory, Outer Write-Back Write-Allocate Cacheable

IRGN1, bits [25:24]

Inner cacheability attribute for memory associated with ttbr1_el1 related translation table walks
Set to 0b01 which corresponds to Normal memory, Inner Write-Back Write-Allocate Cacheable

EPD1, bit [23]

Disable translation table walk for translations using ttbr1_el1
Set to 0b0 since we want to perform translation table walks using ttbr1_el1

A1, bit [22]

Selects whether ttbr0_el1 or ttbr1_el1 defines the Address Space Identifier (ASID)
Set to 0b0 (ttbr0_el1 defines the ASID)

T1SZ, bits [21:16]

The size offset of the memory region addressed by ttbr1_el1.
Set to 16, since we want 64 - 16 = 48 significant bits for our virtual addresses.

TG0, bits [15:14]

ttbr0_el1 granule size
Set to 0b00 which corresponds to 4 KB

SH0, bits [13:12]

Shareability attribute for memory associated with ttbr0_el1 related translation table walks
Set to 0b11 which corresponds to Inner Shareable

ORGN0, bits [11:10]

Outer cacheability attribute for memory associated with ttbr0_el1 related translation table walks
Set to 0b01 which corresponds to Normal memory, Outer Write-Back Write-Allocate Cacheable

IRGN1, bits [9:8]

Inner cacheability attribute for memory associated with ttbr0_el1 related translation table walks
Set to 0b01 which corresponds to Normal memory, Inner Write-Back Write-Allocate Cacheable

EPD0, bit [7]

Disable translation table walk for translations using ttbr0_el1
Set to 0b0 since we want to perform translation table walks using ttbr0_el1

Bit [6]

Reserved, RES0

T0SZ, bits [5:0]

The size offset of the memory region addressed by ttbr0_el1.
Set to 16, since we want 64 - 16 = 48 significant bits for our virtual addresses.

Assembly for setting up tcr_el1:


    ldr     x0, =0x5b5103510
    msr     tcr_el1, x0
    isb

MAIR_EL1

The mair_el1 register acts as a table containing memory attributes associated with EL1 address translations. Each entry of mair_el1 holds on byte. We will have two types of memory attributes: Device-nGnRnE Memory which covers 0x00000000-0x3fffffff (first 1 GB, in QEMU the peripherals etc. lay here) of our 2 GB memory and Inner Write-back non-transient Normal Memory which covers 0x40000000-0x7fffffff (all the code will be located here) of the 2GB memory. So we will write the following to mair_el1:

Attr1, bits [15:8]
Set to 0b11111111, Normal Memory
Attr0, bits [7:0]
Set to 0b00000000, Device-nGnRnE memory

Assembly for setting up mair_el1:


    ldr     x0, =MAIR_EL1_VALUE
    msr     mair_el1, x0
    isb

TTBR0_EL1 and TTBR1_EL1

In our simple case ttbr0_el1 and ttbr1_el1 hold the base address of the same Level 0 translation table. This table in turn holds the addresses to two Level 1 tables. As we discussed earlier entry of a translation table can either point to a memory block or the translation table of the level above. To distinguish between the these two entries the bits [1:0] of the address neet to be set to 0b11 (table descriptor/entry) or 0b01 (block entry). Since the Level 0 table holds two addresses corresponding to table descriptors, we will set their bits [1:0] to 0b11.

The Level 1 table on the other hand holds two block entries, so the bits [1:0] will be set t0 0b01.

We will also have to set couple other attributes for these block entries. For the block entry corresponding to the first 1 GB of our 2 GB physical memory:

UXN, bit [54]

The Unpriveleged Execute-never bit
Set to 0b1 since the region is not executable

PXN, bit [53]

The Priveleged Execute-never bit
Set to 0b1 since the region is not executable

AF, bit [10]

The Access flag
Set to 0b1 since it has not yet been accessed and will be accessed for the first time

SH, bits [9:8]

Shareability field
Set to 0b10 which corresponds to Outer Shareable

AP[2:1], bits [7:6]

Data Access Premission bits
Set to 0b00 which corresponds to Read/Write from EL1

NS, bit [5]

Non-secure bit
Set to 0b0 meaning that the the addresses in the block are in the Secure address map

AttrIndx[2:0], bits [4:2]

mair_el1 index field
Set to 0b000 since Attr0 of mair_el1 corresponds to Device-nGnRnE memory.

To the block entry corresponding to the second 1GB of our 2 GB physical memory we set the following attributes:

UXN, bit [54]

The Unpriveleged Execute-never bit
Set to 0b0 since the region is executable

PXN, bit [53]

The Priveleged Execute-never bit
Set to 0b0 since the region is executable

AF, bit [10]

The Access flag
Set to 0b1 since it has not yet been accessed and will be accessed for the first time

SH, bits [9:8]

Shareability field
Set to 0b11 which corresponds to Inner Shareable

AP[2:1], bits [7:6]

Data Access Premission bits
Set to 0b00 which corresponds to Read/Write from EL1

NS, bit [5]

Non-secure bit
Set to 0b0 meaning that the the addresses in the block are in the Secure address map

AttrIndx[2:0], bits [4:2]

mair_el1 index field
Set to 0b001 since Attr1 of mair_el1 corresponds to Normal memory.

Note: We can write to the lowest 10 bits of the entries since they correspond to the offset within the page which is not relevant when gathering information about the translation tables (bits[47:12] of the virtual address).

Assembly for setting up ttbr0_el1 and ttbr1_el1:


    ldr     x0, =pagetable_level0
    ldr     x1, =pagetable_level1
    orr     x2, x1, 0x3
    str     x2, [x0]
    ldr     x4, =0x00000000
    lsr     x5, x4, 30
    and     x5, x5, 0x1ff
    lsl     x4, x5, 30
    ldr     x6, =0x60000000000601
    orr     x4, x4, x6
    str     x4, [x1, x5, lsl 3]
    
    ldr     x4, =0x40000000
    lsr     x5, x4, 30
    and     x5, x5, 0x1ff
    lsl     x4, x5, 30
    ldr     x6, =0x00000000000705
    orr     x4, x4, x6
    str     x4, [x1, x5, lsl 3]
    msr     ttbr0_el1, x0
    msr     ttbr1_el1, x0
    isb

Set up the page tables:


.balign 0x1000
pagetable_level0:
    .space 0x1000
.balign 0x1000
pagetable_level1:
    .space 0x100

SCTLR_EL1

All we have left is enabling the MMU. This is done by setting the bit [0] (M) of sctlr_el1 to 1. We also set the bit [12] (I) to 1, enabling instrcution caches at EL1.

Assembly for enabling the MMU:


    mrs     x0, sctlr_el1
    orr     x0, x0, 0x1
    orr     x0, x0, (0x1 << 12)
    msr     sctlr_el1, x0
    is

And there we have it! The MMU should be now set up and you should be able to read/write to addresses residing in 0x00000000-0x3fffffff and 0xffff000000000000-0xffff00003fffffff and read/write/execute addresses located in 0x40000000-0x7fffffff and 0xffff000040000000-0xffff00007fffffff.

This was my first post related to bare metal programming, so there might be some errors or poorly explained sections. If you find any mistakes or room for improvements feel free to email me! If you found this post helpful please do the same.