Support Large Files with Doubly-Indirect Blocks

In this homework, you will extend xv6 so a single file can grow beyond the current direct-plus-single-indirect limit.

Refer to lectures, the xv6 textbook, and the filesystem code in fs.c, fs.h, and file.h to understand how xv6 maps file offsets to disk blocks.

Pull the latest code from https://github.com/sysec-uic/xv6-public/, and switch to hw7.

git pull origin
git checkout hw7

Step 1: Understand the new inode layout

In the starter code, one direct block slot has already been traded for a doubly-indirect pointer. That means each inode now supports:

  • 11 direct blocks
  • 128 blocks through one indirect block
  • 128 * 128 blocks through one doubly-indirect block

The new maximum file size is therefore 16523 data blocks.

The starter already updates the inode layout in fs.h and file.h so that:

  • addrs[0] through addrs[10] are direct blocks
  • addrs[11] is the single-indirect block
  • addrs[12] is the doubly-indirect block

Most of your work will be in fs.c.

Step 2: Update bmap() to support doubly-indirect blocks

The function bmap() translates a logical file block number into a disk block number, allocating blocks as needed.

The starter already handles:

  • direct blocks
  • single-indirect blocks

You need to finish the TODO(hw7) code in bmap() so that xv6 can:

  1. Allocate the doubly-indirect root block when needed
  2. Use that root block to find or allocate a first-level indirect block
  3. Use the first-level indirect block to find or allocate the final data block

The starter currently returns failure once the file reaches the doubly-indirect region.

Step 3: Update itrunc() to free doubly-indirect blocks

When a file is deleted, itrunc() frees all of its blocks.

The starter already frees:

  • direct blocks
  • blocks reachable from the single-indirect block

You need to finish the TODO(hw7) code in itrunc() so that xv6 also frees:

  1. Every data block reachable through the doubly-indirect block
  2. Every first-level indirect block in the doubly-indirect tree
  3. The doubly-indirect root block itself

If this part is wrong, your kernel may appear to work once, but later runs may fail because blocks were leaked.

Step 4: Test your implementation with bigfile

The template includes a user program named bigfile. It writes a large file, reads it back, deletes it, and then repeats the whole process. A correct solution should print:

$ bigfile
bigfile: writing 16523 blocks
PASS

The provided starter should fail before you implement the missing logic. Here is the expected starter behavior:

$ bigfile
bigfile: writing 16523 blocks
FAIL: wrote 139 blocks, expected 16523 (round 1)

The value 139 comes from:

  • 11 direct blocks
  • 128 single-indirect blocks

Some hints

  • The existing single-indirect code in bmap() is the model for the doubly-indirect case. The doubly-indirect case simply adds one more level of indexing.
  • Be careful with the logical block number after subtracting off the direct and single-indirect ranges.
  • For a logical block number in the doubly-indirect region, you need both:
    • the outer index: bn / NINDIRECT
    • the inner index: bn % NINDIRECT
  • Do not forget to call log_write() after modifying an indirect block that is cached in a buffer.
  • In itrunc(), free blocks from the leaves upward: data blocks first, then first-level indirect blocks, then the doubly-indirect root block.

Submission

Please follow instructions on Gradescope to submit your solution.