8

I have a script that creates large temporary files.

I'm inclined to use tmpfs for this, however I did a quick search for tmpfs performance and found this which reports speed as about 2GiB/s.

Modern NVME SSDs have comparable speeds. Putting aside differences other than performance (e.g. longer SSD life by not touching disk, SSDs having a larger while RAM is limited), are there any performance advantages of using tmpfs over SSDs?

7
  • 1
    Did you actually benchmark tmpfs on your own system? You need to do that, rather than thinking that someone else's old system will be representative of yours. Do you actually have a very new server with PCie 4.0? If not, you won't get those speeds out of your NVMe drive. Oct 4, 2020 at 15:42
  • 2
    Hate to tell you but tmpfs kjeep ts the files in memory, so - it takes memory. Obviously it will have more speed - SSD are still SERIOUSLY slower than RAM. But then it is not "virtual" in any means, it uses memory.
    – TomTom
    Oct 4, 2020 at 15:55
  • 2
    There's really only one option here - you need to test both on the actual kit you want to use, anything else is guesswork.
    – Chopper3
    Oct 5, 2020 at 10:15
  • 1
    You found an answer from over two years ago about a problem with tmpfs speed, and decided that's indicative of tmpfs speed in general? Oh dear.
    – Ian Kemp
    Oct 5, 2020 at 11:16
  • 1
    I don't have to have more information to point out that you are basing the premise of your question on suspect data, meaning you might want to consider whether said question is fatally flawed.
    – Ian Kemp
    Oct 6, 2020 at 13:26

3 Answers 3

12

tmpfs, being an extension of the pagecache, really operates as a "transparent" ramdisk. This means it provides very fast sequential read/write speed, but especially fast random IOPs (compared to a storage device).

Some examples, collected on an aging Ryzen 1700 with run-of-the-mill memory:

  • dd if=/dev/zero of=test.img bs=1M count=4096 shows 2.8 GB/s

  • overwriting the just allocated files with dd if=/dev/zero of=test.img bs=1M count=4096 conv=notrunc,nocreat shows 3.5 GB/s

  • fio --rw=randread (random read IOPS) shows 492K iops for queue depth 1 (single-thread) workload, with 2.2M iops for queue depth 8 (8-threads) workloads. This vastly exceeds any NVMe flash-based disk (eg: Intel P4610) and even XPoint-based disks (eg: Intel Optane P4801X)

For comparable performance, you would need an array of NVMe disks or, even better, memory-attached storage as NVDIMM.

In short: if you can live with tmpfs volatile storage (ie: if you lose power, you will lose any written data) it is difficult to beat it (or ramdisks in general).

However, you asked about writing large files to tmpfs, and this can be a challenge on its own: after all, writing GB-sized files will readily eat your available memory size (and budget).

1
  • (Note that @shodanshok's fio numbers used threads to get the higher depth. Up until a year ago, fio's libaio was the highest performing regular async ioengine on Linux but old-style Linux AIO only really with direct I/O and direct I/O is not supported on tmpfs. In the new world you may find setting the depth to 8 via the io_uring ioengine with buffered I/O and a single thread gives even higher IOPS (which just reinforces shodanshok's answer) but this requires a suitably modern setup and fio)
    – Anon
    Oct 9, 2020 at 8:00
8

One number you found is not a substitute for testing. Run your workload on your hardware and check if it is satisfactory. Are you doing small random access I/O, or the entire file?

Yes, modern busses and interconnects mean DRAM and NVMe can both drive GB/s class of sequential I/O. It also can be true that certain reads from DRAM are 10x to 100x faster than persistent solid state storage. Whether this matters to you depends on your workload. See the various visualizations of Latency Numbers Every Programmer Should Know to get an idea of the orders of magnitude.

After establishing performance, the operational considerations should not be dismissed. tmpfs will go away on reboot. If your use can tolerate no durability, great, but usually that is annoying. Yes, not writing to SSDs reduces their wear. Whether this matters depends on - you guessed it - your workload.

1

In terms of how it's implemented, tmpfs uses the same filesystem caching as other Linux filesystems, but the operations to flush data to disk are not implemented, so it's effectively cache only. In practice, this may mean that if your workload never flushes to disk, and fits entirely in memory (the latter would be a requirement to use tmpfs anyway), then even for filesystems that are backed by disk, requests would typically be served out of cache anyway, so you would see minimal performance difference between tmpfs and another filesystem.

Of course, if your workload does flush to disk, then disk write times will matter, so the suggestion given by other answers - to test performance with as realistic a workload as possible - is still the right answer.

2
  • Testing my workload is obviously helpful, at the same time understanding the mechanisms at play can help me modify my workload to best utilize the hardware. You've reduced the difference between tmpfs and regular fs to flushing. Are there resources that can help me understand what governs flushing? Oct 6, 2020 at 9:40
  • @PeeyushKushwaha the most important thing will be the software you're running. If it's code you control, then you can control whether and when data is flushed to disk, and potentially write it to use mmapped IO, or use madvise of fadvise to control how and when it writes stuff back. If it's an existing application, then you may have less control. For databases, it's often fsync behaviour and configuration that you want to look for in the documentation.
    – James_pic
    Oct 6, 2020 at 10:14

Not the answer you're looking for? Browse other questions tagged .