Distributed SSD cache for network storage

Question

Our small computing cluster has 3 computing nodes and 1 file server. The file server has around 70TB data to be accessed by the 3 computing nodes. All of the 3 computing nodes have 3 4TB SSDs which are now idling.

Currently the computing nodes use NFS to access files on the file server, and all the servers are connected to a 1 Gbps network. Is there a proper way to create a "pool of cache" from those SSDs so that there is no need to always access files from the file server? Or is it even a correct solution, or should I just stay with local SSD cache?

I have tried to set up GlusterFS with those SSDs and FS-Cache on top of GlusterFS but it caches nothing.

David Owen · Accepted Answer · 2023-01-19 03:47:53Z

fscache is what you want, but be aware that it won't cache files opened for writing unless you're running NFS v4 (see https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/storage_administration_guide/fscachelimitnfs). In fact, opening a file for writing invalidates any of that file's cached data.

Create some small test files on the NFS share, then try cating them (or something else that opens them for reads) from the NFS client machine. If you don't see anything populating in your configured cache directory, you probably don't have fscache fully configured or enabled yet.

I would typically configure each NFS client with its own local cache. This won't entirely prevent network access---fscache still requires that the client communicate with the serve regularly for cache coherency, locks, etc., but it does reduce how many data blocks must be transferred for commonly-read parts of files.

As for making a distributed cache, you may not see any performance improvement from that, as the point of the cache is to avoid network traffic as much as possible (mostly data reads and writes; metadata, coherency, etc. must still hit the network). Something like GlusterFS adds more network traffic back into the mix. But, it may be worth testing for your specific case.

Chopper3 · Accepted Answer · 2023-01-12 14:56:51Z

The problem you face is in regard to what does the file locking and cache-coherency.

Because you're using a NFS based solution the NFS server itself needs to manage which clients have locked any given file and, once changes are made, to ensure that it's own cache remains coherent (usually by flushing it's own cache of the file as it previously was).

If you created a per node caching layer on the client machines they'd have to work with the server to ensure their copy of cached files were synchronised with any writes to the files - and I don't believe there are any protocols available to accomplish this.

Obviously individual nodes could cache these files with some form of version identifier, probably the last-written date/time stamp and check this hasn't changed before any cache reads but this is typically an in-memory cache, you'd need to write something very specific to sync this if local disk was used.

You may well be on the right track with the distributed file-system approach as the locking, and thus cache-coherence is managed in a distributed fashion by the nodes themselves, not by a centralised server. A while there are many different distributed file-system out there I know that Ceph and a few others do allow for node-led caching of shared file-system objects/files.

May be I misunderstand something. But isn't FS-Cache already a per node caching layer? — Leo, Jan 12 at 20:30

Stack Exchange Network

Distributed SSD cache for network storage

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
storage
nfs
cache
ssd
distributed-filesystems
.

Hot Network Questions

Distributed SSD cache for network storage

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged storagenfscachessddistributed-filesystems.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
storage
nfs
cache
ssd
distributed-filesystems
.