Distributed file storage with Seaweed-FS

For one of our current projects we need to store a huge load of files and make them accessible easily. Access control is done in the application layer, so there is no need for a heavy POSIX compliant file system and since we feel home in the world wide web, we decided to go with a distributed storage engine that communicates via http, the still young but innovative Seaweed-FS.

Seaweed-FS comprises of a cluster of master servers, which control the communication and a bunch of volume server, which handle a set of volumes. Volume servers are logically grouped in data racks and data racks are logically grouped in data centers. So one is free to build any topology for file storage.

But why should we do this?

As you may think now, there must be a purpose and of course there is one. It’s all about replication and the availability of our business data, which is crucial for running our service. When you want to save a file in a Seaweed-FS infrastructure, the first thing we do, is asking the master to tell us where a free volume on a volume server is. With a handy combination of 3 digits, we can tell the master server, if and how we want our files to be replicated. Across different data centers, on a different rack or maybe different volumes on the volume server are enough. It’s up to us, to decide for our use case. The master server will create a so called layout for us and assign volumes to them. We even can set a collection name to identify this layout.

Examples of replication types

ReplicationMeaning
000no replication, just one copy in the system
110replicate once in a different data center and once on a different rack
200replicate twice on a different data center

After that, the master server tells us the id of the file, where our binary data gets stored and the id of the volume. In this case the volume id can map to one or many different volumes on different volume server according to our replication option. We send the data via http to the volume server and Seaweed-FS will do the rest.

Of course we don’t want to interact via plain http, so we took cruzzr’s node-weedfs and refactored it to our own node-seaweedfs module, to interact with weed via promises, streams and all the other sugar node.js provides us. The integration into hapi is done via another module of ours, hapi-seaweedfs which integrates nicely with hapi-intercom – a event bus system to couple hapi pluginz horizontally. See the repos for some usage examples.

hapi-seaweedfs can be configured to query the master server regularely and will emit an event on the “seaweedfs” intercom channel. SeaweedFS supports Autofailover and will elect a new leader in the master cluster, if the old one goes down. In later versions of our modules we will implement a failover feature to switch transparently to the newly elected leader.

As always, feedback is appreciated. Just open an issue in the corresponding repository.

Veröffentlicht in Allgemein