11
33

How to share a folder in Linux?

18h 23m ago by mander.xyz/u/ranzispa in sysadmin

I work on an HPC and often I have to share files with other users. The most approachable solution is to have an external cloud storage and recline back and forth. However there's some projects that are quite heavy (several TB) and that is unfeasible. We do not have a shared group. The following is the only solution I found which is not to just set al permissions to 777, and I still don't like it.

Create a directory and set ACL to give access to the selected users. This works fine if the users create new files in there, but it does not work if they copy from somewhere else as default umask is 022. Thus the only appropriate solution is to change default umask to 002, which however affects file creation system wide. The alternative is to change permissions every time you copy something, but you all know very well that is not going to happen.

Does it really have to be such a pain in the ass?

Uh, why not create the shared group? That's more or less exactly the purpose of their existence.

Lots of paperwork involved around our HPC. Politica in the middle and plausibly would have to sign confidentiality agreement with everyone who shares anything in there, which would have to go through a review process which is generally takes about 6 months.

There’s a reason for this governance and you’re putting your whole team at risk trying to do this yourself

It's just default Linux permissions. People who get access to the HPC through the same institution are placed in the same group. As you may know scientific collaboration is quite important. Indeed when collaborating we sign all the necessary paperwork, but that does not translate to the HPC administrators who are part of a separate institution. To request a separate group you have to contact the HPC institution, they will have to contact the institutions involved. Those institutions will have to check that NDAs are already in place. If the NDAs are in place they will have to check that the data to be shared is actually covered by the project. I will have to fill a bunch of paperwork. This will be sent to an external auditor to check that everything is correct, and then everything goes back in that chain.

I already waste way too much of my time on paperwork. Worst case scenario is a collaborator leaks some data which will be published publicly in a few months anyway. And those collaborators will have access to such data anyway, just through other less comfortable means.

The fact that you’re sharing this internal policy stuff so openly is definitely a red flag.

I'm no sysadmin, I just run my homelab. Let me get this straight... You want to bypass system level access level restrictions with some form of control but not go through your company's standard method of doing so because of bureaucracy?

If that's the case: why not put something in front Like opencloud for example?

I mean, maybe OC is not what you need, but conceptually... would a middleman solution work for you? If so, you could go with a thousand different alternatives depending on your needs.

A cloud solution is indeed an option, however not a very palatable one. The main problem with a cloud solution would be pricing. From what I can see, you can get 1TB for about 10€/month. We'd need substantially more than that. The cost is feasible and not excessive, but frankly it's a bit of a joke to have to use someone else's server when we have our own.

You want to bypass system level access level restrictions with some form of control but not go through your company's standard method of doing so because of bureaucracy?

Yes. Not a company but public research, which means asking for a group change may lead to several people in the capital discussing on whether that is appropriate or not. I'd like this to be a joke, but it is not. We'd surely get access eventually if we do that, but that would lead to the unfortunate side: if we work in that way every new person who has to get in has to wait all that paperwork.

Don't bypass your organizational policies

I think he meant self-hosting Opencloud

Yes. That's what I recommended. Self-host whatever middleman software. Opencloud, WebDAV, S3, FTP, anything he puts in the middle can accomplish what he wants.

I recommended Self-hosting whatever middleman software. Opencloud, WebDAV, S3, FTP, anything you put in the middle can accomplish what you want.

I'm pretty sure you can do this by adding default user entries to the directory acl which will then be set on files added to that dir.

Default user entries are in there and do work, however when copying existing files those get masked with the existing group permissions. As such, the only solution I found is to have everyone set their umask to 002 as otherwise we would not get write access to files which are copied and not created in place.

Ah, I see. Well its ugly, but you could inotify to trigger a tiny script to update the perms when files are added or copied to the share dir.

That is a possibility, but how would the setup look like? Only the owner can update the permissions. This would mean that all users need an inotify daemon on that folder for whenever they copy something in there. Not to mention, this is an HPC and we mostly live in login nodes; our sessions are limited to 8 hours which makes setting up such a daemon a bit tricky. Could probably set up somewhere else a cronjob to connect and start it, but it feels a bit cumbersome.

Running the inotify script as a service as root would require only one instance. You could trigger it on close_write and then run setfacl to add ACL entries to the new file for all the share users.

If you can't add a daemon or service to the system then you can skip inotify and just slam a cron job at it every minute to find new files and update their perms if needed. Ugly but effective.

Another option to consider: You could write a little script that changes umask, copies files, and changes it back. Tell people they must use that "share_cp" script to put files into the share dir.

We can not setup a common group, no way we get root privileges. A cron job would not work either: it is a cluster with many nodes, of which many login nodes. Cron jobs do not work on such systems.

A share_cp script would in fact be a good solution, I may try that and see if people pick it up.

Here's someone that solved this by monitoring the directory using inotifywait, but based on the restrictions you already mentioned I'm assuming you can't install packages or set up root daemons, correct?

https://bbs.archlinux.org/viewtopic.php?id=280937

Edit: CallMeAI beat me with this exact same answer by 15 minutes.

A dedicated file sharing application.

What do you mean? Is there an application that allows easily sharing files on one Linux system? That would be nice!

If you mean going through an external server or peer to peer transfer, that is not too feasible. I do not have other storage places with tens of terabytes available, and transfering that much data through some P2P layer, while feasible, would probably be even less user friendly.

NFS Well if someone is running windows look for samba
s/ftp
scp
You could use pythons build-in http server
rsync

This is a large computing cluster, there are no such mountpoints available and I'm definitely not allowed to go there and plug a few disks into the racks.

My answer regarding Sftp Scp Rsync remains unchanged. Rclone, Globus, Fuse

None of these programs allow overriding Linux permissions. You can not recline/rsync in another user directory. You can not sftp/scp in another user directory. My problem is not about transferring data across different systems, but rather accessing data on one system through different users. All users should be able to read and modify the files.

Oh, I see. Then I completely misunderstood. Sorry

You can't override user permissions. If you could they would be useless

Skip NFS and ftp as they cause more problems than they solve

I would hire someone who knows what they are doing. It sounds like you are out of your element here which is risky.

To answer your question, you have a few options:

  • Samba

    • Samba is just a SMB server. If you have a local active directory setup use this.
  • SCP

    • this just copies files over a ssh connection
  • Rsync

    • this performs a sync of one directory to the other. It can run over SCP
  • Unison

    • like rsync but two directional. For it to work it needs to track state
  • Syncthing

    • never actually used it but it might be close to what you want.

Maybe some sticky bit https://www.redhat.com/en/blog/suid-sgid-sticky-bit

I thought sticky bits were used to allow other users to edit files but not delete them. Do they also allow inheriting the parent directory permissions?

I didn't intend and don't think the stick bit stuff will or could be a complete solution for you. You've got some oddly specific and kinda cruddy restrictions that you've got to workaround and when they get that nonsensical one ends up solidly in "cruddy hack" territory.

From the article:

group + s (pecial)

Commonly noted as SGID, this special permission has a couple of functions:

If set on a file, it allows the file to be executed as the group that owns the file (similar to SUID) If set on a directory, any files created in the directory will have their group ownership set to that of the directory owner

You could run something like https://pypi.org/project/uploadserver/ in screen or run a cron every minute that just recursively sets the correct permissions.

Wow, that group +s seems exactly what I'm looking for! That actually looks like the clean solution I was looking for. I'll test it out and report back, I'll have to wait on Monday for the colleagues to be back in the server, but it seems very promising.

Thank you very much!

Wahoo! Best of luck!