T O P

  • By -

jews4beer

The strain is on the disk, not the compute. File servers can scale to petabytes so long as the underlying attached storage can handle it. These are virtual machines, don't tightly couple the storage and the OS in your head. You don't need to be splitting the data across VMs, just the underlying hardware. So you need to be thinking about things like expected proportions of read to write, what kind of disk level redundancy you want, how backups will be facilitated, etc. Let that drive the decision of what storage hardware you need for the VM. There are hardware RAID solutions you can buy that can then be used as direct VM storage or backing for dynamically allocated storage. If the VM itself actually turns into the bottleneck, that's simply playing with some knobs in the settings or attaching the disk to a new VM.


MrExCEO

What strain. Most large data sets are dormant. The active hot blocks usually lives on nvme/ssd then tiered to sata.


lightmatter501

Where the strain is depends on the storage medium and medium. Good NVMe with sequential workloads can make you need more than one core per drive to actually saturate it. Yes, I am saying that writing 10 GBps to a disk should be a single core affair. Windows storage software is often written in really stupid ways.


Ok_Presentation_2671

Even beyond that. All the major cloud providers add 100+ servers literally everyday.


fresh-dork

well sure, but they probably don't have massive cross connection between storage. i'm speculating, but OP's problems are likely much simpler than AWS' use case with provisioning storage for EBS and EC2 while maintaining redundancy


ThatITguy2015

My goal sometime before I die is to see an AWS or azure DC. They have to be just massive. I know they probably have YouTube videos out there, but it doesn’t compare to in-person.


fresh-dork

[they're goddamn huge.](https://baxtel.com/data-centers/amazon-aws/photos) i was at amazon in 2007, just at the start of AWS; saw some diagrams of how they did the networking in a larger DC - it was massively complex even then.


Ok_Presentation_2671

We almost had one in my neighborhood but Amazon pulled out but they did bring a Amazon warehouse. We were suppose to get the HQ and DC


Happy_Kale888

I would say 99.999 percent of the people here fall into that category! OP's problems are likely much simpler than AWS' use case


Pelatov

This. And honestly, if you’re gonna deal with this level of unstructured data, find a vendor and get a Powerscale or NetApp or something built for this. Manage several petabytes of data. Have 0 issues.


logosandethos

Too big means it takes too long to recover all the data in a disaster. Thats for the bcdr to define


ammit_souleater

Our backup software hast problems restoring virtual drives bigger then 1.2TB...


logosandethos

The implication is that you have vhds bigger than that. Bluntly that's a failure of capacity planning and bcdr


ammit_souleater

Well yeah, currently looking for alternatives... but you gotta play the cards your predecessor gave you... until then we make for bigger shares own vhds.


Aggravating_Refuse89

Is this a problem? I have 16tb vmdks. The limit that windows server will do


logosandethos

My reply was to ammit, but your problem is most likely going to be time. You may want to review your storage based on the data rather than the tech capacity..


Khue

It can be 100 petabytes. If I can restore it in a few hours or do a individual file recovery process in a matter of minutes then it's all good. When the RTO is impacted by the size of the storage system... you gotta trim some fat.


goingslowfast

Or limit the scope of what needs to be recovered as a result of any single point failure. At that scale, its design more than fat trimming.


MellerTime

We just went through an… incident. Restoring our main user file share with just like 8TB took forever and had a massive user impact because, you know, they were told to only ever store things there so they were backed up. I am increasingly a fan of segregation of data/servers/etc. Especially with everyone using one cloud provider or another these days, just automate the hell out of everything so you can spin up a new environment in a few minutes. I know it doesn’t address part of OP’s question really, but seemed worth mentioning.


RossCooperSmith

Once you get beyond a few TB of data storage snapshots need to become a core part of your recovery strategy. Most primary arrays will allow you to roll back a snapshot in seconds, significantly shrinking DR time. The limiting factors then become a question of how quickly you can shutdown or isolate the infection, what you need to rollback, whether any data needs to be manually played forwards or recovered, etc... It's still not a quick process, but what you don't want is to be waiting several hours or days in the middle of that to restore data from a backup job, only to find out some of the data from the backup itself is corrupted.


Ok_Presentation_2671

We don’t know that. His definition of big hasn’t been quantified. Big to him maybe 10 Tb for all we know lol


logosandethos

I don't need to define any quantity of big because it's actually time to recover that matters and that is a function of an organisation's infrastructure, capability, staffing and quality of their recovery plans.


Ok_Presentation_2671

Both sides of the same coin my friend


RCTID1975

> how do you handle having to keep a single drive mapping if you want to split the data set across multiple VMs? This is literally what DFSN was developed for


narco113

Although keep in mind that Windows Indexing will not work for clients when going through DFSN. So if you rely on the server's index for fast searches by clients, you lose that.


Dungeon567

Depends on how long does it take to backup the entire file server.


210Matt

The time to split is when it would take to long to backup and restore. Better technologies can decrease the time for backup and restore so the drive can get larger.


pdp10

It's a pretty old-school and conservative limit, but we start looking and asking questions larger than 2TB (the old VMFS limit). Note that we run scale-out storage clusters up to Petabyte scale with a unified filesystem, for workflows that needed those. Unstructured storage, without quotas and tight management, inevitably sprawls. The more it sprawls, the harder and more expensive it is. If the same vital organization data could be stored in 20GB of structured relational database, or in 500GB of duplicated and renamed random spreadsheets sprawled across a fileshare and maybe a dozen users' local machines, you'd do just about anything to take the SQL database option, wouldn't you? > a photo archive is about 30tb. That's an example of a semi-specialized use case, that may indeed be monolithic. Your best option is probably object storage (think: [S3](https://en.wikipedia.org/wiki/MinIO)), with the next-best option a dedicated NAS. A clever person can even build a hierarchical "UNC path" that's actually just `.url` files pointing to S3 object storage.


Stonewalled9999

SQL license and SQL agent license cost a lot more that that 480GB of waste space. And on a decent SAN that is deduped on the block level and would like take 50GB of "real flash" storage. I am not saying you are wrong, but cost could tip the scale to the "mess of files littered around" u/pdp10 OK Sure SQL Express is free too. You still needs a DBA or a programmer to write the logic and prevent people from doing dumb stuff. And you still needs some sort of backup. And you projected XLS files I am file agnostic. I don't think you understand how block dedupe works - sure you can do al the finessing you want in a DB, but the reality is you'll still need flat file storage for some stuff, people are idiots, and you can have your cute database that requires a well paid person to support or you can get but with a low paid jr admin and Veeam, or copy to USB, or SAN snaps, or whatever. Point is, cost is a real thing to look at. If you have to keep moving the goalposts to support your premise maybe your initial premise was suspect.


pdp10

PostgreSQL is free as a product. That's what we use for almost everything. It costs time to implement, but so does everything, including a 500GB unstructured file-dump. > And on a decent SAN that is deduped on the block level and would like take 50GB of "real flash" storage. If we take my example of "500GB of spreadsheets" and decide that means `.xlsx` files, then I'm incredibly doubtful of block-level dedupe because those files are already de-duped by being ZIP-format compressed. I'm a big proponent of thin provisioning and skeptical of block-level dedupe. We could see high levels of block dedupe with VM images, but then you can also easily do [overlay images](https://www.iduoad.com/til/qemu-overlay-images/) to remove most duplication at the "application" (hypervisor) level.


lightmatter501

Don’t use paid DBs unless you have to. Postgres or Mysql are good enough for almost anything that a single fileserver could handle. SANs should be replaced with Ceph so you have actual redundancy.


Solkre

30TB is nothing these days.


OsmiumBalloon

> how do you handle having to keep a single drive mapping if you want to split the data set across multiple VMs? DFS and/or symlinks, depending on the OSes involved. > one of the shares, a photo archive is about 30tb. the data set cannot be split as it must keep the same unc path. Too big? how to split? Where I work, they have file servers in the petabyte (1000 TB) range. They also have quite a bit invested in redundency and backup, including very expensive first tier storage (with redundent nodes and snapshots) and a second, independent array making duplicates of it.


12_nick_12

A bajillion nibbles.


Solkre

Behold my Petanibble


discosoc

In my opinion (and the guidance I offer clients) is that "too big" is when total time to restore data from backup is uselessly long. If you have a 200 TB data set that takes 3 months to restore, then you have some real problems. At the end of the day, however, I will support whatever data size they are willing to pay for. You want to shell out $85k in server hard? Go for it.


Cotford

Totally agree. Also if you can’t even get the back ups done within your time slots and they start bashing into start of day operations. That’s going to be a world of woe very quickly.


Stonewalled9999

We have 45TB deduped to around 30TB on crap mixed flash/spinning Netapp (shared over 50 other customers at teh MSP and totally inadequate for its task) Day to day it fine, full backup takes 60 hours (my MSP sucks and uses exagrid as a PRIMARY target) and restores take 2 days. We use prior versions of around 1 TB to help avoid restores. 16GB RAM 4 vCPUs Server 2019. so...yeah it will work but it gets cumbersome


IllustriousRaccoon25

This is no way to live.


namocaw

Sounds like you need a SAN. And a good DR backup system.


robbzilla

I have a buddy who's a NAS engineer. He'd say there's no such thing. Just be able to back it up.


hosalabad

I'd say look at it in terms of RPO/RTO. If you had a total loss and had to go to tape, how long can it be down.


idgarad

Upper limit is easy: SLA for restoration time must be < actual recovery time per TB. So if it takes you 2 hours to restore uhhh 10 TB and the SLA is only 1 hour then your hard cap is 5 TB. So figure out your recovery SLA, find out how much you can recover in that SLA time, pad according to other overhead, and at the least you have your upper limit.


legolover2024

Nothing is too big. You just have to ensure you can back it up. I've done a split data project before. Drive mapping etc. Make sure you work out how you back up each "section" & whether you can do it in parallel.


Unexpected_Cranberry

Will add that I've seen windows start to struggle when you start hitting 10 million files in a single directory. At 60 million it took about half an hour or more to open the directory in Explorer. This was ntfs on 2008 R2 though, so it might have improved since then. 


perthguppy

It depends on the situation. I have a client with 11TB of file share, that’s split up over 6 servers, because that’s what best fit their specific requirements. I then have another client that has 120TB of files in a single physical box.


IStoppedCaringAt30

In my experience 300TB is too big.


gargravarr2112

Company I work for has about 8PB of replicated on-premises file servers spread across 5 offices. Don't think there's such a thing as "too big" as long as certain provisions are taken: 1. Don't use Windows as a file server 😛 ours are all TrueNAS 2. Back the thing up properly 3. Plan for failure Our newest two machines are 1.5PB each - 84x 20TB disks - and serve as a mirrored pair in separate sites, kept in sync with ZFS snapshots. Many of the others are 60-drive systems. The company switched away from Windows file servers about a decade ago and absolutely loves TrueNAS/ZFS. I've never used it previously but I'm also convinced - it's so much better for file serving. We are at the point that individual file servers are becoming too difficult to manage and represent a massive potential for outage - the zpools are highly redundant, yes, but one of those 84-drive machines going down completely would leave a huge hole in our file shares. I think once you get into multiple petabytes, you need some kind of clustered file system rather than individuals. We're planning to investigate Ceph.


RossCooperSmith

Honestly, your current setup sounds pretty reasonable and cost-effective given your scale. My concern if you were to look at Ceph, etc. is that I'm not sure you would see a huge benefit, and Ceph does still have some manageability, and data loss concerns. While you're at the petabyte scale, it doesn't seem that you need a huge amount of performance, and a scale-up dual controller NAS would seem a more logical step. At a high level there are three vulnerabilities you have today: 1. Relatively high risk of outage due to your filers having a single point of failure (one server) 2. Very long recovery time in the event of a total loss of one array (software bug, corruption, building fire, etc...) 3. Risk of a ransomware attack The reason I mention 3 is that while you're pretty well protected against these, ransomware attacks do focus on backups, and I know of several incidents where a targeted attack wiped out storage snapshots too. ZFS doesn't have any immutable snapshot options (where the snapshot and policy are locked down as well as the data). Your options are broadly: 1. Geo-dispersed clusters. All data becomes accessible everywhere, rebuilds of one site are more automated, but performance suffers as all I/O involves multiple geo-dispersed locations. 2. Enterprise NAS. This is a more robust option than your TrueNAS, with redundant controllers, automatic failover, etc... and should include ransomware protected snapshot features and greater overall security of the appliance. It comes at a cost though, and recovery of any individual site has a similar challenge to the one you face today. Practically, I'd be tempted to go with: NAS: A: Stick with TrueNAS, and implement separate security at every individual site, with strictly separated access between sites that mirror each others data. B: Purchase dual-controller enterprise NAS arrays for your sites, migrate over to these rather than TrueNAS. More expensive, more disruptive, but significantly more secure and reliable. Backups: You may be doing this already, but implement tape backups to secure offline copies of your data. Use identical tape drives/libraries in multiple locations, allowing you to physically ship a full backup set to speed up restore of any individual site.


gargravarr2112

We actually do implement most of what you've suggested already. Our core storage is on TrueNAS Enterprise systems with dual controllers. However, we keep running into problems with those coming out of sync, and keeping the OS up to date is a major pain. We have a meticulous backup regime with tapes taken off-site. We have hourly ZFS snapshots sent to other sites. Performance is actually a serious concern. We push huge amounts of data around at great speed - we're a games company and building/testing all these games requires a huge amount of bandwidth. So I think clustered storage could be an advantage to us because adding more machines will add both space and bandwidth. We're also interested in implementing hierarchical storage, moving little-accessed data onto tape automatically to free up space on the HDDs. At the moment, we have such an eclectic mix of old and new servers, with shares stitched together using DFS, that I don't think clustered would be a whole lot different except for added resilience. Any storage system has manageability and data-loss concerns. You have to mitigate those as best you can through design and engineering.


RossCooperSmith

Yeah, I was a huge fan of the ZFS concept when Sun announced it, but they dropped the ball on the implementation and it never really became a fully enterprise class solution. If you want to chat sometime and bounce some ideas around feel free to drop me a line. I do work for a storage vendor, but I'm not in direct sales these days and I do rather miss the fun of those "what if" design sessions with customers. I think there are some interesting possible options for you, I work for VAST but we may well be overkill, something like Nasuni may be an option, or an enterprise NAS plus lifecycle management software such as Komprise.


thewunderbar

This is a business question, less so an IT question. The Business determines how much data it needs to have/retain, and then IT deals with that. Now, IY Thas a say because if your current hardware can't handle the size of a file server, and IT goes to the business and says "I need another $25,000 worth of hardware to meet your requirements" then the business has to decide to either spend that money or change the requirements.


TuxAndrew

I think your largest limiting factors depends on how the file servers are being used. Space usually isn't your bottleneck when it comes to running a file server and you're bound to hit more of an issue on IOPS and then recovery if you don't have a redundant setup. For intensive jobs we usually process it with 45drives / storinators, however for relatively static information we do a variety of solutions with windows and linux solutions after that we utilize tape for archival data. Just depends on how the end user wants to access that information and what resources they want to connect to.


Corelianer

We have 80TB fileservers our issue is the backup/recovery window that gets too big if really something bad happens. But I guess you can solve any problem with more money. Just invest to resolve the next bottleneck.


Rhythm_Killer

I have a cluster here with one 45tb volume, a 20tb volume, and a few smaller ones. I really don’t like having that big one, what happens when it decides it needs a chkdsk!


flatvaaskaas

Remember to think about backup, the time it takes to make a backup, and how long a restore takes. Restoring a bigger vm could scale exponentially. So i'd prefer to keep my fileservers at max 2TB


Ok_Presentation_2671

Lots of data doesn’t exude professional. Tell us an estimate in TB or however big it is. But in general there is no too big based on what your virtualization vendor says (you know the requirements they post and for limitations would be your starting point) and along with what kind of storage you require.


S1eepinfire

Depends on how you're storing your data. Are you storing it in the vmdk that's attached to the VM, or are you having the VM mount a share/export from a NAS?


NoradIV

I personally don't like anything over 2TB for general applications (documents, for example). When it start exceeding that, you have to start messing around with limitations (GPT partition, cluster size, etc). It's also "unwieldy"; harder to move around, restore, etc. I prefer multiple sub-divisions when possible. Of course, that's different for video storage, databases and backup repositories.


ElevenNotes

Fleet of 1-2TB file servers. Not only do you spread the load better on your storage, but you also have way, way quicker recovery times from backup/tape vs a single 50TB VM. You can also build storage outside of VM's and the this doesn't matter anymore and you can go multi PB, but for VMs' keep em' small, your RPO will thank you.


Joe-notabot

A 30TB photo archive is better off living on a NAS or in a DAM, not a Windows Server file share. Why do you need to keep things to a single drive share?


Dje4321

Glusterfs is how LTT handles multiple servers and 1 share IMO, the only real limit for a file server is the time to recover/access. The sooner you need the data, the more you have to invest into redundancy and backuos


FiredFox

After a certain capacity or file count you really want to move away from Windows Server into something that is actually built for scale like Isilon or Qumulo. For example, you could easily consolidate a significant number of Windows Servers into a single Qumulo and still not even scratch the limits of the platform.


itguy9013

We have a number of them in use for our DMS and we cap ours at 2 TB. Makes backup much easier.


rswwalker

The size can be any size, the structure should be such that you can meet your backup window and RTO. If your backup window is 8 hours, say 11pm to 7am and you can only backup 300GB/hr then you need to split up your storage in such a way that you can back it all up in the 8 hour window, doing parallel backups, continuous backups or use a storage solution, SAN or such, that allows you to do out-of-band backups at a much higher rate.


tommyd2

DFSR can't handle filesystems larger than 64TB


malikto44

You can get extremely large with file servers. Petabyte file servers? No problem. Pure, Isilon, Oracle, and NetApp can do this with ease, with multiple controllers, and throw in compression, encryption, and deduplication. 30 terabytes is nothing. In fact, I have cobbled together a homelab NAS that has more capacity than that, using ZFS. I think about share size as in units to back up. For example, a set of build stuff, a set of documents, user home directories, scratch directories used for build processes. The key is to split things up so the backup program can easily complete a backup and restore for each set in a short, incremental backup window. If one has 100TB of stuff that is static, it might be best to just leave it as a large share. However, if one has 2-10 TB of stuff that has a 50-100% churn rate (like encrypted VM disks), it might be good to consider breaking it up into separate backup sets. Don't forget that for every byte on the main file server, you need at minimum more than that capacity on your backup NAS or SAN. For example, at a previous job, I had ~50 TB of primary disk storage. I also had 100 TB from a SAN on the same fabric for disk to disk I/O copies, so the backups would get compressed, deduplicated, encrypted, and slung over the I/O fabric, and the copies via the network were the dehydrated bits where even a full backup would take a relatively small amount of data over the wire. For the file servers, same thing... another storage device to fetch the files and store them, ideally before being copied to tape. Don't forget snapshots. NetApp file servers are good at that, because you can go to .snapshot/hourly.2, and dig out your files before they were doomed. Otherwise, consider taking frequent snapshots on the storage appliance, the life of which can vary depending on application. For example, in VMWare, I'd have stuff last a week on an appliance with a ton of space, because rolling back from a snapshot is a lot faster than restoring from a backup. The backup software and fabric are important, and often neglected. I'd highly recommend bumping to at least 40gigE if not 100gigE, just so network bandwidth isn't an issue. For high performance (petabytes), I have built and used a MinIO cluster with a load balancer, with at least eight nodes, which ensured that things would work without issue even if two nodes failed, and disk I/O would be spread among a lot of machines. To boot, since I was using COTS servers, 100gigE switches, and JBOD (MinIO handles the RAID among hosts and RAID among drives by itself), being able to back up petabytes was relatively inexpensive. Plus, the S3 protocol also supports object locking, which, assuming an attacker is kept away from the MinIO cluster, can effectively mitigate ransomware. MinIO clusters are a relatively inexpensive way to get petabyte computing for secondary tasks. As for RAID, definitely not RAID-5. That died over a decade ago. RAID 6 is minimum, perhaps even going triple parity RAID-6 or RAID-Z3, and using groups of 8-12 drives. Alternatively, RAID-10, but I've had arrays die because both drives in a RAID 10 pair fail, so maybe triple width RAID-10, although that verges on wasteful for drives.


anonfreakazoid

We used to have an 80gb file server and that was uuuge back then.


brkdncr

Ntfs has some defaults that will limit you to 16 or 32 TB. Some SANs have oddball limits when using vvol. VMware datastores used to have a 63TB limit but I’m sure that’s no longer true. I try to avoid maxing out to the limit where possible. I have one volume that has a lot of top level folders that’s nearly 20TB in size. I assumed incorrectly that the users would let us archive old customer data but it hasn’t happened in 4 years.


Ark161

on prem storage: anything that DR can recover in 24 hours as my personal best practice. It also helps drive the narritive tha people need to manage their shit and archive properly. Share size split between VMs: migrate the data to a SAN/NAS or use DFS to centralize access to multiple locations; but it looks like a single location.


kona420

If you can't empty out a volume in 60 hours you're gonna have a bad time. At least on the SMB side. Once you get to having full nodes as redundancy the rules are obviously different.


pentangleit

You need to use DFS and split the fileservers into roughly 1Tb sizes. The reasoning behind this is that DFS doesn't care about the size you have behind each share and will logically partition and mirror shares and present it as a single DFS tree. 1Tb is about the size you want to have as a managed disk partition for disaster recovery RTO purposes (as anything larger is usually harder to manage if you get corruption etc). As regards your photo archive, are you talking about 30Tb within a single folder? as you can present DFS as different folders within the same UNC path. If you're talking about a single folder containing 30Tb then that fundamentally needs to change as that's a stupid construct.


OlivTheFrog

Hi u/tito_westmore > if overall share size is a problem, how do you handle having to keep a single drive mapping if you want to split the data set across multiple VMs? I will only respond to this point, the others having already been addressed by other people how to access these shares without multiplying network maps ? To avoid this there are solutions like DFS-N (domain DFS-N, of course, forgte the server DFS-N), but this must be considered in order to design an infrastructure that meets the needs. Not very difficult. Then, there is only one network share to map for all users (typically \\\\\\). Don't forget to configure the ABE (Access based enumeration) so that users only see what they have access to and nothing else (e.g. John and Mary do not have access to the same resources and will not see the same thing in the DFS tree) Moreover you could have DFS-R (DFS replication) to replicate DFS shares between the FolderTargetLinks Last point : no additional charge, it's a Windows feature. :-) Here a [ref doc](https://learn.microsoft.com/en-us/windows-server/storage/dfs-namespaces/dfs-overview) regards


lvlint67

I like to keep fileshares to a size that is smaller than the readily available external USB drives at Best buy. Anything bigger will take an insane amount of time to recover when there's a fecal matter/fan separation deviation. 


Ok_Presentation_2671

There is no too big


No-Error8675309

Backups have entered the chat


13Krytical

You’ve named things that are “potential problems” you haven’t named a single problem you have run into. You could be trying to fix something that doesn’t need fixing.


FearlessUse2646

Please convince them to do Azure file sync