I’m currently running a 4 drive setup using the Synology Hybrid Raid (SHR) configuration.

Long story relatively short, I started noticing bad sectors increase on Drive 4 over the past month or so, and the drive just went into critical status the other day. Luckily, I had a new drive on hand. So I replaced drive 4 with the new drive, and rebuilt the storage pool just fine. It was then recommended that I run data scrubbing, and during this process drive 3 went into “critical” status, and when I noticed the approximate time for the data scrubbing to complete increased substantially I canceled the scrubbing all together. Drive 3 also had a number of “bad sectors” on it over the past few months, but was stable, and I would check it regularly for any increases. However, the data scrubbing sent it into critical status, and the bad sectors count has skyrocketed. It’s clear the drive is failing.

My question is, what should I do until the new drive arrives, which isn’t until next Tuesday? Should I continue to just let the system run as normal, deactivate the critical drive and remove it from the storage pool or just power down the entire NAS until the new drive arrives? I’ve seen recommendations online for all of the above, as well as reasons opposed to all of the above. Lol. I tried to contact synology through live chat, but of course it’s down now for some reason.

ETA: and of course both seagate iron wolf drives were out of warranty, but I am curious if anyone on this board has had experience getting seagate to replace failed drives that were under warranty. Was it a painless process, or did they make you jump through hoops? Time frame to get it replaced?

This post was edited on 3/31/23 at 3:30 pm

6 ...

Report Post

Posted by BabySam

Member since Oct 2010

1528 posts

Posted on 3/31/23 at 4:19 pm to PhilipMarlowe

Ironically i was just reading up on this today out of curiosity. If your unit supports hot swappable you should be able to repair/deactivate it and pop new one in when it arrives. Then complete rebuild/replacement. Was also noted to do the scrubbing prior to replacement. If not hot swappable would have to repair/deactivate and then power down, then replace drive and power up and complete rebuild/repair.

Guess its more of a question of do you have a backup and what is impact of not having access to data until new drive arrives. If you have a backup i would look at rebooting and performing disk test and data scrub.

0 ...

Report Post

Posted by LemmyLives

Texas

Member since Mar 2019

10007 posts

Posted on 3/31/23 at 7:27 pm to PhilipMarlowe

I just wouldn't do *anything*. I have two Synology NAS using SHR. The bad sector count can go up *and down* based on my experience, and doesn't mean the drive is going to die anytime soon. Also, look into the Backblaze C2 service to provide catastrophe insurance for your data. It's shockingly cheap.

When you look at the SMART data associated with the drives in critical status, what are you seeing?

I've primarily used WD Red Pro drives, and they *always* die after the warranty period. Usually, drives in double the capacity are pretty cheap, so I just drill or sledgehammer the old one and recycle it. For my new NAS, I did do a couple of Seagates and a couple of WD in the same storage pool just to create some manufacturing diversity.

1 ...

Report Post

Posted by LemmyLives

Texas

Member since Mar 2019

10007 posts

Posted on 3/31/23 at 8:11 pm to PhilipMarlowe

quote:
Was it a painless process, or did they make you jump through hoops? Time frame to get it replaced?

Their system seems very similar to WD's, hit the serial number, and not much else. I think I've only replaced one drive that was under warranty, and it probably took ten days or so. I'm not that tight with cash (thanks, divorce!) I just buy another higher capacity drive that shows up in 1-2 days from NewEgg or Amazon, and keep the warranty replacement as a spare for one of the other drives that will eventually fail.

Just in case you're pondering it, I tried to put a desktop drive that was fully functional into my first NAS. The vibration/heat from the other drives caused complete failure of the drive within two weeks. Spend the cash on NAS drives!

0 ...

Report Post

Posted by Korkstand

Member since Nov 2003

28997 posts

Posted on 3/31/23 at 9:35 pm to PhilipMarlowe

Does anyone else feel like drives go bad way more often when they are in a NAS vs. a desktop/laptop/whatever else? Heat/vibration, or just my imagination?

Kind of off-topic but related, I think I will move away from a NAS for my own use. I feel like they're too constraining with the fixed number of drives and matching capacities and whatnot. I *still* have not completed the project I posted about here a long time ago regarding a "hyperconverged" proxmox cluster. The delay is mostly due to my house being older and there isn't really a good place to put a server rack, so I'm shifting gears again and instead of used servers I will put together something like Project TinyMiniMicro. These boxes are pretty cheap (used) and quiet, so I'll see what kind of drives I can put in them (will probably be limited to 2.5" spinners or SSDs) unless I hang some off USB. Anyway, I love the concept of Ceph for storage since I'll be able to add a host+drive(s) at will with arbitrary capacity, basically networked JBOD. Maybe one day I will get it up and running and report back how it performs vs. a NAS.

1 ...

Report Post

Posted by LemmyLives

Texas

Member since Mar 2019

10007 posts

Posted on 3/31/23 at 10:11 pm to Korkstand

quote:
Does anyone else feel like drives go bad way more often when they are in a NAS vs. a desktop/laptop/whatever else? Heat/vibration, or just my imagination?

If you are using desktop drives, 100% using the equivalent of a WD Blue or Black series (for desktop use) will experience increased failure rates. In my old 5 bay NAS, the drives that are in the middle are at 88 degrees, while the ones at the edges are 82, and I have fans on full blast. I put a desktop drive in that NAS, as a I posted earlier, and after three-ish years of desktop service, it died within two weeks of putting it in the NAS.

When used with NAS drives, I don't have this experience. I choose drives based mainly on warranty length, but I also check the Backblaze failure rate stats.

It's not completely clear how Project TinyMiniMicro relates to replacing a NAS, to include redundancy. Are you going to buy three of them and connect JBOD via eSATA cable or something?

I've built my own domain controller for my house (and HTPCs, etc.), and now I realize my time is worth more than the headache of dealing with unrefined interfaces and recovery options. I don't know how you generate other excuses to stay away from your wife, but there are easier ways. Synology's DSM is a no-brainer, even at extra cost. There's support available, but it's mostly fire and forget. I don't care to deal with minutiae any more than necessary at this point.

With certain models of DSMs you can buy an expansion unit, but it's also child's play to replace a drive with a higher capacity unit and expand the volume (within RAID constraints of course.)

1 ...

Report Post

Posted by Korkstand

Member since Nov 2003

28997 posts

Posted on 3/31/23 at 11:07 pm to LemmyLives

quote:
If you are using desktop drives, 100% using the equivalent of a WD Blue or Black series (for desktop use) will experience increased failure rates.

I don't think I have ever put a desktop drive in a NAS. Always NAS drives and failures seem pretty common. Not that I deal with a lot of them, so it could just be due to small sample size or of course my imagination.

quote:
It's not completely clear how Project TinyMiniMicro relates to replacing a NAS, to include redundancy. Are you going to buy three of them and connect JBOD via eSATA cable or something?

I would run proxmox+ceph. And yes I would have at least 3 mini boxes, with the idea being I can add a 4th, 5th, on up to an arbitrary number of nodes to scale my storage and compute capacity as needed. Ceph recommends 10gbit dedicated to the storage cluster, but I'll see how it goes with gigabit (plenty say it works just fine though performance isn't exceptional). You can set redundancy either via replication or erasure coding, to whatever level of durability you want.

quote:
I've built my own domain controller for my house (and HTPCs, etc.), and now I realize my time is worth more than the headache of dealing with unrefined interfaces and recovery options. I don't know how you generate other excuses to stay away from your wife, but there are easier ways. Synology's DSM is a no-brainer, even at extra cost. There's support available, but it's mostly fire and forget. I don't care to deal with minutiae any more than necessary at this point.

This project will definitely be pretty hands-on, that's for sure, but I think it's a good trade-off for me. And since I need distributed virtualization as well, I hope to satisfy all of my requirements with a single "hyperconverged" cluster that can grow along with me adding compute and storage resources incrementally. I look at it like yes it will be a "project", but it'll be like a garden for me that I tend to from time to time.

0 ...

Report Post

Posted by FriedEggBowL

Member since Nov 2021

1022 posts

Posted on 4/1/23 at 8:34 am to PhilipMarlowe

Use only western digital red pro drives that you order directly from western digital. Not from a reseller like Amazon or Newegg. Those drives have a fiber year warranty and a much lower failure rate. The red pro is made for NAS

1 ...

Report Post

Posted by BabySam

Member since Oct 2010

1528 posts

Posted on 4/1/23 at 10:49 am to FriedEggBowL

When i got my synology rs-820 i went with the recommended synology drives and so far so good. But when i had a netgear readynas i made sure to check the model of WD red drives and stuck mainly with those. Had stayed away from seagate in last 5yrs at least for anything hdd related

0 ...

Report Post

Posted by PhilipMarlowe

Member since Mar 2013

21054 posts

Posted on 4/1/23 at 1:13 pm to LemmyLives

Tbh I don’t know what most of those numbers mean. I did check the bad sector count again this morning, and yesterday it peaked at around 17k, and this morning it dropped back to 0. I did run another smart quick test and it’s now moved from critical to failing. Unsurprisingly, when I run a “seagate iron wolf“ drive check it comes back as “healthy.” Lol

I hear you on not not doing anything just based on bad sector numbers, as that’s what I had been doing for the last several months with these two drives. They showed bad sector numbers, but were stable at that number for months…until they weren’t. Plus, I started to notice some unusual noises associated with their operation which is telling me that they are/were dying.

There’s been a lot of helpful info in here, but I’m still not sure if it’s best to leave the NAS running with the failing drive, or to remove said drive from the storage pool and Operate in a degraded status until the new drive gets here or to just power the whole thing down?

1 ...

Report Post

Posted by LSshoe

Burrowing through a pile o MikePoop

Member since Jan 2008

4301 posts

Posted on 4/1/23 at 9:49 pm to PhilipMarlowe

One thing not noted by others is that if you had to replace a drive, the rebuilding process is going to cause extra wear on all the drives. It's not uncommon to have one failure domino and take out other drives living on the edge and having their usage go up while they're being used to reconstruct the data back to the replacement drive

0 ...

Report Post

Posted by LemmyLives

Texas

Member since Mar 2019

10007 posts

Posted on 4/3/23 at 11:04 pm to PhilipMarlowe

TBH, the SMART data doesn't look great. The reason I asked is that sometimes the threshold is (for instance) 44, and the recorded data lists the max recorded error rate is 45, but it shows as bad, the same as having a measurement of 500 against a threshold of 45 would.

Keep in mind that shutting the unit down and having the startup process, including drive spin up *may* be more damaging than waiting another day or two to just replace the drive.

A 3 year old drive I have reports zero Raw Read Error rate, zero seek error rate, a power cycle count of 697 (in nearly 3 years), you get the idea.

Plug in the new drive, but plug the NAS into a $50 UPS. A power cycle count at nearly 14000? That will totally kill drives, even if that's over a three or four year period. That is immediate cringe, so whatever is causing that, make it stop.

My power on hours are 21k in my old NAS on the drive I'm looking at, so we're nearly the same, but your drive has a SMART metric for failure that is *massively* higher, to a logarithmic proportion, than what I have. Cross fingers, order 2 day shipping from NewEgg next time.

For actually helpful information, if you bought all of the drives at the same time, they are likely going to be statistically subject to fail relatively closely together.
The serial numbers are probably sequential, or close to it, if you bought them together. Someone sneezed in the clean room that week, who knows. Put the replacement you want for the next drive to fail on your wishlist just in case. Even with the WD Red Pro/Plus drives, if one went, I expected the next one to report as dead with 4-6 weeks.

But with 14000 power cycles on drives that have 20000 active hours? I don't know if your cat is turning off the power strip, but that is B-A-D.

1 ...

Report Post

Posted by PhilipMarlowe

Member since Mar 2013

21054 posts

Posted on 4/4/23 at 12:49 am to LemmyLives

quote:
But with 14000 power cycles on drives that have 20000 active hours? I don't know if your cat is turning off the power strip, but that is B-A-D.

Never would’ve thought about this. I do have the NAS powered on 24/7, but have the HDD’s set to hibernate after 15 minutes. Could this be a factor in the high power cycles? Nothing is turning my power strip on and off repeatedly. I have no clue what would be causing this. I rarely lose power except for the odd thunderstorm, and I’ve never noticed my lights flickering or any other issues with electronics plugged in. Your comment has me concerned now. Lol.

Even the new drive that I just put in a few days ago, shows a power on time of 36 hours and already has 27 drive power cycles.

Any recommended ups? I’ve never purchased one before.

Also, what temps do your hdd’s operate at when running? Mine are usually right around 41-43° C.

1 ...

Report Post

Posted by BabySam

Member since Oct 2010

1528 posts

Posted on 4/4/23 at 2:40 pm to PhilipMarlowe

Tripp Lite, APC, CyberPower have been fine from what i've seen in home/small business...I just bought a new Tripp Lite for my rack a month or two ago...

for reference, my synology has 13,610hrs with 20 drive power cycle and 0 read errors

Page 1